Posts

Comments

Comment by Devansh Shah (devansh-shah) on An example elevator pitch for AI doom · 2023-04-15T15:40:10.138Z · LW · GW

Contrary opinion: Agents like #AutoGPT are more aligned than the underlying LLMs due to chaining. e.g. If the model will say "I need to make $1M" & then answers itself as "Stealing is the best plan to achieve it"; there are two prompts for the underlying LLMs to refuse help.