Posts

Comments

Comment by ctic2421 on MetaAI: less is less for alignment. · 2023-06-17T23:19:44.874Z · LW · GW

Curious if you could elaborate more on why MACHIAVELLI isn't a good test for outer alignment!

Comment by ctic2421 on MetaAI: less is less for alignment. · 2023-06-17T23:18:33.815Z · LW · GW

Yep, it's a language model agent benchmark. It just feeds a scenario and some actions to an autoregressive LM, and asks the model to select an action.

Comment by ctic2421 on ChatGPT's "fuzzy alignment" isn't evidence of AGI alignment: the banana test · 2023-03-23T08:43:59.243Z · LW · GW

GPT-4 seems to pass the banana test.