Posts
Comments
Comment by
ctic2421 on
MetaAI: less is less for alignment. ·
2023-06-17T23:19:44.874Z ·
LW ·
GW
Curious if you could elaborate more on why MACHIAVELLI isn't a good test for outer alignment!
Comment by
ctic2421 on
MetaAI: less is less for alignment. ·
2023-06-17T23:18:33.815Z ·
LW ·
GW
Yep, it's a language model agent benchmark. It just feeds a scenario and some actions to an autoregressive LM, and asks the model to select an action.
Comment by
ctic2421 on
ChatGPT's "fuzzy alignment" isn't evidence of AGI alignment: the banana test ·
2023-03-23T08:43:59.243Z ·
LW ·
GW
GPT-4 seems to pass the banana test.