Posts

Can We Predict Persuasiveness Better Than Anthropic? 2024-08-04T14:05:33.668Z
A Visual Task that's Hard for GPT-4o, but Doable for Primary Schoolers 2024-07-26T17:51:28.202Z

Comments

Comment by Lennart Finke (l-f) on Can We Predict Persuasiveness Better Than Anthropic? · 2024-08-05T14:02:23.695Z · LW · GW

Good point, and I was conflicted whether to put my thoughts about this at the end of the post. My best theory is that increased persuasion abilities looks something like "totalitarian government agents doing solid scaffolding on open-source models to DM people on Facebook". We will see that persuasive agents get better, but not know why and how. As stated in the introduction, persuasion detection is dangerous, but one of the few capabilities that could also be used defensively (i.e. detecting persuasion in an incoming email -> displaying warning in UI and offer to rephrase).

In conclusion, definitely agree that we should consider closed-sourcing any improvements upon the above baseline and only show them to safety orgs instead. Some people at AISI I have talked to while working on persuasion are probably interested in this. 

Comment by Lennart Finke (l-f) on Can We Predict Persuasiveness Better Than Anthropic? · 2024-08-05T07:34:00.435Z · LW · GW

Thanks, fixed!

Comment by Lennart Finke (l-f) on A Visual Task that's Hard for GPT-4o, but Doable for Primary Schoolers · 2024-07-28T07:29:18.204Z · LW · GW

Agreed, although that it turn makes me wonder why it does perform a bit better than random. Maybe there is some nondeclarative knowledge about the image, or some blurred position information? I might test next how much vision is bottlenecking here by providing a text representation of the grid, as in Ryan Greenblatt's work on ARC-AGI.