The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better
post by Thane Ruthenis · 2025-02-21T20:15:11.545Z · LW · GW · 4 commentsContents
Persuasion A Better Target Demographic Extant Projects in This Space? Framing None 4 comments
First, let me quote my previous ancient post on the topic [LW · GW]:
Effective Strategies for Changing Public Opinion
The titular paper [EA · GW] is very relevant here. I'll summarize a few points.
- The main two forms of intervention are persuasion and framing.
- Persuasion is, to wit, an attempt to change someone's set of beliefs, either by introducing new ones or by changing existing ones.
- Framing is a more subtle form: an attempt to change the relative weights of someone's beliefs, by empathizing different aspects of the situation, recontextualizing it.
- There's a dichotomy between the two. Persuasion is found to be very ineffective if used on someone with high domain knowledge. Framing-style arguments, on the other hand, are more effective the more the recipient knows about the topic.
- Thus, persuasion is better used on non-specialists, and it's most advantageous the first time it's used. If someone tries it and fails, they raise the recipient's domain knowledge, and the second persuasion attempt would be correspondingly hampered. Cached thoughts are also in effect.
- Framing, conversely, is better for specialists.
My sense is that, up to this point, AI risk advocacy targeted the following groups of people:
- ML researchers and academics, who want "scientifically supported" arguments.
- Advocacy methods: theory-based arguments, various proof-of-concept empirical evidence of misalignment, model organisms, et cetera.
- US policymakers, who want either popular support or expert support to champion a given cause.
- Advocacy methods: behind-the-scenes elbow-rubbing, polls showing bipartisan concern for AI, parading around the experts concerned about AI.
- Random Internet people with interests or expertise in the area.
- Advocacy methods: viral LW/Xitter blog posts laying out AI X-risk arguments.
Persuasion
I think all of the above demographics aren't worth trying to persuade further at this point in time. It was very productive before, when they didn't yet have high domain knowledge related to AI Risk specifically, and there's been some major wins.
But further work in this space (and therefore work on all corresponding advocacy methods, yes) is likely to have ~no value.
- ~All ML researchers and academics that care have already made up their mind regarding whether they prefer to believe in misalignment risks or not. Additional scary papers and demos aren't going to make anyone budge.
- The relevant parts of the USG are mostly run by Musk and Vance nowadays, who have already decided either that they've found the solution to alignment (curiosity, or whatever Musk is spouting nowadays), or that AI safety is about wokeness. They're not going to change their minds. They're also going to stamp out any pockets of X-risk advocacy originating from within the government, so lower-level politicians are useless to talk to as well.
- Terminally online TPOT Xitters have already decided that it's about one of {US vs. China, open source vs. totalitarianism, wokeness vs. free speech, luddites vs. accelerationism}, and aren't going to change their mind in response to blog posts/expert opinions/cool papers.
Among those groups, we've already convinced ~everyone we were ever going to convince. That work was valuable and high-impact, but the remnants aren't going to budge in response to any evidence short of a megadeath AI catastrophe.[1]
Hell, I am 100% behind the AI X-risk being real, and even I'm getting nauseated at how tone-deaf, irrelevant, and impotent the arguments for it sound nowadays, in the spaces in which we keep trying to make them.
A Better Target Demographic
Here's whom we actually should be trying to convince inform: normal people. The General Public.
- This demographic is very much a distinct demographic from the terminally online TPOT xitter users.
- This demographic is also dramatically bigger and more politically relevant.
- Poll have demonstrated that this demographic shows wide bipartisan support for the position that AI is existentially threatening. If their attention is directed to it.
- However: this demographic is largely unaware of what's been happening.
- If they've used AI at all, they mostly think it's all just chatbots (and probably the free tier of ChatGPT, at that).
- Ideas like hard takeoff, AI accelerating AI research, or obvious-to-us ways to turn chatbots into agents, are very much not obvious to them. The connection between "this funny thing stuck in a dialog window" and "a lightcone-eating monstrosity" requires tons of domain expertise to make.
- Most of them don't even know the basics, such as that we don't know how AI works [LW · GW]. They think it's all manually written code underneath, all totally transparent and controllable. And if someone does explain, they tend to have appropriate reactions to that information.
- This demographic is not going to eat out of the AGI Labs' hands when they say they're being careful and will share the benefits with humanity. "Greedy corporations getting us all killed in the pursuit of power" is pretty easy to get.
- This demographic is easily capable of understanding the grave importance of X-risks (see the recent concerns regarding 3% chance of asteroid impact in 2032).
If we can raise the awareness of the AGI Doom among the actual general public (again, not the small demographic of terminally online people), that will create significant political pressure on the USG, giving politicians an incentive to have platforms addressing the risks.
The only question is how to do that. I don't have a solid roadmap here. But it's not by writing viral LW/Xitter blog posts.
Some scattershot thoughts:
- Comedians seem like a useful vector.
- Newspapers and podcasts too. More stuff in the vein of Eliezer's Time article would be good. Podcast-wise, we want stuff with a broad audience of "normies". (So, probably not whatever podcasts you are listening to, median LW reader.)
- "Who will control the ASI if they can control it?" is another potentially productive question to pose. There's wide distrust in/dissatisfaction with all of {governments, corporations, billionaires, voting procedures}. Nobody wants them to have literal godlike power. Raising people's awareness regarding what the AGI labs are even saying they are doing, and what implications that'd have – without even bringing in misalignment concerns – might have the desired effect all on its own. (Some more on that [LW(p) · GW(p)].)
- This one is kinda tricky, though.
- @harfe [LW · GW]'s galaxy-brained idea here [LW(p) · GW(p)] about having someone run in the 2028 election on an AI Notkilleveryoneism platform. Not with the intent to win; with the intent to raise the awareness plus force the other candidates to speak on the topic.
- I am not sure how sensible this is, and also 2028 might be too late. But it'd be big if workable.
Overall, I expect that there's a ton of low-hanging high-impact fruits in this space, and even more high-impact clever interventions that are possible (in the vein of harfe's idea).
Extant Projects in This Space?
Some relevant ones I've heard about:
- My impression is that MIRI is on it, with their change of focus. I haven't seen much come of that besides the Time article plus Eliezer appearing on a few podcasts, though.
- I think Conjecture might be doing this stuff too, with their Compendium [LW · GW] et cetera? I think they've been talking about appeals to the (actual) general public as well. But I haven't been following them closely.
- AI Notkilleveryoneism Memes shows some examples of what not to do:
- Mostly speaking to a Twitter-user demographic.
- Using shrill, jargon-heavy (therefore exclusionary) terminology. Primarily, constantly calling AI models "shoggoths" with no explanation.
- Overall posture seems mostly optimized for creating an echo chamber of AI-terrified fanatics, not for maximally broad public outreach.
- PauseAI is a mixed bag. They get some things right, but they're also acting prematurely in ways that risk being massively net negative.
- Protests' purpose is to cause a signaling cascade, showing to people that there are tons of other people sharing their opinions and concerns. If done well, they cause a snowball effect, with subsequent protests being ever-bigger.[2]
- There's no chance of causing this yet: as I'd said, the general public's opinion on AI is mostly the null value. You need to raise awareness first, then aim for a cascade.
- As-is, this is mostly going to make people's first exposure to AI X-risk be "those crazy fringe protestors". See my initial summary regarding effective persuasion: that would be lethal, gravely sabotaging our subsequent persuasion efforts.
Framing
Technically, I think there might be some hope for appealing to researchers/academics/politicians/the terminally online, by reframing the AI Risk concerns in terms they would like more.
All the talk about "safety" and "pauses" have led to us being easy to misinterpret as unambitious, technology-concerned, risk-averse luddites. That's of course incorrect. I, at least, am 100% onboard with enslaving god, becoming immortal, merging with the machines, eating the galaxies, perverting the natural order to usher in an unprecedented age of prosperity, forcing the wheels of time into reverse to bring the dead back to life, and all that good stuff. I am pretty sure most of us are like this (if perhaps not in those exact terms).
The only reason I/we are not accelerationists is because the current direction of AI progress is not, in fact, on the track to lead us to that glorious future. It's instead on the track to get us all killed like losers.
So a more effective communication posture might be to empathize this: frame the current AI paradigm as a low-status sucker's game, and suggest alternative avenues for grabbing power. Uploads [LW · GW], superbabies [LW · GW], adult intelligence enhancement [LW · GW], more transparent/Agent Foundations-y AI research, etc. Reframing "AI Safety" as being about high-fidelity AI Control might also be useful. (It's mostly about making AIs Do What You Mean, after all, and the best alignment work is almost always dual-use.)
If the current paradigm of AI capability advancement visibly stumbles in its acceleration[3], this type of messaging would become even more effective. The black-box DL paradigm would open itself to derision for being a bubble, an empty promise.
I mention this reluctantly/for comprehensiveness' sake. I think that this is a high-variance approach, most of the attempts at this are going to land badly, and will amount to nothing or have a negative effect. But it is a possible option.
Messaging aimed at the general public is nevertheless a much better, and more neglected, avenue.
- ^
Or maybe not even then, see the Law of Continued Failure [? · GW].
- ^
The toy model there is roughly:
- Protest 1 is made up of some number of people , who are willing to show their beliefs in public even with the support of zero other people.
- Protest 2 is joined by people who are willing to show their beliefs in public if they have the support of other people.
- ...
- Protest is joined by people who are willing to show their beliefs in public if they have the support of other people.
(Source, Ctrl+F in the transcript for "second moving part is diverse threshold".)
- ^
Which I do mostly expect. AGI does not seem just around the corner on my inside model of AI capabilities. The current roadmap seems to be "scale inference-time compute, build lots of RL environments, and hope that God will reward those acts of devotion by curing all LLM ailments and blessing them with generalization". Which might happen, DL is weird. But I think there's a lot of room for skepticism with that idea.
I think the position that The End Is Nigh is being deliberately oversold by powerful actors: the AGI Labs. It's in their corporate interests to signal hype to attract investment, regardless of how well research is actually progressing. So the mere fact that they're acting optimistic carries no information.
And those of us concerned about relevant X-risks are uniquely vulnerable to buying into that propaganda. Just with the extra step of transmuting the hype into despair. We're almost exactly the people this propaganda is optimized for, after all – and we're not immune to it.
4 comments
Comments sorted by top scores.
comment by Mitchell_Porter · 2025-02-22T00:09:58.615Z · LW(p) · GW(p)
frame the current AI paradigm as a low-status sucker's game, and suggest alternative avenues for grabbing power
You say you want to target normal people, but even in the 2020s, normal people are not transhumanists. The most politically effective message would be anti-AI in general, anti-transhumanism in general, and would portray the desire to play God as the core problem.
Replies from: Thane Ruthenis↑ comment by Thane Ruthenis · 2025-02-22T00:11:08.134Z · LW(p) · GW(p)
The section under "framing" isn't for targeting normal people, it's an alternate approach for targeting the academics/researchers/politicians/tech-enthusiasts.
comment by Davey Morse (davey-morse) · 2025-02-21T22:14:47.690Z · LW(p) · GW(p)
agreed that comedy/memes might be a strategic route for spreading ai x-risk awareness to general population. this kind of thinking inspired this silly alignment game https://dontsedateme.org.
some other silly/unlikely ideas:
- attempting to reframe religions/traditional god-concepts around the impending superintelligence. i'm sure SI, once here, will be considered a form of god to many.
- AGI ice bucket challenge.
- Ads with simple one-liners like, "We are building tech we won't be able to control."
comment by Ebenezer Dukakis (valley9) · 2025-02-22T06:30:54.421Z · LW(p) · GW(p)
It looks like the comedian whose clip you linked has a podcast:
https://www.joshjohnsoncomedy.com/podcasts
I don't see any guests in their podcast history, but maybe someone could invite him on a different podcast? His website lists appearances on other podcasts. I figure it's worth trying stuff like this for VoI.
I think people should emphasize more the rate of improvement in this technology. Analogous to early days of COVID -- it's not where we are that's worrisome; it's where we're headed.