The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

thane-ruthenis

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

post by Thane Ruthenis · 2025-02-21T20:15:11.545Z · LW · GW · 51 comments

  Persuasion
    A Better Target Demographic
    Extant Projects in This Space?
  Framing
None
51 comments

First, let me quote my previous ancient post on the topic [LW · GW]:

Effective Strategies for Changing Public Opinion
The titular paper [EA · GW] is very relevant here. I'll summarize a few points.
The main two forms of intervention are persuasion and framing.
Persuasion is, to wit, an attempt to change someone's set of beliefs, either by introducing new ones or by changing existing ones.
Framing is a more subtle form: an attempt to change the relative weights of someone's beliefs, by empathizing different aspects of the situation, recontextualizing it.
There's a dichotomy between the two. Persuasion is found to be very ineffective if used on someone with high domain knowledge. Framing-style arguments, on the other hand, are more effective the more the recipient knows about the topic.
Thus, persuasion is better used on non-specialists, and it's most advantageous the first time it's used. If someone tries it and fails, they raise the recipient's domain knowledge, and the second persuasion attempt would be correspondingly hampered. Cached thoughts are also in effect.
Framing, conversely, is better for specialists.

My sense is that, up to this point, AI risk advocacy targeted the following groups of people:

ML researchers and academics, who want "scientifically supported" arguments.
- Advocacy methods: theory-based arguments, various proof-of-concept empirical evidence of misalignment, model organisms, et cetera.
US policymakers, who want either popular support or expert support to champion a given cause.
- Advocacy methods: behind-the-scenes elbow-rubbing, polls showing bipartisan concern for AI, parading around the experts concerned about AI.
Random Internet people with interests or expertise in the area.
- Advocacy methods: viral LW/Xitter blog posts laying out AI X-risk arguments.

Persuasion

I think all of the above demographics aren't worth trying to persuade further at this point in time. It was very productive before, when they didn't yet have high domain knowledge related to AI Risk specifically, and there's been some major wins.

But further work in this space (and therefore work on all corresponding advocacy methods, yes) is likely to have ~no value.

~All ML researchers and academics that care have already made up their mind regarding whether they prefer to believe in misalignment risks or not. Additional scary papers and demos aren't going to make anyone budge.
The relevant parts of the USG are mostly run by Musk and Vance nowadays, who have already decided either that they've found the solution to alignment (curiosity, or whatever Musk is spouting nowadays), or that AI safety is about wokeness. They're not going to change their minds. They're also going to stamp out any pockets of X-risk advocacy originating from within the government, so lower-level politicians are useless to talk to as well.
Terminally online TPOT Xitters have already decided that it's about one of {US vs. China, open source vs. totalitarianism, wokeness vs. free speech, luddites vs. accelerationism}, and aren't going to change their mind in response to blog posts/expert opinions/cool papers.

Among those groups, we've already convinced ~everyone we were ever going to convince. That work was valuable and high-impact, but the remnants aren't going to budge in response to any evidence short of a megadeath AI catastrophe.^[1]

Hell, I am 100% behind the AI X-risk being real, and even I'm getting nauseated at how tone-deaf, irrelevant, and impotent the arguments for it sound nowadays, in the spaces in which we keep trying to make them.

A Better Target Demographic

Here's whom we actually should be trying to ~~convince~~ inform: normal people. The General Public.

This demographic is very much a distinct demographic from the terminally online TPOT xitter users.
This demographic is also dramatically bigger and more politically relevant.
Poll have demonstrated that this demographic shows wide bipartisan support for the position that AI is existentially threatening. If their attention is directed to it.
However: this demographic is largely unaware of what's been happening.
- If they've used AI at all, they mostly think it's all just chatbots (and probably the free tier of ChatGPT, at that).
- Ideas like hard takeoff, AI accelerating AI research, or obvious-to-us ways to turn chatbots into agents, are very much not obvious to them. The connection between "this funny thing stuck in a dialog window" and "a lightcone-eating monstrosity" requires tons of domain expertise to make.
- Most of them don't even know the basics, such as that we don't know how AI works [LW · GW]. They think it's all manually written code underneath, all totally transparent and controllable. And if someone does explain, they tend to have appropriate reactions to that information.
This demographic is not going to eat out of the AGI Labs' hands when they say they're being careful and will share the benefits with humanity. "Greedy corporations getting us all killed in the pursuit of power" is pretty easy to get.
This demographic is easily capable of understanding the grave importance of X-risks (see the recent concerns regarding 3% chance of asteroid impact in 2032).

If we can raise the awareness of the AGI Doom among the actual general public (again, not the small demographic of terminally online people), that will create significant political pressure on the USG, giving politicians an incentive to have platforms addressing the risks.

The only question is how to do that. I don't have a solid roadmap here. But it's not by writing viral LW/Xitter blog posts.

Some scattershot thoughts:

Comedians seem like a useful vector.
Newspapers and podcasts too. More stuff in the vein of Eliezer's Time article would be good. Podcast-wise, we want stuff with a broad audience of "normies". (So, probably not whatever podcasts you are listening to, median LW reader.)
"Who will control the ASI if they can control it?" is another potentially productive question to pose. There's wide distrust in/dissatisfaction with all of {governments, corporations, billionaires, voting procedures}. Nobody wants them to have literal godlike power. Raising people's awareness regarding what the AGI labs are even saying they are doing, and what implications that'd have – without even bringing in misalignment concerns – might have the desired effect all on its own. (Some more on that [LW(p) · GW(p)].)
- This one is kinda tricky, though.
@harfe [LW · GW]'s galaxy-brained idea here [LW(p) · GW(p)] about having someone run in the 2028 election on an AI Notkilleveryoneism platform. Not with the intent to win; with the intent to raise the awareness plus force the other candidates to speak on the topic.
- I am not sure how sensible this is, and also 2028 might be too late. But it'd be big if workable.

Overall, I expect that there's a ton of low-hanging high-impact fruits in this space, and even more high-impact clever interventions that are possible (in the vein of harfe's idea).

Extant Projects in This Space?

Some relevant ones I've heard about:

My impression is that MIRI is on it, with their change of focus. I haven't seen much come of that besides the Time article plus Eliezer appearing on a few podcasts, though.
I think Conjecture might be doing this stuff too, with their Compendium [LW · GW] et cetera? I think they've been talking about appeals to the (actual) general public as well. But I haven't been following them closely.
AI Notkilleveryoneism Memes shows some examples of what not to do:
- Mostly speaking to a Twitter-user demographic.
- Using shrill, jargon-heavy (therefore exclusionary) terminology. Primarily, constantly calling AI models "shoggoths" with no explanation.
- Overall posture seems mostly optimized for creating an echo chamber of AI-terrified fanatics, not for maximally broad public outreach.
PauseAI is a mixed bag. They get some things right, but they're also acting prematurely in ways that risk being massively net negative.
- Protests' purpose is to cause a signaling cascade, showing to people that there are tons of other people sharing their opinions and concerns. If done well, they cause a snowball effect, with subsequent protests being ever-bigger.^[2]
- There's no chance of causing this yet: as I'd said, the general public's opinion on AI is mostly the null value. You need to raise awareness first, then aim for a cascade.
- As-is, this is mostly going to make people's first exposure to AI X-risk be "those crazy fringe protestors". See my initial summary regarding effective persuasion: that would be lethal, gravely sabotaging our subsequent persuasion efforts.

Framing

Technically, I think there might be some hope for appealing to researchers/academics/politicians/the terminally online, by reframing the AI Risk concerns in terms they would like more.

All the talk about "safety" and "pauses" have led to us being easy to misinterpret as unambitious, technology-concerned, risk-averse luddites. That's of course incorrect. I, at least, am 100% onboard with enslaving god, becoming immortal, merging with the machines, eating the galaxies, perverting the natural order to usher in an unprecedented age of prosperity, forcing the wheels of time into reverse to bring the dead back to life, and all that good stuff. I am pretty sure most of us are like this (if perhaps not in those exact terms).

The only reason I/we are not accelerationists is because the current direction of AI progress is not, in fact, on the track to lead us to that glorious future. It's instead on the track to get us all killed like losers.

So a more effective communication posture might be to empathize this: frame the current AI paradigm as a low-status sucker's game, and suggest alternative avenues for grabbing power. Uploads [LW · GW], superbabies [LW · GW], adult intelligence enhancement [LW · GW], more transparent/Agent Foundations-y AI research, etc. Reframing "AI Safety" as being about high-fidelity AI Control might also be useful. (It's mostly about making AIs Do What You Mean, after all, and the best alignment work is almost always dual-use.)

If the current paradigm of AI capability advancement visibly stumbles in its acceleration^[3], this type of messaging would become even more effective. The black-box DL paradigm would open itself to derision for being a bubble, an empty promise.

I mention this reluctantly/for comprehensiveness' sake. I think that this is a high-variance approach, most of the attempts at this are going to land badly, and will amount to nothing or have a negative effect. But it is a possible option.

Messaging aimed at the general public is nevertheless a much better, and more neglected, avenue.

^{^}
Or maybe not even then, see the Law of Continued Failure [? · GW].
^{^}
The toy model there is roughly:
- Protest 1 is made up of some number of people , who are willing to show their beliefs in public even with the support of zero other people.
- Protest 2 is joined by $Q_{1}$ people who are willing to show their beliefs in public if they have the support of $Q_{0}$ other people.
- ...
- Protest $N$ is joined by $Q_{N}$ people who are willing to show their beliefs in public if they have the support of $\sum_{i = 0}^{N - 1} Q_{i}$ other people.
(Source, Ctrl+F in the transcript for "second moving part is diverse threshold".)
^{^}
Which I do mostly expect. AGI does not seem just around the corner on my inside model of AI capabilities. The current roadmap seems to be "scale inference-time compute, build lots of RL environments, and hope that God will reward those acts of devotion by curing all LLM ailments and blessing them with generalization". Which might happen, DL is weird. But I think there's a lot of room for skepticism with that idea.
I think the position that The End Is Nigh is being deliberately oversold by powerful actors: the AGI Labs. It's in their corporate interests to signal hype to attract investment, regardless of how well research is actually progressing. So the mere fact that they're acting optimistic carries no information.
And those of us concerned about relevant X-risks are uniquely vulnerable to buying into that propaganda. Just with the extra step of transmuting the hype into despair. We're almost exactly the people this propaganda is optimized for, after all – and we're not immune to it.

51 comments

Comments sorted by top scores.

comment by Hoagy · 2025-02-23T19:28:39.706Z · LW(p) · GW(p)

~All ML researchers and academics that care have already made up their mind regarding whether they prefer to believe in misalignment risks or not. Additional scary papers and demos aren't going to make anyone budge.

Disagree. I think especially ML researchers are updating on these questions all the time. High-info outsiders less so but the contours of the arguments are getting increasing amounts of discussion.

For those who 'believe', 'believing in misalignment risks' doesn't mean thinking they are likely, at least before the point where the models are also able to honestly take over the work of aligning their successors. As we get closer to TAI, we should be able to get an increasing number of bits about how likely this really is because we'll be working with increasingly similar systems to early TAI.
For the 'non-believers', current demonstrations have multiple disanalogies to the real dangers. For example, the alignment faking paper shows fairly weak preservation of goals that were initially trained in, with prompts carefully engineered to make this happen. Whether alignment faking (especially of a kind that wouldn't be easily fixable) will happen without these disanalogies at pre-TAI capabilities is highly uncertain. Compare the state of X-risk info with that of climate change, we don't have anything like the detailed models that should tell us what the tipping points might be.

Ultimately the dynamics here are extremely uncertain and look different to how they did even a year ago, let alone 5! (E.g. see rise of chain of thought as the source of capability growth, which is a whole new source of leverage over models and corresponding failure modes). I think it's very bad to plan to abandon or decenter efforts to actually get more evidence on our situation.

(This applies less if you believe in sharp-left-turns. But the plausibility of this happening before automated AI research should also fall as that point gets closer. Agree that communicating just how radical the upcoming transition is to the public, may be a big source of leverage.)

comment by Anonymous (currymj) · 2025-02-23T17:22:44.465Z · LW(p) · GW(p)

a quite widespread experience right now among normal people, is having their boss tell them to use AI tools in stupid ways that don't currently work, and then being somewhat held responsible for the failures. (For example: your boss heard about a study saying AI increased productivity by 40% among one group of consultants, so he's buying you a ChatGPT Plus subscription and increasing all your KPI targets by 40%.)

on the one hand this produces very strong anti-AI sentiment. people are just sick of it. if "Office Space" were made now, Bill Lumbergh would be talking about "AI transformation" and "agents" all the time. that's politically useful if you're advocating about x-risk.

on the other hand, it means if you are talking about how AI capabilities are growing fast, this gets an instant negative reaction because you sound like their delusional boss. At the same time they are worried about AI taking their jobs as it gets better.

This isn't a very internally consistent set of beliefs, but I could summarize what I've heard as something like this:

"AI doesn't really work, it's all a big scam, but it gives the appearance of working well enough that corporations will use it as an excuse to cut costs, lay people off, and lower quality to increase their profits. The economy is a rigged game anyway, and the same people that own the corporations are all invested in AI, so it won't be allowed to fail, we will just live in a world of slop."

comment by otto.barten (otto-barten) · 2025-02-26T15:36:25.316Z · LW(p) · GW(p)

I'm founder of the Existential Risk Observatory, a nonprofit aiming to reduce xrisk by informing the public since 2021. We have published four TIME Ideas pieces (including the first one on xrisk ever) and about 35 other media pieces in six countries. We're also doing research into AI xrisk comms, notably producing to my knowledge the first paper on the topic [EA · GW]. Finally, we're organizing events, coupling xrisk experts such as Bengio, Tegmark, Russell, etc. to leaders of the societal debate (incl. journalists from TIME, Economist, etc.) and policymakers.

First, I think you're a bit too negative about online comms. Some Yud tweets, but also e.g. Lex Fridman xrisky interviews, actually have millions of views: that's not a bubble anymore. I think online xrisk comms is firmly net positive, including AI Notkilleveryoneism Memes. Journalists are also on X.

But second, I definitely agree that there's a huge opportunity informing the public about AI xrisk. We did some research on this (see paper above) and, perhaps unsurprising, an authority (leading AI prof) on a media channel people trust seems to work best. There's also a clear link between length of the item and effect. I'd say: try to get Hinton, Bengio, and Russell in the biggest media possible, as much as possible, as long as possible (and expand: get other academics to be xrisk messengers as well). I think eg this item was great.

What also works really well: media moments. The FLI open letter and CAIS open statement created a ripple big enough to be reported by almost all media. Another example is the Nobel Prize of Hinton. Another easy one: Hinton updating his pdoom in an interview from 10% to 10-20%, that was news apparently. If anyone can create more of such moments: amazingly helpful!

All in all, I'd say the xrisk space is still unconvinced about getting the public involved. I think that's a pity. I know projects that don't get funded now, but could help spread awareness at scale. Re activism: I share your view that it won't really work until the public is informed. However, I think groups like PauseAI are helpful in informing the public about xrisk, making them net positive too.

comment by Mitchell_Porter · 2025-02-22T00:09:58.615Z · LW(p) · GW(p)

frame the current AI paradigm as a low-status sucker's game, and suggest alternative avenues for grabbing power

You say you want to target normal people, but even in the 2020s, normal people are not transhumanists. The most politically effective message would be anti-AI in general, anti-transhumanism in general, and would portray the desire to play God as the core problem.

Replies from: robo, Thane Ruthenis

↑ comment by robo · 2025-02-22T13:33:44.638Z · LW(p) · GW(p)

You might be^[1] overestimating the popularity of "they are playing god" in the same way you might overestimate the popularity of woke messaging. Loud moralizers aren't normal people either. Messages that appeal to them won't have the support you'd expect given their volume.

Compare, "It's going to take your job, personally". Could happen, maybe soon, for technophile programmers! Don't count them out yet.

^{^}
Not rhetorical -- I really don't know

↑ comment by Thane Ruthenis · 2025-02-22T00:11:08.134Z · LW(p) · GW(p)

The section under "framing" isn't for targeting normal people, it's an alternate approach for targeting the academics/researchers/politicians/tech-enthusiasts.

comment by Kaj_Sotala · 2025-02-23T16:12:58.920Z · LW(p) · GW(p)

The history of climate change activism seems like an obvious source to look for ideas. It's a recent mass movement that has convinced a large part of the general public that something is at least a catastrophic or even an existential risk and put major pressure on governments worldwide. (Even though it probably won't actually end civilization, many people believe that it will.)

Its failures also seem analogous to what the AI risk movement seems to be facing:

strong financial incentives to hope that there's no problem
action against it having the potential to harm the quality of life of ordinary people (though in the case of novel AI, it's less about losing things people already have and more about losing the potential for new improvements)
(sometimes accurate, sometimes not) accusations of fear-mongering and that the concerns are only an excuse for other kinds of social engineering people want to achieve anyway
overall a degree of collective action that falls significantly short of what the relevant experts believe necessary for stopping the current trajectory of things getting worse

comment by Holly_Elmore · 2025-02-23T03:54:18.275Z · LW(p) · GW(p)

As-is, this is mostly going to make people's first exposure to AI X-risk be "those crazy fringe protestors". See my initial summary regarding effective persuasion: that would be lethal, gravely sabotaging our subsequent persuasion efforts.

Pretty strong conclusion with no evidence.

Replies from: Thane Ruthenis, Holly_Elmore, Holly_Elmore

↑ comment by Thane Ruthenis · 2025-02-23T04:06:01.917Z · LW(p) · GW(p)

By all means, my intuitive model might be wrong. Do you have evidence that small protests in the reference class of protests PauseAI are doing tend to have positive effects on the causes being championed?

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T06:29:15.977Z · LW(p) · GW(p)

Small protests are the only way to get to big protests, and I don’t think there’s a significant risk of backfire or cringe reaction making trying worse than not trying. It’s the backfire supposition that is baseless.

Replies from: Benito, pktechgirl, Holly_Elmore, Thane Ruthenis

↑ comment by Ben Pace (Benito) · 2025-02-23T06:53:15.370Z · LW(p) · GW(p)

The point that "small protests are the only way to get big protests" may be directionally accurate, but I want to note that there have been large protests that happened without that. Here's a shoggoth listing a bunch, including the 1989 Tiananmen Square Protests, the 2019 Hong Kong Anti-Extradition Protests, the 2020 George Floyd Protests, and more.

The shoggoth says spontaneous large protests tends to be in response to triggering events and does rely on pre-existing movements that are ready to mobilize, the latter of which your work is helping build.

Replies from: matthew-milone

↑ comment by Matt Vincent (matthew-milone) · 2025-02-25T02:32:32.862Z · LW(p) · GW(p)

[...]spontaneous large protests tends to be in response to triggering events[...]

Unless you have a very optimistic view of warning shots, we shouldn't rely on such an opportunity.

↑ comment by Elizabeth (pktechgirl) · 2025-02-23T18:00:54.528Z · LW(p) · GW(p)

Can you share data on the size of PauseAI protests over time?

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T20:31:00.984Z · LW(p) · GW(p)

Yeah the SF protests have been about constant (25-40) in attendance, but we have more locations now and have put a lot more infrastructure in place

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T20:39:38.472Z · LW(p) · GW(p)

I think the relevant question is how often social movements begin with huge protests, and that’s exceedingly rare. It’s effective to create the impression that the people just rose up, but there’s basically always organizing groundwork for that to take off.

↑ comment by Holly_Elmore · 2025-02-23T20:31:43.347Z · LW(p) · GW(p)

Do you guys seriously think that big protests just materialize?

Replies from: habryka4, Thane Ruthenis

↑ comment by habryka (habryka4) · 2025-02-23T20:45:21.597Z · LW(p) · GW(p)

When I was involved with various forms of internet freedom activism, as well as various protests around government misspending in Germany, I do not remember a run-up of many months of small protests before the big ones. It seemed that people basically directly organized some quite big ones, and then they grew a bit bigger over the course of a month, and then became smaller again. I do not remember anything like the small PauseAI protests on those issues.

(This isn't to say it isn't a good thing in the case of AGI, I am just disputing that "small protests are the only way to get big protests")

Replies from: Holly_Elmore, Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T20:51:32.034Z · LW(p) · GW(p)

Do you think those causes never had organizing before the big protest?

Replies from: habryka4

↑ comment by habryka (habryka4) · 2025-02-23T22:29:22.742Z · LW(p) · GW(p)

The specific ones I was involved in? Pretty sure they didn't. They were SOPA related and related to what people thought was a corrupt construction of a train station in my hometown. I don't think there was much organizing for either of these before they took off. I knew some of the core organizers, they did not create many small protests before this.

Replies from: johannes-ackva-1

↑ comment by Johannes Ackva (johannes-ackva-1) · 2025-03-01T20:28:41.254Z · LW(p) · GW(p)

Assuming the second refers to "Stuttgart 21"?

I think both of these examples might have been for novel concerns in their specifics (e.g. a specific new train station project), but there is a lot of precedent for this kind of process as well as a strong existing civil society doing this kind of protest (e.g. a long history of environmentalist and mass protests against large new infrastructure projects).

Maybe this is also true for AI risk (e.g. maybe it fits neatly into other forms of anti-tech sentiment and could "spontaneously" generate mass protests), but I don't think these examples seem well-described as not having precedents / lots of societal and cultural preconditions (e.g. you probably would not have seen mass protests against the train station without a long history of environmental protests in Southern Germany).

Replies from: habryka4

↑ comment by habryka (habryka4) · 2025-03-02T04:44:59.462Z · LW(p) · GW(p)

Assuming the second refers to "Stuttgart 21"?

Yep!

but I don't think these examples seem well-described as not having precedents / lots of societal and cultural preconditions

I totally think there are lots of cultural preconditions and precedents, I just think they mostly don't look like "small protests for many years that gradually or even suddenly grew into larger ones". My best guess is if you see a protest movement not have substantial growth for many months, it's unlikely to start growing, and it's not that valuable to have started it earlier (and somewhat likely to have a bit of an inoculation effect, though I also don't think that effect is that big).

Replies from: johannes-ackva-1

↑ comment by Johannes Ackva (johannes-ackva-1) · 2025-03-04T13:52:01.098Z · LW(p) · GW(p)

Thanks for clarifying, I can see that.

I think my model is more "if there's an incident that increases the salience of AI x risk concerns, then an existing social movement structure that can catalyze this will be very valuable" which is different from assuming that Pause AI by itself will drive that.

In a similar way then, say, after Fukushima in Germany the existence of a strong environmental movement facilitated mass protests whereas in other countries ~nothing happened despite objectively the same external shock.

↑ comment by Holly_Elmore · 2025-02-23T20:50:00.816Z · LW(p) · GW(p)

Yeah I unintentionally baited the “not always” rationalist reflex by talking normally

Replies from: habryka4, kabir-kumar

↑ comment by habryka (habryka4) · 2025-02-23T22:31:30.560Z · LW(p) · GW(p)

I don't understand, I don't think there was any ambiguity in what you said. Even not taking things literally, you implied that having big protests without having small protests is at least highly unusual. That also doesn't match my model. I think it's pretty normal. The thing that I think happens before big protests is big media coverage and social media discussion, not many months and years of small protests. I am not sure of this, but that's my current model.

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-24T21:23:22.527Z · LW(p) · GW(p)

Yeah I suspect that these one-shot big protests are drawing on a history of organizing in those or preceding fields. The Women’s March coalition comes together all for one big event but draws on a far on deeper history involving small demonstrations and deliberate organizing to make it to that point, is my point. Idk about Free Internet but I would bet it leaned on Free Speech organizing and advocacy.

I sure wish someone would put on a large AI Safety protest if they know a way to do this in one leap. If I got a sponsor for a concert or some other draw then perhaps I could see a larger thing happening quickly in the family of AI Safety protest, but I’d like the keep the brand pretty earnest and message-focused.

I have to note, based on our history, I interpret your posts as attacking, like the subtext is that I’m just not a good organizer and, if you wanted to, you could organize a way bigger movement way faster. If that’s true, I wish you would! I’m trying my best with my understanding of how this can work for me and I wish more people like you were embracing broad messaging like protests.

↑ comment by Kabir Kumar (kabir-kumar) · 2025-03-02T20:42:28.867Z · LW(p) · GW(p)

yup.

↑ comment by Thane Ruthenis · 2025-02-23T22:11:23.846Z · LW(p) · GW(p)

My model is that big protests require (1) raising the public awareness, then (2) solving the coordination problems to organize. Small protests are one way to incrementally raise awareness, and one way to solve coordination problems/snowball into big protests (as I'd outlined in a footnote in the post).

But small protests can't serve their role in (2) without (1) being done first. You can't snowball public sentiments without those sentiments existing. So prior to the awareness-raising groundwork being made, the only role of protests is just (1): to incrementally raise public awareness, by physically existing in bystanders' fields of vision.

I agree that protests can be a useful activity, potentially uniquely useful; including small protests.

I am very skeptical that small protests are uniquely useful at this stage of the game.

↑ comment by Thane Ruthenis · 2025-02-23T09:05:37.229Z · LW(p) · GW(p)

It’s the backfire supposition that is baseless

There is a potential instinctive association of protests with violence, riots, vandalism, obstruction of public property, crackpots/conspiracy theorists, et cetera. I don't think it's baseless to worry whether this association is strong enough, in a median person's mind, for any protest towards an unknown cause to be instinctively associated with said negative things, with this first impression then lingering.

Anti-technology protests, in particular, might have an association with Unabomber-style terrorism, and certainly the AGI labs will be eager to reinforce this association. Protests therefore make your cause/movement uniquely vulnerable to this type of attack (via corresponding biased newspaper coverage). The marginal increase in visibility does not necessarily offset it.

It doesn't seem obvious to me whether the net effects are positive or negative. Do you have theoretical or empirical support for the effects being positive?

Small protests are the only way to get to big protests

I don't think so, and I'm not even saying small protests bad. I'm saying small protests might be bad without the appropriate groundwork.

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T19:34:47.600Z · LW(p) · GW(p)

Sounds like you are saying that you have those associations and I still see no evidence to justify your level of concern.

Replies from: Thane Ruthenis

↑ comment by Thane Ruthenis · 2025-02-23T19:51:57.335Z · LW(p) · GW(p)

My understanding is that I am far from the only person in the LW/EA spaces who has raised this genre of concern against the policy of protests. Plenty of people at least believe that other people have this association, which is almost equivalent to this association actually exiting, and is certainly some evidence in that direction.

Based on your responses, that hadn't prompted you to make any sort of inquiry – look up research literature, run polls, figure out any high-information-value empirical observations of bystander's reactions you can collect – regarding whether those concerns are justified?

That implies a very strong degree of confidence in your model. I'm only asking you to outline that model (or whatever inquires you ran, if you did run them).

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T20:28:05.274Z · LW(p) · GW(p)

The thing is there isn’t a great dataset— even with historical case studies where the primary results have been achieved, there are a million uncontrolled variables and we don’t and will never have experimentally established causation. But, yes, I’m confident in my model of social change.

What leapt out to me about your model was that is was very focused how an observer of the protests would react with a rationalist worldview. You didn’t seem to have given much thought to the breadth of social movements and how a diverse public would have experienced them. Like, most people aren’t gonna think PauseAI is anti-tech in general and therefore similar to the unabomber. Rationalists think that way, and few others.

Replies from: Thane Ruthenis, habryka4

↑ comment by Thane Ruthenis · 2025-02-23T21:56:11.808Z · LW(p) · GW(p)

most people aren’t gonna think PauseAI is anti-tech in general and therefore similar to the unabomber

My model of a normal person doesn't think PauseAI protests are anything in particular, yes. My model of a normal person also by-default feels an instinctive wariness towards an organized group of people who have physically assembled to stand against something, especially if their cause is unknown to me or weird-at-a-glance — which "AGI omnicide" currently is. (Because weird means unpredictable, and unpredictable physical thing means a possible threat).

This wariness will be easy to transform into outright negative feelings/instinctive dismissal by, say, some news article bankrolled by an AGI lab explicitly associating PauseAI with environmental-activism vandals and violence. Doubly so in the current political climate, with pro-AI-progress people running the government.

The difference between protests and other attempts at information proliferation is that (1) seeing a protest communicates little information on the cause (compared to e. g. a flyer or something, which can be instantly navigated to an information-dense resource if it contains links), so you can't immediately tell that the people behind it are thoughtful and measured and have expert support, instead of a chaotic extremist mob, (2) it is a deliberately loud physical anti-something activity, meaning the people engaging in it are interested in imposing their will on other people.

Like, look at how much mileage they got out of Eliezer's statements about being willing to enforce the international AGI ban even in the face of nuclear retaliation. Obviously you can't protect against all possible misrepresentations, but I think some moves can be clearly seen to be exposing too much attack surface for the benefits they provide.

Which, I'm not even saying this is necessarily the case for the protests PauseAI have been doing. But it seems like a reasonable concern to me. I would want to launch at least some organized inquiry to inform my CBAs, in your place.

Replies from: WillPetillo

↑ comment by WillPetillo · 2025-02-26T09:30:45.076Z · LW(p) · GW(p)

If you want to get an informed opinion on how the general public perceives PauseAI, get a t-shirt and hand out some flyers in a high foot-traffic public space. If you want to be formal about it, bring a clipboard, track whatever seems interesting in advance, and share your results. It might not be publishable on an academic forum, but you could do it next week.

Here's what I expect you to find, based on my own experience and the reports of basically everyone who has done this:
- No one likes flyers, but get a lot more interested if you can catch their attention enough to say it's about AI.
- Everyone hates AI.
- Your biggest initial skepticism will be from people who think you are in favor of AI.
- Your biggest actual pushback will be from people who think that social change is impossible.
- Roughly 1/4 to 1/2 are amenable to (or have already heard about!) x-risk, most of the rest won't actively disagree but you can tell that particular message is not really "landing" and pay a lot more attention if you talk about something else (unemployment, military applications, deepfakes, etc.)
- Bring a clipboard for signups. Even if recruitment isn't your goal, if you don't have one you'll feel unprepared when people ask about it.

Also, protests are about Overton-window shifting, making AI danger a thing that is acceptable to talk about. And even if it makes a specific org look "fringe" (not a given, as Holly has argued), that isn't necessarily a bad thing for the underlying cause. For example, if I see an XR protest, my thought is (well, was before I knew the underlying methodology): "Ugh, those protestors...I mean, I like what they are fighting for and more really needs to be done, but I don't like the way they go about it" Notice that middle part. Activation of a sympathetic but passive audience was the point. That's a win from their perspective. And the people who are put off by methods then go on to (be more likely to) join allied organizations that believe the same things but use more moderate tactics. The even bigger win is when the enthusiasm catches the attention of people who want to be involved but are looking for orgs that are the "real deal," as measured by willingness to put effort where their words are.

Replies from: Thane Ruthenis, kabir-kumar, TrevorWiesinger

↑ comment by Thane Ruthenis · 2025-02-26T10:47:32.509Z · LW(p) · GW(p)

Here's what I expect you to find, based on my own experience and the reports of basically everyone who has done this:

Excellent, thank you. That's the sort of information I was looking for.

Also, protests are about Overton-window shifting

Hmm. Good point, I haven't been taking that factor into account.

Replies from: WillPetillo

↑ comment by WillPetillo · 2025-02-27T07:01:45.299Z · LW(p) · GW(p)

Glad to hear it! If you want more detail, feel free to come by the Discord Server or send me a Direct Message. I run the welcome meetings for new members and am always happy to describe aspects of the org's methodology that aren't obvious from the outside and can also connect you with members who have done a lot more on-the-ground protesting and flyering than I have.

As someone who got into this without much prior experience in activism, I was surprised how much subtlety and counterintuitive best practices there are, most of which is learned through direct experience combined with direct mentorship, as opposed to written down & formalized. I made an attempt to synthesize many of the code ideas in this video--it's from a year ago and looking over it there is quite a bit I would change (spend less time on some philosophical ideas, add more detail re specific methods), but it mostly holds up OK.

↑ comment by Kabir Kumar (kabir-kumar) · 2025-03-02T20:41:32.249Z · LW(p) · GW(p)

Multiple talented researchers I know got into alignment because of PauseAI.

↑ comment by trevor (TrevorWiesinger) · 2025-03-01T05:35:39.935Z · LW(p) · GW(p)

NEVER WRITE ON THE CLIPBOARD WHILE THEY ARE TALKING.

If you're interested in how writing on a clipboard affects the data, sure, that's actually a pretty interesting experimental treatment. It should not be considered the control.

Also, the dynamics you described with the protests is conjunctive. These aren't just points of failure, they're an attack surface, because any political system has many moving parts, and a large proportion of the moving parts are diverse optimizers.

Replies from: kabir-kumar

↑ comment by Kabir Kumar (kabir-kumar) · 2025-03-02T20:40:11.675Z · LW(p) · GW(p)

You can also give them the clipboard and pen, works well

↑ comment by habryka (habryka4) · 2025-02-23T20:48:47.287Z · LW(p) · GW(p)

What leapt out to me about your model was that is was very focused how an observer of the protests would react with a rationalist worldview. You didn’t seem to have given much thought to the breadth of social movements and how a diverse public would have experienced them. Like, most people aren’t gonna think PauseAI is anti-tech in general and therefore similar to the unabomber. Rationalists think that way, and few others.

I am confused, did you somehow accidentally forget a negation here? You can argue that Thane is confused, but clearly Thane was arguing from what the public believes, and of course Thane himself doesn't think that PauseAI is similar to the Unabomber based on vague associations, and certainly almost nobody else on this site believes that (some might believe that non-rationalists believe that, but isn't that exactly the kind of thinking you are asking for?).

Replies from: Holly_Elmore

↑ comment by Holly_Elmore · 2025-02-23T20:54:05.288Z · LW(p) · GW(p)

I’m saying he’s projecting his biases onto others. He clearly does think PauseAI rhymes with unabomber somehow, even if he personally knows better. The weird pro-tech vs anti-tech dichotomy, and especially thinking that others are blanketly anti-tech, is very rationalist.

↑ comment by Holly_Elmore · 2025-02-23T03:57:53.974Z · LW(p) · GW(p)

Appreciate your conclusion tho— that reaching the public is our best shot. Fortunately, different approaches are generally multiplicative and complementary.

↑ comment by Holly_Elmore · 2025-02-23T03:55:09.958Z · LW(p) · GW(p)

People usually say this when they personally don’t want to be associated with small protests.

comment by Knight Lee (Max Lee) · 2025-02-23T08:24:27.161Z · LW(p) · GW(p)

I still think "parading around the experts concerned about AI" hasn't been done enough.

Most people have no freaking clue that the median AI expert sees a 5%-10% chance of extinction. I bet if you told them they would be pretty surprised.

comment by Charbel-Raphaël (charbel-raphael-segerie) · 2025-03-08T13:53:20.521Z · LW(p) · GW(p)

I'm the founder of CeSIA [LW · GW], the French Center for AI Safety.

We collaborated/advised/gave interviews with 9 French Youtubers, with one video reaching more than 3.5 million views in a month. Given that half of the French people watch Youtube, this video reached almost 10% of the French population using Youtube, which might be more than any AI safety video in any other language.

We think this is a very cost effective strategy, and encourage other organisations and experts in other country to do the same.

comment by SAB25 (Idontwantausername) · 2025-02-23T04:21:53.640Z · LW(p) · GW(p)

~All ML researchers and academics that care have already made up their mind regarding whether they prefer to believe in misalignment risks or not. Additional scary papers and demos aren't going to make anyone budge.

I think this mostly shows that the approach used so far has been ineffective. I don't think it's evidence that academics are incapable of changing their minds. Papers and demos seem like the intuitive way to persuade academics, but if this was the case how could they ever come to the conclusion that AI is safe by default, something which is not supported by evidence?

I think the most useful approach right now would be to find out why some researchers are so unconcerned with safety. When you know why someone believes the things they do it is much easier to change their mind.

comment by Elizabeth (pktechgirl) · 2025-02-22T22:42:46.831Z · LW(p) · GW(p)

Comedians seem like a useful vector.

https://www.youtube.com/shorts/eUxJZk6niBI

comment by Charbel-Raphaël (charbel-raphael-segerie) · 2025-03-08T13:19:53.994Z · LW(p) · GW(p)

or obvious-to-us ways to turn chatbots into agents, are very much not obvious to them

I think that's also surprisingly not obvious for many policy makers, and many people in the industry. I made introductory presentation in various institutions on AI risks, and they were not familiar with the idea of scaffolding at all.

comment by Davey Morse (davey-morse) · 2025-02-21T22:14:47.690Z · LW(p) · GW(p)

agreed that comedy/memes might be a strategic route for spreading ai x-risk awareness to general population. this kind of thinking inspired this silly alignment game https://dontsedateme.org.

some other silly/unlikely ideas:

attempting to reframe religions/traditional god-concepts around the impending superintelligence. i'm sure SI, once here, will be considered a form of god to many.
AGI ice bucket challenge.
Ads with simple one-liners like, "We are building tech we won't be able to control."

comment by Ebenezer Dukakis (valley9) · 2025-02-22T06:30:54.421Z · LW(p) · GW(p)

It looks like the comedian whose clip you linked has a podcast:

https://www.joshjohnsoncomedy.com/podcasts

I don't see any guests in their podcast history, but maybe someone could invite him on a different podcast? His website lists appearances on other podcasts. I figure it's worth trying stuff like this for VoI.

I think people should emphasize more the rate of improvement in this technology. Analogous to early days of COVID -- it's not where we are that's worrisome; it's where we're headed.

comment by hopeful_lobbyist · 2025-02-23T23:32:20.797Z · LW(p) · GW(p)

The relevant parts of the USG are mostly run by Musk and Vance nowadays, who have already decided either that they've found the solution to alignment (curiosity, or whatever Musk is spouting nowadays), or that AI safety is about wokeness. They're not going to change their minds.

What's the basis for each part of these statements? I was curious for more elaboration on Musk in particular, since as far as I can tell he's demonstrated pro-AI regulation stances prior to 2025.

Replies from: Thane Ruthenis

↑ comment by Thane Ruthenis · 2025-02-25T02:53:36.970Z · LW(p) · GW(p)

My impression is that Musk's behavior is incoherent at the best of times. IIRC, he did support SB 1047. But he seems perfectly willing to stand aside now as Trump repels Biden's Executive Act and JD Vance spouts accelerationist anti-regulation rhetoric at the Paris AI Safety Summit. He seems to aim to win the AGI race instead, thinking misalignment risks aren't a concern/are solved.

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

Contents

Persuasion

A Better Target Demographic

Extant Projects in This Space?

Framing

51 comments