Anti-memes: x-risk edition
post by WillPetillo · 2025-04-10T23:35:30.756Z · LW · GW · 0 commentsContents
No comments
I first heard of the concept of an anti-meme while listening to an interview with Connor Leahy (relevant section is from 1:02:28-1:05:42). Leahy's definition of an anti-meme is:
"An idea that, by its very nature, resists being known."
While memes are defined by their ability to stick in one's mind and propagate, anti-memes are slippery, tending to fade from conscious thought. I do not know if Leahy picked up this concept from somewhere else [LW · GW]; an Internet search mostly leads to examples following other definitions, such as images that subvert expectations. This post will be using Leahy's definition of anti-meme, as I have since found it to be exceptionally useful in explaining concepts in AI safety as well as in attuning my own mind to concepts I might otherwise miss. Some of the content in this post will be from Leahy's description, but most is based on my own experience.
What anti-memes are not
Arbitrary: ideas that people don't remember because what they simply aren't important enough to bother with. Examples include random phrases, jokes without a punchline, and boring or useless facts. What makes anti-memes worth naming as a concept is their membership in a class of ideas that resist being known despite their worthiness of being known. In fact, the anti-memes I will be cataloging in this post all have a fair number of people putting considerable effort into spreading, but this effort is a consequence of the ideas' perceived utility, not the appeal of the idea itself.
Controversial: ideas that people are predisposed to disbelieve. When saying something controversial, one is often met with clear, deliberate rejection. In contrast, when explaining an anti-meme, people often agree...but then continue to speak & act in a way that is inconsistent with the concept. When using an anti-meme in an argument, you will often find yourself being weirdly ignored or consistently misunderstood as saying something very different.
Complicated: anti-memes can be very simple—even boringly so. What makes them slippery is that they are difficult to internalize, especially on the emotional level. Anti-memes' lack of emotional connection also makes them difficult to remember...which has made this post rather difficult to write.
The common theme driving all of these negative categories is alternative explanations for an idea resisting being known. If an idea is arbitrary, controversial, or complicated, these are all good reasons to resist internalizing the idea and so there is no need to reference any additional mechanism for its mental slipperiness. The applicability of one or more of these negative categories therefore does not automatically disqualify an idea from being an anti-meme, but one should first determine how much of the ideas' resistance to being known is likely to be the result of more mundane explanations.
Example anti-memes:
I've managed to catalog a few anti-memes by writing them down each time I come across an idea that seems to display their core characteristics. AI safety appears to be especially rife with anti-memes, as does self-help...but perhaps it only seems this way to me because these are where I have been paying attention. Anyways, here are some examples, with explanations:
The world can end
This one is deceptive because it is adjacent to the highly memetic idea of impending end times, which has been continuously popular among religious groups for millennia, environmentalists for decades, and has also been a persistent theme in fiction. Almost all apocalyptic visions, however, include subtle deflections from the anti-meme.
Religious stories of the apocalypse often frame the human experience of the world as a chapter in the story of reality. The end of the world is thus not really the end of the world, but rather the end of the familiar world.
Environmental and AI safety advocates often seem to assume, at least on an emotional level, that the world will be OK in the end, but as a near-miss from existential catastrophe, saved in the last moment by a coalition of heroic scientists and activists.
Apocalyptic fiction focuses on the surviving remnants of the world after the Very Bad Thing has happened.
"Don't Look Up" is a rare counter-example, but even here the last scenes are of an off-world settlement and Jonah Hill's character posting images of the wreckage on social media. Both are clearly doomed, but images of life are nonetheless the last moments the audience sees, rather than, for example, the camera hovering over a dead world in silence for an uncomfortably long time until the credits roll.
There is no rule that we must have a chance of surviving
In 2022, Rob Miles posted There's No Rule That Says We'll Make It arguing that people, including those working in AI safety, don't really take the idea that humanity could fail to survive this century (or decade) seriously. He uses the thought experiment that if a very large asteroid was heading towards Earth two hundred years ago, we would have had zero chance of stopping it—no matter how valiantly humanity unified and rose to the challenge—and that simply would have been the end of the human story.
Ironically, in 2024, Rob Miles posted AI Ruined My Year, in which he admitted that he still hadn't fully internalized extinction risk from AI as a real thing that could actually happen in his lifetime.
These discussions match my own experience pretty well, both as an individual and as a volunteer organizer at PauseAI who regularly welcomes new members and connects them with projects. Navigating the cognitive dissonance between intellectual acceptance of x-risk and the subconscious avoidance of mortality is a central part of many of these conversations. Some people internalize this idea and are emotionally overwhelmed by it, but these are the exception and the process of internalization takes a lot of effort.
AI consciousness is irrelevant to AI safety
"It's about competence, not consciousness" has been an obligatory cliche in AI safety communication since the beginning. The fact that this disclaimer is still necessary is telling. And I have noticed from personal experience that almost every conversation about why consciousness is not necessary will inevitably turn into a conversation about the nature of consciousness and whether AI might have it.
There is no single "best" way to communicate x-risk (or anything) persuasively
People are different in terms of their background knowledge and prior intuitions and so what types of arguments they are likely to find persuasive vary wildly.
One person might be dismiss x-risk as "too sci-fi" while being very receptive to the idea of AI disrupting society by taking peoples' jobs, whereas another person might be deeply concerned about x-risk while seeing job loss as a temporary bump in the road to progress.
One person might be receptive to simple, attention-grabbing statements of expert opinion but get confused and lose interest if you try to explain the finer points of instrumental convergence, whereas another person might be skeptical of any claim that asks them to take anything on faith or handwaves over details.
One person might agree with your message and need to be persuaded that it is worthwhile to take action, whereas another may need to be depolarized from actively pushing opposing narratives and policy.
Asking for drastic policy measures at a negotiating table can stall out the discussion and miss out on an opportunity to make tangible progress, but in a different context those same asks can shift the range of what is possible to talk about and thus make moderate proposals more likely to succeed.
These are just a few examples, I go over quite a few more in this video on communication strategy. Any strategy that works well for some people will fail to connect with others. And if you don't know the person you are speaking to well enough to make an informed decision about how to connect...that's a problem, take some time to listen and understand them.
The above point is obvious to the point of sounding banal, but I see it come up on a regular basis as AI safety communication strategists continue to engage in endless internal debate regarding the "best" way to communicate with the public. To be fair, there are contexts where it is important to have a unified message, which decreases the extent to which one can flexibly adapt the message to the audience. When one must use a "one-size-fits-all" strategy, finding the size that fits most makes sense. But this is a special case, not a default consideration.
The world is capable of having multiple problems
An example of this anti-meme is the frequent insistence from the "AI ethics" community that x-risk is a "distraction" from present-day harms. This argument makes zero sense as a rational argument and is also thoroughly unsupported [LW · GW] by empirical evidence. As Max Tegmark has observed, arguing that extinction is a distraction from current harms is like arguing that global warming is a distraction from recent environmental disasters.
When faced with these obvious and central objections or pressed on why advocacy for "near" and "long" term risks is zero sum, proponents of the "x-risk as distraction" claim consistently change the subject--or simply don't reply.
Coordination does not involve self-sacrifice
Possibly the most dangerous narrative in AI is that it is an arm's race. Most AI companies were founded out of concern for what would happen if someone else builds AI first—DeepMind for OpenAI and Elon Musk, OpenAI for Anthropic and Safe Superintelligence. And on the national level, US interest in accelerating AI development is fueled by concerns of competition with China. This concern was raised especially pointedly in Situational Awareness and now seems to be the guiding philosophy of the US government since Trump's re-election.
Think about how crazy this is: the idea of international coordination is shot down on the grounds that the US slowing down will allow China to race ahead. That's not what an agreement is! If you and I decide to bind ourselves to a contract, I don't simply adhere to the terms on my own and hope for the best, we agree in advance on the details of the terms, figure out an effective means of monitoring and enforcing those terms, visibly agree to bind ourselves, and only after all of this has happened will I change my behavior. This is common knowledge for anyone who has ever entered into a contract--or has collaborated with anyone on anything--and yet the United States Government not only pretends not to get it, but expects to get away with this nonsense because a sufficient proportion of the general public doesn't get it either.
Coordination is a dominant strategy that frequently enables better outcomes than any individual can achieve on their own through competition. It is the basis for stable coordination between states and organizations, for the effective functioning of institutions, for healthy interpersonal relationships, and even for a well-balanced mind. Self-sacrifice assumes an adversarial relationship and accepts a losing role in exchange for avoiding destructive conflict. They are not the same.
Say it out loud so it sticks:
- No one is advocating for a unilateral slowdown!
- Unilateral contracts are not a thing!
Anyone who tells you that agreeing to end a race means losing the race (without having comprehensively explored all possible monitoring and enforcement options) is not thinking about what they are saying.
The fate of the world depends on people grokking this concept. Be a part of the solution.
How to find an anti-meme
Anti-memes are not widely understood as a concept so there aren't much in the way of established responses, but here are a couple of ideas that makes sense to me:
Notice when an idea has anti-meme characteristics. When it is brought up in argument, is the response a direct objection, dismissed for an identifiable reason, or weirdly ignored?
Intentionally reinforce it over and over (like a mantra)
Actively search for ways to apply the concept and tie it to more familiar ideas.
Connor Leahy's advice: be circuitous, use stories, and "update on the vibe."
All of the above is for noticing when an idea you have encountered is anti-memetic and then absorbing it anyways. All of such anti-memes were, at some point, generated by someone else. I don't have any advice for being that "someone else" who can originate an anti-meme.
Or if I have heard of such a technique, I don't remember it.
0 comments
Comments sorted by top scores.