A Disciplined Way to Avoid Wireheading
post by amitlevy49 · 2025-05-07T15:20:27.893Z · LW · GW · 6 commentsThis is a link post for https://ivy0.substack.com/p/a-disciplined-way-to-avoid-wireheading
Contents
What is wireheading? On the possibility of just not inserting a wire into your brain How to avoid edging toward the right side of the scale None 6 comments
This is a crosspost of a post from my blog, Metal Ivy. The original is here: A Disciplined Way To Avoid Wireheading.
What is wireheading?
The term wireheading comes from the observation that if you put a wire in someone’s brain such that it is near their brain’s pleasure (or craving) center, and give them a lever that when pressed activates the current through the wire, they will repeat that action forever, to the point of dying from starvation. Conditional on them trying it once, of course. These experiments have been performed on mice, dolphin, monkeys, and in the distant unethical annals of the 1970s, humans.
Personally, though I am not alone in this view, I see wireheading not as an independent phenomenon, but instead the edge of a spectrum. These neurons were not created by evolution for some other task that is being tricked by wireheading to cause cravings to pull a lever. They were evolved precisely to give you a craving to perform an action to get dopamine. It just didn’t want you to do it quite so directly (if you oppose the framing of “goals” for evolution, take it as a rhetorical device, or call it Moloch[1]).
Evolution has many pathways for getting you to do what it wants. The craving mechanism is one of them, but you also have a more stable sense of utility — that sense tells you that, actually, being addicted to pulling a lever to directly stimulate your brain is “bad”. You want to get pleasure in a less direct, more natural way. You don’t want to trick your own brain. The craving and the perceived, externalized utility contradict, especially if you were to reflect on the subject before the first pull of the lever. This article will take the stance that this sense of utility is important, as an axiom, since we find it to be important by its very definition. We do not want it to contradict with our actions, which means we don’t want to stimulate our dopamine secreting neurons so directly.
On the possibility of just not inserting a wire into your brain
Directness is a spectrum, trickiness is a sliding scale and something being natural or not is not well defined. Fentanyl is not as direct as wireheading, but it’s not all that different either. TikTok is further to the left, in this sense. But it can be placed on this same spectrum. The typical activities that you find rewarding in both senses belong on the left extreme.
The only sense in which there is a phase change on this spectrum — a fundamental, qualitative difference — is the point at which the craving overpowers your self control, which I’ll call addiction (regardless of the formal definition of addiction, I think most can agree that repeatedly pressing a lever is the quintessential example of it). But I think even that is not binary. The mice in the wireheading experiments died from starvation, not thirst. They could have been more addicted than they were. The human subject in Moan 1972 (unethical experiment trigger warning) didn’t starve, he just “vigorously” protested disconnection (unfortunately what that means was not expanded upon in the paper).
I’ll argue that there isn’t actually any point at which you should prefer going toward the right of this spectrum. The hedonic treadmill means that whenever you are exposed to a new, more direct method of stimulating yourself with more dopamine, your happiness level will rapidly re-adjust such that you will be just as happy (or unhappy) as before.
As such, even if you are a hedonist, at present it’s not even clear if wireheading would actually make you happier long term, it will just get you very addicted, and very depressed once you stop (and unfortunately, the hedonic treadmill is slower to adjust in the other direction). As such exposing your brain to a new, never seen before level of Septal stimulation is only actually good for you, long term, if it corresponds to a new, never seen before level of actual utility to you, i.e. not rightward on the spectrum described above. So instead of the harm appearing at some arbitrary point, on the fentanyl range, or the wire-in-brain stage, it is “whatever shows your brain a new, never seen before peak of dopamine without a corresponding, never seen before peak of utility”.
How to avoid edging toward the right side of the scale
This situation — where moving to the right side could be as easy as popping a pill, and moving to the left (breaking an addiction) is one of the most challenging activities a human can undertake, is systemically troubling. If you’re a transhumanist and think there’s a chance you will live ~forever, the situation is even worse. You can imagine wireheading as something like a black hole — once you’re inside, you aren’t getting out, and it’s very easy to get pulled closer.
More than this, technology improves. The most addictive drugs today are more addictive than the most addictive drugs of the past, and we’ve found ways to get people addicted to things without direct access to the insides of their body, through a combination of visual and auditory channels (TikTok, its predecessors and its future successors). This trend seems likely to continue.
Still, maybe we shouldn’t be too fatalistic. Most of us don’t go around in constant fear of getting addicted to a new drug. Even when you eat something you’ve never eaten before, you are not concerned that it is going to be uniquely addictive. And technologies occasionally get invented to help push you away from the blackhole, or slow your descent — Ozempic’s craving suppressing properties seem promising. But my empirical observation is that looking at the trend line, we are moving to the right as technology improves. As he describes in his personal diary, Marcus Aurelius was worried about the addictive potential of poetry two millennia ago, and we are now worried about the addictive potential of TikTok, but they are not the same. Those that fall to TikTok today spend more time on it, and feel stronger cravings for it, than Romans felt for poetry.
I think the only way you could be confident that you don’t accidentally move to the right — that the new technology you’re trying isn’t going to shock you with its level of stimulation, and get you addicted, that it isn’t the “next tiktok”, is to not try new technologies as they come out. Wait until you’ve seen evidence they aren’t addictive, or verify that your understanding of the technology is such that you can be confident it is not addictive in advance.
Otherwise I don’t see how you would have avoided the original Coca Cola (which famously contained cocaine) in 1886, or TikTok in 2016 (and you may have tried both these things and not gotten addicted, but you couldn’t have been confident of that, if you originally tried them as they came out). And so I unfortunately find myself pointing in a technology fearing direction in this respect, even being the techno optimist that I am. But this isn’t the only respect in which technology needs to be evaluated, and intuitively you don’t actually want to avoid all new technologies.
The fact that technology usually has real benefits, means that waiting to try a new technology until it is confirmed not addictive (or worth it regardless) is not free. But I think this is a consideration that should be taken into account, because the only way to be sure you don’t move toward the wireheading black hole is to decide on a date (say, today’s date) and to avoid trying new technologies created after that date, unless they are confirmed not addictive.
I personally chose a cutoff date for just new entertainment technologies instead. They have the most potential to be out of distribution addicting, and the least benefit from technological advancement, once you take the hedonic treadmill into account. This is at least a stopgap measure, until we have some technology capable of painlessly stopping a severe addiction, or the depths of audio/visual mediums are fully explored and we are confident that they could never cause something approaching true wireheading.
- ^
Or call it POSIWID, which in my opinion is, in the original, academic sense, is isomorphic to the Moloch concept, though Scott himself disagrees.
6 comments
Comments sorted by top scores.
comment by localdeity · 2025-05-07T22:16:20.848Z · LW(p) · GW(p)
I thought you were going to conclude by saying that, since it’s unviable to assume you’ll never get exposed to anything new that’s farther to the right of this spectrum, it’s important to develop skills of bouncing off such things, unaddicting yourself, or otherwise dealing with it.
To that end: I think it helps to perceive the creators of a thing as being malicious manipulators trying to exploit you, and to think of certain things as being Skinner boxes or other known exploits. Why does this game or app do this thing this way? If they wanted me to get maximum value out of it and waste minimal time, they would do it another way. Therefore they’re trying to screw with me. I’m not gonna put up with that.
By the way, I do in fact avoid trying out things like skiing, “just to see what it’s like”, partly because I do not want to discover that I really like it, and then spend all kinds of money and inconvenience and risk on it. (A friend of mine has gotten like three concussions skiing, the cumulative effects of which have serious neurological consequences that are disrupting his daily life, and my impression is that he still wants to ski more. (It’s not his profession—he’s a programmer.)) Likewise I’m not interested in “trying out” foods like ice cream that I’m confident I don’t want to incorporate into my regular diet; if it’s a social event then I’ll relax this attitude, but if such events start happening too frequently in a short period then I resume frowning at foods I think are too, erm, high in the calories:nutrition and especially sugar:nutrition ratio.
Replies from: amitlevy49↑ comment by amitlevy49 · 2025-05-13T17:17:48.903Z · LW(p) · GW(p)
I thought you were going to conclude by saying that, since it’s unviable to assume you’ll never get exposed to anything new that’s farther to the right of this spectrum, it’s important to develop skills of bouncing off such things, unaddicting yourself, or otherwise dealing with it.
I think this would be nice, but it assumes that it's possible to develop a skill of bouncing of addictions. I think it's both fairely genetically predetermined, and hard on average at the extemes (Heroin or potentially future TikTok) even for people in the top 10% of ability in this regard.
By the way, I do in fact avoid trying out things like skiing, “just to see what it’s like”, partly because I do not want to discover that I really like it, and then spend all kinds of money and inconvenience and risk on it. (A friend of mine has gotten like three concussions skiing, the cumulative effects of which have serious neurological consequences that are disrupting his daily life, and my impression is that he still wants to ski more. (It’s not his profession—he’s a programmer.)) Likewise I’m not interested in “trying out” foods like ice cream that I’m confident I don’t want to incorporate into my regular diet; if it’s a social event then I’ll relax this attitude, but if such events start happening too frequently in a short period then I resume frowning at foods I think are too, erm, high in the calories:nutrition and especially sugar:nutrition ratio.
What I'm saying is basically what you're saying, just with an additional automatic ban on new types of entertainment. The only reason you can evaluate the harm of ski or ice cream is because they've been around for a while and know how they work. When <future TikTok> comes out, you can't know immediately that this isn't going to have an algorithm that surpasses you abilities to "bounce off addictions", so you should wait
comment by MalcolmMcLeod · 2025-05-07T21:24:13.916Z · LW(p) · GW(p)
One of my favorite genres of LW posts is "puts a name and a framework to my deeply felt intuition." This is a sterling example. Have you read Infinite Jest? It's substantially about how tech & America grease our rightward slide on this continuum.
Replies from: amitlevy49↑ comment by amitlevy49 · 2025-05-13T17:19:47.504Z · LW(p) · GW(p)
Haven't yet, added it to my reading list, thanks!
comment by JustisMills · 2025-05-08T02:14:45.569Z · LW(p) · GW(p)
I actually think most of my "to the right" stuff is synthetic! For example, when I've posted on LessWrong, (too) frequently checking to see how the post is received is self reinforcing with a random spaced reward. Nobody did this to me per se; mostly my brain created the loop itself from the available landscape.
On the other hand, deliberately addictive stuff is (in my experience) usually self limiting. Like, I enjoy Fall Guys because it scratches the large group competition itch but isn't violent or scary. It also has all the dark patterns to keep you engaged. I voluntarily fell for those a bit, but never became a durable Fall Guys addict. It just sort of got old. I expect TikTok would be the same. Even WoW, that glorious ambrosia, wears out its welcome within a year or two (much faster now that I'm married, courtesy of my wife).
So I don't think avoiding new stuff would work for me as a heuristic here; there'd be false positives and false negatives. Perhaps I'm unusual!
Replies from: myron-hedderson↑ comment by Myron Hedderson (myron-hedderson) · 2025-05-08T16:08:41.797Z · LW(p) · GW(p)
I expect the fact that addictive things wear out their welcome faster thanks to your wife generalizes.
Maybe one defense against new addictive things is to have people with whom you are in close relationships, who are not trying the same things at the same time, and can remind you it would be bad to sink a bunch of time or effort into an addictive thing, because (among other things) it would hurt your important relationships.
If you are relying on your own self-control to deal with something designed to overcome or compete with your self-control, one misstep could be doom. But with someone else whose judgment you rely on, both minds would have to be compromised at the same time.
Doesn't work for instantly-effective perfect addictors, but for anything you can recover from, a supportive group of people will plausibly help. Maybe as our tech advances, "our friends monitor us for new addictions and help us recover" should become a low-shame social norm. Along with "friends avoid trying the same new thing at the same time".