I universally trying to reject the Mind Projection Fallacy—consequences

yanlyutnev

I universally trying to reject the Mind Projection Fallacy—consequences

post by YanLyutnev (YanLutnev) · 2024-08-30T17:42:30.733Z · LW · GW · 0 comments

No comments

Jan 2025 - translation updated

My essay is a continuation of the reasoning from Yudkowsky's article "Mind Projection Fallacy [LW · GW]", and an attempt to expand it by providing more gears of how the fallacy works. When I understood the mechanism described in it (not from the 1st time), it was a huge delta. Here I want to describe my vision of the mechanism, how the mind projection fallacy affects people's motivations by distorting their map.

I had a period of analyzing jumps in my motivations while reading texts and having conversations, and I noted those pieces of phrases during which I noticed a motivation jump. Including while reading Yudkowsky's articles. I discovered a pattern that repeated too often - motivation jumps when reading words like "useful", "wrong", "right", "good", "bad", "should", "important", and other words that from the inside [LW · GW] seemed to me to be a one-place function rather than a two-place function [? · GW].

Here's how I understood the "mind projection fallacy" from Yudkowsky:

What is the mind projection fallacy? It's a cognitive distortion that is found everywhere around us. "This thing is good, you're a kitty, you're a bad person, this movie is disgusting, you're a monster, the woman is sexy, this is harmful, this is useful, this is important, this is not important, this is right" - all these phrases can be, or may not be, a mind projection fallacy. How to distinguish whether a phrase falls into the patterns behind the label?

The mind projection fallacy is the perception of your sensations towards an object not as your sensations towards the object, but as a property of the object, coupled with it and independent of the observer.

You look at a kitty and it seems cute, beautiful, pleasant, and so on to you. You experience certain sensations that are verbalized with the word "pleasant". From the inside, it feels like the kitty has some property of being "beautiful" and "pleasant", similar to how the sky has the property of being "blue", and the surface of a stone has the "property" of being smooth.

Let me define the word "property" here - these are stable patterns in manifestations that we notice in an object.

You see the sky as blue, so it has some property of being "blue" directly "coupled" with the sky, and it seems that you can somehow verify this. The smoothness of a stone can also be verified in some ways. And thus a habit is formed that if you see some pattern in a piece of reality, and this pattern is confirmed by other people seeing it exactly the same way - you develop a habit of generalizing this pattern further than where you noticed it. For example, if you saw that grass is green in several areas of the forest, you will generalize the expectation that it will be approximately the same color in unvisited areas of the forest and won't be wrong.

The mind projection fallacy is precisely the result of this habit.

The formulation can be difficult to understand at first, so I'll provide an example that you can always refer to in order to grasp the intuition.

I took this example from Yudkowsky's article "Mind Projection Fallacy [LW · GW]". Yudkowsky himself took this term from mathematician E.T. Jaynes, although initially Jaynes used this term as an error in probability perception.

In the early days of science fiction, alien invaders might occasionally kidnap a girl in a torn dress and drag her away with the intention of rape, which was depicted on many old magazine covers. It's somewhat strange that aliens never hunted for men in torn shirts.

Sometimes people might think that all minds are structured similarly, since they don't have access to how an alien experiences the world from the inside, you simply don't know how to model it and there's a temptation to take the easy path and model based on your own perception. I feel this way - others likely feel the same way.

From the inside, it may seem that sexuality is an innate direct attribute of the object woman, rather than a word that an alien used to name their sensations while looking at a woman. A woman is attractive, so the alien will see the attribute "sexuality" and will experience attraction to her — logical, right? (insert verbatim paragraph from English here)

Imagine that suddenly capybaras became intelligent, learned human language, and began saying that female capybaras are universally sexy and possess not a property of sexuality that any other intelligent species should notice. Humans, whose brains were not formed to receive certain sensations called sexuality while looking at female capybaras, would begin to argue with capybaras, saying that this property of sexuality doesn't exist. Saying that capybaras are projecting their own sensation as a property of the object capybara. Property -

Even a child could see this error if a capybara started pushing this story to them.

But when people communicate among themselves and say "this building is beautiful" - for some reason the analogy with capybaras becomes less obvious. If in the case of capybaras you don't see this property of capybara sexuality at all from any angle, then when a person points out to you the property of a building being "beautiful" you can, with effort, imagine how this supposedly existing property of "beauty" feels from the inside. And since the desire to argue depends on whether you feel this supposed property from the inside and how strongly, in case you felt it - you might even accept "the building is beautiful" as a true statement.

How false properties are born (false patterns that generate false predictions) - let's say your friend guessed what you were doing last evening based on a couple of phrases. And since you believe that one needs to be a "genius" to guess so accurately, you hang the property "you're a genius" on your friend.

From the inside, this is perceived as "there is some property of friend's power and coolness, they know something smart and it's too lazy to comprehend, I'll bow before the unknown and set a long-term sensation of being in awe of the friend and their certain property of genius, which I generalized from one case, and to save cognitive resource I'll add to the expectation all my intuitions for the word 'genius'." Now the friend has the false property of "genius", which generates a false prediction that the presence of this property will manifest in a similar way in the future - for example, that the friend will amazingly guess what you were doing in the future too. But you forgot to turn off the microphone and the friend overheard, and therefore knew. They don't have the property of "genius", but you can hang it on them. And become disappointed and experience confusion when the friend made a stupid mistake. How is this possible, since they're a "genius"?

In the same way, you can hang the false property of "good", "harmful" or "disgusting". And forget that you were naming your sensations with this word. After all, if a friend possesses the property of "genius", then what difference does it make what you feel, this property is felt from the inside as objective, it's a pattern.

And considering that when people talk they usually optimize such words that are aimed at making certain associations and feelings from hearing these words still occur, meaning people usually don't use unfamiliar language to you or words that you clearly won't understand, which means that the mind projection fallacy flourishes everywhere.

If your beloved wife or husband states that "the building is beautiful", you look at the building and experience feelings that I myself might call with the word beautiful, then why spend additional cognitive resources on adding a level of indirectness, that is, "I call my sensations while looking at this building with the word 'beautiful'", people don't usually talk like that, and you don't automatically launch adding a level of indirectness (indication of through how many perceptions the information is passed) to people's words that fly by so quickly and which call an object or strategy good, there is a temptation to simply activate familiar sensations to the word "good", namely - calmness, loyalty, sense of value, and stress on the forecast of loss.

If an alien crystal declares that this lilac crystal is sexy, but you don't feel any sexuality towards this crystal, then why would you go down the branch where you'll try to activate the same sensations of sexuality familiar to you towards this crystal? The crystal doesn't have the property of "sexuality", therefore you won't try to mirror these feelings at all, to avoid arguing with the alien. Or will you? Do people often solve the task of training their brain in such a way that crystals excite them? Seems not.

But if another person similar to you declares that a girl is attractive, your brain is already structured so that with the word "attractive" you named certain sensations that already exist for some girls, you tried to activate these sensations, and you succeeded. And to avoid arguing with this person, it's easier for you to agree with the delusion that the girl has the property of attractiveness, coupled only with her (1-place word), and not with her + your perception (2-place word).

And if people chatter a bunch of words per minute and there's a projection error in each sentence, then to avoid cognitive overload by adding one level of indirectness after another 20 times per minute, you just relax and start automatically at the level of sensations to decode words as sensations from the object.

The mind projection fallacy doesn't sit in people for centuries and isn't going to leave for nothing. If you try to completely reject it, then speech will become similar to mine (I talk the same way as I write in this article), which is usually verbalized by people as "robot", "strange", "cringe" and "I lack emotions". Moreover, the mind projection fallacy is almost the main source of emotions from perceiving human speech. Emotions are often generated by world models, expectations. The perception of the word cool with and without the mind projection fallacy will feel different at the emotional level. With the mind projection fallacy, if a friend compliments you "you're so cool" you might experience a complex of pleasant emotions related to the fact that this exceeds your expectations in a more pleasant direction. Because the friend kind of testifies that you have some property of being "cool", independent of the friend's perception. Similar to how a girl has an "objective" property of being "sexy", only the girl this time is you. If you call with the word "cool" a set of certain feelings, for example admiration and exceeding expectations, then if you try on this supposed property as permanent, then if you have this "hardwired property" of admiration, then other people will also notice this property, similar to how they'll notice the "beauty of the building" or "sexuality of the girl". Living your normal life, you didn't find many confirmations that you have this supposed property, because you walk down the street and people don't admire you. And if this property of "coolness" was present in you, they would definitely admire you.

And then they tell you that you're "actually cool". You thought that you don't have the property of being "cool", but here some person sees it. This feels like evidence [LW · GW] for the hypothesis "I'm cool". You fall for the mind projection fallacy, self-deception occurs, but at the level of sensations you don't even need to activate any analytical component to fall for this. And you get pleasant sensations of justified expectations as a reward for self-deception.

And since the carousel of these sensations brings you diverse and pleasant sensory experience, then why would you reject it? Other people haven't been rejecting it for centuries.

But what's the alternative? What will happen if you try to completely reject the mind projection fallacy? I've been rejecting it for my internal judgments for six months already, but sometimes I use it for the purpose of quickly activating pleasant sensations in the interlocutor's head, for example by stating that "they're a kitty". Here I exploit the mind projection fallacy for my purposes, for example as reinforcement, so that the person is more likely to do a useful service for me again or subscribe to the channel for a new portion of oxytocin, which I predict in some people after "you're a kitty".

Rejecting the mind projection fallacy should in theory completely change your view of the world. What you considered "objectively good, right or important" will be replaced with "I experience certain sensations towards this".

And what you considered obviously "disgusting and bad" will also be reconsidered as "I experience unpleasant feelings towards this something for some reasons". After rejecting the mind projection fallacy, you will no longer be able to deceive yourself that a woman has an objective sense of "sexuality" that everyone around will notice.

The process of rejecting the projection fallacy if you had it, and I expect that it inevitably exists in all people who haven't purposefully tried to fight it - will inevitably lead to the destruction of expectations from standard thinking habits with corresponding side emotions about this. And if you know about yourself that any sufficiently small destruction of expectations turns into an emotional drama for you, then rejecting the projection fallacy will lead to incredible maximum drama and many secondary unpleasant emotions from not being able to comfortably experience the familiar emotion on the cluster. But the brain will adapt if you do this often enough.

If you want to experiment with rejecting the projection fallacy, then my method is the technique of "add a level of indirectness" (indication of the perceiver). That is, every time you notice from yourself in thoughts or speech a statement like "this thing is good or bad or something else", and in your formulation there is no indication or intuition that this is only your perception, then you add to this phrase "I verbalize my sensations towards this thing as". Because that's really what you're doing. You have certain sensations towards this thing and you name them, there's no deception here. But the perception of this thing can suddenly change. I tested this on several friends and they stated that their perception changes from adding a level of indirectness.

If before you wanted to argue with people who called your favorite movie "trash", you just use the level of indirectness "this person verbalizes their sensations towards the movie as 'trash'" and usually after this, branches of arguments aimed at proving to the person that this object has some property of being "good", but they don't see it, they will simply stop and you will lose interest in this, because these arguments were based on a wrong belief - on the mind projection fallacy.

And finally — usually people get hooked on familiar emotions towards things and if after rejecting the mind projection fallacy these emotions first weaken, and then leave, then this can be verbalized as "life loses meaning, magic and pleasantness [LW · GW]". At the transition stage, I myself fell into something similar and wondered why my happiness level sharply dropped.

But no magic leaves the world — because it wasn't there initially. Magic is the verbalization of your own sensations, it's the familiar reactions that leave, because they were based on the false belief that the object has some property.

The brain apparatus allows gluing any sensations to any object and you can return these sensations by simply training them. The reasons why you called the building beautiful and kittens cute are still here, these mechanisms work and are included in the laws of physics. If your sensations were coupled with reality, they will remain. Sensations that can be destroyed by truth will be under attack. But I remind you — human brains are capable of experiencing various sensations towards the non-existent. Everything can be returned if you want, except for those emotions that were coupled with delusion, they can only be returned if you force yourself to believe in the delusion again [LW · GW].

Now I have a very high level of happiness, even with rejecting this fallacy, you can perceive this as evidence. But in the first 3 months of adaptation it was painful. In the text of this video I tried to minimize the mind projection fallacy. Perhaps you experienced some emotions despite the fact that I actively tried to remove it, which is evidence towards the fact that the absence of the projection fallacy doesn't destroy your emotions.

0 comments

Comments sorted by top scores.

I universally trying to reject the Mind Projection Fallacy—consequences

Contents

0 comments