The hostile telepaths problem

post by Valentine · 2024-10-27T15:26:53.610Z · LW · GW · 74 comments

Contents

  Newcomblike self-deception
    Sketch of a real-world version
    Possible examples in real life
  Other solutions to the problem
    Having power
    Occlumency
    Solution space is maybe vast
  Ending the need for self-deception
    Welcome self-deception
    Look away when directed to
    Hypothesize without checking
    Does this solve self-deception?
  Summary
None
74 comments

Epistemic status: model-building based on observation, with a few successful unusual predictions. Anecdotal evidence has so far been consistent with the model. This puts it at risk of seeming more compelling than the evidence justifies just yet. Caveat emptor.


Imagine you're a very young child. Around, say, three years old.

You've just done something that really upsets your mother. Maybe you were playing and knocked her glasses off the table and they broke.

Of course you find her reaction uncomfortable. Maybe scary. You're too young to have detailed metacognitive thoughts, but if you could reflect on why you're scared, you wouldn't be confused: you're scared of how she'll react.

She tells you to say you're sorry.

You utter the magic words, hoping that will placate her.

And she narrows her eyes in suspicion.

"You sure don't look sorry. Say it and mean it."

Now you have a serious problem. You don't have an internal "actually mean it" button. And yet here's Mom peering into your soul and demanding that you both have that button and press it. Trying to appease her didn't work. She needs you to be different — and she's checking.

What can you do now?

This is a template for what I've come to call "the hostile telepaths problem". I think it's a common feature of social problems. The hostile telepaths problem is when you're dealing with a being (a) who can kind of read your internal experiences and (b) whom you don't trust won't make your situation worse due to what they find in you.

There are lots of solutions to the hostile telepaths problem. I don't claim to know all of them. But recognizing some common ones has helped clarify a lot of my thinking — particularly around self-deception and akrasia.

And getting very clear on the nature of the problem makes identifying real solutions way easier. This fact produces some previously-surprising-to-me predictions, especially for trauma processing and for making emotionally difficult decisions.

I'll try to spell out what I mean with some theory and a few examples.

 

Newcomblike self-deception

There's one really tricky solution to the hostile telepaths problem. It deserves some special front-loaded attention before I name some other solutions.

Here I'll try to spell out its logic with a modification of Newcomb's Problem [? · GW].

Suppose that instead of Omega, you're dealing with Omega-V. Omega-V is an extremely good but imperfect predictor of someone's box choice.

Through some clever sleuthing, you deduce why. Omega-V is scanning only folks' visual thinking for a little while prior to their box choice. It's extremely good at tracking the difference between the mental visualizations of one-boxers vs. two-boxers… provided they don't know Omega-V is scanning specifically their visual thinking.

So now you have an edge. Maybe you can trick Omega-V. Maybe you can visually think like a one-boxer and use other modes of reasoning (like inner talk) to plan on actually two-boxing. You just have to be very careful to have any "leaks" of reasoning into visualization look totally normal for a one-boxer.

If you decide it's worth the risk to try for this deception, the visual reasoning part of your mind has to be kind of siloed from the rest. It's like you have a subagent that truly honestly believes you're a one-boxer. It's the non-visual reasoning parts of you that are aware of the real plan.

Another way of saying this is: in normal Newcomb's Problem, you can't tell whether you're in the real world being faced with Omega's offer or if you're instead Omega's simulation of real-you and you're revealing what real-you will do. The situation is similar with Omega-V, but in this case it's specifically your visual reasoner (viewed as kind of a subagent) that has to be built in such a way that it can't tell whether it's a simulation. Many other parts of you can tell perfectly well what's going on![1] But for the deception to work, the way in which you plan on two-boxing cannot be something your inner visual reasoner can even deduce.

Now, let's take this one step further.

Let's suppose we're instead dealing with Omega-C. Also an imperfect predictor. But this time you learn that Omega-C scans specifically all of — and only — your conscious thinking.

At this point, the right strategy is to consciously decide you're going to one-box.

But some people still find themselves two-boxing and winning. Even while consciously intending to one-box. Sometimes they get confused and say the wrong thing or push the wrong button. Some of them trip and accidentally grab both boxes. Some laugh and grab both boxes and find them both full.

Assuming that at least some of those outcomes are strategic… how?

Hint: now you (by which I mean conscious parts of you) are maybe the siloed part. Basically an illusion meant to trick Omega-C.

The thing is, you cannot tell. Because if you could, then the deception — if it were there — wouldn't work.

Which is to say: on the inside, you cannot tell the difference between running this Omega-C deception strategy and honestly truly just being a one-boxer.

And furthermore, it's actually to your advantage not to check. Even if you were to find some way of being able to tell the difference!

But because of that, if you're aware of this deception strategy, it might work best if you honestly (from your perspective) look for some way to tell if you're running it. Actively trying to "come clean". That's a pretty powerful signal to Omega-C that you really are a one-boxer.

So if you can make it so that it seems to you like you really are looking, including accounting for awareness of possible non-conscious "shadow" reasoning, and you find none, but you still two-box in the end…

…well, you win!

…while the conscious parts of you are confused and surprised. And possibly quite dismayed.

 

Sketch of a real-world version

I call this pattern "Newcomblike self-deception" as a nod to Nate Soares' point that Newcomblike problems are the norm.

The deal is that people are kind of telepathic, in the sense of using things like vocal tone and body language and microexpressions and so on to intuitively deduce what might otherwise be hidden in others. They're scanning for things like intent and strategy and emotional tone. Often this telepathy is in service of synching up ("Hey, you seem upset, is something wrong?"), but it doubles as threat detection.

This telepathy is imperfect. Which means that sometimes Newcomblike self-deception is in fact a viable strategy.

I'd like to name one way I think this type of self-deception can actually happen in a person. It might be the main way it happens, or it might be rare. I honestly don't know. But it's one I've in fact seen in myself[2] and I think I've observed in some others.

By some mysterious method, it's possible to contract your awareness — by which I mean, the space of things you're actually aware of can be smaller than the space of sensory inputs (including mental experiences like thoughts and memories). Lots of people experience this when watching TV (losing awareness of the room), or when deep in flow work (not noticing hunger for hours while programming).

If you construct a sort of fake self in your mind, and then contract your awareness around that fake self, it can seem to you on the inside like you really are the way depicted in the fake self. Like it's not fake, it really is who you are.

If you also build up explanations to your fictitious self about why things outside that fiction either are consistent with it or don't matter, then you both (a) can honestly display to hostile telepaths that you (here meaning fake you) are being fully sincere in not hiding anything and (b) possibly give the telepaths ways of discounting the unavoidable signals that you (here meaning you holistically) are hiding something.

For instance, as a child whose mother says to you "Say you're sorry and mean it", you might be able to strategically misinterpret your fear of Mom's Wrath as "being really sorry". As long as you're not aware that that's what you're doing, it might work very well! She might read your distress as you really meaning it. ("I'm sorry I'm sorry I won't do it again please Mom I'm sorry…!") And you can keep yourself from being aware of this whole strategy by keeping your awareness contracted on the fictitious version of yourself that's "bad" and "very sorry", and keeping your understanding of the real problem outside of your awareness.

 

Possible examples in real life

Here are some examples I think I've actually seen — in culture, in others, and in myself:

I'm not trying to be exhaustive here. There are tons more examples.

 

Other solutions to the problem

We can't actually penetrate our own Newcomblike self-deception without having another viable (to us) strategy for dealing with hostile telepath problems.

However, if we do have another strategy in a given instance, then in that instance it can be safe to look. The self-deception can lift.

 

Having power

One alternative strategy type is, coming to trust that you're able to handle the consequences of being accurately seen.

Such as the moms in the abusive partners example above: each one could acknowledge her self-deception once it was safe for her abusive partner to know too. She got enough power (financial or social) to protect herself and her child, making the telepathic scan no longer a dire threat.

I think a lot of "trauma processing" amounts to this self-empowerment strategy. But it's more like, noticing you already have power. I bet a lot of foundational self-deception habits come from being a child faced with telepaths (adults) who have a lot of power over them. A kid who deals with Mother's "Say sorry and mean it" demand with self-deception might then grow up to become really apologetic and "have low self-esteem". But it's just an old strategy for dealing with Mother that hasn't made contact with the fact that Mother isn't that powerful over them anymore. It's now actually just fine for her to know they're not "really sorry". If this raw physical truth comes into contact with the impulse to "be sorry", the mental firewall might simply collapse, and the mislabeling will stop.

So in many cases, "trauma processing" can basically mean noticing you're not a child anymore. You have power. So you don't have to appease the hostile telepaths just because they're adults. They can just know your internal state, and you (trust that you) can handle the consequences of them knowing.

Building emotional resilience is like this, I think. If you (trust that you) can handle the emotional and somatic sensations of others being upset with you, then you don't have to hide the parts of you that might make them upset. They can just be upset. While you might not like it, you know you'll be fine.

(Not to say anything about what's ultimately good to do here. Caring about others' reactions totally makes sense for other reasons, like the health of the community we're in. Here I'm focusing specifically on what can solve the hostile telepaths problem without self-deception.)

 

Occlumency

Another solution type is occlumency. Which is to say, if you trust you can keep your real goals and/or strategies hidden from a hostile telepath even if you consciously know what your goals/strategies are, then it's safe to consciously know them.

(This is something like switching from Omega-C to Omega-V.)

A classic example is in WWII when Nazis come knocking and ask if you're harboring any Jews. The analog of one-boxing here is just not harboring Jews. Newcomblike self-deception doesn't seem plausible to me here. You very much don't have the power to handle the consequences of being caught "two-boxing". So if you're helping refugees, you probably have to lie convincingly. And if self-deception were a plausible strategy here, you wouldn't need it to the extent that you trust your ability to hide the truth from the Nazis even if you know the truth.

I think many psychopaths[3] use occlumency quite a lot. I've met some who know full well that they're trying to manipulate others and are presenting a façade to do so. It works for them in part because they don't send implicit distress signals around thinking they're bad for being manipulative: they're not nervous, so they don't need to explain their nervousness away.

There's a moral tangle here. Honesty is important for connection, integrity, and communal health. But you might not trust that it's safe to reveal the truth to a hostile telepath.[4] In this case, the moral injunction not to lie makes occlumency harder (because of fear of being caught, plus doubt about whether you should be using occlumency at all). This situation can leave self-deception as your only viable solution — which, incidentally, means you're still not being honest!

I think this means that if you care both about (a) wholesomeness and (b) ending self-deception, it's helpful to give yourself full permission to lie as a temporary measure as needed. Creating space for yourself so you can (say) coherently build power such that it's safe for you to eventually be fully honest.

 

Solution space is maybe vast

I've named three solutions to the hostile telepaths problem:

These aren't the only ones. A pretty simple one is simply running away and avoiding them. Another is investigating whether the telepaths are in fact hostile and discovering they're not (if that's true). Yet another is to jam telepathic scans with emotional charge that backs privacy norms. ("It's none of your business whether I 'really am' sorry!")

The important part isn't that we have a full taxonomy. That might be helpful, I don't know. The important part, as far as I'm concerned, is that by being very clear about what problem we're solving, we can tell when something is — and is not — a solution.

 

Ending the need for self-deception

By this model, to end (Newcomblike) self-deception, we have to remove the need for it. This means solving each instance of the hostile telepath problem some other way.

This is kind of tricky in practice. When you use self-deception to deal with a hostile telepath, you can't know that that's what you're doing. You[5] can't even know which hostile telepath problem you're solving! So how do you come up with another solution?

I don't have a provably general answer, but I have a pretty general approach that makes sense to me and has clearly worked several times. I'll share that approach here.

 

Welcome self-deception

First is welcoming that I'll self-deceive.

But this isn't "Well, I'm going to do it anyway, so I might as well be okay with it." That's nonsense: you probably can't just "be okay" with it. And trying probably makes the problem worse![6]

I mean something more wholehearted. If I self-deceive, it's because it's the best solution I have to some hostile telepath problem. If I don't have a better solution, then I want to keep deceiving myself. I don't just tolerate it. I actively want it there. I'll fight to keep it there!

This is somewhat akin to dealing with Omega-C by saying:

Look, I know it's possible I'm running a deception strategy. I could spend a bunch of energy trying to suss it out as a costly signal that it's not there. But at a policy level I'm just not going to do that. Not because I have evidence that I'm not following up on, but because I don't want to add stress to myself in the world where I really am self-deceiving. Since I'm doing this regardless of whether the deception strategy is running, it's not information about whether I'm secretly trying to two-box.

This relieves pressure. If I have some sense that I'm self-deceiving, and my attitude is to back the deception instead of trying to penetrate it, then the hidden part of me running the deception doesn't have to engage in an internal arms race with me. We become same-sided.

 

Look away when directed to

Once I really back my own self-deception, it becomes easier to notice signs I'm doing it.

This works way better if I trust my occlumency skills here. If I don't feel like I have to reveal the self-deceptions I notice to others, and I trust that I can and will hide it from others if need be, then I'm still safe from hostile telepaths.

Seeing where I self-deceive doesn't mean I see what the deception is. In practice it's more indirect than that. What I mean are things like:

I don't mean this as an exhaustive list. Nor do I mean it as things to look out for. Nor do I mean that these always imply that self-deception is going on.

What I mean is, there are things a person does to maintain self-deception. If you basically promise the strategic not-conscious-to-you part that you really will respect the strategy, then it doesn't have to keep you so firmly out of the loop. Then you can potentially start picking up on some signposts like these ones.

Part of the deal is, when you notice such a possible signpost, you look away. You notice it and you drop the inquiry. Because until you have a non-self-deceptive strategy for whatever the real problem is, you don't want to break the one strategy you have.

For instance, sometimes I'll think about responding to an email… and I start getting sleepy. If I push, I start wanting to watch YouTube. These are signs that something in me doesn't trust it's safe for me to look there. Maybe it involves a decision that requires me to ask myself an unsafe question. I don't know — and I don't try to figure it out. At least not right away. Instead I back off and direct my attention elsewhere. Maybe I go cook something, or take a walk. I consciously distract myself from the tension point.

In my experience, this alone can often eliminate most of the stress involved in self-deception. It becomes fine. Annoying, glitchy, but no longer fraught with anxiety and self-doubt.

 

Hypothesize without checking

After a while I kind of get a "negative space" sense of what the self-deception is about. I continue not to look, out of something like respect. But I still have a hint.

Like if there's an email I keep freezing around. I can tell there's something there. I might even have some intuitive guesses about what it is!

…but I do not check. I don't introspect on whether my guesses feel right.

Instead, I hypothesize. What hostile telepath problem might someone in my shoes be trying to solve such that this behavior arises?

For instance, let's suppose the person is asking for me to run an event this weekend. I might hypothesize like this, intentionally referring to myself in third person:

Maybe Valentine doesn't actually want to do it, but he's scared that letting them know will make them think he's actually uninterested in them in general, which might have them closing opportunities he wants with them in the future.

Importantly, I am not introspectively checking. I'm not asking if I think the above really is what's going on with me. I'm just noticing that, viewing myself in third person, this model does seem to fit the evidence.

I'm also not trying to construct a plan to verify what's going on! Here Nature wants her secrets kept. I do not try to peek under her skirt.

Instead, I notice what Valentine (i.e., me in third person) in this hypothetical could maybe do instead of Newcomblike self-deception. What would be a viable alternative strategy for him?

Maybe Valentine could meditate on their possible disapproval, and come up with a plan for what happens next in which he's okay. (Building power.)

At this point I could just implement this possible solution. I don't have to check if it's relevant to my situation: there's not much cost in leaving myself a line of retreat [LW · GW] this way.

If it turns out there's been Newcomblike self-deception going on, and if this hypothetical solution really did resolve the core problem that the self-deception was solving, then the self-deception should basically just lift.[7]

And if I still have an ugh field [LW · GW] around the email, then I haven't addressed the real problem yet. Which is fine. Not ideal, but I'm still going to back any self-deception that might be there while I don't have a better option!

I can repeat this process. Hypothesize without checking, implement solutions that would work in the hypothetical, and find out what happens.

…at least unless and until I start getting frozen about this process. That might mean I'm getting too close to understanding the strategy before it's safe to do so.

Then I back off.

 

Does this solve self-deception?

I don't know.

I didn't originally set out to make sense of self-deception. I was just trying to understand why people sometimes view themselves as flawed and in need of fixing.

It just turned out that that question was tied to a lot of others. Self-deception being one of them. A lot of them unified by considering the problem of hostile telepaths.

It seems worth noting that a bunch of the method I describe here — particularly the "hypothesize without checking" part — is derived. It amounted to a prediction that I tested and discovered worked as the model anticipated.

Likewise, occlumency being helpful. There might be other explanations for why getting better at privacy makes more thoughts thinkable. But I derived it from this one. And, again, it (anecdotally) seems to have worked as predicted.

These approaches work remarkably well on shame too, by the way. I might write a separate post on shame. Its logic is a bit different, but with a few adjustments I've found that shame dissolves extremely well in contact with these ideas.

With all that said, I don't think I'm in a position to say that I've solved self-deception. I don't know how I could know that. I'm not even convinced I've solved Newcomblike self-deception! My method seems plausibly general, but I don't have even the sketch of a necessity argument yet.

So, more work needed.

 

Summary

It seems to me that self-deception is solving a real problem. If we don't solve that real problem differently in a given instance, then in that instance we can't stop self-deceiving.

It seems to me that the real problem is (at least sometimes) hostile telepaths.

When I view hostile telepaths as the real problem I'm trying to solve, the perspective suggests what alternative solutions might look like, and it lets me check whether a given approach even can work as a solution.

And it seems to me that when I implement those alternative solutions, the result is sometimes that self-deception visibly falls away, non-mysteriously. It becomes obvious to me what was going on, and why.

I don't know if this model captures all cases of what we might want to call "self-deception". Maybe it does. But my impression is that it at least captures some cases that matter, and quite a lot of them.

  1. ^

    Note that having non-visual ways of thinking isn't enough to know you're not a simulation. What tells you you're not an Omega-V simulation is that you can reason in ways that (a) cannot be derived from your visual thinking and (b) change what you in fact do.

  2. ^

    Of course, this is something I became aware of after unraveling the structure in a few cases. It's not something that reveals itself while the structure works.

  3. ^

    By "psychopath" I mean someone with the cluster B personality disorder. I don't mean something derogatory. Nor am I (necessarily) referring to Gervais Principle psychopaths.

  4. ^

    To be clear, "hostile telepath" is a role, not an identity. Someone is a hostile telepath to you when they're scanning your mind and you don't trust they won't create problems for you based on what they find. Someone being a hostile telepath is less like them being a criminal and more like them being your lover or your foe. I say this because it's not a solution to identify "the hostile telepaths" in a community and reform or expel them; that approach is gibberish made of confused reification.

  5. ^

    If I were carefully describing this from the outside, I'd say that your false self can't know. "Self-deception" is really false self deception (as a strategy for deceiving hostile telepaths). The thing is, on the inside it doesn't feel like "your false self". That's the whole point! I'm describing this model in a way that's hopefully legible to the internal experience of actually running the strategy. Otherwise any instructions might make theoretical sense but won't be actionable. Sadly, this way of talking results in some ambiguities — precisely because the whole point of the strategy is to make something difficult to see clearly. Hopefully you can correct for this confusion as needed, sort of shifting to third-person and renaming things when the theory isn't clear.

  6. ^

    Why? Well, you need to "be okay" with it. But you're not. So what do you do with the fact that you're not okay with it? Loosely speaking, you've just turned your own conscious mind into an internal hostile telepath!

  7. ^

    In practice I find that not only does this work quite often, but now it sometimes works once I think of the alternative solution. I don't always need to implement it first. It feels to me like this result comes from having built internal trust that I really can and will respect my need for some strategy.

74 comments

Comments sorted by top scores.

comment by Gordon Seidoh Worley (gworley) · 2024-10-27T18:48:05.584Z · LW(p) · GW(p)

Some cultures used to, and maybe still do, have a solution to the hostile telepaths problem you didn't list: perform rituals even if you don't mean them.

If a child breaks their mom's glasses, the mom doesn't care if they are really sorry or not. All she cares about is if they perform the sorry-I-broke-your-glasses ritual, whatever that looks like. That's all that's required.

The idea is that the meaning comes later. We have some non-central instances of this in Western culture. For example, most US school children recite the Pledge of Allegiance every day (or at least they used to). I can remember not fully understanding what the words meant until I was in middle school, but I just went along with it. And wouldn't you know it, it worked! I do have an allegiance to the United States as a concept.

The world used to be more full of these rituals and strategies for appeasing hostile telepaths, who just chose not to use their telepathy because everyone agreed it didn't matter so long as the rituals were performed. But the spread of Christianity and Islam has brought a demand for internalized control of behaviors to much of the world, and with it we get problems like shame and guilt.

Now I'm not saying that performing rituals even if you don't mean them is a good solution. There are a lot of tradeoffs to consider, and guilt and shame offer some societal benefits that enable higher trust between strangers. But it is an alternative solution, and one that, as my Pledge of Allegiance example suggests, does sometimes work.

Replies from: Valentine, quila
comment by Valentine · 2024-10-28T01:26:27.483Z · LW(p) · GW(p)

Some cultures used to, and maybe still do, have a solution to the hostile telepaths problem you didn't list: perform rituals even if you don't mean them.

Ah, yep! True that!

Your point relates more directly to my main interest, memetics. I bet there are memes that encourage both (a) these rituals and (b) the telepathic attacks that make those rituals necessary.

comment by quila · 2024-10-27T19:46:43.649Z · LW(p) · GW(p)

For example, most US school children recite the Pledge of Allegiance every day (or at least they used to). I can remember not fully understanding what the words meant until I was in middle school, but I just went along with it. And wouldn't you know it, it worked! I do have an allegiance to the United States as a concept.

Can you explain how it caused that, and maybe what it feels like?

(I find it alarming that being forced to recite a pledge as a child can actually have that effect -- I knew humans were culturally programmable, but not that {forcing someone to say "I endorse x!" when they don't know what it means nor want to say it} every day would actually cause them to endorse x later on. Actually, I notice I'm skeptical that that was the real cause in your case; what's your reason for believing it was the cause?)

(No pressure to answer my questions of course - interpret them as statements of curiosity rather than requests in the human/social sense)

Replies from: gworley
comment by Gordon Seidoh Worley (gworley) · 2024-10-27T20:30:49.232Z · LW(p) · GW(p)

I'm sure my allegiance to these United States was not created just by reciting the Pledge thousands of times. In fact, I resented the Pledge for a lot of my life, especially once I learned more about its history.

But if I'm honest with myself, I do feel something like strong support for the ideals of the United States, much stronger than would make sense if someone had convinced me as an adult that its founding principals were a good idea. The United States isn't just my home. I yearn for it to be great, to embody its values, and to persist, even as I disagree with many of the details of how we're implementing the dream of the founders today.

Why do I think the Pledge mattered? It helped me get the feeling right. Once I had positive feelings about the US, of course I wanted to actually like the US. I latched onto the part of it that resonates with me: the founding principals. Someone else might be attracted to something else, or maybe would even find they don't like the United States, but stay loyal to it because they have to.

I'm also drawing on my experience with other fake-it-until-you-make-it rituals. For example, I and many people really have come to feel more grateful for the things we have in life by explicitly acknowledge that gratitude. At the start it's fake: you're just saying words. But eventually those words start to carry meaning, and before long it's not fake. You find the gratitude that was already inside you and learn how to express it.

In the opening example, I bet something similar could work for getting kids to appologize. No need to check if they are really sorry, just make them say sorry. Eventually the sadness at having caused harm will become real and flow into the expression of it. It's like a kind of reverse training, where you create handles for latent behaviors to crystalize around, and by creating the right conditions when the ritual is performed, you stand a better-than-chance possibility of getting the desired association.

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2024-10-28T10:08:37.517Z · LW(p) · GW(p)

I bet something similar could work for getting kids to appologize.

Also, for getting them to say thank you. When kids are at a certain age, adults frequently seem to be reminding them to say thank you for gifts and such; I have a vague memory of adults also reminding me of this, when I was at that age. But these days I automatically say thank you for various things, and mean it.

comment by sitomin724 · 2024-10-28T12:05:22.856Z · LW(p) · GW(p)

Corollaries:

Honesty

  • If you want to become more honest and less self-deceiving, acquire power
  • If you want to make other people more honest and less self-deceiving, provide them with power (including power to protect themselves from you)
  • If you know someone who is more powerful than you but cant guarantee an upper bound on their power (and future power), then occlumency no longer works

Unboundedness

  • If you want an unlimited amount of power (such as a utility maximiser), there will almost always be coalitions of people more powerful than you against whom self-deception works
  • As long as there exist (hostile) coalitions of people unboundedly more powerful than you, completely removing self-deception from yourself is impossible

More than just yourself

  • If you want more examples of honesty and lack of self-deception available to you, ask powerful people to speak about their life experience. If you want these examples to be public, make them public
  • If you want two agents hostile to each other to both simultaneously be honest and not self-deceiving, provide them defensive rather than offensive power
  • If you want to achieve world peace, consider building defensive but not offensive power for every level of self-organisation - individual, family, ideological group, geographic group, etc etc

Time

  • If you don’t want someone to learn skills of dishonesty and self-deception, provide them with power as early as possible
  • If you don’t want to learn skills of dishonesty and self-deception, acquire power as early as possible

Hiding

  • If you want to acquire power without dishonesty or self-deception, ensure that your mental state and all causally downstream changes to world state are indistinguishable from noise in the eyes of more powerful actors.

I’m sure you can find real world examples of most of these.

Also there might exist edge cases where some of the above corollaries don’t hold, those edge cases are worth exploring.

Also: this post is refreshing because it is not about AI and has the same vibes as Lesswrong 2008 (back when LW was actually good). I made an alt just to reply to it.

Replies from: Chipmonk
comment by Chipmonk · 2024-11-18T00:45:48.985Z · LW(p) · GW(p)

What would you say that the main types of power are?

My list (for humans): physical security, financial security, social security, emotional security [LW · GW] (this one you can only give yourself though)

comment by Ivan Vendrov (ivan-vendrov) · 2024-10-28T04:09:10.493Z · LW(p) · GW(p)

I like this a lot! A few scattered thoughts

  • This theory predicts and explains "therapy-resistant dissociation", or the common finding that none of the "woo" exercises like focusing, meditation, etc, actually work. (c.f. Scott's experience as described in https://www.astralcodexten.com/p/are-woo-non-responders-defective). If there's an active strategy of self-deception, you'd expect people to react negatively (or learn to not react via yet deeper levels of self-deception) to straightforward attempts to understand and untangle one's psychology.
  • It matches and extends Robert Trivers' theory of self-deception, wherein he predicts that when your mind is the site of a conflict between two sub-parts, the winning one will always be subconscious, because the conscious mind is visible to the subconscious but not vice versa, and being visible makes you weak. Thus, counterintuitively, the mind we are conscious of - in your phrase the false self - is always the losing part.
  • It connects to a common question I have for people doing meditation seriously - why exactly do you want to make the subconscious conscious? Why is it such a good thing to "become more conscious"? Now I can make the question more precise - why do you think it's safe to have more access to your thoughts and feelings than your subconscious gave you? And how exactly do you plan to deal with all the hostile telepaths out there (possibly including parts of yourself?). I expect most people find themselves dealing with (partly) hostile telepaths all the time, and so Occlumency is genuinely necessary unless one lives in an extraordinarily controlled environment such as a monastery.
  • Social deception games like Avalon or Diplomacy provide a fertile ground for self- and group experimentation with the ideas in this essay.
Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2024-10-28T12:06:59.032Z · LW(p) · GW(p)

Now I can make the question more precise - why do you think it's safe to have more access to your thoughts and feelings than your subconscious gave you? And how exactly do you plan to deal with all the hostile telepaths out there (possibly including parts of yourself?).

An answer I'd give is that for a lot of people, most of the hostile telepaths are ultimately not that dangerous if you're confident enough to be able to deal with them. As Valentine mentioned, often it's enough to notice that you are actually not anymore in the kind of a situation where the strategies would be necessary.

Unfortunately, many of the strategies also behave in such a way as to make themselves necessary, or to prevent the person from noticing that they could be abandoned:

  • Maybe I had a parent that wanted me to be dependent on them, so that they could control me. Even if I manage to break away from that parent, I may still have the belief that if someone wants to control me, then I have to genuinely believe that I cannot escape their control or they'll hurt me. This belief will tend to get me into abusive relationships... and then that strategy again becomes necessary for protecting me while in the relationship, when I would never have gotten into that relationship in the first place if not for that very strategy!
  • Maybe I believe that if I cause someone else any discomfort, I have to say I'm really sorry and experience genuine distress. As a result, I always execute this strategy, believing it to be crucial for my safety. If I were to ever not execute it, I might notice that some people are actually okay with me not reacting in such an extreme way... but because I always execute it, I never get the chance to notice that it'd be safe not to.

One of the ways by which these kinds of strategies get implemented [? · GW] is that the psyche develops a sense of extreme discomfort around acting in the "wrong" way, with successful execution of that strategy then blocking that sense of discomfort. For example, the thought of not apologizing when you thought someone might be upset at you might feel excruciatingly uncomfortable, with that discomfort subsiding once you did apologize.

I believe this is also related to the way that awareness narrows around the strategy - feeling the original discomfort is very unpleasant, and the mind tends to want to contract awareness in ways that keep discomfort out. If awareness to broaden, then it would become aware of the unpleasant thing that the strategy is trying to push out of awareness. So for example, in the center of that discomfort of not-yet-having-apologized might be a memory of a time when you weren't really sorry and your mother was upset at you... and if you were to instead execute the strategy of desperately apologizing, then that would feel somewhat less painful and your awareness would naturally contract around that act of desperate apology, causing the original memory and the pain associated with that to drop away.

And something that practices like meditation can do is to bring the original discomfort into awareness in such a way that it can gradually stop feeling so unpleasant. (Though this can also go badly and bring something painful into awareness faster than the person is capable of dealing with it.) If that happens so that the original pain stops feeling so painful, then the self-deceptive strategies can stop creating situations where they perpetuate their own need to exist.

Now that's not to say that you would be guaranteed to be safe. A brief discussion I had on Twitter:

Me: I wonder to what extent significant parts of Buddhism got so focused on renunciation because that's the "safe" kind of mental transformation in the sense of not upsetting secular rulers.

While the kind of practice that dismantles societal programming and makes you go out in the world to change things, can easily become a threat for established power structures and a target for being rooted out.

Societal forces exerting evolutionary pressure on spiritual practice and selecting it for increased harmlessness/renunciation.

(Parallels of this idea in the context of corporate mindfulness training programs and such left as an exercise for the reader.)

(Or for that matter, parallels in the context of notions like "our group of ten people meditating is by itself an act of healing the world", which have some truth to them but also conveniently keep any change pretty localized and non-threatening.)

David Chapman: Yes this is very much the case in the history of sutrayana vs vajrayana. Vajrayana was typically reserved for the aristocratic elite, for this reason, and intermittently also appropriated by anti-establishment forces when they could.

Romeo Stevens (@romeostevensit [LW · GW] ): The ambitious sects were indeed wiped out

Aneesh Mulye: This wasn't so much of a thing in India; yes, it happened, but engagement with the world and with rulers was def a part of Indian Buddhist (and Shaiva, and most if not all other) traditions.

One solution involved only an elite having access to the hardcore agentifying stuff.

The extermination of all Indian Buddhism (what's called 'Tibetan' today, but that's just because it survived only in Tibet), and all Tantrik institutions (and Indic, generally), engaged with the world as they were, at the hands of the hateful Muslims, is why this didn't survive.

So apparently there were times in history when meditators did get a lot more confidence and self-insight, used that to become more powerful until they were seen as threats and wiped out, and that's why so many of the surviving meditative traditions are focused on things like withdrawing from the world and living as ascetics.

Replies from: PoignardAzur
comment by PoignardAzur · 2024-11-12T11:32:13.101Z · LW(p) · GW(p)

One of the ways by which these kinds of strategies get implemented is that the psyche develops a sense of extreme discomfort around acting in the "wrong" way, with successful execution of that strategy then blocking that sense of discomfort. For example, the thought of not apologizing when you thought someone might be upset at you might feel excruciatingly uncomfortable, with that discomfort subsiding once you did apologize.

Interesting. I've had friends who had this "really needs to apologize when they think they might have upset me" thing, and something I noticed is that they when they don't over-apologize they feel the need to point it out too.

I never thought too deeply about it, but reading you, I'm thinking maybe their internal experience was "I just felt really uncomfortable for a moment and I still overcame my discomfort, I'm proud of that, I should tell him about it".

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2024-11-12T12:28:02.047Z · LW(p) · GW(p)

Sounds plausible to me. Alternatively, telling you that they didn't over-apologize still communicates that they would have over-apologized in different circumstances, so it can be a covert way of still delivering that apology.

comment by Chipmonk · 2024-10-27T17:08:26.912Z · LW(p) · GW(p)

This reminds me… maybe muscle tension is a frequent solution to this problem?

Some context: Lately I've been wondering, Why do we often experience feelings as things in the body? For example, why do I feel anxiety in my chest rather than just “knowing” I'm anxious? 

For example, my previous chronic neck pain seemed to be related to information that manifested in my neck: 

I suspect the feeling in my neck represented the information "I have the choice to leave the social situation I'm in right now" and/or "I am disliking/suppressing myself."

Why might this feeling have manifested in my neck?

What if feelings use the body as a screen to communicate information with others? If you have a certain feeling in your chest, maybe others can see that

BUT: What if a feeling represents information that your system doesn't want other people to know? Hostile telepaths problem.

Im my case:

The feeling represented the awareness that I was insecure, and there were probably situations (probably social situations) in which it partially benefited me to be partially unaware of the fact that I was insecure. 

Well, in that case, your system could create muscle tension to "jam the signal"

If the muscles are stiff, maybe they can't be used as a screen anymore.

Replies from: Valentine
comment by Valentine · 2024-10-28T01:19:58.911Z · LW(p) · GW(p)

Oh huh. Yeah. It's not a solution by itself since there are lots of other cues hostile telepaths can use. But rigidity might dampen what they can read for sure!

This is testable. It predicts that improved skill with occlumency and/or gaining power should sometimes cause a release of chronic tension.

Replies from: Camille Berger, Chipmonk, Lulie
comment by Camille Berger (Camille Berger) · 2024-11-04T00:02:13.317Z · LW(p) · GW(p)

I read these comments a few days ago. It prompted me to try applying something inspired by what was written in the post, but immediately on my muscle tension: I slightly Focus on it, then tell myself to "side with" the tension / feeling, while also telling myself that it's Ok to do so, not trying to "bust" it or put it into words, and using chipmonk's technique (cf his blog) to explore resistance around being seen displaying "the underlying emotion".

I have the very clear impression that it weakens the tension quite fast (just timed it, it took about 30 seconds). I'm not having any insight on what the tension was about specifically.

That's purely subjective experience report, might be heavily biased.

comment by Chipmonk · 2024-10-28T01:22:15.828Z · LW(p) · GW(p)

I think it's true that people who have more power (whether emotional security or social status etc) generally have less muscle tension yea. 

But that reminds me that I should check with my clients if they accidentally experience much less muscle tension

Replies from: mr-hire
comment by Matt Goldenberg (mr-hire) · 2024-10-29T11:38:50.590Z · LW(p) · GW(p)

IME you can usually see in someone's face or body when they have a big release, just from the release of tension.

But I think it's harder to distinguish this from other hypotheses I've heard like "negative emotions are stored in the tissues" or "muscular tension is a way of stabilizing intentions."

comment by Lulie · 2024-11-10T21:55:39.380Z · LW(p) · GW(p)

This is testable. It predicts that improved skill with occlumency and/or gaining power should sometimes cause a release of chronic tension.

That wouldn’t be a test of the theory that hostile telepaths use muscle cues, since those things could cause muscle release for other reasons (as per Popper: tests can only be disproving, and they require a rival theory to decide between).

If gaining power never causes a release of tension, that still doesn’t disprove the theory, since again they could be tracking other things as well.

A more direct question would be something like: Can hostile telepaths in fact read people who are physically rigid better than people who have low muscle tension? Do their reads get better or worse when tension is added? Does it change the type of information they can read (and perhaps give more information for some axes and less for others)?

My impression is muscle tension gives a big sign on your back that you are hiding something, but makes it more muddy to non-trained people what exactly is being hidden.

It reminds me of Mark Lippmann’s blog post on virtual machines, and how we often have layers of virtual machines. Or in plain language: if you close your eyes and imagine your environment, and imagine making an escape within that imaginary environment, real-you might not tighten your muscles in such a way that you’d be readable.

I remember hearing that when we are seriously thinking about standing up, our heart rate and blood pressure rise in anticipation, but if we just hypothesise that we might stand up and keep it very abstract, the body doesn’t start those physical processes.

But it’s very obvious when someone has gone into their head! So hostile telepaths often want some kind of emoting or ‘really listening’ or ‘paying attention’ or ‘be present with me’.

So, yeah it conceals some information, but then it adds other information (such as meta information about concealment).

Actors might be interesting to study, here.

comment by kave · 2024-10-28T05:39:37.922Z · LW(p) · GW(p)

From the related book Elephant in the Brain:

Here is the thesis we’ll be exploring in this book: We, human beings, are a species that’s not only capable of acting on hidden motives—we’re designed to do it. Our brains are built to act in our self-interest while at the same time trying hard not to appear selfish in front of other people. And in order to throw them off the trail, our brains often keep “us,” our conscious minds, in the dark. The less we know of our own ugly motives, the easier it is to hide them from others.

Replies from: romeostevensit
comment by romeostevensit · 2024-10-30T14:12:38.244Z · LW(p) · GW(p)

I was reading this earlier and it dovetails very well with this post. Framing defending yourself against hostile people and processes as primarily selfish itself serves the hostile.

comment by Kaj_Sotala · 2024-10-28T11:17:04.229Z · LW(p) · GW(p)

Like if there's an email I keep freezing around. I can tell there's something there. I might even have some intuitive guesses about what it is!

…but I do not check. I don't introspect on whether my guesses feel right.

Instead, I hypothesize. What hostile telepath problem might someone in my shoes be trying to solve such that this behavior arises?

I tried doing this and it felt promising, and then I noticed a familiar feeling of wanting tell a person affected by my possible self-deception how I'd now solved the problem and would behave differently from now on. And I remembered that on each previous time when I'd had that feeling and told the other person something like that, my behavior had in fact not changed at all as a consequence.

And now I'm chuckling at myself.

Replies from: PoignardAzur
comment by PoignardAzur · 2024-11-12T11:33:08.956Z · LW(p) · GW(p)

Yeah, bad habits are a bitch.

comment by Vanessa Kosoy (vanessa-kosoy) · 2024-10-28T08:55:33.618Z · LW(p) · GW(p)

I've been thinking along very [LW(p) · GW(p)] similar [LW(p) · GW(p)] lines [LW(p) · GW(p)] for a while (my inside name for this is "mask theory of the mind": consciousness is a "mask"). But my personal conclusion is very different. While self-deception is a valid strategy in many circumstances, I think that it's too costly when trying to solve an extremely difficult high-stakes problem (e.g. stopping the AI apocalypse). Hence, I went in the other direction: trying to self-deceive little, and instead be self-honest about my[1] real motivations, even if they are "bad PR". In practice, this means never making excuses to myself such as "I wanted to do A, but I didn't have the willpower so I did B instead", but rather owning the fact I wanted to do B and thinking how to integrate this into a coherent long-term plan for my life.

My solution to "hostile telepaths" is diving other people into ~3 categories:

  1. People that are adversarial or untrustworthy, either individually or as representatives of the system on behalf of which they act. With such people, I have no compunction to consciously lie ("the Jews are not in the basement... I packed the suitcase myself...") or act adversarially.
  2. People that seem cooperative, so that they deserve my good will even if not complete trust. With such people, I will be at least metahonest: I will not tell direct lies, and I will be honest about in which circumstances I'm honest (i.e. reveal all relevant information). More generally, I will act cooperatively towards such people, expecting them to reciprocate. My attitude towards in this group is that I don't need to pretend to be something other than I am to gain cooperation, I can just rely on their civility and/or (super)rationality.
  3. Inner circle: People that have my full trust. With them I have no hostile telepath problem because they are not hostile. My attitude towards this group is that we can resolve any difference by putting all the cards on the table and doing whatever is best for the group in aggregate.

Moreover, having an extremely difficult high-stakes problem is not just a strong reason to self-deceive less, it's also strong reason to become more truth-oriented as a community. This means that people with such a common cause should strive to put each other at least in category 2 above, tentatively moving towards 3 (with the caveat of watching out for bad actors trying to exploit that).

  1. ^

    While making sure to use the word "I" to refer to the elephant/unconscious-self and not to the mask/conscious-self.

Replies from: romeostevensit, Valentine, keenan-pepper
comment by romeostevensit · 2024-10-30T14:15:49.593Z · LW(p) · GW(p)

Agree with the approach with the caveat that some people in group 2 are naive cooperators and therefore second order defectors since they are suckers for group 1. Eg the person who will tell the truth to the Nazis out of mistaken theories of ethics or just behavioral conditioning.

comment by Valentine · 2024-10-29T20:55:28.987Z · LW(p) · GW(p)

…I went in the other direction: trying to self-deceive little, and instead be self-honest about my real motivations, even if they are "bad PR".

Yep. I'm not sure why you think this is a "very different" conclusion. I'd say the same thing about myself. The key question is how to handle the cases where becoming conscious of a "bad PR" motivation means it might get exposed.

And you answer that! In part at least. You divide people into three categories based on (a) whether you need occlumency with them at all and (b) whether you need to use occlumency on the fact that you're using occlumency.

I don't think of it in terms this explicit, but it's pretty close to what I do now. People get to see me to the extent that I trust them with what I show them. And that's conscious.

Am I misunderstanding you somehow?

 

Moreover, having an extremely difficult high-stakes problem is not just a strong reason to self-deceive less, it's also strong reason to become more truth-oriented as a community. This means that people with such a common cause should strive to put each other at least in category 2 above, tentatively moving towards 3 (with the caveat of watching out for bad actors trying to exploit that).

I both agree and partly disagree. I tagged your comment with where.

Totally, yes, having a real and meaningful shared problem means we want a truth-seeking community. Strong agreement.

But I think how we "strive" to be truth-seeking might be extremely important. If it's a virtue instead of an engineering consideration, and if people are shamed or punished for having non-truth-seeking behaviors, then the collective "striving" being talked about will encourage individual self-deception and collective untalkaboutability. It's an example of inducing adaptive entropy [LW · GW].

Relatedly: mathematicians don't have truth-seeking collaboration because they're trying hard to be truth-seeking. They're trying to solve problems, and they can verify whether their proposed solutions actually solve the problems they're working on. That means truth-seeking is more useful for what they're doing than any alternatives are. There's no need for focusing on the Virtue of Seeking Truth as a culture.

Likewise, there's no Virtue of Using a Hammer in carpentry.

What puts someone in category 2 or 3 for me isn't something I can strive for. It's more like, I can be open to the possibility and be willing to look for how they and I interact. Then I discover how my trust of them shifts. If I try to trust people more than I do, I end up in more adaptive entropic confusion. I'm pretty sure this is lawful on par with thermodynamics.

This might be what you meant. If so, sorry to set up and take a swing at a strawman of what you were saying.

comment by Keenan Pepper (keenan-pepper) · 2024-10-29T21:13:26.638Z · LW(p) · GW(p)

...never making excuses to myself such as "I wanted to do A, but I didn't have the willpower so I did B instead", but rather owning the fact I wanted to do B and thinking how to integrate this...

 

AKA integrating the ego-dystonic into the homunculus [LW · GW]

comment by Kaj_Sotala · 2024-10-28T11:12:46.375Z · LW(p) · GW(p)

So in many cases, "trauma processing" can basically mean noticing you're not a child anymore. You have power. So you don't have to appease the hostile telepaths just because they're adults.

Yes, definitely. And this is also why it's often so important for the therapist - if this is done in the context of therapy - to exhibit unconditional positive regard toward the client. If the therapist is genuinely accepting of any thoughts and feelings that the client brings up, then that opens the door for the client's parts to start considering the possibility that maybe they can tell the truth and still be accepted. And once it has become possible to tell the truth to at least one person, it becomes possible to tell it to yourself as well.

(Though maybe I should say that the therapist needs to either experience unconditional positive regard toward the client, or successfully deceive themselves and the client into thinking that they do. Heh.)

One additional tangle is that often the client's issue is less about needing to act in a certain way, and more about needing to be a certain way. At some point, one frequently goes from "it's bad to break something and not be genuinely sorry on that particular instance" to "it's bad to be the kind of person who wouldn't automatically feel sorry and who needed to fake being sorry". 

This makes it harder to get to the point where the therapist could provide evidence that they are fine with you not being sorry in that particular instance, because getting there would require you to reveal that it's possible for you to not automatically feel sorry, and that feels dangerous by itself!

And what you've written also gets to the limitations of therapy - that no matter how much positive regard the therapist might have toward their client, if they are still e.g. living with an abusive partner, just the therapist's warmth and support may not be enough to produce a shift. (I haven't had clients with situations that extreme, but I've certainly noticed times when we started making much more progress once they broke up with a partner or quit a job that they had been trying to force themselves to do, and then suddenly new parts of them came to awareness that could now be convinced they were safe.)

Replies from: romeostevensit, Valentine
comment by romeostevensit · 2024-10-30T14:20:55.794Z · LW(p) · GW(p)

It's worth noting that many therapists break therapeutic alliance for ideological or liability reasons and this is one of the reasons that self therapy, peer therapy, llms, and workbooks can sometimes be better.

comment by Valentine · 2024-10-31T15:10:39.810Z · LW(p) · GW(p)

(Though maybe I should say that the therapist needs to either experience unconditional positive regard toward the client, or successfully deceive themselves and the client into thinking that they do. Heh.)

I mean, technically they don't even need to deceive themselves. They can be consciously judgy as f**k as long as they can mask it effectively. Psychopaths might make for amazing therapists in this one way!

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2024-10-31T18:45:16.629Z · LW(p) · GW(p)

True, though I think that judgment tends to be hard to effectively mask in this kind of context (though maybe psychopaths would be able to fake it; I don't know). At least my own experience inclines me to agree with this person:

I’ve worked with and/or done swaps with a lot of different practitioners (IFS, aletheia, VIEW, regular talk therapy, bodywork, voice work etc), and what I found to be the most effective element of their skill set (for me) is: non-judgmental, loving presence… 

many times I have explored the same topic with two different practitioners within a few days of each other; and it’s in those cases that the impact of the difference in the quality of non-judgmental loving presence is most noticeable.

the degree to which the quality of the presence is non-judgmental can be VERY subtle, but the system can pick up on it. it might not even be a strong enough signal to notice it consciously, but it will greatly impact how the session unfolds.

comment by Tao Lin (tao-lin) · 2024-10-30T05:07:59.526Z · LW(p) · GW(p)

I'm often surprised how little people notice, adapt to, or even punish self deception. It's not very hard to detect when someone's deceiving them self, people should notice more and disincentivise that

Replies from: Ratios, Valentine
comment by Ratios · 2024-10-30T12:03:14.281Z · LW(p) · GW(p)

This reads to me as, "We need to increase the oppression even more."

comment by Valentine · 2024-10-31T14:33:53.942Z · LW(p) · GW(p)

It's not very hard to detect when someone's deceiving them self…

A few notes:

  • Sometimes this is obviously true. I agree.
  • It's a curious question why many folk turn their attention away from someone else's self-deception when it's obvious. Often they don't, but sometimes they do. Why they (we) do that is an interesting question worthy of some sincere curiosity.
  • Confirmation bias. You don't notice the cases where you don't pick up on someone else's self-deception.

 

…people should notice more and disincentivise that

Boy oh boy do I disagree.

If someone's only option for dealing with a hostile telepath is self-deception, and then you come in and punish them for using it, thou art a dick.

Like, do you think it helps the abused mothers I named if you punish them somehow for not acknowledging their partners' abuse? Does it even help the social circle around them?

Even if the "hostile telepath" model is wrong or doesn't apply in some cases, people self-deceive for some reason. If you don't dialogue with that reason at all and just create pain and misery for people who use it, you're making some situation you don't understand worse.

I agree that getting self-deception out of a culture is a great idea. I want less of it in general.

But we don't get there by disincentivizing it.

Replies from: jimmy
comment by jimmy · 2024-11-12T07:14:28.410Z · LW(p) · GW(p)

If someone's only option for dealing with a hostile telepath is self-deception, and then you come in and punish them for using it, thou art a dick.

Like, do you think it helps the abused mothers I named if you punish them somehow for not acknowledging their partners' abuse? Does it even help the social circle around them?

 

If that's their only option, and the hostility in your telepathy is antisocial, then yes. In some cases though, people do have other options and their self-deception is offensive, so hostile telepathy is pro-social. 

For example, it would probably help those mothers if the men knew to anticipate punishment for not acknowledging their abuse of their partners. I bet at least one of those abusive husbands/boyfriends will give his side of the story that's a bit more favorable than "I'm a bad guy, lol", and that it will start to fall apart when pressed. In those cases, he'll have to choose between admitting wrongdoing or playing dumb, and people often do their best to play really dumb. The self-deception there is a ploy to steal someone else's second box, so fuck that guy.

I think the right response is to ignore the "self" part of the deception and treat it like any other deception. If it's okay to lie to the Nazis about hiding Jews, then it's okay to deceive yourself into believing it too. If we're going to make it against the law to lie under oath, then making it legal so long as they lie to themselves too is only going to increase the antisocial deception.

comment by Ninety-Three · 2024-10-27T19:06:58.486Z · LW(p) · GW(p)

By "psychopath" I mean someone with the cluster B personality disorder.

There isn't a cluster B personality disorder called psychopathy. Psychopathy has never been a formal disorder and the only time we've ever been close to it is way back in 1952 when the DSM-1 had a condition called "Sociopathic Personality Disturbance". The closest you'll get these days is Antisocial Personality Disorder, which is a garbage bin diagnosis that covers a fairly broad range of antisocial behaviours, including the thing most people have in mind when they say "psychopath", but also plenty of other personality archetypes that don't seem particularly psychopathic, like adrenaline junkies and people with impulse control issues.

Replies from: Seth Herd, Valentine
comment by Seth Herd · 2024-10-28T04:49:35.875Z · LW(p) · GW(p)

Okay; so what's the reality about the people we're thinking of when we say psychopathic? The term seems to still be in use among some professionals, for bad or good reasons.

A garbage bin diagnosis seems like a step down if psychopathy or sociopathy was pointing to a more specific set of attitudes and tendencies.

Replies from: Ninety-Three
comment by Ninety-Three · 2024-10-28T12:22:33.733Z · LW(p) · GW(p)

I think Valentine gave a good description of psychopath as "people who are naturally unconstrained by social pressures and have no qualms breaking even profound taboos if they think it'll benefit them", where just eyeballing human nature, that seems to be a "real" category that would show up as a distinct blip in a graph of human behaviour and not just "how constrained by social pressures people are is a normally distributed property and people get called psychopaths in linear proportion to how far left they are on the bell curve".

comment by Valentine · 2024-10-28T01:35:43.099Z · LW(p) · GW(p)

Cool. I knew there at least used to be "antisocial personality disorder", which I thought was under cluster B along with narcissism and borderline. And I thought "psychopathy" was a different term for APD. Thanks for the correction.

The main thing I wanted to gesture at there is that I wasn't using "psychopath" as something derogatory. I didn't mean "bad guys". I meant something more like "people who are naturally unconstrained by social pressures and have no qualms breaking even profound taboos if they think it'll benefit them". (I just now made that up.) It seems to me that it's a pretty specifically different mental/emotional architecture.

Replies from: Ninety-Three
comment by Ninety-Three · 2024-10-28T03:07:45.596Z · LW(p) · GW(p)

Yep, your intended meaning about the distinctive mental architecture was pretty clear, just wanted to offer the factual correction.

comment by Ben Pace (Benito) · 2024-11-12T04:56:21.923Z · LW(p) · GW(p)

Curated![1] 

I think this is an excellent post on a tricky subject. I found here an articulate description of a great many internal experiences and thoughts I've had but have never well-named or seen written down clearly (e.g. 'occlumency' is a skill I have practiced a lot). I find this topic pretty hard to talk and think openly about, in large part due to the adversarial dynamics, so I am especially grateful for this post (and the ensuing discussion section). One of my favorite posts on LW this year, I think.

Personally, I frame the "Having power" solution as "Gaining independence". I think power is a bit goodhartable on in a corruptible way, and the true goal is to be able to think whichever thoughts you'd think if you had no influences on you, not the thoughts you'd think if you had immense power.

  1. ^

    "Curated", a term which here means "This just got emailed to 30,000 people, of whom typically half open the email, and it gets shown at the top of the frontpage to anyone who hasn't read it for ~1 week."

Replies from: Valentine
comment by Valentine · 2024-11-12T05:41:42.366Z · LW(p) · GW(p)

Ah yeah, I think "gaining independence" is a better descriptor of (what I meant by) that solution type.

comment by Measure · 2024-10-28T12:50:04.517Z · LW(p) · GW(p)

it's not information about whether I'm secretly trying to two-box

It's still Bayesian evidence. Someone with a different policy (always deeply investigating themselves), could get Omega-C to have a higher credence of them one-boxing.  We'd have to specify how sure Omega has to be to offer the large payment (and what priors Omega has) to know if the choice of policy matters.

Replies from: Valentine
comment by Valentine · 2024-10-28T23:14:12.094Z · LW(p) · GW(p)

I think I disagree. I'll add some precision to point out how. Happy to hear if I'm missing something.

E is Bayesian evidence of X if E is more likely to happen when X is true than when it's not.

If Bob says "As a policy, I'm not going to check whether I'm running an Omega-C deception", that's equally likely whether Bob is running a deception or not. (Hence the "as a policy" part.) It just fully happens in both cases. So from Omega-C's point of view, it's not Bayesian evidence that distinguishes between the two versions of Bob.

It would be evidence if the choice were made from a stance of "Oh shoot, that might be self-deception! Well, I'm now going to adopt the no-looking policy so that I don't have to check it!" Then yeah, sure, that's clearly evidence — which is precisely why that method of deciding not to look isn't what can work.

The policy of always deeply investigating oneself can produce evidence for Omega-C, but the act of choosing that policy might not. Choosing the policy not to look just doesn't produce evidence.

Or at least that's how it seems to me.

Replies from: Measure
comment by Measure · 2024-10-29T11:05:54.926Z · LW(p) · GW(p)

The fact that Bob has this policy in the first place is more likely when he's being self-deceptive. Sure, some people will glomorize even when they have nothing to hide, but more often it will be the result of Bob noticing that he's the sort of person who might have something to hide.

It's a general rule [LW · GW] that if E is strong evidence for X, then ~E is at least weak evidence for ~X.

Replies from: gwern, Valentine
comment by gwern · 2024-10-31T15:14:21.362Z · LW(p) · GW(p)

The fact that Bob has this policy in the first place is more likely when he's being self-deceptive.

A fun fictional example here is Bester's The Demolished Man: how do you plan & carry out an assassination when telepaths are routinely eavesdropping on your mind? The protagonist visits a company musician, requesting a musical earworm for a company song to help the workers' health or something; alas! the earworm gets stuck in his head, and so all any telepath hears is the earworm. And you can't blame a man for having an earworm stuck in his head, now can you? He has an entirely legitimate reason for that to be there, which 'explains away' the evidence of the deception hypothesis that telepathic-immunity would otherwise support.

comment by Valentine · 2024-10-31T14:53:22.009Z · LW(p) · GW(p)

The fact that Bob has this policy in the first place is more likely when he's being self-deceptive.

I don't know if that's true. It might be. But some possible counterpoints:

  • People can distrust systems that demand they check. "You have nothing to fear if you have nothing to hide" can get a response of "No" even from people who don't have anything to hide.
  • If someone subconsciously thinks they can pull off the illusion of honestly looking while in fact finding nothing, they become more likely to choose to look because they're self-deceiving.
  • Someone with a policy of not looking might be better at making their own self-deception unnecessary.

 

…more often it will be the result of Bob noticing that he's the sort of person who might have something to hide.

Sure, that way of deciding doesn't work.

Likewise, if you're inclined to decide you're going to dig into possible sources of self-deception because you think it's unlikely that you have any, then you can't do this trick.

The hypothetical respect for any self-deception that might be there needs to be unconditional on its existence. Otherwise, for the reason you say, it doesn't work as well.

(…with some caveats about how people are imperfect telepaths, so some fuzz in implementation here is in practice fine.)

That said, I think you're right in that if Omega-C is looking only at the choice of whether to look or not, then yes, Omega-C would be right to take the choice as evidence of a deception.

But the whole point is that Omega-C can read what conscious processes you're using, and can see that you're deciding for a glomerizing reason.

That's why why you choose what you do matters so much here. Not just what you choose.

 

It's a general rule [LW · GW] that if E is strong evidence for X, then ~E is at least weak evidence for ~X.

Conservation of expected evidence is what makes looking relevant. It's not what makes deciding to look relevant.

If I decide to appease Omega-C by looking, and then I find that I'm self-deceiving, the fact that I chose to look gets filtered. The fact that this is possible is why not finding evidence can matter at all. Otherwise it'd just be a charade.

Relatedly: I have a coin in my pocket. I don't feel like checking it for bias. Does that make it more likely that the coin is biased? Maybe. But if I could magically show you that I'm not looking because I honestly do not care one way or the other and don't want to waste the effort, and it doesn't affect me whether it's biased or not… then you can't use my disinterest in checking the coin for bias as evidence of some kind of subconscious deception about the coin's bias. I'm just refusing to do things that would inform you of the coin's possible bias.

If this kind of reasoning weren't possible, then it seems to me that glomerization wouldn't be possible.

comment by romeostevensit · 2024-10-27T16:52:59.127Z · LW(p) · GW(p)

I can secondhand lend some affirmation to the newcomb case. A friend with DID from a childhood with a BPD mom later became a meditator and eventually rendered transparent the shell game that was being played with potentially dangerous preferences and goals to keep them out of consciousness, since the mom was extremely good at telepathy and was hostile for the standard BPD reason: other beings with other goals are inherently threatening to their extremely fragile sense of their own preferences and goals.

Another solution is illegible-ization/orthogonalization of preferences to the hostile telepath so that you don't overlap in anything they might care about or overpower you with. I think this is one of the things to think about in terms of rationalist avoidance of conflict theory.

Replies from: Valentine
comment by Valentine · 2024-10-28T01:42:34.230Z · LW(p) · GW(p)

I can secondhand lend some affirmation to the newcomb case.

Oh yeah, that's a cool example.

Another solution is illegible-ization/orthogonalization of preferences to the hostile telepath so that you don't overlap in anything they might care about or overpower you with.

You mean something like, look boring to them? Like, I don't care how good Putin is at reading people, I just don't have anything he wants, so I'm safe as long as I keep (apparently) not having anything he wants?

Replies from: romeostevensit
comment by romeostevensit · 2024-10-28T18:32:37.043Z · LW(p) · GW(p)

Yes, though this often involves some self deception about your true utility function. I suspect that some ace people did this to themselves to avoid zero sum competition they expect to painfully lose.

comment by Chipmonk · 2024-10-27T15:58:22.126Z · LW(p) · GW(p)

I'm very glad you wrote this

comment by Alex Lintz (alex-lintz) · 2024-11-08T16:03:55.732Z · LW(p) · GW(p)

This jogged a lot of thinking about how it fits into various modalities. I think the lack of an actual solution to hostile mind-reading might be a flaw in several modalities I've tried which could be part of why I've struggled to have the progress I made with them stick. Many of these at least point toward alternative methods of dealing with self-deception which could be useful and I think authentic relating suggests at least one idea for an alternative method of occlumency which feels a little more virtuous (definitely felt some aversion to your solutions because of this). 

Existential Kink

  • This is a pretty weird book. Very much not my style but have found it useful on recommendation of Sasha Chapin's post about persistent self-love. Still new to this so might be getting it a little wrong.
  • Existential kink is like 'find the desire behind the self-deception and learn to love it, appreciate it. Learn to be ok with the presence of this desire.
  • I think it doesn't really have a solution to actually dealing with hostile mindreaders effectively but maybe I haven't got there yet.
  • Main practice is something like: Find a pattern which is coming up often (e.g. losing interest in someone quickly after they show interest in you) and then dive into the deep desires underlying that. What is thrilling and great about this pattern, what does it get you? E.g. maybe it allows you to imagine you'll find the perfect person, allows you to think of yourself as superior in some confusing way (i.e. I'd never date someone willing to stoop so low as to date me), helps avoid the potential for rejection (e.g. if I reject them first I'm perfectly safe from rejection and being safe from rejection is NICE).
  • So in this case you'd make peace with the desire to avoid rejection and the sense of superiority doing the rejecting gives you. Now you can notice those senses without needing to avoid them.
  • When it comes to how you engage with that in the real world... Unclear.
  • In many cases it seems like the desire itself recedes somewhat and I'm not sure why.
  • E.g. I confronted this and am currently dating someone. I notice sometimes the fear of rejection and the sense of 'I could find someone better' but they feel not that important to me. I want her attention and I get it and that feels nice. I have my uncertainties sure but idk, I guess other things feel more important after having acknowledged these hidden desires?
  • Where did the desire to be superior, the desire to find the absolute best go? Why would recognizing it and embodying it reduce it?
  • Perhaps now it shares the stage with other emotions and I can recognize that in comparison I care more about those other things? E.g. a stable relationship and someone I enjoy being around just does seem better than being able to avoid rejection and feeling superior to attractive women when they're both in the clear.
  • Why would a desire be stronger when it's ONLY in the subconscious?
    • It's got wrong priorities or doesn't understand the situation. E.g. it's optimizing for more of a childhood scenario? Not sure...

Nondual practice

  • Coming out of self-deception is a key aspect of nondual practice. Adyashanti talks about a practice of coming to accept all parts of ourselves. Mostly this is through meditation and just being ok with everything that arises. He also has another technique that seems more hands-on but he doesn't really give much a description that I've seen so not really sure what the process is - would like to know!
  • Loch Kelly uses IFS as a tool for full self-acceptance and a way to move through self-deception (more on that below).
  • When I've been in a nondual state for longer periods on retreat I've noticed that I can find desires which I generally don't acknowledge and feel totally alright with them. The threat of consequences for the belief feel unimportant because I know that I can live with the consequences and still feel this deep sense of okayness.
  • E.g. I could be really critical of myself and notice failures and just be like 'wow, so cool that I'm noticing these failures, well done'. And this isn't something I'd practiced it just came along for the ride.
  • Unfortunately this hasn't stuck around much and seems to require a lot of steady dedication to re-entering the state. I think probably I could do this somewhat reliably but for some reason I don't want to or don't trust it? I'm not really sure what's going on here but I mostly don't inhabit this state anymore and I'm not entirely sure why. Probably some self-deception going on?

IFS

  • IFS talks a lot about protector parts which are often self-deceiving in order to protect some other part. A crucial part of every IFS session is to ask the protector what age they think you are (often, at least in examples, it would say something like 5-12) and then you could reveal to it that actually you're 30 (or whatever). And sometimes this would lift the burden and the protector would suddenly be like 'oh, I guess we don't really need to have all these walls up, we can handle some difficult reactions now'. I think this is basically the power solution to self-deception.
  • This sometimes has just worked for me in a clear and discontinuous way. That said, it's usually reverted at least to some extent after a few days or weeks. I never really figured out how to stop the reversion which is why I ended up giving up on IFS (mostly). Could be in part due to lack of solution to hostile telepaths problem. 

TEAM CBT

  • This is my current favorite modality (along with existential kink).
  • A lot of it is finding a belief that you might not even really believe rationally but which feels true (e.g. I'm worthless) and then exploring that. There are a million different ways to explore the beliefs so I'm finding it a little harder to connect it with hostile mind-reading.
  • One of my favorite methods in TEAM CBT is cost-benefit analysis. So with 'I'm worthless' you might set a timer for 5-10min and fill out the positives. Some examples might be:
    • I'm not responsible for anything, I'm worthless so there's no obligation for me to fix the world's problems.
    • If people see that I think I'm worthless then they'll probably want to comfort me and provide support - that can be pretty nice!
  • Then you'd do the negatives (e.g. it makes me feel bad all the time, makes work harder, etc) and when looking at them side by side you might recognize that you do care about meeting some of the more hidden desires (e.g. being cared for) but this belief is not the best way to go about that. 

All of these methods have mostly been about self-deception and recognizing it but mostly doesn't deal with the hostile mind-reading aspect. That's perhaps a key reason why they haven't fully worked for me! I've had the feeling that I could accept these things but haven't wanted to admit them. I do think Authentic Relating has a useful solution here though that has helped at least somewhat:

Circling & Authentic Relating:

  • Though you're trying to reveal as much as you can about yourself it's not useful to push too far beyond what you're comfortable with. Pushing too far can be somewhat traumatizing and lead to setbacks. Instead you can wholeheartedly welcome some thoughts but decide you don't need to share them. Rather than stonewalling or lying, you can simply say things like 'I'm feeling some shame right now around a thought that's coming up. I'm not quite ready to share it but it feels good to voice that it's there'.
  • This idea that one could fully accept something happening inside that might feel shameful to reveal while not sharing it was pretty big for me. It empowered me to feel I could explore more scary stuff. For example, a problem I've had in some circling situations was thoughts like 'wow she's attractive, I notice myself feeling some arousal but it would be weird to share that' or 'I don't find her attractive I want to make clear I'm not interested but this would be really offensive to say out loud'. Both of these are things that, when I first did authentic relating I was really trying to do some Newcomblike self-deception around. I was trying to be someone who doesn't think these thoughts, I would distract myself, latch onto other thoughts arising that I would feel comfortable revealing. Once I felt empowered to have these thoughts without revealing them that became somewhat easier - I could reveal some nervousness and discomfort but not reveal the actual thought. And sometimes I would reveal these thoughts and that would usually go quite well.
  • To be fair, I don't think I've fully internalized this and this maybe makes it sound like more of a solved problem than it is. That said, I think this is an important step and seems kinda similar to occlumency but a little more virtuous? (which is my main problem with occlumency as described)
Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2024-11-11T14:51:06.270Z · LW(p) · GW(p)

A crucial part of every IFS session is to ask the protector what age they think you are (often, at least in examples, it would say something like 5-12) and then you could reveal to it that actually you're 30 (or whatever).

I wouldn't put it as strongly as to say that it's a crucial part of every IFS session. It can sometimes be a very useful question and approach, sure, but I've had/facilitated plenty of great sessions that didn't use that question at all. And there are people who that question just doesn't resonate with.

comment by Hastings (hastings-greer) · 2024-11-04T21:27:16.941Z · LW(p) · GW(p)

Organizations and communities can also face hostile telepaths. My pet theory that sort of crystalized while reading this is that p-hacking is academia’s response to a hostile telepath that banned publication of negative results.

This of course sucks for non traditional researchers and especially journalists who don’t even subconsciously know that p=0.05002 r=1e-7 “breakthrough in finding relationship between milk consumption and toenail fungus” is code for “We have conclusively found no effect and want to broadcast to the community that there is no effect here; yet we cannot ever consciously acknowledging that we found nothing because our mortgages depend on fooling a hostile telepath into believing this is something”

comment by jwray · 2024-11-19T05:43:38.507Z · LW(p) · GW(p)

My experience is very different.  I feel unitary, without any IFS or jungian shadow or other sort of subconscious parts trying to deceive my conscious self.  I violate quite a lot of social norms without feeling any shame or guilt about it, because I've got an 'internal scorecard'.  So long as I'm true to my own values/morality, and I can protect myself with some combination of power / occlumency / disengaging, all three of which come easily to me, social norms don't matter in private.

Replies from: Valentine
comment by Valentine · 2024-11-19T15:00:33.636Z · LW(p) · GW(p)

To me this is exciting. I deduced that the mental architecture you're describing should be possible. It's extremely cool to hear someone just name it as a lived experience. Like, what would a mind that's actually systematically free of Newcomblike self-deception have to be like, assuming the hostile telepaths problem is real? This is one possible solution. Assuming I haven't misunderstood what you're describing!

comment by lemonhope (lcmgcd) · 2024-11-08T08:33:47.196Z · LW(p) · GW(p)

What gaslighting goes on in math class?

Replies from: Valentine
comment by Valentine · 2024-11-08T22:32:09.322Z · LW(p) · GW(p)

A few examples:

  • Framing kids as "disruptive" or "inattentive" or otherwise having the wrong nature if they feel disengaged. This is after informing them what they're going to study without consulting what's relevant or interesting to them, and then using social power to require them to study those things. But the problem is supposedly the student, not the system.
  • Claiming that they'll need these math tools later in life, and that this justifies adults pressuring the kids to learn those skills now. (This is more bullshit-flavored than gaslight-flavored, but I think they're psychological neighbors.)
  • Pretending that because a word problem touches on a topic kids care about, the math is relevant to what the kids like about that topic.
  • Insisting that forcing kids to take math classes is for their own good, and if the kids don't see why or don't agree, then they should believe the adults over their own sense of things.

It makes me so angry. It's perfectly antithetical to the essence of math as I see it.

Replies from: lcmgcd
comment by lemonhope (lcmgcd) · 2024-11-09T00:37:33.648Z · LW(p) · GW(p)

Your examples fit the definition quite well. Apparently this is in the dictionary now. https://www.merriam-webster.com/dictionary/gaslighting

comment by CuoreDiVetro · 2024-11-02T22:56:18.074Z · LW(p) · GW(p)

This is coherent with my experience. I'm pretty sure there are other problems solved by self-deception other than hostile telepaths. One other such problems solved by self-deception which I'm pretty sure I've seen in people is preserving motivation: if something is really important for me and I need to put in a lot of effort to make it happen and probability of success is very low (let's say epsilon), and if know that the probability of success is epsilon would totally annihilate my motivation to work towards it, then maybe hiding to myself that low probability safeguards my motivation to put in all the work necessary. 

Think here at a situation where there is an natural catastrophe, and someone's loved one is caught below the rubble, the person refusing to believe that the person might be dead and doing everything to remove the rubble as fast as possible.

Maybe this is also where the planning fallacy comes from in some cases. 

comment by VaRuna · 2024-11-12T19:49:42.717Z · LW(p) · GW(p)

I think this is a great outline of how these strategies form. A very similar idea is described in The Elephant in the Brain, but this is straightforwardly written and more visceral in a way I felt the book (and most other attempts to describe it) lacked. Kudos!

The drive to be "perfectly rational" and push all slivers of self-deception out with force is, I think, one of the core psychological errors made in rationalist circles (including the writing) for exactly the reasons you lined out. Well explained!

Honesty, and specifically self-honesty, is held as one of the highest virtues. I think it's even cited as the only example of an instrumental good that can and should just be treated as an inherent good. Considering that, I would have loved to see you engage a little more in the discussion of how what you propose interferes with this and the tradeoff there.

Your proposed solution resembles IFS therapy, focusing on not forcing the part of you that tries to do something you might consciously disapprove of, and instead being understanding and accepting of that mechanism as well. The difference is that the intention is to find out what is happening, and not to try to solve it from a distance.

The last point touches on my central issue with your suggestion. The intention of really letting the self-deception sit does make it easier to notice when it happens, but so does investigation when it is done gently and without force. The problem with building up the habit to let self-deception run is that you will notice it more, but it will also continue to happen habitually, or the propensity for it might even grow stronger.

If there were actually plenty of hostile telepaths running around, that might not be a bad thing, but I think it is actually a rare occurrence in adulthood. To be specific, what is rare is a situation in which you really do want to self-deceive in that way and not see the avoided problem instead. Almost all of these situations are cases in which the solution will be "investigating whether the telepaths are in fact hostile and discovering they're not," as you wrote. There are obvious situations in which another person might be dismayed by your true mental state, but keeping that hidden only serves you in situations where the other person does not actually care about your well-being (in the form of your true mental state) anyway, in which case a detectable lie or insincere appeasement will work just as well and does not require the Newcomb-like deception in the first place. Having better information about your own internal state and preferences is a net good in all these situations.

Of course, there are situations in which you really would want to lie, but I would argue that you have to make the distinction between whom you can be honest with and whom to lie to anyway, subconsciously. So if you do, then you can also do that consciously, and rather learn to face the discomfort of confronting conflict. And, in these rare cases, realize and integrate that if someone is truly hostile enough to not just disregard your true preferences, but will even use knowledge of them against you, then you should be consciously aware of that (not just subconsciously, as you have to be in order to pull off Newcomb-like deception in the first place) and treat them appropriately, which includes lying to them with impunity. And if you do not make the distinction between whom you can really be honest with about everything, then that itself poses a significant issue, and bringing that to attention is invaluable in itself.

Another practical problem with the method is that most of the situations in which this would apply in fact do not have a practical solution, not even a concrete emotional one, because the problem is the avoidance of the feeling itself. There can be practical things you can do to make an instance of shame or procrastination less overwhelming, but when the badness of the problem is inherent, then no strategy can ease it enough to make the self-deception superfluous. In most cases, that means that the problem just festers until the urgency forces it out, in which case nothing is learned or gained, just a solution postponed. Or, without a timeline, the deception just stays unaddressed, you never learning what is behind it, which is tragic, considering that there might be something positive in there, and the negative is in fact only in the anticipation, since there are no hostile telepaths raining consequences on you as a result of your self-knowledge. In my experience, being in a safe environment suffices as a prerequisite to the safety of lifting any self-deception, with time and patience. Another person who can assure you that it's okay and safe can be helpful, but also not strictly needed.

I would be curious to hear your thoughts, most of which comes down to the discussion of tradeoffs between solving the problem and the high value of self-honesty and having accurate models over all mentioned above.

comment by MikkW (mikkel-wilson) · 2024-10-30T14:58:05.311Z · LW(p) · GW(p)

This post does a good job of laying out compelling arguments for thoughts adjacent to areas I've previously already enjoyed thinking about.

For the record, this sentence popped into my head while reading this: "Wait, but what if I'm Omega-V, and [Valentine] is a two boxer?"

(Edit: the context for this thought is my previous thoughts having read other posts by Valentine, which I find both quite elucidating, but also somehow have left me feeling a bit creeped out; that being said, my opinion about this post itself is strongly positive)

comment by Kabir Kumar (kabir-kumar) · 2024-11-16T19:57:39.771Z · LW(p) · GW(p)

I thought this was going to be an allegory for interpretability.

comment by NickH · 2024-11-12T10:14:08.037Z · LW(p) · GW(p)

I like this except for the reference to "Newcomblike" problems, which, I feel, is misleading and obfuscates the whole point of Newcomb's paradox. Newcomb's paradox is about decision theory - If you allow cheating then it is no longer Newcomb's paradox. This article is about psychology (and possibly deceptive AI) - cheating is always a possible solution . 

comment by lemonhope (lcmgcd) · 2024-11-08T08:47:46.592Z · LW(p) · GW(p)

Regarding this

Such as the moms in the abusive partners example above: each one could acknowledge her self-deception once it was safe for her abusive partner to know too. She got enough power (financial or social) to protect herself and her child, making the telepathic scan no longer a dire threat.

I would add that most abusive people don't really like crushing their loved ones and it is sometimes easy to get them to stop, eg by having a peer of the abuser get a private word with the two parties separately. I think it is common for there to be simple miscommunication/misunderstanding — the abuser does not typically actually benefit from the accusative situation.

Why haven't abuser & abusee already talked and figured this out? Well there is some force field where you can't have a normal conversation with someone who is hitting you (or you are hitting) about the hitting. Although I don't know how to put it in your terms here from this post.

Replies from: Valentine
comment by Valentine · 2024-11-08T22:18:13.874Z · LW(p) · GW(p)

In broad strokes I agree with you. Here I was sharing my observation of four cases where a friend was involved this way. One case might have been miscommunication but it doesn't seem likely to me. The other three definitely weren't. In one of those I personally knew the guy; I liked him, but he was also emotionally very unstable and definitely not a safe father. I don't think the abuse was physical in any of those four cases.

Replies from: lcmgcd
comment by lemonhope (lcmgcd) · 2024-11-09T00:41:31.382Z · LW(p) · GW(p)

Aw man we used the same word for different things again

comment by Lorec · 2024-10-31T00:21:24.681Z · LW(p) · GW(p)

I think this means that if you care both about (a) wholesomeness and (b) ending self-deception, it's helpful to give yourself full permission to lie as a temporary measure as needed. Creating space for yourself so you can (say) coherently build power such that it's safe for you to eventually be fully honest.

The first sentence here, I think, verbalizes something important.

The second [instrumental-power] is a bad justification, to the extent that we're talking about game-theoretic power [as opposed to power over reductionistic, non-mentalizing Nature]. LDT is about dealing with copies of myself. They'll all just do the same thing [lie for power] and create needless problems.

You do give a good justification that, I think, doesn't create any needless aggression between copies of oneself, and which I think suffices to justify "backing self-deception" as promising:

I mean something more wholehearted. If I self-deceive, it's because it's the best solution I have to some hostile telepath problem. If I don't have a better solution, then I want to keep deceiving myself. I don't just tolerate it. I actively want it there. I'll fight to keep it there! [...]

This works way better if I trust my occlumency skills here. If I don't feel like I have to reveal the self-deceptions I notice to others, and I trust that I can and will hide it from others if need be, then I'm still safe from hostile telepaths.

[emphases mine]

"I'm not going to draw first, but drawing second and shooting faster is what I'm all about" but for information theory.

Replies from: Valentine
comment by Valentine · 2024-10-31T15:07:31.917Z · LW(p) · GW(p)

I think the word "power" might be creating some confusion here.

I mean something pretty specific and very practical. I'm not sure how to precisely define it, but here are some examples:

  • If someone threatens to freak out at you if you disagree with them, and you tend to get overwhelmed and panic when the freak out at you, then they have a kind of power over you. Building power here probably looks like learning to experience them freaking out without you getting overwhelmed.
  • If someone pays for your rent and food but might stop if they get any hint that you're gay, it might not be safe to even ask yourself honestly whether you are. You build power here by getting an income, or a source of rent and food, that doesn't depend on the hostile telepathic benefactor.
  • If your lover gets turned on by you politically agreeing with them and turned off by disagreement, you might find your political views drifting toward theirs for "unrelated" reasons. One way to build power here is to get other access to sex. Another is to diminish your libido. Another is to break up with them. (Not saying any of these are a great idea. I'm just naming what the solution of "building power" might look like here.)

I'm not familiar with LDT. I can't comment on that part. Sorry if that means what I just said misses your point.

Replies from: Lorec
comment by Lorec · 2024-10-31T16:33:58.068Z · LW(p) · GW(p)

! I'm genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.

LDT says that, for the purposes of making quasi-Kantian [not really Kantian but that's the closest thing I can gesture at OTOH that isn't just "read the Yudkowsky"] correct decisions, you have to treat the hostile telepaths as copies of yourself.

Indexical uncertainty, ie not knowing whether you're in Omega's simulation or the real world, means that, even if "I would never do that", if someone is "doing that" to me, in ways I can't ignore, I have to act as though I might ever be in a situation where I'm basically forced to "do that".

I can still preferentially withhold reward from copies of myself that are executing quasi-threats, though. And in fact this is correct because it minimizes quasi-threats in the mutual copies-of-myself negotiating equilibrium.

"Acquire the ability to coerce, rather than being coerced by, other agents in my environment", is not a solution to anything - because the quasi-Rawlsian [again, not really Rawlsian, but I don't have any better non-Yudkowsky reference points OTOH] perspective means that if you precommit to acquire power, you end up in expectation getting trodden on just as much as you trod on the other copies of you. So you're right back where you started.

Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.

And I think "be willing to back deceptions" is in fact such a socially-orthogonal improvement.

Replies from: Valentine
comment by Valentine · 2024-10-31T17:45:51.963Z · LW(p) · GW(p)

! I'm genuinely impressed if you wrote this post without having a mental frame for the concepts drawn from LDT.

Thanks. :)

And thanks for explaining. I'm not sure what "quasi-Kantian" or "quasi-Rawlsian" mean, and I'm not sure which piece of Eliezer's material you're gesturing toward, so I think I'm missing some key steps of reasoning.

But on the whole, yeah, I mean defensive power rather than offensive. The offensive stuff is relevant only to the extent that it works for defense. At least that's how it seems to me! I haven't thought about it very carefully. But the whole point is, what could make me safe if a hostile telepath discovers a truth in me? The "build power" family of solutions is based on neutralizing the relevance of the "hostile" part.

I think you're saying something more sophisticated than this. I'm not entirely sure what it is. Like here you say:

Basically, you have to control things orthogonal to your position in the lineup, to robustly improve your algorithm for negotiating with others.

I'm not sure what "the lineup" refers to, so I don't know what it means for something to be orthogonal to my position in it.

I think I follow and agree with what you're saying if I just reason in terms of "setting up arms races is bad, all else being equal".

Or to be more precise, if I take the dangers of adaptive entropy [LW · GW] seriously and I view "create adaptive entropy to get ahead" as a confused pseudo-solution. It might be that that's my LDT-like framework.

Replies from: Lorec
comment by Lorec · 2024-10-31T20:30:17.422Z · LW(p) · GW(p)

I once thought "slack mattered more than any outcome". But whose slack? It's wonderful for all humans to have more slack. But there's a huge game-theoretic difference between the species being wealthier, and thus wealthier per capita, and being wealthy/high-status/dominant/powerful relative to other people. The first is what I was getting at by "things orthogonal to the lineup"; the second is "the lineup". Trying to improve your position relative to copies of yourself in a way that is zero-sum is "the rat race", or "the Red Queen's race", where running will ~only ever keep you in the same place, and cause you and your mirror-selves to expend a lot of effort that is useless if you don't enjoy it.

[I think I enjoy any amount of "the rat race", which is part of why I find myself doing any of it, even though I can easily imagine tweaking my mind such that I stop doing it and thus exit an LDT negotiation equilibrium where I need to do it all the time. But I only like it so much, and only certain kinds.]

comment by Ratios · 2024-10-28T11:47:54.248Z · LW(p) · GW(p)

It is worth noting that Ziz has already proposed the same idea in False Faces, although I think Valentine did a better job of systematizing and explaining the reasons for its existence.

Another interesting direction of thought is the connection to Gregory Bateson’s theory that double binds cause schizophrenia. Spitballing here: it could be that a double bind triggers an attempt to construct a "false face" (a self-deceptive module), similar to a normal situation involving a hostile telepath. However, because the double bind is contradictory, the internal mechanism that tries to create the false face to appease the hostile telepath malfunctions, resulting in mental chaos.

comment by João Ribeiro Medeiros (joao-ribeiro-medeiros) · 2024-10-28T05:31:21.557Z · LW(p) · GW(p)

Very powerful reasoning. I would add that a relevant form of self-deception that should be investigated in this framework is religious faith, given its place as as foundational to societies worldwide. 

Religious faith seems like an optimal form of solution to hostile telepaths problem, in certain contexts it seems like a mixture of the three solutions you outlined. (Newcomblike self-deception, Having power and Occlumency)

Religious faith seems to provide psychological power through feelings of absolute certainty and over-confidence that religious people experience.  At the same time, the conversion to religions is correlated with overcoming PTSD and addiction (step 2 of the 12 steps program: "Came to believe that a Power greater than ourselves could restore us to sanity.")

I think there is an underlying problem of concept hierarchy which may precede self deception. Maybe we are able to hide concepts and thoughts while they occupy a peripheral part of the mind, this could be also linked to a continuous formulation of the newcomb-like problem in decision theory. I am not sure how this unfolds, will be trying to explore that in the weeks to come. 

Thank you for sharing!

comment by Kabir Kumar (kabir-kumar) · 2024-11-16T20:01:33.286Z · LW(p) · GW(p)

I think this is really along the wrong path and misunderstanding a lot of things, but so far along the incorrect path of thought and misunderstanding so much, that it's hard to untangle

Replies from: kabir-kumar
comment by Kabir Kumar (kabir-kumar) · 2024-11-16T20:05:41.585Z · LW(p) · GW(p)

To be a bit less useless - I think this fundamentally misses the problem of respect and actually being able to communicate with yourself and fully do things, if you've done so - and that you can do these when you have full faith and respect in yourself (meaning all of yourself - may include love as well, not sure how necessary that is for this). Could maybe be done in other ways as well, but I find those less beautiful, personally.