[Valence series] 4. Valence & Liking / Admiring

post by Steven Byrnes (steve2152) · 2024-06-10T14:19:51.194Z · LW · GW · 12 comments


  4.1 Post summary / Table of contents
  4.2 Key concept: “liking / admiring”
    4.2.1 Intuitive (extreme) example of “liking / admiring”
    4.2.2 Examples of “liking” without “liking / admiring”
    4.2.3 Examples of “liking / admiring” without “admiring”
  4.3 Proposal: “Beth likes / admires Alice” = “the concept of ‘Alice’ has positive valence in Beth’s mind”
    4.3.1 What’s happening with valence in cases where “liking” comes apart from “liking / admiring”?
  4.4 An innate “drive to feel liked / admired”
    4.4.1 Claim: People’s motivation to feel liked / admired is an innate drive, not just a learned strategy
    4.4.2 How might an innate “drive to be liked / admired” work?
    4.4.3 Side note: Should we make AGIs with a “drive to be liked / admired”?
  4.5 Our tendency to pick careers, preferences, clothes, beliefs, etc. that seem “high-status”
    4.5.1 Path 1: (I like / admire Alice) & (Alice likes X) → (I like X) → (I try to do X)
    4.5.2 Path 2: (Alice likes X) → (if I do X, then Alice will like / admire me more) → (I try to do X)
  4.6 Our tendency to want people we like / admire to “lead”—i.e., to afford them more “social status”
    4.6.1 Side note: prestige versus dominance
  4.7 My self-esteem (i.e., the valence I assign to “myself”) is not the same as my tendency to be liked / admired. But it is strongly affected by that.
    4.7.1 Connection to active self-concept formation, externalization of ego-dystonic tendencies, etc.
  4.8 Conclusion

4.1 Post summary / Table of contents

Part of the Valence series [? · GW].

(This is my second attempt to write the 4th post of my valence series. If you already read the previous attempt [LW · GW] and are unsure whether to read this too, see footnote→[1]. Also, note that this post has a bit of overlap with (and self-plagiarism from) my post Social status part 2/2: everything else [LW · GW], but the posts are generally quite different.)

The previous three posts built a foundation about what valence is, and how valence relates to thought in general. Now we’re up to our first more specific application: the application of valence to the social world.

Here’s an obvious question: “If my brain really assigns valence to any and every concept in my world-model, well, how about the valence that my brain assigns to the concept of some other person I know?” I think this question points to an important and interesting phenomenon that I call “liking / admiring”—I made up that term, because existing terms weren’t quite right. This post will talk about what “liking / admiring” is, and some of its important everyday consequences related to social status, mirroring, deference, self-esteem, self-concepts, and more.

4.2 Key concept: “liking / admiring”

I’m using the term “liking / admiring” to talk about a specific thing. I’ll try to explain what it is. Note that it doesn’t perfectly line up with how people commonly use the English words “liking” or “admiring”.

4.2.1 Intuitive (extreme) example of “liking / admiring”

I’m Beth, a teenage fan-girl of famous pop singer Alice, whom I am finally meeting in person. Let’s further assume that my demeanor right now is “confident enthusiasm”: I am not particularly worried or afraid about the possibility that I will offend Alice, nor am I sucking up to Alice in expectation of favorable treatment (in fact, I’m never going to see her again after today). Rather, I just really like Alice! I am hanging on Alice’s every word like it was straight from the mouth of God. My side of the conversation includes things like “Oh wow!”, “Huh, yeah, I never thought about it that way!”, and “What a great idea!”. And (let us suppose) I’m saying all those things sincerely, not to impress or suck up to Alice.

That’s a good example of what I mean by “Beth likes / admires Alice”.

One side effect of really liking Alice is that I’ll tend to also want to do things that Alice does—or if I don’t want to do them myself, I’ll at least be more likely to think of them as good things to do. If Alice likes going to a certain bar, then (in my mind) it must be a friggin’ awesome bar! In other words, I’m applying the halo effect to Alice (see §3.4.4 [LW · GW])—more discussion in §4.5–§4.6 below.

I’m picking a very extreme example to make it clear. For example, I happen to like / admire the actor Tom Hanks to a small degree; but I like / admire him much, much, much less strongly than how much Beth likes / admires Alice in the story above.

4.2.2 Examples of “liking” without “liking / admiring”

The reason I’m using the term “like / admire”, instead of just “like”, is that it’s a specific kind of liking—something a bit like “Beth likes Alice in Alice’s capacity as a person with beliefs and desires and agency”. So some non-examples of liking / admiring would be:

4.2.3 Examples of “liking / admiring” without “admiring”

The reason I’m using the term “like / admire”, instead of just “admire” (or “respect”), is because my “liking / admiring” does not have to be reflectively-endorsed, or ego-syntonic, or associated with an all-things-considered desire to emulate the person. Nor does it imply that you think of the target as somehow above or better than yourself. As some examples:

4.3 Proposal: “Beth likes / admires Alice” = “the concept of ‘Alice’ has positive valence in Beth’s mind”

In §2.4.1 [LW · GW], I proposed a “linear model” [LW · GW], where “thoughts” are compositional (i.e., basically made of lots of little interlocking pieces), and that the total valence is linearly additive over those thought-pieces (a.k.a. “concepts”).

My proposal is simple: “Beth likes / admires Alice” to the extent that Beth’s brain assigns positive valence to the “Alice” concept.

4.3.1 What’s happening with valence in cases where “liking” comes apart from “liking / admiring”?

In §4.2.2 just above, I mentioned two examples where the everyday notion “liking” comes apart from “liking / admiring”—namely, “I like Milhouse (as an object of derision)”, and “Bob likes Alice (as an object of sexual desire)”. Those have something to do with positive valence, but I also said that they were not examples of “like / admire”. So, what’s going on?

My answer is: It’s the same idea as I discussed in § [LW · GW]. There, I gave the example where, for a pro-Israel reader, “Hamas” would have negative valence, but “Hamas as the subject of my righteous indignation” would have positive valence. Well, in exactly the same way, it’s entirely possible for my brain to assign negative valence to “Milhouse”, while assigning positive valence to “Milhouse as an object of my condescension”.

4.4 An innate “drive to feel liked / admired”

4.4.1 Claim: People’s motivation to feel liked / admired is an innate drive, not just a learned strategy

For everything I’ve said so far in this post, there needn’t be anything special and specific in the brain underlying liking / admiring per se. The same brain mechanisms that associate positive valence with the thought of a particular chair, can likewise associate positive valence with the thought of a particular person.

But I do think there's something special and specific that the genome builds into the brain for a drive to be liked / admired. This would be a reflex that says: if I believe that someone else likes / admires me—especially if it’s someone else I like / admire in turn—then that belief is itself intrinsically rewarding to me.

(In an earlier version of this post, I was using the term “status drive” for this reflex. And it certainly has plenty to do with status-seeking! But now I think “drive to be liked / admired” is a much better and more specific term. I think “status-seeking” in full generality is a more complicated topic, probably involving at least two different innate drives[2] in conjunction with various learned strategies.)

Stepping back a bit: As I’ve mentioned in §2.5 [LW · GW] and discussed in much more detail elsewhere [LW · GW], I think there’s a sharp and important distinction between “innate drives” versus the various products of within-lifetime learning. One way to tell them apart is that, if something is not a human cross-cultural universal, then it’s unlikely to be directly related to an innate drive. But the converse is not true: If something is a cross-cultural universal, then maybe it’s directly related to an innate drive, or an alternative possibility is that everyone has similar learning algorithms, and everyone has similar life experience (in certain respects), so maybe everyone winds up adopting the same habits. Let’s call that alternative possibility “convergent learning”.

Applying this general idea to the phenomenon of “wanting to be liked / admired”, I believe that this phenomenon is a cross-cultural human universal. So two hypotheses would be: (1) it’s a direct innate drive, or alternatively (2) it’s “convergent learning”—each person learns from life experience that lots of good things happen when other people like / admire them.

Anyway, my strong belief is that it’s (1) not (2)—a direct innate drive, not “convergent learning”. That belief comes from various sources, including how early in life liking/admiration-seeking starts, how reliable it is, the person-to-person variability in how much people care about being liked / admired, and the general inability of people to not care about being liked / admired, even in situations where it has no other downstream consequences.

Here’s another piece of evidence, maybe: I think some high-functioning sociopaths are (in many but not all respects) examples of what it looks like for a person to operate in the social world via pure learned strategy rather than innate social drives. How does their liking / admiration-seeking behavior compare to normal? My impression is: they are substantially more open-minded to forgoing liking / admiration than normal. In particular, there’s a strategy of “getting other people to pity me”. This strategy seems to be a good way to extract favors from people, and high-functioning sociopaths famously use this strategy way more than most people.[3] But this strategy seems to require a lack of liking / admiration-seeking—if you’re being pitied, then you’re not being liked / admired. So maybe that’s another bit of evidence that the pursuit of liking / admiration normally derives from an innate drive, not from within-lifetime learning of instrumentally-useful social strategies.

4.4.2 How might an innate “drive to be liked / admired” work?

If I’m right, then how does that innate drive work? Neuroscientific details would be way out of scope (and I don’t know them anyway). But in broad strokes, my proposal is:

If I like / admire Tom, and I have a thought wherein I imagine Tom to be liking / admiring me in turn, then that thought is positive valence, a.k.a. intrinsically motivating.

Spelling out the recipe in a bit more detail:

The even-more-detailed version would involve a mechanism that enables my brainstem to detect and react to transient empathetic simulations. In a post last year [LW · GW], I surmised that most human social innate drives, from schadenfreude to compassion, centrally involve transient empathetic simulations. But I didn’t have any good examples at the time. Well, the above “drive to be liked / admired” recipe is my first good example! Or so I hope—I still need to flesh it out into a more detailed model, like with nuts-and-bolts pseudocode along with how it’s implemented in neuroanatomy. (And then proving that hypothesis experimentally would be far harder still.)

Two more details:

First, there might be an adaptation mechanism—if you’re used to being strongly liked / admired, then thoughts of other people liking / admiring you gradually lose some or most of their positive valence. Instead you get positive valence for thoughts of other people liking / admiring you more than the baseline expectation.

Second, if I have a thought wherein I imagine Tom liking / admiring me, that thought doesn’t have to be consciously-endorsed, or plausible-upon-reflection. I think people can make decisions that turn their life upside-down, based on a feeling that their idols would be impressed by those decisions if they ever learned about them, when in fact that feeling is wildly divorced from what those idols would actually think.[4] Motivated reasoning (§3.3 [LW · GW]) is relevant here, as everywhere.

4.4.3 Side note: Should we make AGIs with a “drive to be liked / admired”?

(If you don’t know what “AGI” stands for, see context here [? · GW], or maybe just skip this section, it’s irrelevant to the rest of this series.)

There’s a long history of otherwise-intelligent people proposing to build future powerful AGI agents with motivations and drives that would just really really obviously (from my perspective) make those AGIs behave in a dangerous and antisocial way—see here and here [LW · GW] for two of many examples.

I think the prospect of an AGI displaying the full suite of human status-seeking behaviors is likewise terrifying—see for example The Status Game by Will Storr (example excerpt here [LW · GW]) for a dark picture of the consequences of status.

On the other hand, a “drive to be liked / admired” is just one piece of status-seeking, and maybe by itself it’s not all bad?? In particular, it seems like it would be nice to know how to make AGIs that follow human norms [LW · GW], and I think the “drive to feel liked / admired” is a major part of why humans follow human norms (see §4.5 below). Hence, if we make brain-like AGI [? · GW], a drive to be liked / admired might be a piece of that puzzle towards making it safe and beneficial.

(Incidentally, LLMs are not brain-like, and insofar as they seem to follow human norms, they do it via a very different path, as discussed here [LW · GW].)

That’s just food for thought. I don’t have a strong opinion right now. I want to make much more progress in assembling a more complete list of human innate social drives, and understanding their consequences, and only then revisit which of those drives (if any) we would want to put into future AGIs.

4.5 Our tendency to pick careers, preferences, clothes, beliefs, etc. that seem “high-status”

I think there’s a general tendency wherein, if people that I like / admire are doing Thing X, then I’ll be tempted to do X too. This applies to choosing careers, clothes, beliefs, behaviors, slang, and so on, and also includes subconscious “mirroring”. Incidentally, we might start thinking of these careers, clothes, beliefs, etc. as “high-status” or “prestigious”.

I think there are two different, mutually-reinforcing paths that lead to this same behavior:

4.5.1 Path 1: (I like / admire Alice) & (Alice likes X) → (I like X) → (I try to do X)

This path does not involve the “drive to feel liked / admired” of §4.4 above.

In fact, I don’t think this path requires any specific innate social brain mechanisms beyond the general concepts that I’ve already discussed in this series. Instead, I think it’s just the same thing as the phenomenon of §2.5.1 [LW · GW]: if different concepts “go together”, then TD learning will tend to push their respective valences towards each other. Thus, if the thought of Alice tends to evoke highly positive valence, and I often think about how Alice is doing Thing X, then the valence that my brain assigns to Thing X is liable to go up as well. And then, naturally (§2.4.3 [LW · GW]), I’m going to want to do Thing X myself (or at least, I’ll think it’s a good thing to do in general, even if it’s not really a good fit for me personally).

4.5.2 Path 2: (Alice likes X) → (if I do X, then Alice will like / admire me more) → (I try to do X)

This path does involve the “drive to feel liked / admired” of §4.4 above. It’s kinda the mirror image of the previous path: In Path 1, positive valence bleeds over from a person I respect to their idiosyncratic fashion accessories. Whereas here in Path 2, positive valence will bleed over from already-trendy fashion accessories to me, in the eyes of the people whom I like / admire—or at least, that’s what I’m imagining / fantasizing.

4.6 Our tendency to want people we like / admire to “lead”—i.e., to afford them more “social status”

The section heading is a reference to the term “leading” as defined in Social status parts 1/2: negotiations over object-level preferences (§1.2) [LW · GW], and to the term “social status” as defined in Social status part 2/2: everything else (§2.4) [LW · GW].

If I find myself with someone I greatly like / admire, I tend to defer to them in questions about what to do, where to go, etc. What’s happening in my brain, such that I do that? My answer is: it’s the same idea as the previous section.

Suppose Alice says “we should go to karaoke”. Bam, I have learned something important about Alice: she thinks karaoke is a good idea right now. So “Path 1” of the previous section says: my brain assigns positive valence to Alice, and then I think about how Alice likes karaoke right now, and so my brain increments the valence of karaoke-right-now (§2.5.1 [LW · GW]). And “Path 2” of the previous section says: I can expect that if Alice learns that I’m also enthusiastic about karaoke-right-now, then Alice’s brain will do the reverse thing, incrementing its valence for me—i.e., Alice will like / admire me marginally more, which in turn is strongly intrinsically motivating because of my “drive to be liked / admired” (§4.4 above).

Either way, the end result is that I’m trying to preemptively suss out Alice’s preferences and go along with them.

4.6.1 Side note: prestige versus dominance

In “dual strategies theory” (see Elephant in the Brain for a friendly introduction), there are two kinds of “status”, namely prestige and dominance. I think this is oversimplified, but pointing at something real. See my post Social status part 2/2: everything else [LW · GW] for extensive discussion.

Anyway, “liking / admiring” is centrally involved in “prestige”, whereas it has very little to do with “dominance”.

4.7 My self-esteem (i.e., the valence I assign to “myself”) is not the same as my tendency to be liked / admired. But it is strongly affected by that.

I have a self-concept too, and like all concepts, it has a valence—something like “how good or bad I feel about myself in general right now”. Let’s call that valence by the name “self-esteem”. Equivalently, this would be the extent to which I like / admire myself.

I claim that there’s a strong connection between self-esteem, and being liked / admired by other people (especially people whom you like / admire yourself). Here’s how I think that works:

As mentioned above (§4.5), we tend to settle into the same valence assignments as our friends and in-group. For example, if my friends and in-group think that Marvel movies are great, I’m liable to wind up feeling that way too, other things equal.

Well, by the exact same mechanism, if my friends and in-group (i.e., the people whom I like / admire) like / admire me in turn, then I’m liable to wind up feeling liking / admiring myself as well, other things equal. And conversely, if the people whom I like / admire tend to dislike and scorn me, then I’m liable to wind up disliking and scorning myself.

I’m not 100% sure, and I can’t prove it, but I don’t think there’s any direct innate drive for self-esteem to be high. I think we care about self-esteem only for reasons that directly or indirectly route through other people, and especially through the “drive to be liked / admired”. I think low self-esteem is demotivating / aversive only because of its above mental association with not being liked / admired, and conversely I think high self-esteem is motivating only because of its mental association with being liked / admired.

4.7.1 Connection to active self-concept formation, externalization of ego-dystonic tendencies, etc.

I think there’s also some connection between those ideas and self-concept formation. For example [LW · GW], a food snob might say “I love fine chocolate”, while a dieter might say “I have an urge to eat fine chocolate". These two people are talking about the same kind of brain signal, but the food snob is treating that signal as ego-syntonic and “internalizing” it as a core part of themselves, whereas the dieter is treating that signal as ego-dystonic and “externalizing” it as an unwelcome intrusion from outside their core self.

I think drawing the boundaries of a self-concept is (partly) a choice, and like all choices, my brain (tautologically—see §1.5.3 [LW · GW]) makes the choice that has higher valence.[5]

As discussed in §2.5 [? · GW], valence assignments are determined to some extent by every innate drive, in conjunction with a lifetime of experience including culture. But I do think that a major factor in self-concept formation in most people stems from the “drive to be liked / admired”.

When that drive is the determining factor in self-concept decisions, then for reasons discussed above, we’re not only making decisions that maximize “drive to be liked / admired”, but we’re also making decisions that maximize self-esteem. In other words, we’ll conceptualize ourselves in a way that makes us think most highly of ourselves, which correlates with making other people we like / admire think highly of us.

Thus, socially-disapproved (by people we like / admire, such as our in-group) behaviors tend to get externalized as ego-dystonic intrusions, as opposed to part of “our true self”. Similarly, rationalizations are concocted and memories distorted as much as possible in a way that vibes with in-group social approval, via motivated reasoning (§3.3) [LW · GW]. I think that’s the big kernel of truth behind the Robert Trivers self-deception school of thought.

Confusingly, things like humility and sincerity are often socially approved, in which case the very process described in this subsection will be downplaying its own existence! This would happen via the same mechanisms mentioned above, like externalization, rationalization, and other sorts of motivated reasoning (§3.3) [LW · GW].

4.8 Conclusion

I still have some lingering uncertainties, but the basic connection between valence, liking / admiring, and (one aspect of) social status seems really obvious to me in hindsight—almost trivial. And thus I find it weird that I don’t recall ever seeing it in the literature, or really anywhere else. (Old Scott Alexander blog posts are closest.) Has anyone else? I’m very interested to hear your thoughts, ideas, references, counterexamples, and so on in the comments section.

The next post [LW · GW] will be the last of the series, discussing how I think valence signals might shed light on certain aspects of mental health and personality.

Thanks to Rafael Harth, Seth Herd, Aysja Johnson, Justis Mills, Charlie Steiner, Adele Lopez, and Garrett Baker for critical comments on earlier drafts.

  1. ^

    The previous version is here [LW · GW]. I wrote it in December 2023, centered around an idea (which I still think is right, and remains the core of this new version) that there’s a very important phenomenon associated with people assigning valence to other people, and that this phenomenon has something to do with social status and “prestige”. But I was pretty confused about social status and prestige, so the post wound up with a core good idea along with a bunch of stuff that I no longer endorse. Then in February 2024 I read a bunch more about social status and wrote up my take in the pair of posts Social status parts 1/2: negotiations over object-level preferences [LW · GW] and Social status part 2/2: everything else [LW · GW]. Those posts were not mainly about valence, but part 2 referred back to my valence idea in a couple places, including (implicitly or explicitly) some corrections to what I had said before. So if you carefully read all three of those earlier posts, then you can probably figure out what I currently think about everything, and there’s not much new for you here in this post. But it occurred to me a few days ago that it’s annoying to make readers jump through hoops like that. New readers coming across the Valence series [? · GW] are entitled to read something that’s clean and self-contained and hopefully-mostly-correct. So I rewrote this post.

  2. ^

    In particular, there might be a “drive to feel feared” in parallel to my proposed “drive to feel liked / admired”. But if so, that’s out of scope for this post.

  3. ^

    Source: Martha Stout’s book: “After listening for almost twenty-five years to the stories my patients tell me about sociopaths who have invaded and injured their lives, when I am asked, “How can I tell whom not to trust?” the answer I give usually surprises people. The natural expectation is that I will describe some sinister-sounding detail of behavior or snippet of body language or threatening use of language that is the subtle giveaway. …None of those things is reliably present. Rather the best clue is, of all things, the pity play. …Pity from good people is carte blanche… Perhaps the most easily recognized example is the battered wife whose sociopathic husband beats her routinely and then sits at the kitchen table, head in his hands, moaning that he cannot control himself and that he is a poor wretch whom she must find it in her heart to forgive. There are countless other examples, a seemingly endless variety, some even more flagrant than the violent spouse and some almost subliminal.” Also, I’ve known two high-functioning sociopaths in my life (I think), and they were both very big into the “pity play”.

  4. ^

    More humdrum example: I happen to have a nerdy little kid, and sometimes he evidently has a very, very intense desire to tell me about the exciting thing that he did in Zelda. He begs me to listen. I can tell him a million times that I’m not gonna be impressed, and then I listen to what he has to say, and then immediately afterwards I could say “yup, I’m not impressed, I really don’t care, and I’m carrying a very heavy object up the stairs right now, can you please let me by?” And it wouldn’t put him off for a second! He’s just delighted to have shared his story, and he’ll do the same thing tomorrow with equal enthusiasm. I think he’s just typical-mind-fallacy [? · GW]-ing me really hard. I can tell him that I’m unimpressed, but deep in his subconscious, he doesn’t really believe me. His mental model of discovering the Zelda secret has such a high valence in his head, that when he does a transient empathetic simulation [LW · GW] of me thinking about that same discovery, he imagines my brain assigning it a super-high valence too, no matter how much I protest that my brain isn’t doing that. (It’s very cute, and I’m sure I did the same thing when I was a nerdy little kid!)

  5. ^

    I say “my brain makes the choice that has higher valence” rather than “I make the choice that has higher valence”, because the choice concerns what the definition of “I” is! It can be kinda mind-bending to think about. I’ll leave it there rather than getting into a long off-topic digression.


Comments sorted by top scores.

comment by Gunnar_Zarncke · 2024-06-11T10:03:55.562Z · LW(p) · GW(p)

If I like / admire Tom, and I have a thought wherein I imagine Tom to be liking / admiring me in turn, then that thought is positive valence, a.k.a. intrinsically motivating.

I wonder how you think the brain is going to reinforce such thoughts. One layer I get. It could be via the thought assessor that is responding to "presence" of others and the associated valence. But there is no nesting of such assessors in the brain, right?

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-06-11T13:26:29.577Z · LW(p) · GW(p)

I mean, I don’t really know—it’s something I’m still trying to pin down. But just as an example, let’s say the spatial attention as a “tell” story [LW · GW] is true (which I’m not at all confident in). Let me try.

So there’s 3 relevant brainstem signals.

  • The brainstem has a “center of spatial attention” signal that zig-zags around the local environment
  • The brainstem has a “valence” signal that jumps up and down multiple times per second
  • The brainstem has a “is-a-person” signal that combines visual, sound, smell and other heuristics built into the genome. (And whenever the signal triggers, it simultaneously pulls spatial attention (and other types of attention) onto the (apparent) person.

Going along with these,

  • Probably all three of these above signals get sent into the world-model as interoceptive sensory input (see §1.5.4 [LW · GW])
    • …and as a result we can imagine a valence, and (probably) imagine a center of spatial attention, and (probably) imagine an is-a-person feeling, separately from what the real current brainstem settings are.
  • Each of these three above signals are associated with Thought Assessors
    • So there’s a “valence guess” (a.k.a. value function, see Appendix A [LW · GW]), and a “spatial attention guess” and a “is-a-person guess” Thought Assessor, each trained over a lifetime of experience via brainstem ground truth.

Now I’m hanging out in my bed staring at the ceiling, and it occurs to me that Zoe would be very proud of what I did this morning, and that thought is inherently motivating. How does that work? I think it’s actually not just one thought but two consecutive transient thoughts, i.e. following each other within a fraction of a second:

  • The first transient thought is not a transient empathetic simulation, but rather just that I’m thinking about Zoe from my own perspective.
  • The second transient thought is a transient empathetic simulation of something that I think is happening in Zoe’s mind, and that thought happens to be about me.

Then some cell group in the hypothalamus or brainstem would be looking for the following two tell-tale signs:

  • The first thought has “spatial-attention guess” away from my body, and the second thought has “spatial-attention guess” at my body.
  • Both thoughts have a high value of “is-a-person guess”.

When both of those are present, then it’s time for this little cell group to spring into action! And now what happens is:

  • The “valence guess” of the first thought is implicitly interpreted as “how much I like / admire [this other person]”
  • The “valence guess” of the second thought is implicitly interpreted as “how much I feel like [this other person] likes / admires me”
  • The cell group multiplies those two values together (well, maybe not literally multiplication, but it performs some function of two inputs that increases with each of its inputs when the other is positive) to get the “inherent value” of the second thought, and it feeds that value into the (ground-truth / brainstem / “override” [LW · GW]) valence signal.

I haven’t thought this through in great detail and am interested in criticism.  :)

Replies from: Gunnar_Zarncke, Gunnar_Zarncke
comment by Gunnar_Zarncke · 2024-06-12T10:31:26.007Z · LW(p) · GW(p)

If the story is right there should be a lot of interesting effects and potential experiments related to mirrors. After all, your attention on your image in the mirror is outside your body. Maybe narcissism results from too much attention on your mirror image? What is the effect of seeing yourself in a video feed all the time? Seeing yourself in photos?

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-06-12T18:52:42.412Z · LW(p) · GW(p)

Interesting thought, thanks!!

I think that after a lifetime of looking in the mirror, your brain has long ago stopped applying “is-a-person” to the image.† Like, I still “feel alone” if there’s a mirror in my room.

I can’t immediately find examples of adults (or articulate kids) seeing themselves in the mirror for the first time—for example, you can see your reflection in still water which is very common in the world. (There’s a famous-ish video from Papua New Guinea of people seeing a mirror for the first time but it’s apparently at least somewhat fake and maybe very fake.)

You can however buy a non-left-right-reversing mirror made from a pair of front-reflecting mirrors at a precise 90° angle. So your reflection doesn’t quite look like a normal mirror, nor does it move like a normal mirror (you move left and the reflection moves right). And people indeed seem to have strong feelings in reaction to seeing their reflection in this kind of mirror the first time—see e.g. NYT, collection of testimonials. I don’t think that really proves anything specific, but I guess it’s vaguely compatible with the kind of story I was imagining.


† How does that work? I.e., how do we learn from experience to not trigger “is-a-person” (much or at all) upon seeing ourselves in a mirror? Umm, one thing might be, certain sensory inputs cause startle / orienting reactions along with physiological arousal, but those reactions can be suppressed if we predict (“expect”) the inputs in complete detail, e.g. you can’t startle yourself by moving your arm in front of your eyes, and likewise a marionette being controlled by someone else offstage seems less “alive” than a marionette that you’re controlling with your own hand. Maybe the presence of unpredicted motion / sound is an especially important heuristic behind the is-a-person ground truth. If so, after enough time with a mirror you develop an excellent predictive model of what the image will do, such that there’s nothing left to trigger any amount of startle / orienting / salience. Also, people with schizophrenia can spend hours staring at themselves in the mirror, apparently, which would (in my view [LW · GW]) be an exception that proves the rule, related to their cortex’s failure to build an excellent predictive model of what their mirror image will do under different circumstances, so the startle and physiological arousal never goes away, and it remains mesmerizing.

I don’t think photos trigger the “is-a-person” detector appreciably—at least not for us western adults who are very used to them. Again, there’s a lack of self-generated motion and physiological arousal. I guess videos do trigger it to some extent, and people do indeed seem to enjoy watching people interact on TV for hours straight, which suggests it’s triggering something inherently motivating. And people also react socially to TV / movie characters, e.g. wanting revenge on the bad guy. As for seeing myself on video, I don’t do that often enough to have any opinion about that. … when I do, my strong feelings of embarrassment overwhelm everything else. :-P

Replies from: Gunnar_Zarncke
comment by Gunnar_Zarncke · 2024-06-12T22:47:04.506Z · LW(p) · GW(p)

Some more ideas: 

  • Show people videos of themselves from a while back, so they don't remember the details and can't predict well and see if that triggers admiration.
  • Let people imagine that the appreciation goes to their younger self, or a character they are playing, or to a person/memory, say their grandparents, they have "in their hearts".
  • Something something identical twins. 
comment by Gunnar_Zarncke · 2024-06-11T21:12:09.491Z · LW(p) · GW(p)

OK. That story does make a lot of sense to me. It doesn't require nesting, but just combining ("multiplying") two signals that are present at the same time. To check that I understand: That hypothetical cell group reinforces positive thoughts about another person focused on them co-occurring with positive thoughts about another person focused on oneself, correct?

That should be testable! The co-occurrence time can't be too long, so separating the parts of the thought sufficiently should make it less rewarding. That can plausibly be tested retrospectively as well as traditionally with neuroimaging. The centers of self and other attention seem to be known (MPFC/PCC vs. TPJ/STS). See Neural activity associated with self-reflection. It should also be possible to implement this in simulated agents.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-06-12T19:29:27.876Z · LW(p) · GW(p)

I dunno … I think much of human cognition involves rapid-fire sequences of several “thoughts” within a fraction of a second, and those “thoughts” all blend together on fMRI because the time-resolution of fMRI is bad. EEG & MEG have much better time resolution but much worse spatial resolution. ECoG is a bit better but rare. Etc. Then there’s a bunch of monkey experiments, and even more rodent experiments, but those are hard to interpret in this context (e.g. you can’t ask monkeys what they’re thinking about; instead scientists tend to make things up and I often wind up thinking they got it wrong). I guess it’s not entirely impossible that something useful could be gleaned from EEG/MEG experiments. I haven’t looked into that.

The real goldmine would be finding the relevant cluster of cells in (I’d guess) the hypothalamus, but that would be extraordinarily hard in humans. There’s almost no data on the human hypothalamus. Not too much in monkeys either (although I haven’t yet tried looking specifically, it’s on my to-do list), and rodents might not even have a “drive to feel liked / admired” in the first place.

Also, I’m not sure I agree with “The centers of self and other attention seem to be known (MPFC/PCC vs. TPJ/STS).” For example, your link in the next sentence talks about TPJ activation during “self-reflection and -perception” (they call it “inferior parietal lobule” rather than “TPJ” but IPL is a subset of TPJ, and it’s the subset which is relevant to this discussion, IMO). And conversely here’s a paper mentioning PCC in the context of “emotional empathy”.

Again, I do hope to spend some time thinking in more detail about the neuroscience implementation of the alleged “drive to feel liked / admired”, and probably write another post. Hopefully this would happen in the next couple months. So I appreciate the discussion, keep it coming :)

(I can’t follow your first paragraph enough to agree or disagree, sorry.)

Replies from: Gunnar_Zarncke
comment by Gunnar_Zarncke · 2024-06-12T22:36:26.246Z · LW(p) · GW(p)

OK, let's forget about the testability via neuroscience. But it could be testable by introspection. One could ask people how admiring they consider different scenarios and the scenarios would be constructed in a way that the two thoughts are separated in time to different degrees. It should also be possible to construct though sequences that are superficially not about admiration, but should also trigger the feeling.

Zoe would be very proud of what I did this morning.
I’m thinking about Zoe (is-person) from my own perspective (spatial attention far)
I think of something positive in Zoe’s mind (is-person) about me (spatial attention near)


Zoe was feeling good while I was watching her.

= same pattern (should also feel like admiration).


Zoe was proud, but I forgot about what it was. Ah yes, it was something I did this morning.

= same pattern, but more separated in time.

It is, of course, likely in the latter case that people construct the co-occurring thoughts, but the admiration should, on average, feel less.

comment by Seth Herd · 2024-06-10T19:22:02.949Z · LW(p) · GW(p)

Very nice. The explanation is definitely smoother here. This rings true to my knowledge og cognitive psychology, and my limited knowledge of clinical and social psychology.

The valence series is a pretty good attempt at an explanation of why people do everything they do, including choosing their beliefs. Which is obviously pretty important for a variety of purposes, alignment not least among them.

So I think you're underselling it, but I guess that's better than overselling it.

comment by Carl Feynman (carl-feynman) · 2024-06-10T17:29:12.138Z · LW(p) · GW(p)

Thanks for rewriting this. It’s clearer— the old posts I either didn’t understand or found them tautological.

You have examples of the effect of “liking/admiring”, so I have a pretty clear understanding of what you’re getting at. But you didn’t say anything about the causes. What are the business rules in the steering system by which it is applied?

Five minutes of thought suggests “increase the valence of people who demonstrate skill superior to the average” and “decrease the valence of those who hurt you” and “increase the valence of those who incur a cost to your benefit.” But it’s debatable whether those notions are simple enough to be business rules built into the hypothalamus.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2024-06-10T18:09:52.151Z · LW(p) · GW(p)

Thanks for the kind words!

What are the business rules in the steering system by which it is applied?

I’m not aware of anything special—just the same general idea as in §2.5 [LW · GW]. Valence transfers from concepts to other concepts when they immediately (within a fraction of a second) follow each other, and also, valence can come from innate drives.

I think all your examples are plausibly consistent with that kind of thing. For example:

  • If Beth has negative valence on getting punched (as is very common!), and then Alice punches Beth, then in Beth’s brain (as she replays the memory over and over), she’s thinking about Alice, and then a fraction of a second later she’s thinking about getting punched by Alice; and then the negative valence of the latter will transfer over to Alice by TD learning (as in §2.5 [LW · GW]).
  • If Beth has (for some reason) previously come to believe that skateboarding is really cool (positive valence), then Alice can get some liking / admiration by being a good skateboarder. The story would be: in Beth’s brain, she’s sometimes thinking about Alice and then a fraction of a second later she’s thinking about “skillful skateboarding”, and so some of the preexisting positive valence on good skateboarding will transfer over to Alice by TD learning (as in §2.5 [LW · GW]).
  • If Beth likes getting gifts, and Alice gives her a gift … etc.
Replies from: carl-feynman
comment by Carl Feynman (carl-feynman) · 2024-06-10T18:43:16.942Z · LW(p) · GW(p)

Quite right.

The valence-sharing mechanism accounts for the effectiveness of television advertising that makes no rational sense.  An ad shows happy laughing young beautiful people on a beach, enjoying Pedro’s Tortilla Chips, with no logical connection.  Show the ad 100 times and the positive valence of happy laughing young beautiful beach people transfers over to Pedro’s Tortilla Chips.  Then next time you’re in the store you reach for Pedro’s without knowing why.

Portions of the social purpose of the ”like/admire” system could be replaced by rational negotiation.  If we were soulless, we would note who is particularly good to affiliate with, and then agree with them that you will help each other as needed.    But that assumes the existence of language, which is possibly no more than 50,000 years old.  The like/admire system is much older than that: my dogs love me, and are loyal and obedient.  Or, to be precise, they act out an emotion toward me that looks homologous to the human like/admire feeling.