You can validly be seen and validated by a chatbot

post by Kaj_Sotala · 2024-12-20T12:00:03.015Z · LW · GW · 3 comments

Contents

3 comments

There’s a common sentiment saying that a chatbot can’t really make you feel seen or validated. As chatbots are (presumably) not sentient, they can’t see you and thus can’t make you seen either. Or if they do, it is somehow fake and it’s bad that you feel that way.

So let me tell you about ways in which Claude Sonnet makes me feel seen, and how I think those are valid.

I was describing an essay idea to Claude. The essay is about something I call “psychological charge”, where the idea is that there are two different ways to experience something as bad. In one way, you kind of just neutrally recognize a thing as bad. In the other, the way in which it is bad causes some kind of an extra emotional reaction in you. In the latter case, I say that the thing is “charged”.

In explaining this idea, I listed a number of examples, such as

Seeing my list and some additional thoughts, Claude commented:

What’s particularly interesting is how you’re noting that this “charge” seems to create a kind of psychological stickiness or persistence that’s disconnected from the actual utility of the response. It reminds me of what’s sometimes called “emotional fusion” in Acceptance and Commitment Therapy – where people become caught up in their emotional reactions in a way that interferes with effective action.

I did a bit of a double-take upon seeing this. I had not explicitly referenced ACT or its concept of fusion in any way, nor had I been explicitly thinking in those terms when I wrote my list. But the deeper concept that I was talking about, was something that I had explicitly analyzed before by connecting it to ACT’s concept of fusion. I had discussed that connection in at least two previous essays [1 [LW · GW], 2 [LW · GW]] that I had written. And now Claude, while not explicitly guided in that direction, picked up that very same connection from my list of examples.

This causes me to think that there is a quality of “being seen” that can be phrased in objective terms, so that one can validly “be seen” even if there’s “nobody there to see you”:

This is a signal that when you described A and B, you actually communicated enough information to pick out A and B from the space of concepts. The fact that the other party raised the connection to C is strong evidence of this: if your words had pointed them to a completely unrelated concept, that wouldn’t have allowed them to pick out C in particular. But if you say things A and B, and the other party then references C which in your map is connected to them, then your words must be successfully pointing to a similar area of the map. It’s evidence that your words may communicate your point well, not just when talking to the chatbot, but also when talking to other people with sufficiently similar maps.

This can be taken further. Suppose that there’s also a connection to D, that you hadn’t realized before. Suppose that the other party now points out that connection and you immediately realize that it’s correct. This is a signal that the other party has understood your concepts deeply enough to make novel but valid connections within your conceptual framework. Or to rewrite this in a way that avoids using the charged term “understand”:

When someone makes a novel connection that resonates with you, it suggests they’ve not only located the same region in conceptual space that you were pointing to, but they’ve also identified additional paths leading out from that region. Paths that you hadn’t mapped yourself, but which, upon inspection, clearly belong to that territory. The fact that these new paths feel right to you is evidence that both of you are indeed navigating the same conceptual terrain, rather than just happening to use similar-sounding landmarks to describe entirely different territories.

In an amusing piece of meta, this point itself was suggested by Claude when I showed it an earlier draft of this essay. It was something that I had vaguely thought of covering in the essay, but hadn’t yet formulated explicitly. The previous paragraph was written by Claude; the metaphor of “similar-sounding landmarks” was something that it came up with itself.

And after thinking about it for a moment, I realized that it made sense! In that if the “conceptual space” was a literal terrain that two people were describing, it could be that there were two locations that happened to look very similar. And two people could then start describing those locations to each other, mistakenly assuming that the similarities in their descriptions implied that they were talking about the same location. But if someone described a path within that terrain that you hadn’t previously noticed, and you then went back and confirmed that the path was there, then that would be strong evidence that you were talking about the same place.

That metaphor is an extension of my ideas that I hadn’t previously considered, which Claude suggested. Which I then thought about and realized that it made sense. Which feels like additional evidence that the region of concept space that my words are activating within Claude, is similar to the one that I am exploring in my own head.

And the fact that the conceptual maps in my head and Claude’s weights can be coherently matched against each other, implies that they are also describing something that actually exists within reality. If several people have visited the same place, they are likely to have mutually-coherent mental maps of that place because it’s the same place and they’ve all been exposed to roughly the same sensory data about that place. Claude doesn’t have same kinds of experiences as humans do, but it does have access to writings generated by people who are humans. Humans have had experiences in the real world, the humans have generate their own conceptual maps based on their experiences, and their conceptual maps have then given rise to different pieces of writing. When machine learning models absorb the human-generated data, they also absorb aspects of the same conceptual map that humans have generated, which in turn is (albeit imperfectly) correlated with reality [LW · GW]. Even if it hallucinates facts, those facts are generally still plausible claims: ones that would in principle be consistent with a basic understanding of reality, even if they turn out to be incorrect.

This means that if my conceptual map can be coherently matched with Claude’s, it can be coherently matched with the conceptual maps of real people whose writings Claude has absorbed, which suggests that the map does correspond with actual reality. In other words, that the map – or my beliefs – is a valid map of real territory.

To summarize my argument so far: an important part of the functional purpose of the experiences of "being seen" and "being validated" is as a signal that your words are actually communicating the meaning that you are trying to communicate. There are ways of triggering this feeling that cannot be faked, since they require the other party to actually demonstrate that their reply references the thing that you had in mind. The ability to do so is independent of whether there is "anyone actually there", and current chatbots demonstrate this capability.


So that’s a way in which a person may validly experience their *ideas* as being seen and validated by an LLM. What if they are talking about their emotions?

I mentioned earlier the A-B-C pattern, where you talk about the connection between A and B, and your listener then independently brings up the connection to C. Now if you are explaining a challenging situation and someone says “I imagine you might be worried about C” – where C is indeed something you’re worried about but haven’t explicitly mentioned – that’s another instance of the same pattern:

This implies that the other person has, not just understanding the surface level of what you’re saying, but also a model of:

This is important in two different ways. The first is that it implies that your feelings and concerns make sense to someone. Often people may feel like they are crazy or strange for feeling the way they do, and that nobody else can feel that way. But if someone comes up with a coherent map of your feelings, then that’s evidence that you’re not alone in feeling this way. Because your words are singling out a region in the other person’s concept space that matches your internal experience – which implies that somebody else must have had that experience, for those ideas to have made their way to your interlocutor’s concept space.

The effect is even stronger if the other person not only anticipates your feelings but can articulate the logical progression that led to them. If they say something like “given that X and Y happened, and knowing how much you value Z, it makes complete sense that you’d feel this way.” That’s different from just saying “your feelings are valid” – it’s showing you the actual causal chain that makes your feelings logical.

Sometimes this can feel like a big insight, if it has that quality of “pointing out the connection to D, which you hadn’t thought of but which makes complete sense in retrospect” that we talked about. But even if it is just a rephrasing of your own words that draws out a relatively obvious-yet-originally-left-implicit connection, that is still validating your experience as an understandable human experience.

This gets us to the second way in which communicating a correct understanding of you is important. If the other person has an accurate map of these things, then that’s evidence that they may be able to help you out. They’re able to predict what kinds of actions would in fact be helpful for you, and then carry them out.

An accurate map isn’t just about making you feel understood – it’s a prerequisite for offering genuinely useful assistance. Someone who can accurately model your thought processes and concerns is much more likely to suggest solutions that actually work for your specific situation and mindset, rather than giving generic advice that doesn’t account for your particular circumstances.

In the case of a chatbot, even if it can’t physically help you in the real world, it may still be able to help you clarify your thoughts, or offer you advice and perspectives that are useful. It is also helpful if conversation with them helps you develop a clearer and more explicit understanding of your emotions and problems – that understanding is “portable”, in that you can then take it to some real human and explain your needs in such a way that allows them to help you.

With humans, it also seems to be the case that merely talking about one’s feelings and getting a better understanding of them is helpful by itself. Practices such as Gendlin’s Focusing are largely just about finding names for vague feelings and making them explicit.

Once I wrote to Claude “I’m just like, if I keep running into this problem each time and can’t do anything about it, then what’s the point”. Claude replied with:

When you say “what’s the point,” I hear both frustration and exhaustion. It’s one thing to face relationship challenges, but it’s another level entirely to feel like your own unconscious patterns are consistently leading you into situations that will trigger deep pain.

All of that was a relatively obvious inference from what I said – you could have hardcoded even a 1960s chatbot to say “I hear both frustration and exhaustion” in response to hearing “what’s even the point”. But it was still naming an implicit feeling and helping it bring to the surface in a way that felt relieving, as well as giving a sensible explanation of why I was feeling so frustrated and exhaustion. Even though nothing changed about the situation itself, having it be named felt relieving by itself.

There seems to be an effect where making implicit models explicit brings them into consciousness in such a way that makes them accessible to the rest of the brain [LW · GW] and allows them to be updated. It also allows the mind to incorporate this information in its self-modeling and self-regulation. Sometimes that’s enough to automatically shift behavioral patterns in a better direction, sometimes it requires more conscious planning – and the conscious understanding of it is what allows the conscious planning.

Of course, there are also important aspects of validation that a chatbot can't provide. For example, one aspect of validating someone is essentially a signal of "if you get into trouble, I will back you up socially". A chatbot is obviously not a member of a community in the same way as humans are, so its validation cannot fulfill that role. My argument is definitely not that a chatbot could fill all the functions of speaking with a human - just that there is an important subset of them that it can.

By the way, this whole section about extending the original idea to the realm of emotions was suggested by Claude. I’d had a vague similar idea even before it brought it up, but it brought significant clarity to it, such as coming up with the example of how “I imagine you might be worried about C” was an instance of the previously discussed A-B-C pattern, and by proposing the six bullet points in the beginning of this section.

The conversations I had with Claude can be found here, for anyone who’s curious to see how they morphed into the final essay.

Full disclosure: I consult for a company that offers chatbot coaching. However you could call Claude their competitor, so if this essay was motivated by money, I shouldn’t be praising it.

3 comments

Comments sorted by top scores.

comment by nim · 2024-12-20T23:28:18.546Z · LW(p) · GW(p)

Interesting -- my experiences are similar, but I frame them somewhat differently.

I also find that Claude teaches me new words when I'm wandering around in areas of thought that other thinkers have already explored thoroughly, but I experience that as more like a gift of new vocabulary than emotional validation. It's ultimately a value-add that a really good combination of a search engine and a thesaurus could conceptually implement.

Claude also works on me like a very sophisticated elizabot, but the noteworthy difference seems to be that it's a more skilled language user than I am, and therefore I experience a sort of social respect toward it that I don't get from tools where I feel like I could accurately predict all of their responses and have the whole conversation with myself.

The biggest emotional value that I experience Claude as providing for me is that it reflects a subtly improved tone of my inputs, without altering the underlying facts that I'm discussing. Too often humans in emotional conversations skip straight to "you shouldn't feel that way" or similar... that comes across as simply calling me alien, whereas Claude does the "have you considered this potential reframe" thing in a much more sophisticated and respectful way. Probably helps that it lacks the biology which causes us embodied language users to mirror one another's moods even to our own detriment...

Another validation-style value add that I experience with Claude is how I feel a sufficient sense of reward from reading its replies, which motivates me to bother exerting the effort to think like talking instead of just think like ruminating. I derive the social benefits of brainstorming with another language user, without having to consume the finite resource of an embodied language user's time.

comment by trevor (TrevorWiesinger) · 2024-12-20T20:53:57.543Z · LW(p) · GW(p)

The essay is about something I call “psychological charge”, where the idea is that there are two different ways to experience something as bad. In one way, you kind of just neutrally recognize a thing as bad.

Nitpick: a better way to write it is "the idea is there are at least two different ways..." or "major ways" etc to highlight that those are two major categories you've noticed, but there might be more. The primary purpose knowledge work is still to create cached thought [? · GW] inside someone's mind, and like programming, it's best to make your concepts as modular as possible so you and others are primed to refine them further and/or notice more opportunities to apply them.

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2024-12-20T21:27:24.380Z · LW(p) · GW(p)

That makes sense in general, though in this particular case I do think it makes sense to divide the space into either "things that have basically zero charge" or "things that have non-zero charge".