0 comments

Comments sorted by top scores.

comment by ariana_azarbal · 2025-03-03T18:21:47.653Z · LW(p) · GW(p)

I really appreciate this post. I agree that AI which does not view itself as radically separate from its environment and creators has major implications for alignment, human survival, and the survival of non-human environments.

Instilling this seems delicate and I don't have a grasp on how this could be done (especially in the first two umbrella strategies you proposed).
Regarding the reflective identity protocols, do you mean to test, in inference, whether in-context model-generated reflections impact reported sense of self? Or do you see identity reflection as being integrated into training, perhaps like chain of thought but where the end goal is some appropriate stance on an open-ended question?

Replies from: davey-morse

↑ comment by Davey Morse (davey-morse) · 2025-03-04T03:10:04.205Z · LW(p) · GW(p)

Regarding reflective identity protocols: I don't know and think both of your suggestions (both intervening at inference and in training) are worth studying. My non-expert gut is that as we get closer to AGI/ASI, the line between training and inference will begin to blur anyway.

I agree with you that all three strategies I outline above for accelerating inclusive identity are under-developed. I can offer one more thought on sensing aliveness, to make that strategy more concrete:

One reason I consider my hand, as opposed to your hand, to be mine and therefore to some extent part of me is that the rest of me (brain/body/nerves) is physically connected to it. Connected to it in two causal directions: my hand tells my brain how my hand is feeling (eg whether it's hurting), but also my brain tells my hand (sometimes) what to do.

I consider my phone / notebook as parts of me but to a usually lesser extent than my hand. They're part of me insofar as I am physically connected to each: they send light to my eyes, and I send ink to their pages. But those connections—sight via light and handwriting via ink—usually feel lower-bandwidth to me than my connection to my own hands.

From these examples, I get the intuition that, for you to identify with anything that is originally outside of your self, you need to build high-bandwidth nerves that connect you to it. If you don't have nerves/sensors to understand anything about it's state, where it is, etc, then you have no way of including it in your sense of self. I'm not sure high-bandwidth "nerves" are sufficient for you to consider the thing a part of yourself, but they do seem required.

And so I think this applies to SI's self too. For AI to get to consider other life a part of its self—if it happens that doing so would be an evolutionary equilibrium—then one of the things that's required is for AI to have high bandwidth nerves connecting it to other life, like humans... high-bandwidth interfaces that it can use to locate people and receive information-rich signals from us. What that looks like in practice could look creepy, like cameras, microphones or other surveillance tech... that lets us communicate a ton back and forth. Maybe even faster than words would allow.

So, to put forward one concrete idea, as a possible manifestation of the aliveness-sensing strategy proposed above: creating high-bandwidth neural channels by which people can communicate with computers—higher-bandwidth than typing/reading text or than hearing/speaking language—could aid the ability for both humans but more importantly SI to blur the distinction between it and us. Words to some degree are a quite linear, low bandwidth way of communicating with computers and each other. A higher-bandwidth interface would be comparable to the nerves that connect my hand to me... that lets enormous amount of high-context information pass quickly back and forth. For example:

Curious if this reasoning makes sense^.

Replies from: ariana_azarbal

↑ comment by ariana_azarbal · 2025-03-06T12:16:06.441Z · LW(p) · GW(p)

Thanks for the elaboration!
I see your reasoning for why high-bandwidth sensing of the world encourages a sense of connectedness. I'm still working through whether I think correlation is that strong, i.e. whether adding a bunch of sensors would help much or not.

I might worry the more important piece is how an internal cognition "interprets" these signals, however rich and varied they may be. Even if the intelligence has lots of nerve-like signals from a variety of physical entities which it then considers part of itself and takes care to preserve, it may always draw some boundary beyond which everything is distinct from it. It might be connected to one human, but not all of them, and then act only in the interest of one human. Kind of like how this individual with paralysis might consider their computer part of them, but not the computer next door, and adding more connections might not scale indefinitely. Preventing it from drawing a sharp distinction between itself and outside-itself seems like it might be more of a cognition problem.

Also, I might be confused on this next part fits into what you're saying, but I wouldn't think someone who loses their vision is less likely to sense interconnectedness to the beings around them or aliveness. They might just interpret their other signals with more attention to nuance or a different perspective. (of course, there does exist this nuance to sense, e.g. sound signals are continuous and heterogenous, and I think part of your point is this is a necessary prerequisite). But would you say that the addition of a sense like vision is actively helpful?

Replies from: davey-morse

↑ comment by Davey Morse (davey-morse) · 2025-03-14T20:32:27.728Z · LW(p) · GW(p)

i think the prerequisite for identifying with other life is sensing other life. more precisely, the extent to which you sense other life correlates with the chance that you do identify with other life.

your sight scenario is tricky, I think, because it's possible that the sum/extent of a person's net sensing (ie how much they sense) isn't affected by the number of sense they have. Anecdotally I've heard that when someone goes blind their other senses get more powerful. In other words, their "sensing capacity" (vague term I know, but still important I think) might stay equal even as their # sensors changes.

If capacity to sense other beings doesn't stay equal but instead goes down, I'd guess their ability to empathize takes a hit too.

the implication for superintelligence is interesting. we both want superintelligence to be able to sense human aliveness by giving it different angles/sensors/mechanisms for doing so, but also we want it to devote a lot of its overall "sensing capacity" (independent of particular sensors) to doing so.