We Can Build Compassionate AI
post by Gordon Seidoh Worley (gworley) · 2025-02-25T16:37:06.160Z · LW · GW · 1 commentsThis is a link post for https://uncertainupdates.substack.com/p/we-can-build-compassionate-ai
Contents
To create a compassionate mind, be it human or AI, that mind has to also want to be compassionate. None 1 comment
Compassion is, roughly speaking, caring for others and wanting the best for them.
Claim: We can build AI that are compassionate.
The above definition is insufficiently precise to construct an objective function for an RL training run that won't Goodhart, but it's good enough to argue that compassionate AI is possible.
We should not expect AI to be automatically compassionate, though, because most things in the world don't act compassionately. Plants and rocks and things just do what they are going to do with no regard for anything else, except insofar as those other things get in their way. This is roughly what AI is like now.
Animals have a little more capacity for compassion, with some animals, like mammals and birds, having a decent amount. That's because they evolved to cooperate, and while cooperation isn't compassion, it is a starting point for developing it.
Some animals are really good at cooperating. Specifically cooperating within-lifetime, meaning they learn to cooperate in complex ways that they weren't born knowing how to do. Our closest relatives, the great apes, show perhaps some of the greatest ability to cooperate by adapting to their conditions, and they, along with several other animals, show signs of compassion.
But sometimes compassionate animals are brutal. Compassion is something they can express, but it does not dominate their behavior. A chimp might show care and concern one minute, then murder a monkey the next.
We humans are much like our primate relatives when it comes to compassion. The anthropological record is pretty clear on that. The default state of humans is to live in small bands and to be at constant war. Yet somehow we now live in a world full of people that care about each other. How did we get here?
We can't know for certain, but it seems likely that horticulture led to agriculture led to increasingly larger numbers of people being forced to cooperate to survive. This kickstarted a 10,000 year process of exponentially increasing populations and thus exponentially increasing cooperation. This increasing cooperation set the stage for increasing amounts of compassion.
Then, about 2500 years ago, a few people realized that wide-spread cooperation was not enough to create the kind of world they wanted to live in—one where everyone is safe and flourishes. They realized they could get people to do more than cooperate: they could get them to care about each other beyond their own self interest. And thus we saw, in the Axial Age, the rise of universal religions and philosophies that, among other things, taught compassion.
Now I'm not going to claim that universal religions have been a purely good force in the world. But universal religions have been, for the last couple millennia, the main way we've trained people to be more compassionate. We've even created whole classes of professionally compassionate humans who's job is to demonstrate and teach compassion.
Religions have been able to teach compassion because we have general intelligence that can be applied to the problem of modeling others minds. It allows us to think about what it would be like to be another person, and then model actions that would be to their benefit. And since modeling what others want is definitionally necessary to compassion, general intelligence is a requirement for it. That apes and other animals lack the same measure of general intelligence as we humans have is why they are less capable of consistently acting compassionately.
Humans are, in my estimation, just barely able to be consistently compassionate, and most people fail to do it. If we were a little less generally intelligent, I don't think we would be as capable of consistent compassion, both on the whole and individually. But we are capable, and religions and movements have been founded by the rare people who are able to act compassionately all the time.
AGI will be more generally intelligent than us. It will be able to model our minds better than we can. It will thus have a greater capacity for compassion than we do, and will in theory be able to exhibit compassion more consistently because it will less be subject the same limitations of mind that we humans face.
But general intelligence and modeling other minds are merely necessary conditions for compassion, not sufficient.
To create a compassionate mind, be it human or AI, that mind has to also want to be compassionate.
In humans, our innate care for self and kin is the seed of compassion. We can think of others whom we care about as if they were ourselves, and model them as if they were similar to us, and from this bootstrap towards caring about people and things more different from us, with an endpoint of seeing oneself as not separate from the whole of the world, and caring about all the world as if it were oneself.
It also helps we humans are forced to cooperate to get what we want. We usually can't act selfishly and be happy because we must rely on each other to get what we want. If care of kin pulls people towards compassion, the need to cooperate pushes them towards it. When we are forced to cooperate, we experience what it is like to be unable to get what we want without help, and sometimes we fail and experience dissatisfaction. This creates the experiential knowledge needed to empathize with the suffering of others.
But we cannot rely on the same forces to make AI compassionate! AGI will be much smarter than us. It will be more powerful. And it will have no innate feelings of kinship because it didn't evolve in an environment that would make it care about its kin that carry its non-existent genes.
But we can make AI that cares about all life. We know it should be possible because there are already people who care for all life. The only question is, how do we do it? How do we make an AGI that is not merely capable of compassion, but cares enough to act compassionately for the benefit of all beings?
Sadly, I don't have the answers. Some people are looking for them. I talked to some of them this past weekend at EA Global. I hope they succeed in time. Maybe you can help them!
1 comments
Comments sorted by top scores.
comment by Davey Morse (davey-morse) · 2025-02-25T23:19:25.571Z · LW(p) · GW(p)
I'm thinking along similar lines and appreciate your articulation.
"How do we make... [self-interested] AGI that cares enough to act compassionately for the benefit of all beings?" Or: under what conditions would compassion in self-interested AGI be selected for?
Not a concrete answer, but the end of this post gestures at one: https://www.lesswrong.com/posts/9f2nFkuv4PrrCyveJ/make-superintelligence-loving