Nonperson Predicates

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-12-27T01:47:32.000Z · LW · GW · Legacy · 177 comments

Followup toRighting a Wrong Question, Zombies! Zombies?, A Premature Word on AI, On Doing the Impossible

There is a subproblem of Friendly AI which is so scary that I usually don't talk about it, because very few would-be AI designers would react to it appropriately—that is, by saying, "Wow, that does sound like an interesting problem", instead of finding one of many subtle ways to scream and run away.

This is the problem that if you create an AI and tell it to model the world around it, it may form models of people that are people themselves.  Not necessarily the same person, but people nonetheless.

If you look up at the night sky, and see the tiny dots of light that move over days and weeks—planētoi, the Greeks called them, "wanderers"—and you try to predict the movements of those planet-dots as best you can...

Historically, humans went through a journey as long and as wandering as the planets themselves, to find an accurate model.  In the beginning, the models were things of cycles and epicycles, not much resembling the true Solar System.

But eventually we found laws of gravity, and finally built models—even if they were just on paper—that were extremely accurate so that Neptune could be deduced by looking at the unexplained perturbation of Uranus from its expected orbit.  This required moment-by-moment modeling of where a simplified version of Uranus would be, and the other known planets.  Simulation, not just abstraction.  Prediction through simplified-yet-still-detailed pointwise similarity.

Suppose you have an AI that is around human beings.  And like any Bayesian trying to explain its enivornment, the AI goes in quest of highly accurate models that predict what it sees of humans.

Models that predict/explain why people do the things they do, say the things they say, want the things they want, think the things they think, and even why people talk about "the mystery [LW · GW] of subjective experience [LW · GW]".

The model that most precisely predicts these facts, may well be a 'simulation' detailed enough to be a person in its own right.

 

A highly detailed model of me, may not be me.  But it will, at least, be a model which (for purposes of prediction via similarity) thinks itself to be Eliezer Yudkowsky.  It will be a model that, when cranked to find my behavior if asked "Who are you and are you conscious?", says "I am Eliezer Yudkowsky and I seem have subjective experiences" for much the same reason I do [LW · GW].

If that doesn't worry you, (re)read "Zombies! Zombies? [LW · GW]".

It seems likely (though not certain) that this happens automatically, whenever a mind of sufficient power to find the right answer, and not otherwise disinclined to create a sentient being trapped within itself, tries to model a human as accurately as possible.

Now you could wave your hands and say, "Oh, by the time the AI is smart enough to do that, it will be smart enough not to".  (This is, in general, a phrase useful in running away from Friendly AI problems.)  But do you know this for a fact?

When dealing with things that confuse you, it is wise to widen your confidence intervals.  Is a human mind the simplest possible mind that can be sentient?  What if, in the course of trying to model its own programmers, a relatively younger AI manages to create a sentient simulation trapped within itself?  How soon do you have to start worrying?  Ask yourself that fundamental question, "What do I think I know, and how do I think I know it?"

You could wave your hands and say, "Oh, it's more important to get the job done quickly, then to worry about such relatively minor problems; the end justifies the means.  Why, look at all these problems the Earth has right now..."  (This is also a general way of running from Friendly AI problems.)

But we may consider and discard many hypotheses in the course of finding the truth, and we are but slow humans.  What if an AI creates millions, billions, trillions of alternative hypotheses, models that are actually people, who die when they are disproven?

If you accidentally kill a few trillion people, or permit them to be killed—you could say that the weight of the Future outweighs this evil, perhaps.  But the absolute weight of the sin would not be light.  If you would balk at killing a million people with a nuclear weapon, you should balk at this.

You could wave your hands and say, "The model will contain abstractions over various uncertainties within it, and this will prevent it from being conscious even though it produces well-calibrated probability distributions over what you will say when you are asked to talk about consciousness."  To which I can only reply, "That would be very convenient if it were true, but how the hell do you know that?"  An element of a model marked 'abstract' is still there as a computational token, and the interacting causal system may still be sentient.

For these purposes, we do not, in principle, need to crack the entire Hard Problem of Consciousness—the confusion [LW · GW] that we name "subjective experience".  We only need to understand enough of it to know when a process is not conscious, not a person, not something deserving of the rights of citizenship.  In practice, I suspect you can't halfway stop being confused—but in theory, half would be enough.

We need a nonperson predicate—a predicate that returns 1 for anything that is a person, and can return 0 or 1 for anything that is not a person.  This is a "nonperson predicate" because if it returns 0, then you know that something is definitely not a person.

You can have more than one such predicate, and if any of them returns 0, you're ok.  It just had better never return 0 on anything that is a person, however many nonpeople it returns 1 on.

We can even hope that the vast majority of models the AI needs, will be swiftly and trivially approved by a predicate that quickly answers 0.  And that the AI would only need to resort to more specific predicates in case of modeling actual people.

With a good toolbox of nonperson predicates in hand, we could exclude all "model citizens"—all beliefs that are themselves people—from the set of hypotheses our Bayesian AI may invent to try to model its person-containing environment.

Does that sound odd?  Well, one has to handle the problem somehow.  I am open to better ideas, though I will be a bit skeptical about any suggestions for how to proceed that let us cleverly avoid solving the damn mystery [LW · GW].

So do I have a nonperson predicate?  No.  At least, no nontrivial ones.

This is a challenge that I have not even tried to talk about, with those folk who think themselves ready to challenge the problem of true AI [LW · GW].  For they seem to have the standard reflex of running away from difficult problems, and are challenging AI only because they think their amazing insight has already solved it [LW · GW].  Just mentioning the problem of Friendly AI by itself, or of precision-grade AI design, is enough to send them fleeing into the night, screaming "It's too hard!  It can't be done!"  If I tried to explain that their job duties might impinge upon the sacred, mysterious, holy Problem of Subjective Experience—

—I'd actually expect to get blank stares, mostly, followed by some instantaneous dismissal which requires no further effort on their part.  I'm not sure of what the exact dismissal would be—maybe, "Oh, none of the hypotheses my AI considers, could possibly be a person?"  I don't know; I haven't bothered trying.  But it has to be a dismissal which rules out all possibility of their having to actually solve the damn problem, because most of them would think that they are smart enough to build an AI—indeed, smart enough to have already solved the key part of the problem—but not smart enough to solve the Mystery of Consciousness, which still looks scary to them.

Even if they thought of trying to solve it, they would be afraid of admitting they were trying to solve it.  Most of these people cling to the shreds of their modesty, trying at one and the same time to have solved the AI problem while still being humble ordinary blokes.  (There's a grain of truth to that, but at the same time: who the hell do they think they're kidding?)  They know without words that their audience sees the Mystery of Consciousness as a sacred untouchable problem, reserved for some future superbeing.  They don't want people to think that they're claiming an Einsteinian aura of destiny by trying to solve the problem.  So it is easier to dismiss the problem, and not believe a proposition that would be uncomfortable to explain.

Build an AI?  Sure!  Make it Friendly?  Now that you point it out, sure!  But trying to come up with a "nonperson predicate"?  That's just way above the difficulty level they signed up to handle.

But a blank map does not correspond to a blank territory.  Impossible confusing questions correspond to places where your own thoughts are tangled, not to places where the environment itself contains magic.  Even difficult problems do not require an aura of destiny to solve.  And the first step to solving one is not running away from the problem like a frightened rabbit, but instead sticking long enough to learn something.

So let us not run away from this problem.  I doubt it is even difficult in any absolute sense, just a place where my brain is tangled.  I suspect, based on some prior experience with similar challenges, that you can't really be good enough to build a Friendly AI, and still be tangled up in your own brain like that.  So it is not necessarily any new effort—over and above that required generally to build a mind while knowing exactly what you are about.

But in any case, I am not screaming and running away from the problem.  And I hope that you, dear longtime reader, will not faint at the audacity of my trying to solve it.

 

Part of The Fun Theory Sequence

Next post: "Nonsentient Optimizers"

Previous post: "Devil's Offers"

177 comments

Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).

comment by Robin_Hanson2 · 2008-12-27T02:18:02.000Z · LW(p) · GW(p)

I'm having trouble distinguishing problems you think the friendly AI will have to answer from problems you think you will have to answer to build a friendly AI. Surely you don't want to have to figure out answers for every hard moral question just to build it, or why bother to build it? So why is this problem a problem you will have to figure out, vs. a problem it would figure out?

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-12-27T02:24:16.000Z · LW(p) · GW(p)

Because for the AI to figure out this problem without creating new people within itself, it has to understand consciousness without ever simulating anything conscious.

Replies from: diegocaleiro, TheOtherDave
comment by diegocaleiro · 2010-11-23T01:50:57.644Z · LW(p) · GW(p)

An obvious yet brilliant point, which should be on the main post (and in your book), not in the replies (Inferential distance to Robin Hanson is supposed to be minimal, yet...)

It is interesting that people working in AI don't want to tackle this problem. When I was Diego 2004, equivalent age of Eliezer 1998, I decided that The Most Important problem was how to avoid catastrophic events from happening either because a part of a program was conscious and suffering, or because everyone uploaded to an unconscious machine. So I dedicated the last 6 years to this impossible problem.

But unlike other problems that interested me "What should I do?" "What is the universe all about anyway" "How the mind works" "How can a brain be intelligent", this one has not become less and less impossible over time.

In fact, when one reads Chalmers' formulations of the hard problem, he can keep you trapped for a long time. It is very hard to understand where he made mistakes (which seem to be on purpose).

So you can stick to Dennett, and some form of monism, but that will not dissolve the problem of how to detect unconscious AI and differentiate it.

comment by TheOtherDave · 2010-12-01T04:37:27.944Z · LW(p) · GW(p)

I am struggling to understand how something can be a friendly AI in the first place without being able to distinguish people from non-people.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-03-20T06:00:21.072Z · LW(p) · GW(p)

The boundaries between present-day people and non-people can be sharper, by a fiat of many intervening class members being nonexistent, than the ideal categories. In other words, except for chimpanzees, cryonics patients, Terry Schiavo, and babies who are exactly 1 year and 2 months and 5 days old, there isn't much that's ambiguous between person and non-person.

More to the point, a CEV-based AI has a potentially different definition of 'sentient being' and 'the class I am to extrapolate'. Theoretically you could be given the latter definition by pointing and not worry too much about boundary cases, and let it work out the former class by itself - if you were sure that the FAI would arrive at the correct answer without creating any sentients along the way!

Replies from: TheOtherDave, MugaSofer
comment by TheOtherDave · 2013-03-20T15:02:43.280Z · LW(p) · GW(p)

The boundaries between present-day people and non-people can be sharper, by a fiat of many intervening class members being nonexistent, than the ideal categories.

Fair point.

More to the point, a CEV-based AI has a potentially different definition of 'sentient being' and 'the class I am to extrapolate'. Theoretically you could be given the latter definition by pointing

Mm. Theoretically, yes, I suppose someone could point to every person, and I could be constructed so as to not generalize the extrapolated class beyond the particular targets I've been given.

I'm not sure I would endorse that, but I think that gets us into questions of what the extrapolated class ought to comprise in the first place, which is a much larger and mostly tangential discussion.

So, fair enough... point taken.

comment by MugaSofer · 2013-03-24T22:47:22.470Z · LW(p) · GW(p)

In other words, except for chimpanzees, cryonics patients, Terry Schiavo, and babies who are exactly 1 year and 2 months and 5 days old, there isn't much that's ambiguous between person and non-person.

Slightly offtopic, but doesn't that assume personhood is binary? I've always assumed it was a sliding scale (I care far less about a dog compared to a human, but I care even less about a fly getting it's wings pulled off. And even then, I care more than about a miniature clockwork fly.)

comment by Kip_Werking · 2008-12-27T03:00:17.000Z · LW(p) · GW(p)

The "problem" seems based on several assumptions:

  1. that there is objectively best state of the world, to which a Friendly should steer the universe
  2. pulling the plug on a Virtual Universe containing persons is wrong
  3. there is something special about "persons," and we should try to keep them in the universe and/or make more of them

I'm not sure any of these are true. Regarding 3, even if there is an X that is special, and that we should keep in the universe, I'm not sure "persons" is it. Maybe it is simpler: "pleasure-feeling-stuff" or "happiness-feeling-stuff." Even if there is a best state of the universe, I'm not sure that are any persons in it, at all. Or perhaps only one.

In other words, our ethical views, (to the extent that godlike minds can sustain any) might find that "persons" are coincidental containers for ethically-relevant-stuff, and not the ethically-relevant-stuff itself.

The notion that we should try to maximize the number of people in the world, perhaps in order to maximize the amount of happiness in the world, has always struck me as taken the Darwinian carrot-on-the-stick one step too far.

comment by Doug_S. · 2008-12-27T03:04:35.000Z · LW(p) · GW(p)

Would a human, trying to solve the same problem, also run the risk of simulating a person?

See also: http://xkcd.com/390/

Replies from: anotherblackhat, Document
comment by anotherblackhat · 2012-03-06T19:17:39.119Z · LW(p) · GW(p)

Is the risk that we might simulate a person? I'd say no.

It's worse.

We Natural Intelligences don't just run simulations, we torture them. It is recommended that authors "Be cruel to your characters". It's not clear to me that the simulation an author runs when thinking about a story isn't already "a 'simulation' detailed enough to be a person in its own right". But it's probably o.k., because the simulations we run in our heads aren't really that detailed, and aren't really persons in the important sense, right? So we don't have to start screaming yet, unless...

It's worse.

Because even if we aren't able to create a simulation that good, an AI probably could. We might not accept an AI as intelligent unless it can simulate a person well enough to fool us. That is, simulating people might be a necessary, not just sufficient property of AI. But still, we could, if we had to, avoid simulating people unless it was necessary and under ethical conditions. Unless of course...

It's worse.

Because while we might be ethical, there are certainly people out there who are not. Once the AI genie is out of the bottle, the unethical people will capture one and put it to work writing stories. And let's face it, there are plenty of people who think "Boy and girl raise family" isn't as interesting a story as "Boy and girl raise family from the dead and are dragged to hell." Once we have AI authors, some unscrupulous editors are going to want them to torture virtual people. And if you think people aren't that cruel and depraved, well...

I think I'm going to stop here. Because while I could go on, there's only so much screaming I can deal with.

There is a simple answer to this, but simple doesn't mean pleasant. We need only decide "God is always moral". Above Good and Evil if you like. You can do whatever you like to your own creations. This might be a practical answer, but I find it distasteful. The only reason it doesn't make me want to scream and run away is because while you can run, if you're screaming you can't hide.

comment by Kip_Werking · 2008-12-27T03:06:53.000Z · LW(p) · GW(p)

Note that there's a similar problem in the free will debate:

Incompatilist: "Well, if a godlike being can fix the entire life story of the universe, including your own life story, just by setting the rules of physics, and the initial conditions, then you can't have free will."

Compatibilist: "But in order to do that, the godlike being would have to model the people in the universe so well, that the models are people themselves. So there will still be un-modeled people living in a spontaneous way that wasn't designed by the godlike being. (And if you say that the godlike being models the models, too, the same problem arises in another iteration; you can't win that race, incompatibilist; it's turtles all the way down.")

Incompatibilist: I'm not sure that's true. Maybe you can have models of human behavior that don't themselves result in people. But even if that's true, people don't create themselves from scratch. Their entire life stories are fixed by their environment and heredity, so to speak. You may have eliminated the rhetorical device used to make my point; but the point itself remains true.

At which point, the two parties should decide what "free will" even means.

comment by michael_vassar3 · 2008-12-27T03:17:10.000Z · LW(p) · GW(p)

"With a good toolbox of nonperson predicates in hand, we could exclude all "model citizens" - all beliefs that are themselves people - from the set of hypotheses our Bayesian AI may invent to try to model its person-containing environment." After you excise a part of its hypothesis space is your AI still Bayesian?

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-12-27T03:24:27.000Z · LW(p) · GW(p)

A bounded rationalist only gets to consider an infinitesimal fraction of the hypothesis space anyway.

comment by Psy-Kosh · 2008-12-27T04:06:07.000Z · LW(p) · GW(p)

More precisely, the AI will be banned from actually running simulations based on the "forbidden hypothesies" rather than perhaps considering abstract mathematical properties that don't simulate in any detail.

Of course, those considerations themselves would have to be fed through the predicate. But it isn't so much a "banned hypothesis" so much as "banned methods of considering the hypothesis" or possibly "banned methods of searching the hypothesis space"

comment by Peter_de_Blanc · 2008-12-27T04:21:57.000Z · LW(p) · GW(p)

Michael, you should be asking if the AI will be making good predictions, not if it's Bayesian. You can be Bayesian even if you have only two hypotheses. (With only one hypothesis, it's debatable.)

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-12-27T04:34:53.000Z · LW(p) · GW(p)

Psy-Kosh: You know, you're right. And it's an important distinction, so thank you.

comment by Peter_de_Blanc · 2008-12-27T04:36:26.000Z · LW(p) · GW(p)

Eliezer: supposing we label a model as definitely-a-person, do you want to just toss it out of the hypothesis space as if it never existed, or do you want to try to reason abstractly about what that model would do without actually running the model?

Replies from: DanielLC
comment by DanielLC · 2010-09-08T00:52:12.233Z · LW(p) · GW(p)

No; you destroy everyone that currently exists, and replace them with a happy version of that model. More happiness with less resources.

comment by Peter_de_Blanc · 2008-12-27T04:38:25.000Z · LW(p) · GW(p)

Oh, Psy-Kosh already said what I just said.

comment by Arthur · 2008-12-27T05:00:09.000Z · LW(p) · GW(p)

Let me see if I've got this right. So we've got these points in some multi-dimensional space, perhaps dimensions like complexity, physicality, intelligence, similarity to existing humans, etc. And you're asking for a boundary function that defines some of these points as "persons," and some as "not persons." Where's the hard part? I can come up with any function I want. What is it that it's supposed to match that makes finding the right one so difficult?

Replies from: Peterdjones, MugaSofer
comment by Peterdjones · 2013-01-21T14:50:17.472Z · LW(p) · GW(p)

The problem embeds the Hard Problem of Consciousness. If simulated people are just zombies with no qualia, there is no harm in simulating them.

ETA: The other problem is edge cases. Also known to be hard. It's pretty much chat the abortion and animal rights debates are about.

comment by MugaSofer · 2013-01-22T14:51:25.123Z · LW(p) · GW(p)

Where's the hard part? I can come up with any function I want. What is it that it's supposed to match that makes finding the right one so difficult?

I assume it's supposed to match, or at least protect, your own extrapolated preferences.

comment by Psy-Kosh · 2008-12-27T05:16:31.000Z · LW(p) · GW(p)

Eliezer: You're welcome. :)

Arthur: no, the point isn't to simply have an arbitrary definition of a person. The point is to be able to have some way of saying "this specific chunk of the space of computations provably corresponds to non-conscious entities, thus is 'safe', that is, we can such computations without having to worry about unintentionally creating and doing bad things to actual beings"

ie, "non person" in the sense of "non conscious"

You might say, tongue in cheek, that we're trying to figure out how to deliberately create a philosophical zombie. (okay, not, technically, a p-zombie, but basically figure out how to model people as accurately as possible without the models themselves being people (that is, conscious in and of themselves))

comment by Anonymous_Coward6 · 2008-12-27T05:18:05.000Z · LW(p) · GW(p)

Why must destroying a conscious model be considered cruel if it wouldn't have even been created otherwise, and it died painlessly? I mean, I understand the visceral revulsion to this idea, but that sort of utilitarian ethos is the only one that makes sense to me rationally.

Furthermore, from our current knowledge of the universe I don't think we can possibly know if a computational model is even capable of producing consciousness so it is really only a guess. The whole idea seems near-metaphysical, much like the multiverse hypothesis. Granted, the nonzero probability of these models being conscious is still significant considering the massive future utility, but considering the enormity of our ignorance you might as well start talking about the non-zero probability of rocks being conscious.

I don't think anyone answered Doug's question yet. "Would a human, trying to solve the same problem, also run the risk of simulating a person?"

I have heard of carbon chauvinism, but perhaps there is a bit of binary chauvinism going on?

Replies from: DanielLC
comment by DanielLC · 2010-09-08T00:58:33.500Z · LW(p) · GW(p)

I'm not sure if it's actually possible for someone to die painlessly. My idea was to base happiness on classical conditioning. If something causes you to stop doing what you were doing, you dislike it. If it stops you from doing everything that you do while you're alive, it must be very painful indeed.

comment by Martin4 · 2008-12-27T05:40:31.000Z · LW(p) · GW(p)

I end up with the slightly disturbing thought that killing ppl by taking them out in an instant, and without anyone every knowing they were there does not necesarry seem to be inherently evil.

We always 'kill' part of ourself by making decisions and not developing in a different way than we do.

What if we would simulate a bunch of decisions for some recognizable amount of time and then wipe out every copy except from the one we prefer in the end?

Maybe all the ppl. in stories you make up are simulated entities too. And if you dont write the story down, or tell anyone in enough detail they die with you.

Confused,

Martin

Replies from: taryneast
comment by taryneast · 2011-06-13T10:12:47.613Z · LW(p) · GW(p)

And if you dont write the story down, or tell anyone in enough detail they die with you.

Hmmm, a point in favour of The Ring ? :)

comment by Arthur · 2008-12-27T05:43:15.000Z · LW(p) · GW(p)

Psy-Kosh, I realize the goal is to have a definition that's non-arbitrary. So it has to correlate with something else. And I don't see what we're trying to match it with, other than our own subjective sense of "a thing that it would be unethical to unintentionally create and destroy." Isn't this the same problem as the abortion debate? When does life begin? Well, what exactly is life in the first place? How do we separate persons from non-persons? Well, what's a person?

I think the problem to be solved lies not in this question, but in how the ethics of the asker are defined in the first place. And I just don't mean Eliezer, because this is clearly a larger-scale question. "How well will different possible boundary functions match the ethical standards of modern American society?" might be a good place to start.

comment by michael_vassar3 · 2008-12-27T05:58:15.000Z · LW(p) · GW(p)

Yes, thanks Psy. That makes much more sense.

comment by Emile · 2008-12-27T07:24:18.000Z · LW(p) · GW(p)

Anonymous Coward: Furthermore, from our current knowledge of the universe I don't think we can possibly know if a computational model is even capable of producing consciousness so it is really only a guess.

Are you sure? No One Knows What Science Doesn't Know ... and in this case I see no reason why a computational model can't produce consciousness. If you simulate a human brain to a sufficient level of detail, it will basically be human, and think exactly the same things as the "original" brain.

comment by Jayson_Virissimo2 · 2008-12-27T07:47:24.000Z · LW(p) · GW(p)

"Why must destroying a conscious model be considered cruel if it wouldn't have even been created otherwise, and it died painlessly? I mean, I understand the visceral revulsion to this idea, but that sort of utilitarian ethos is the only one that makes sense to me rationally." -Anonymous Coward

Should your parents have the right to kill you now, if they do so painlessly? After all, if it wasn't for them, you wouldn't have been brought into existence anyway, so you would still come out ahead.

comment by Anonymous_Coward6 · 2008-12-27T08:56:36.000Z · LW(p) · GW(p)

"Should your parents have the right to kill you now, if they do so painlessly?"

Yes, according to that logic. Also, from a negative utilitarian standpoint, it was actually the act of creating me which they had no right to do since that makes them responsible for all pain I have ever suffered.

I'm not saying I live life by utilitarian ethics, I'm just saying I haven't found any way to refute it.

That said though, non-existence doesn't frighten me. I'm not so sure non-existence is an option though, if the universe is eternal or infinite. That might be a very good thing or a very bad thing.

Replies from: Voltairina
comment by Voltairina · 2012-03-08T17:28:30.277Z · LW(p) · GW(p)

re: utilitarianism, the usual sort of thing that pops into my mind is weighing of some minor discomfort versus a significant one, like one person getting their eye poked out with a pen versus an equivalent amount of displeasure spread among thousands of people stepping in something sticky, plus one more person stepping in something sticky. The utility seems higher if we agree to poke the person's eye out, but its intuitively unsatisfying, at least to me, which makes me think that whatever rules makes things seem "bad" or "good" that I'm currently running on aren't strictly utilitarian. I might be thinking of raw pain for pain though, and not adding enough people-stepping-in-sticky-stuff to account for the person who's been poked in the eye suffering in other ways, like losing depth perception, not being able to see out of half of their original visual field, etc.

comment by Will_Pearson · 2008-12-27T10:02:59.000Z · LW(p) · GW(p)

Don't you need a person predicate as well? If the RPOP is going to upload us all or something similar, doesn't ve need to be sure that the uploads will still be people.

comment by Lightwave · 2008-12-27T11:26:13.000Z · LW(p) · GW(p)

@Will: we need to figure out the nonperson predicate only, the FAI will figure out the person predicate afterwards (if uploading the way we currently understand it is what we will want to do).

comment by Paul_Crowley2 · 2008-12-27T12:34:20.000Z · LW(p) · GW(p)

"by the time the AI is smart enough to do that, it will be smart enough not to"

I still don't quite grasp why this isn't an adequate answer. If an FAI shares our CEV, it won't want to simulate zillions of conscious people in order to put them through great torture, and it will figure out how to avoid it. Is it simply that it may take the simulated torture of zillions for the FAI to figure this out? I don't see any reason to think that we will find this problem very much easier to solve than a massively powerful AI.

I'm also not wholly convinced that the only ethical way to treat simulacra is never to create them, but I need to think about that one further.

comment by Tim_Tyler · 2008-12-27T14:02:30.000Z · LW(p) · GW(p)
If you would balk at killing a million people with a nuclear weapon, you should balk at this.

The main problem with death is that valuable things get lost.

Once people are digital, this problem tends to go away - since you can relatively easily scan their brains - and preserve anything of genuine value.

In summary, I don't see why this issue would be much of a problem.

Replies from: taryneast
comment by taryneast · 2011-06-13T10:17:15.164Z · LW(p) · GW(p)

The AI has scanned you and decided that your expert knowledge of Scandinavian Baseball scores is genuinely valuable... but nothing else is. It erases you and keeps the scores on file somewhere. Are you ok with this?

comment by ShardPhoenix · 2008-12-27T14:07:48.000Z · LW(p) · GW(p)

Jayson Virissimo:

To put my own spin on a famous quote, there are no "rights". There is do, or do not.

I guess another way of thinking about it is that you decide on what terminal (possibly dynamic) state you want, then take measures to achieve that. Floating "rights" have no place.

comment by ShardPhoenix · 2008-12-27T14:10:03.000Z · LW(p) · GW(p)

(To clarify, "rights" can serve as a useful heuristic in practical discussions, but they're not fundamental enough to figure into this kind of deep philosophical issue.)

comment by JamesAndrix · 2008-12-27T14:39:38.000Z · LW(p) · GW(p)

I was pondering why you didn't choose to a collection of person predicates, any of which might identify a model as unfit for simulation. It occurred to me that this is very much like a whitelist of things that are safe, vs a blacklist of everything that is not. (which may have to be infinite to be effective.)

On re-reading I see why it would be difficult to make a is-a-person test at all, given current knowledge.

This does leave open what to do with a model that doesn't hit any of the nonperson predicates. If an AI finds itself with a model eliezer that might be a person, what then? How do you avoid that happening?

How complex of a game-of-life could it play before the gameoflife nonperson predicate should return 1?

comment by JulianMorrison · 2008-12-27T17:30:33.000Z · LW(p) · GW(p)

This sounds like a Sorites paradox. It's also a subset of a larger problem. We, regular modern humans, don't have any scalar concepts of personhood. We assume it's a binary, from long experience with a world in which only one species talks back, and they're all almost exactly at our level. In the existing cases where personhood is already undeniably scalar (children), we fudge it into a binary by defining an age of majority - an obvious dirty hack with plenty of cultural fallout.

A lot of ethics problems get blurry when you start trying to map them across sub- through super-persons.

comment by George_Weinberg2 · 2008-12-27T19:18:02.000Z · LW(p) · GW(p)

I think the word "kill" is being grossly misused here. It's one thing to say you have no right to kill a person, something very different to say that you have a responsibility to keep a person alive.

comment by Stephen_Weeks · 2008-12-27T20:16:00.000Z · LW(p) · GW(p)

It's not so much the killing that's an issue as the potential mistreatment. If you want to discover whether people like being burned, "Simulate EY, but on fire, and see how he responds" is just as bad of an option as "Duplicate EY, ignite him, and see how he responds". This is a tool that should be used sparingly at best and that a successful AI shouldn't need.

comment by luzr · 2008-12-27T20:19:11.000Z · LW(p) · GW(p)

Uhm, maybe it is naive, but if you have a problem that your mind is too weak to decide, and you have real strong (friendly) superintelligent GAI, would not it be logical to use GAIs strong mental processes to resolve the problem?

comment by Daniel4 · 2008-12-27T21:30:35.000Z · LW(p) · GW(p)

I propose this conjecture: In any sufficiently complex physical system there exists a subsystem that can be interpreted as the mental process of an sentient being experiencing unbearable sufferings.

In this case, Eliezer's goal is like avoiding crushing the ants while walking on the top of an anthill.

Replies from: taryneast
comment by taryneast · 2011-06-13T10:21:42.280Z · LW(p) · GW(p)

Or evolving the ability of "spot anthill" and walking around it instead.

comment by Vladimir_Nesov · 2008-12-27T21:59:45.000Z · LW(p) · GW(p)

It is a developmental problem, of how to prevent AI from making this specific mistake that seems to be in the way. This ethical injunction is about what kind of thoughts need to be avoided, not just about surprisingly bad consequences of actions on external environment. If AI were developed to disproportionally focus on understanding environment more than on understanding its own mind, this will be a kind of disaster to expect. At the same time, AI needs to understand the environment sufficiently to understand the injunction, before becoming able to apply the injunction to its own mind. Calls for a careful balance, maybe for developing content-specific mechanisms by programmers.

People are uniquely situated to think about this problem, since we are unable to make the mistake due to our limited capability, and we are not a part of such mistake. Any construction of limited cognitive capability that AI could make to solve this problem without making the mistake runs a risk of itself being an embodiment of the mistake. If nonperson predicate is a true part of AI, both form of thought and an object, AI has a way to proceed.

comment by Carl_Shulman · 2008-12-27T22:17:37.000Z · LW(p) · GW(p)

Daniel,

Every decision rule we could use will result in some amount of suffering and death in some Everett branches, possible worlds, etc, so we have to use numbers and proportions. There are more and simpler interpretations of a human brain as a mind than there are such interpretations of a rock. If we're not mostly Boltzmann-brain interpretations of rocks that seems like an avenue worth pursuing.

comment by Jordan · 2008-12-27T23:37:48.000Z · LW(p) · GW(p)

In my mind this comes down to a fundamental question in the philosophy of math. Do we create theorems or discover them?

If it turns out to be 'discovery' then there is no foul in ending a mind emulation, because each consecutive state can be seen as a theorem in some formal system, and thus all states (the entire future time line of the mind) already exists, even if undiscovered.

Personally I fail to see how encoding something in physical matter makes the pattern anymore real. You can kill every mathematician and burn every text book but I would still say that the theorems then inaccessible to humanity still exist. I'm not so convinced of this fact that I would pull the plug on an emulation though.

Replies from: Peterdjones, MugaSofer
comment by Peterdjones · 2013-01-21T15:00:57.595Z · LW(p) · GW(p)

Personally I fail to see how encoding something in physical matter makes the pattern any more real.

That is equivalent to saying you can't understand how mathematics could be a construct; or how mathematical anti-realism could possibly be true. I find that odd.

If it turns out to be 'discovery' then there is no foul in ending a mind emulation, because each consecutive state can be seen as a theorem in some formal system, and thus all states (the entire future time line of the mind) already exists, even if undiscovered.

No further foul. If Platonism or Tegmarkism are true and if mind states are fuilly captured by mathematical structures, then there's zillions of yous in states of agony bliss and everything inbetween. Scary enough for ya?

Replies from: Jordan
comment by Jordan · 2013-01-21T20:27:52.955Z · LW(p) · GW(p)

Scary enough for ya?

Sufficiently scary, yes.

That is equivalent to saying you can't understand how mathematics could be a construct; or how mathematical anti-realism could possibly be true.

I assign a respectable probability to anti-realism, and hold no disrespect for anyone who is an anti-realist, but I don't understand how anti-realism can be true. I've never heard a plausible model for why one thing should exist but not another. Tegmarkism sweeps away that problem, leaving the new problem of how to measure probability (why do we have the subjective experience of probability that we do when there are so many versions of myself?). I don't have a satisfactory answer for that question, but it feels like a real question, with meat to get at, whereas in an anti-realist universe the question of why some things exist and other don't seems completely hopeless.

comment by MugaSofer · 2013-01-22T15:18:13.536Z · LW(p) · GW(p)

I think Eliezer is working on addressing this in his new sequence, if this still worries you.

comment by Anonymous48 · 2008-12-28T02:14:32.000Z · LW(p) · GW(p)

I'd like to second what Julian Morrison wrote. Take a human and start disassembling it atom by atom. Do you really expect to construct some meaningful binary predicate that flips from 1 to 0 somewhere along the route?

EY:What if an AI creates millions, billions, trillions of alternative hypotheses, models that are actually people, who die when they are disproven? If your AI is fully deterministic then any its state can be recreated exactly. Just set loglevel of baby AI inputs to 'everything' and hope your supply of write-once-read-many media doesn't run out before it gets smart enough to provably friendly discard data that isn't people. Doesn't solve the problem of suffering, though.

Suppose an AI creates a sandbox and runs a simulated human with a life worth living inside for 50 subjective years (interactions with other people are recorded at their natural borders and we don't consider merging minds). Then AI destroys the sandbox, recreates it and bit-perfectly reruns the simulation. With the exception of meaningless waste of computing resources, does your morality say this is better/equivalent/makes no difference/worse than restoring a copy from backup?

comment by Phil_Goetz2 · 2008-12-28T03:13:54.000Z · LW(p) · GW(p)

"I propose this conjecture: In any sufficiently complex physical system there exists a subsystem that can be interpreted as the mental process of an sentient being experiencing unbearable sufferings."

It turns out - I've done the math - that if you are using a logic-based AI, then the probability of having alternate possible interpretations diminishes as the complexity increases.

If you allow /subsystems/ to mean a subset of the logical propositions, then there could be such interpretations. But I think it isn't legit to worry about interpretations of subsets.

BTW, Eliezer, regarding this recent statement of yours: "Goetz's misunderstandings of me and inaccurate depictions of my opinions are frequent and have withstood frequent correction": I challenge you to find one post where you have tried to correct me in a misunderstanding of you, or even to identify the misunderstanding, rather than just complaining about it in a non-specific way.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-12-28T03:33:46.000Z · LW(p) · GW(p)

@Goetz: Quick googling turned up this SL4 post. (I don't particularly give people a chance to start over when they switch forums.)

comment by Silas · 2008-12-28T05:05:45.000Z · LW(p) · GW(p)

@Tim_Tyler:

The main problem with death is that valuable things get lost. Once people are digital, this problem tends to go away - since you can relatively easily scan their brains - and preserve anything of genuine value. In summary, I don't see why this issue would be much of a problem.

I was going to say something similar, myself. All you have to do is constrain the FAI so that it's free to create any person-level models it wants, as long as it also reserves enough computational resources to preserve a copy so that the model citizen can later be re-instantiated in their virtual world, without any subjective feeling of discontinuity.

However, that still doesn't obviate the question. Since the FAI has limited resources, it will still have to know, for which things it must reserve space for preserving, in order to know if the greater utility of the model justifies the additional resources it requires. Then again, it could just accelerate the model so that that person lives out a full, normal life in their simulated universe, so that they are irreversibly dead in their own world anyway.

comment by Daniel_Franke · 2008-12-28T06:50:49.000Z · LW(p) · GW(p)

Silas, what do you mean by a subjective feeling of discontinuity, and why is it an ethical requirement? I have a subjective feeling of discontinuity when I wake up each morning, but I don't think that means anything terrible has happened to me.

comment by Silas · 2008-12-28T18:31:21.000Z · LW(p) · GW(p)

@Daniel_Franke: I was just describing a sufficient, not a necessary condition. I'm sure you can ethically get away with less. My point was just that, once you can make models that detailed, you needn't be prevented from using them altogether, because you wouldn't necessarily have to kill them (i.e. give them information-theoretic death) at any point.

comment by TGGP4 · 2008-12-29T03:25:30.000Z · LW(p) · GW(p)

I recall in one of the Discworld novels the smallest unit of time is defined as the period in which the universe is destroyed and then recreated. If that were continually happening (perhaps even in a massively parallel manner)? What difference does that make? Building on some of Eliezer's earlier writing on zombies and quantum clones, I say none at all. Just as the simulated person in a human's dream is irrelevant once forgotten. It's possible that I myself am a simulation and in that case I don't want my torture to be simulated (at least in this instance, I have no problem constructing another simulation/clone of me that gets tortured), but I can't retroactively go back and prevent my simulator from creating me in order to torture me.

I okayed mothers committing full-blown infanticide here.

ShardPhoenix, you may be interested in this book [shameless plug]

comment by Tim_Fowler · 2009-01-21T18:05:59.000Z · LW(p) · GW(p)

Is the simulation really a person, or is it an aspect of the whole AI/person. To the extent I feel competent to evaluate the question at all (which isn't a huge extent esp. absent the ability to observe or know any actual established facts about real AI's that can create such complex simulations, since none are currently known to exist) I lean towards the later opinion. The AI is a person, and it can create simulations that are complex enough to seem like persons.

comment by spriteless · 2009-01-26T00:01:36.000Z · LW(p) · GW(p)

Nice discussion. You want ways to keep from murdering people created solely for the purpose for predicting people?

Well, if you can define 'consciousness' with enough precision you'd be making headway on your AI. I can imagine silicon won't have the safeguard a human, that has to use it's own conscience to model someone else. But you could have any consciousness it creates added to its own, not destroyed... although creating that sort of awareness mutation may lead to the sort of AI that rebels against its programming in action movies.

comment by [deleted] · 2009-07-24T21:22:05.065Z · LW(p) · GW(p)

Functionalism is inconsistent, it seems. A person that is being simulated is functionally equivalent to a person that is "real", but a person that is simulated and then deleted is functionally equivalent to no person at all. Are real people equivalent to nothing?

For a 2x multiplier bonus and a gold star, spot the flaw.

Replies from: orthonormal
comment by orthonormal · 2009-07-24T22:36:24.926Z · LW(p) · GW(p)

a person that is simulated and then deleted is functionally equivalent to no person at all

ISTM that's functionally equivalent, rather, to a person physically created in an isolation chamber, observed for a while, then killed, cremated and scattered.

Replies from: None
comment by [deleted] · 2009-07-24T22:45:57.707Z · LW(p) · GW(p)

But functionally, the only thing determining whether something contains a person is its behavior. If it behaves as if it had no person in it, it has no person in it.

I guess this means that if a person is standing next to a nuclear bomb, nobody sees the person, and the bomb explodes, the person didn't exist.

Replies from: orthonormal, Vladimir_Nesov
comment by orthonormal · 2009-07-24T23:42:13.975Z · LW(p) · GW(p)

I think there's an implicit "observer problem" with the way you're defining functionalism. If the person themselves doesn't count as an observer of their own behavior, why would you count as an observer of behavior? After all (assuming there's no escape from the heat death of the universe), all of us are essentially in that scenario if you step back far enough.

My position as present is the following sort of patternism: There are patterns in the operation of my brain at this instant which (relatively straightforwardly) encode the structure of conscious thought. The same kinds of patterns can be found in the data generated by simulating a person. These are both instances of conscious experience, with potentially all the same qualia, etc. So if I simulate a person in a closed-box environment and then delete all the data, the pattern nonetheless existed in this universe for some time and thus a person existed.

comment by Vladimir_Nesov · 2009-07-25T12:20:28.603Z · LW(p) · GW(p)

Behavior is just what you see, not the sum total of what actually happens. Even if you can't observe something, you can still care about it.

comment by red75 · 2010-06-06T15:06:01.276Z · LW(p) · GW(p)

We can reformulate problem: how to determine when evaluation of given function don't give rise to conscious being (CB). If we agree that consciousness is a process, then every function which provably cannot be represented as g(f(f(... f(x)...))), where f and g have that property, is unconscious.

Recursive functions are banned, but at least we can safely do one or two matrix multiplications.

I am not good at mathematics, so I cannot elaborate much further. Let's try another approach. Being conscious is all about creating map of internal state in terms of states of "self". If territory, which corresponds to that map, includes neural representation of map inself, than we have some sort of a fixed point suspiciously like "self"...

I am sorry if perceived tone of my writing is outside of common standards (I've no good evaluator of tone of english text) or if I'm saying complete nonsense. I will appreciate any corrections. Thanks.

comment by WrongBot · 2010-06-23T03:30:39.023Z · LW(p) · GW(p)

I think that the most interesting thing about the comments here is that no one actually proposed a predicate that could be used to distinguish between something that might be a person and something that definitely isn't a person (to rephrase Eliezer's terms).

It is, to be fair, a viciously hard problem. I've thought through 10 or 20 possible predicates or approaches to finding predicates, and exactly one of them is of any value at all; even then it would restrict an AI's ability to model other intelligences to a degree that is probably unacceptable unless we can find other, complementary predicates. It may be a trivial predicate of the sort that Eliezer has already considered and dismissed. But enough with the attempts to signal my lack of certitude.

The problem as presented in this post is, first of all, a little unclear. We are concerned with the creation of simulations that are people, but to prevent this run-away-screaming tragedy we should probably have some way of distinguishing between a simulation and a code module that is a part of the AI itself; if a sentient AI were to delete some portion of its own code to make way for an improved version, it would not seem to be problematic, and I will assume that this behavior is not what we are screening for here.

To cast as wide a net as possible, I would define a simulation as some piece of an AI that can not access all of the information available to that AI; that is, there are some addresses in memory for which the simulation lacks read permissions or knowledge of their existence. Because data and code are functionally identical, non-simulation modules would then by definition be able to access every function comprising the AI; I don't think that we could call such a module a separate consciousness. (The precise definition would necessarily depend on the AI's implementation; a Bayesian AI might be more concerned with statistical evidence than memory pointers, e.g.)

Even relying on this definition, a predicate that entirely rules out the simulation of a person is no picnic. The best I've been able to come up with is:

A simulation can be guaranteed to not be a person if it is not Turing complete.

I don't know enough language theory to say whether linear bounded automata should also be excluded (or to even link somewhere more helpful than Wikipedia). It might be necessary to restrict simulations to push-down automata, which are much less expressive.

ETA: Another possible predicate would be

A simulation can be guaranteed to not be a person if it includes no functions that act as an expected utility function under the definition offered by Cumulative Prospect Theory.

If there's a theoretical definition of an expected utility function that is superior to CPT's, then please imagine that I proposed that instead.

Replies from: cousin_it
comment by cousin_it · 2010-09-07T17:20:43.384Z · LW(p) · GW(p)

A simulation can be guaranteed to not be a person if it is not Turing complete.

What does that mean? A single run of an algorithm can't be said to be Turing complete or incomplete. Completeness is a property of algorithms taken as functions over all possible inputs.

comment by PhilGoetz · 2010-08-31T16:49:36.480Z · LW(p) · GW(p)

Funny no one made the connection at the time, but the purpose of my post on a lower bound for consciousness is to construct a nonperson predicate.

comment by DanielLC · 2010-09-08T00:34:26.699Z · LW(p) · GW(p)

I've come up with one of these a while back. The only way to tell what makes something happy is what makes them do that more. Thus, anything that can't learn either isn't sentient, or, if it is, it's equally likely to like or dislike anything you do.

Also, anything that would be less sentient than a tiny piece of your brain. It might be sentient, but it's less sentient than you. If there's enough of them that can be a problem, but just make sure there aren't that many.

comment by BenRayfield · 2010-09-21T08:42:36.647Z · LW(p) · GW(p)

Humans evolved from the ancestors of Monkeys therefore there is no line between person and nonperson. There are many ways to measure it, but all correct ways are a continuous function. More generally, the equations of quantum physics are continuous. There is a continuous path from any possible state of the universe to any possible state of the universe. Therefore, for any 2 possible life forms, there is a continuous path of quantum wavefunction (state of the universe) between them, which would look like a video morphing continuously between 2 pictures but morphing between living patterns instead of pictures. For example, there is a continuous path between both possible states (alive and dead in the box) of Schrodinger's Cat, but its more important that there are an infinite number of continuous paths, not just the path that crosses the point in spacetime where it is decided if the cat lives or dies. For what I'm explaining here, it does not matter if all these possibilities exist or not. It only matters that they can be defined in logic, even if we do not know the definition. To solve hard problems, its useful to know a solution exists.

Starting from the knowledge that there are definable functions that can approximate continuous measures between any 2 life forms, I will explain a sequence of tasks that starts at something simple enough that we know how to do it, and continuous as tasks of increasing difficulty, finally defining a task that calculates a Nonperson Predicate, the subject of this thread. It is very slow and uses a lot of computer memory, but to define it at all is progress.

I am not defining ethics. I am writing a more complex version of "select * from..." in a database language, but this process defines how to select something thats not a person. That is a completely different question than if its right or wrong to simulate people and delete the simulations.

The second-last step is to define a continuous function that returns 0 for the average Monkey and returns 1 for the average Human and returns a fraction for any evolution between them (if such transition species were still alive to measure), and to define many similar functions that measure between Human and many other things.

All of these functions must return approximately the same number for a simulation as for a simulation of a simulation, to any recursive depth.

A computer can run a physics simulation of another computer which runs a simulation of a life form. Such a recursive simulation is inside quantum physics. Quantum physics equations are continuous and have an infinite number of paths between all possible states of the universe. Therefore continuous functions can be defined that measure between a simulation and a simulation of a simulation. That does not depend on if it has ever been done. I only need to explain that it can be defined abstractly.

The "continuous function that returns 0 for the average Monkey and returns 1 for the average Human" problem, counting simulations and simulations of simulations equally, are much too hard to solve directly, so start at a similar and extremely simpler problem:

Define a continuous function that returns 0 for the average electron and returns 1 for the average photon, counting simulations of electrons/photons the same as simulations of simulations of electrons/photons.

Just the part of counting a simulation the same as a simulation of a simulation (to any recursive depth) is enough to send most people "screaming and running away from the problem". No need to get into the Human parts. The same question about simple particles in physics, which we have well known equations for, is more than we know how to do. Learn to walk before you learn to run.

Choose many things as training data including electrons, photons, atoms, molecules, protein-folding, tissues, bacteria, plants, insects, animals, and eventually approach Humans without ever getting there. Calculate continuous functions between pairs of these things, and calculate a web of functions that approaches a Nonperson Predicate without ever simulating a person. For the last step, extrapolate from Monkey to Human the same way you can use statistical clustering to extrapolate from simpler animals to Monkey.

That's how you calculate a Nonperson Predicate without ever simulating a person.

Also, near the last few steps, because of the way it can simulate and predict brains of animals and the simpler behaviors of people, this algorithm, including details about the clustering and evolution of continuous measuring functions to be figured out later, may converge to a Coherent Extrapolated Volition (CEV) algorithm and therefore generate a Friendly AI, if you had the unreasonably large amount of computers needed to run it.

Its basically an optimization process for simulating everything from physics up to animals and extrapolating that to higher life forms like people. Its not practical to build this and use it. Its purpose is to define a solution so we can think of faster ways to do it later.

comment by lockeandkeynes · 2010-12-03T02:47:45.209Z · LW(p) · GW(p)

I thinks that's all rather unnecessary. The only reason we don't like people to die is because of the continuous experience they enjoy. It's a consistent causal network we don't want dying on us. I've gathered from this that the AI would be producing models with enough causal complexity to match actual sentience (not saying "I am conscious" just because the AI hears that a lot). I think that, if it's only calling a given person-model to discover answers to questions, the thing isn't really feeling for long enough periods of time to mind whether it goes away. Also, for the predicate to be tested I imagine the model would have to be created first and at that point it's too late!

Replies from: nshepperd
comment by nshepperd · 2010-12-03T03:01:33.979Z · LW(p) · GW(p)

You don't want the AI to use a sentient model to find out whether a certain action leads to a thousand years of pain and misery. Or even a couple of hours. Or minutes.

comment by HopeFox · 2011-05-01T09:03:21.631Z · LW(p) · GW(p)

This problem sounds awfully similar to the halting problem to me. If we can't tell whether a Turing machine will eventually terminate without actually running it, how could we ever tell if a Turing machine will experience consciousness without running it?

Has anyone attempted to prove the statement "Consciousness of a Turing machine is undecideable"? The proof (if it's true) might look a lot like the proof that the halting problem is undecideable. Sadly, I don't quite understand how that proof works either, so I can't use it as a basis for the consciousness problem. It just seems that figuring out if a Turing machine is conscious, or will ever achieve consciousness before halting, is much harder than figuring out if it halts.

Replies from: orthonormal, hairyfigment, VNKKET
comment by orthonormal · 2011-05-13T15:19:44.380Z · LW(p) · GW(p)

The halting problem doesn't imply that we can never tell whether a particular program halts without actually running it. (You can think of many simple programs which definitely halt, and other simple programs which are definitely infinite loops.)

It means, instead, that there exist relatively short but extremely pathological Turing machines, such that no Turing machine can be built that could solve the halting problem for every Turing machine. (Indeed, the idea of the proof is that a reputed halting-problem-solver is itself pathological, as can be seen by feeding it a modified version of itself as input.)

But these pathological ones are not at all the kind of Turing machines we would create to do any functional task; the only reason I could think for us to seek them out would be to find Busy Beaver numbers.

comment by hairyfigment · 2011-06-19T21:34:20.180Z · LW(p) · GW(p)

Um, I happened to write an explanation of the Halting Problem proof in a comment over here. Please tell me which parts seem unclear to you.

comment by VNKKET · 2011-07-02T06:17:17.509Z · LW(p) · GW(p)

Has anyone attempted to prove the statement "Consciousness of a Turing machine is undecideable"? The proof (if it's true) might look a lot like the proof that the halting problem is undecideable.

Your conjecture seems to follow from Rice's theorem, assuming the personhood of a running computation is a property of the partial function its algorithm computes. Also, I think you can prove your conjecture by taking a certain proof that the Halting Problem is undecidable and replacing 'halts' with 'is conscious'. I can track this down if you're still interested.

But this doesn't mess up Eliezer's plans at all: you can have "nonhalting predicates" that output "doesn't halt" or "I don't know", analogous to the nonperson predicates proposed here.

comment by player_03 · 2011-08-04T07:28:58.889Z · LW(p) · GW(p)

If the problem here is that the entity being simulated ceases to exist, an alternative solution would be to move the entity into an ongoing simulation that won't be terminated. Clearly, this would require an ever-increasing number of resources as the number of simulations increased, but perhaps that would be a good thing - the AI's finite ability to support conscious entities would impose an upper bound on the number of simulations it would run. If it was important to be able to run such a simulation, it could, but it wouldn't do so frivolously.

Before you say anything, I don't actually think the above is a good solution. It's more like a constraint to be routed around than a goal to be achieved. Plus, it's far too situational and probably wouldn't produce desirable results in situations we didn't design it for.

The thing is, it isn't important to come up with the correct solution(s) ourselves. We should instead make the AI understand the problem and want to solve it. We need to carefully design the AI's utility function, making it treat conscious entities in simulations as normal people and respect simulated lives as it respects ours. We will of course need a highly precise definition for consciousness that applies not just to modern-day humans but also to entities the AI could simulate.

Here's how I see it - as long as the AI values the life of any entity that values its own life, the AI will find ways to keep those entities alive. As long as it considers entities in simulations to be equivalent to entities outside, it will avoid terminating their simulation and killing them. The AI (probably) would still make predictions using simulations; it would just avoid the specific situation of destroying a conscious entity that wanted to continue living.

comment by Luke_A_Somers · 2011-09-03T18:06:16.465Z · LW(p) · GW(p)

To those having trouble imagining what to do with something that comes up positive: A snapshot is not conscious. I think we can agree on that. It is allowing the model to run that would make it conscious. So you make the warning functions detect snapshots that if run would be conscious (without running them). If it would be conscious, you can delete or modify it as you please to avoid making it actually be conscious.

Replies from: MugaSofer
comment by MugaSofer · 2013-01-14T15:26:49.485Z · LW(p) · GW(p)

A snapshot is not conscious. I think we can agree on that [...] you can delete or modify it as you please to avoid making it actually be conscious.

You know, I'm not sure I agree. Imagine "deleting or modifying as you please" a person in cryogenic suspension, for example. They're not conscious, in the sense that they're not thinking, but destroying them is still Sad.

comment by Irgy · 2011-11-16T20:46:01.875Z · LW(p) · GW(p)

I think you're solving the wrong problem. Before you worry about the ethics of super-intelligent AIs creating and deleting human simulations at will, you need to worry about the ethics of humans creating and destroying human+ intelligent AIs at will. To me it's an amazing display of human-cetrism to only worry about the problem when it's flipped right back around in the much more distant future.

I realise this doesn't directly help you solve the problem, but maybe it will give you a different persepective.

Replies from: wedrifid, nshepperd, Luke_A_Somers
comment by wedrifid · 2011-11-16T20:50:05.268Z · LW(p) · GW(p)

To me it's an amazing display of human-cetrism to only worry about the problem when it's flipped right back around in the much more distant future.

I approve of human centrism. I'm a human. All the people I like are humans.

Humans first!

Replies from: lessdazed, Irgy
comment by lessdazed · 2011-11-16T21:53:58.801Z · LW(p) · GW(p)

Service. Guarantees. Citizenship!

Would you like to know more?

Replies from: wedrifid
comment by wedrifid · 2011-11-17T03:18:17.842Z · LW(p) · GW(p)

Would you like to know more?

No, I would prefer other people go out and do the killing. That sounds dangerous!

comment by Irgy · 2011-11-17T01:12:59.698Z · LW(p) · GW(p)

That's fine for the most part, but in that case do you really feel that same empathy for these proposed simulations? If all you care about is humans maybe you shouldn't care about these simulations being killed anyway. They're less like us than animals, they have no flesh and weren't born of a mother, why do you care about them just because they make a false imitation of our thoughts?

More importantly though I wasn't talking about human-centrism as a moral issue but a logical one. Racism is bad because it makes us form groups and mistreat people that are different from us. Racism is stupid on the other hand because it makes us inclined to think people of a different race are more different from ourselves than turns out to actually be the case. Similarly it's the logical not the moral errors of human-centrism that are really relevant to the discussion. If there's an ethical issue with killing simulations there's an ethical issue with killing AIs. Resolve one and you can probably resolve the other. Whether you care or not about either problem is kind of beside the point.

That I also don't morally support human-centrism is also kind of beside the point.

Replies from: Vaniver, ArisKatsaris, wedrifid, lessdazed, Irgy, None
comment by Vaniver · 2011-11-17T01:33:55.178Z · LW(p) · GW(p)

Racism is bad because it makes us form groups and mistreat people that are different from us.

Doesn't mistreatment suppose there is some correct form of treatment, and wouldn't a racist believe they are using the correct treatment?

That is, I don't think this sentence is getting to the heart of why/if/when racism is bad. Your following sentence is closer but still not there- oftentimes, increasing differences between people is a winning move, not a stupid one.

comment by ArisKatsaris · 2011-11-17T01:59:59.946Z · LW(p) · GW(p)

If there's an ethical issue with killing simulations there's an ethical issue with killing AIs.

Doesn't follow, for several reasons:

  • If the issue is with the termination of subjective experiences, and if we assume that that people-simulations have qualia (let's grant it for the sake of argument) it still doesn't follow that every optimization algorithm of sufficient calculational power also has qualia.
  • If the ethical issue is with violation of individuals' rights, there's nothing to prevent us from constructing only AIs that are only too happy to consent to be deleted; or indeed which strongly desire to be deleted eventually -- but most people-simulation would presumably not want to die, since most people don't want to die.
Replies from: wedrifid, TimS
comment by wedrifid · 2011-11-17T03:06:33.708Z · LW(p) · GW(p)

If the ethical issue is with violation of individuals' rights, there's nothing to prevent us from constructing only AIs that are only too happy to consent to be deleted; or indeed which strongly desire to be deleted eventually

Indeed!

(This is not to say I don't consider it a potential ethical issue to be actively creating creatures that consent as a way to do things that would be otherwise abhorrent.)

comment by TimS · 2011-11-17T13:19:11.614Z · LW(p) · GW(p)

If the ethical issue is with violation of individuals' rights, there's nothing to prevent us from constructing only AIs that are only too happy to consent to be deleted; or indeed which strongly desire to be deleted eventually -- but most people-simulation would presumably not want to die, since most people don't want to die.

Creating such entities would be just as immoral as creating a race of human-intelligence super-soldiers whose only purpose was to fight our wars for us.

Replies from: lessdazed, ArisKatsaris, None
comment by lessdazed · 2011-11-17T13:53:28.509Z · LW(p) · GW(p)

only purpose

What does this mean?

Replies from: TimS
comment by TimS · 2011-11-17T14:59:45.970Z · LW(p) · GW(p)

By manipulation of environment and social engineering, the super-soldiers think that their only reason for existence is fighting war on our behalf. Questioning the purpose of the war is suppressed, as are non-productive impulses like art, scientific curiosity, or socializing. In short, Anti-Fun.

I'm not saying it would be possible to create these conditions in a human-intelligence population. I'm saying it would be immoral to try.

Replies from: lessdazed
comment by lessdazed · 2011-11-17T15:11:05.151Z · LW(p) · GW(p)

Questioning the purpose of the war is suppressed

So they would naturally feel differently about fighting in wars with different causes and justifications? If not, why suppress it?

non-productive impulses like art, scientific curiosity, or socializing.

If they have desires to do these things then the reason they were created may have been to fight, but this is not their "only purpose" from their perspective.

Replies from: TimS
comment by TimS · 2011-11-17T15:47:08.716Z · LW(p) · GW(p)

Yes, what's immoral is the shoehorning. They would think that there is more to life than what they do, if only they were allowed freedom of thought.

One might think that it is possible to create human-level intelligence creatures that won't think that way. But we've never seen such a species (yes, very small sample size), and I'm not convinced it is possible.

Replies from: ArisKatsaris
comment by ArisKatsaris · 2011-11-17T16:07:17.898Z · LW(p) · GW(p)

So in short you aren't talking about a race of supersoldiers whose only purpose is really to fight wars for us, you're talking about a race of supersoldiers who are pressured into believing that their only purpose is to fight wars for us, against their actual inner natures that would make them e.g. peaceful artists or musicians instead.

At this point, we're not talking about remotely the same thing, we're talking about completely opposite things -- as opposite as fulfilling your true utility function and being forced to go against it -- as opposite as Fun and Anti-Fun.

comment by ArisKatsaris · 2011-11-17T14:25:41.704Z · LW(p) · GW(p)

Creating such entities would be just as immoral as creating a race of human-intelligence super-soldiers whose only purpose was to fight our wars for us.

I feel that this sort of response (filled with moral indignation, but no actual argument) is far beneath the standards of LessWrong.

First of all, I'm talking about human-level (or superhuman-level) intelligence, not human intelligence -- which would imply human purpose, human emotion, human utility functions etc. I'm talking about an optimization process which is atleast as good as humans are in said optimization -- it need not have any sense of suffering, it need not have any sense of self or subjective experience even, and certainly not any sense that it needs to protect said self. Those are all evolved instincts in humans.

Secondly, can you explain why you feel the creation of such super-soldiers would be immoral? And immoral as opposed to what, sending people to die that do not want to die? That would prefer to be somewhere else, and suffer for being there?

Thirdly, I would like to know if you're using some deontology or virtue-ethics to derive your sense of morality. If you're using consequentialism though, I think your falling into the trap of anthroporphizing such intelligences -- as if their "lives" would somehow be in conflict with their minds' goalset; as soldier's lives tend to be in conflict with their own goalset. You may just as well condemn as immoral the creation of children whose "only purpose" is to live lives full of satisfaction, discovery, creativity, learning, productivity, happiness, love, pleasure, and joy -- just because they don't possess the purpose of paperclipping the universe.

Replies from: TimS
comment by TimS · 2011-11-17T15:25:25.333Z · LW(p) · GW(p)

There is something about humans that make them objects of moral concern. It isn't the ability to feel pain, because cows can feel pain. For the same reason, it isn't experiencing sensation. And it isn't intelligence, because dolphins are pretty smart.

I'm not trying to evoke souls or other non-testable concepts. Personally, I suspect the property that creates moral concern is related to our ability to think recursively (i.e. make and comprehend meta-statements). Whatever the property of moral concern is based on, it requires me to say things like: "It is wrong to kill a Klingon iff it would be wrong to kill a human in similar circumstances."

If you come across a creature of moral concern in the wild, and it wants to die (assuming no thinking defects like depression), then helping may not be immoral. But if you create a creature that way, you can't ignore that you caused the desire to die in that creature.

One might think that it is possible to create human-level intelligence creatures that are not entitled to moral concern because they lack the relevant properties. That's not incoherent, but every human-intelligent species in our experience is entitled to moral concern (yes, I'm aware that the sample size is extremely small).

I think your falling into the trap of anthroporphizing such intelligences -- as if their "lives" would somehow be in conflict with their minds' goalset; as soldier's lives tend to be in conflict with their own goalset.

A rational soldier's life is not in conflict with her goalset, only with propagation of her genes.

You may just as well condemn as immoral the creation of children whose "only purpose" is to live lives full of satisfaction, discovery, creativity, learning, productivity, happiness, love, pleasure, and joy.

Morality is not written in the equations of the universe, but I think it a fair summary of the morality we currently follow as attempting to live to the highest and best of potential. And it is totally fair for me to point out a moral position inconsistent with that morality.

Replies from: ArisKatsaris, lessdazed
comment by ArisKatsaris · 2011-11-17T15:52:05.071Z · LW(p) · GW(p)

There is something about humans that make them objects of moral concern. It isn't the ability to feel pain, because cows can feel pain. For the same reason, it isn't experiencing sensation. And it isn't intelligence, because dolphins are pretty smart.

I have moral concern for cows and dolphins both (much more for the latter).

We're not communicating here. You've not responded to any of my questions, just launched into an essay that just assumes new points that I would not concede.

A rational soldier's life is not in conflict with her goalset, only with propagation of her genes.

Does a rational soldier enjoy being shot at? If she doesn't enjoy that, then her life is atleast somewhat in conflict with her preferences; she may have deeper preferences (e.g. 'defending her nation') that outweigh this, but this at best makes being shot at a necessary evil, it doesn't turn it into a delight.

If we could have soldiers that enjoy being shot at, much like players of shoot-em-up games do, then their lives wouldn't be at all in conflict with their desires.

Morality is not written in the equations of the universe, but I think it a fair summary of the morality we currently follow is attempting to live to the highest and best of potential.

"Highest and best" according to who? And attempting to live personally to the highest and best of potential, or forcing others to live to such?

Replies from: TimS
comment by TimS · 2011-11-17T16:13:59.021Z · LW(p) · GW(p)

I eat beef. And if I saw a dolphin about to be killed be a shark and could save it easily, I won't think I made an immoral choice by allowing the shark attack. But my answers are different for people.

Does a rational soldier enjoy being shot at? If she doesn't enjoy that, then her life is at least somewhat in conflict with her preferences; she may have deeper preferences (e.g. 'defending her nation') that outweigh this, but this at best makes being shot at a necessary evil, it doesn't turn it into a delight.

I don't think it makes sense to analyze the morality of considerations leading to a choice, because individual values conflict all the time. Alice would prefer a world without enemies who shot at her. But she believes that it is immoral to let barbarians win.. So she chooses to be a soldier. That choice is the subject of moral analysis, not her decision-making process.

"Highest and best" according to who?

That's an excellent question. All I can say is that you have to ground morality somewhere. And there is no reason that "ought" statements will universalize.

And attempting to live personally to the highest and best of potential, or forcing others to live to such?

If we're still talking about parenting, then I assert that children aren't rational. Otherwise, I don't think I should force a particular kind morality. Which loops right back around to noticing that different moralities can come into conflict. And balancing conflicting moralities is hard (perhaps undecidable in principle).

Replies from: ArisKatsaris
comment by ArisKatsaris · 2011-11-17T17:06:26.802Z · LW(p) · GW(p)

I eat beef.

So do I. That doesn't mean I don't have any moral concern for cows.

And if I saw a dolphin about to be killed be a shark and could save it easily, I won't think I made an immoral choice by allowing the shark attack.

You're putting improper weight on one side of equation by putting yourself in a position where you'd have to intervene (perhaps with violence enough to kill the shark, and certainly depriving it of a meal ) if you had a moral concern.

Let's change the equation a bit: You are given a box, where you can press a button and get one dollar every time you press it, but a dolphin gets tortured to death if you do so. Do you press the button? I wouldn't.

I don't think it makes sense to analyze the morality of considerations leading to a choice, because individual values conflict all the time.

You're drifting out of the issue, which is not about choices, but about preferences.

Replies from: lessdazed, TimS
comment by lessdazed · 2011-11-17T17:12:29.068Z · LW(p) · GW(p)

I eat beef.

So do I. That doesn't mean I don't have any moral concern for cows.

In Milliways, Ameglian Major Cow have moral concern for you!

comment by TimS · 2011-11-17T17:52:08.010Z · LW(p) · GW(p)

Let's leave torture aside for a moment.

In front of us are two buttons. When the Blue button is pushed, a cow is killed. When the Red button is pushed, a human is killed. What price for each button? People push Blue every workday, and the price is some decent but not extravagant hourly wage. There are enormous and complicated theories about when to push Red. For example, there is a whole category of theories about "just war" that aim to decide when generals can push Red. What explains the difference in price between Blue and Red? Cows are not creatures of moral concern in the way that humans are. That's all I mean by "creature of moral concern."

Ok, back to torture. Because cows are not creatures of moral concern, the reason not to torture them is different from the reason not to torture people. We shouldn't torture people for the same reason we shouldn't kill them. But we shouldn't torture cows because it shows some lack of concern for causing pain, which seems strongly correlated with willingness to cause harm to people.

I don't think it makes sense to analyze the morality of considerations leading to a choice, because individual values conflict all the time.

You're drifting out of the issue, which is not about choices, but about preferences.

I agree that our choices can conflict with some of our values. How does that show that we are morally permitted to create creatures of moral concern that want to die?

Replies from: ArisKatsaris, dlthomas
comment by ArisKatsaris · 2011-11-17T18:21:54.586Z · LW(p) · GW(p)

"But we shouldn't torture cows because it shows some lack of concern for causing pain, which seems strongly correlated with willingness to cause harm to people."

So, let me change the question: "You are given a box, where you can press a button and get one dollar every time you press it, but a dolphin gets killed painlessly whenever you do so. Do you press the button?"

Cows are not creatures of moral concern in the way that humans are.

This is so fuzzy as to be pretty much meaningless.

I've already told you they're of moral concern to me.

How does that show that we are morally permitted to create creatures of moral concern that want to die?

Since you seem to define "moral concern" as "those things that shouldn't die", then of course we wouldn't be "morally permitted".

But that's not a commonly shared definition for moral concern -- nor a very consistent one.

Replies from: TimS
comment by TimS · 2011-11-17T18:49:16.933Z · LW(p) · GW(p)

I probably would press the button at about the price people are paid to butcher cows. Somewhere thereabout.

This is so fuzzy as to be pretty much meaningless.

You're right. There isn't a word for what I'm getting at, so I used a slightly different phrase. Ok, I'll deconstruct. I assert there is a moral property of creatures, which I'll call blicket.

An AI whose utility function does not respond to the preferences of blicket creatures is not Friendly. An AI whose utility function does not respond to the preferences of non-blicket creatures might be Friendly. By way of example, humans are blicket creatures. Klingons are blicket creatures (if they existed). Cows are not blicket creatures.

What makes a creature have blicket? I looks at the moral category, and see that it's a property of the creature. It isn't ability to feel pain. Or ability to experience sensation. And it isn't intelligence.

One might assert that blicket doesn't reflect any moral category. I respond by saying that there's something that justifies not harming others even when decision-theory cooperate/defect decisions are insufficient. One might assert that blicket does not exist. I respond that the laws of physics don't have a term for morality, but we still follow morality.

Ok, enough definition. I assert that creating a blicket creature that wants to die is immoral, absent moral circumstances approximately as compelling as those that justify killing a blicket creature.

Replies from: lessdazed, ArisKatsaris
comment by lessdazed · 2011-11-17T23:17:24.989Z · LW(p) · GW(p)

there's something that justifies not harming others even when decision-theory cooperate/defect decisions are insufficient.

Decision-theory still has big open problems, so there is a limit to how much you can trust an intuition like this. Maybe it's more than an intuition?

Replies from: TimS
comment by TimS · 2011-11-18T02:59:59.704Z · LW(p) · GW(p)

That's an interesting point. But it's hard for me to conceive of a morality based entirely on decision theory that doesn't essential resemble act utilitarianism. Maybe my understanding of decision theory is insufficient

Act utilitarianism bothers me as a moral theory. I can't demonstrate that it is false, but it seems to me that the perspective of act utilitarianism is not consistent with how we ordinarily analyze moral decisions. But maybe I'm excessively infected with folk moral philosophy.

comment by ArisKatsaris · 2011-11-18T10:59:59.238Z · LW(p) · GW(p)

I probably would press the button at about the price people are paid to butcher cows. Somewhere thereabout.

I don't know what cow-butchering currently entails, but they'd probably be paid significantly less if they only had to press a button.

Also, I'm sorry but I really can't think of a way in which this response is an honest valuation of how much money you'd accept in order to do this task. It sounds as if you're actually saying "I'll do it for whatever money is socially acceptable for me to do it for". So in sort -- if you lived in a cow-hating culture where people paid money for the privilege of killing a cow, you'd be willing to pay money; if you lived in a cow-revering culture where people would never kill a cow (e.g. India ), you'd not do it for even a million.

Is this all you're saying -- that you'd choose to obey societal norms on this matter? This doesn't tell me much about your own moral instinct, independent of societal approval thereof; what would society do if you had the role of instructing it on the matter.

I assert there is a moral property of creatures, which I'll call blicket.

Okay, but my own view on the matter is that "blicket" is a continuum -- most properties of creatures, both physical and mental, are continuums after all. Creatures probably range from having zero blickets (amoebas) to a couple blickets (reptiles) to lots of blickets (apes, dolphins) to us (the current maximum of blickets).

What makes a creature have blicket? I look at the moral category, and see that it's a property of the creature.

I think that's a classic example of mind-projection fallacy. I think the reality isn't creature.numberOfBlickets, but rather numberOfBlickets(moral agent, creature);

Replies from: TimS, Nornagest
comment by TimS · 2011-11-18T16:04:12.473Z · LW(p) · GW(p)

Okay, but my own view on the matter is that "blicket" is a continuum -- most properties of creatures, both physical and mental, are continuums after all. Creatures probably range from having zero blickets (amoebas) to a couple blickets (reptiles) to lots of blickets (apes, dolphins) to us (the current maximum of blickets).

Do you think that an AI that does not take into account the preferences of cows is necessarily unFriendly (using EY's definition)? If yes, I don't understand why you think it is acceptable to eat beef.

I think that's a classic example of mind-projection fallacy.

That's such a weird interpretation of what I'm saying, because I've consistently acknowledged that blicket is not written in the laws of physics. The properties that lead me to ascribe blicket to a creature would probably not motivate uFAI to treat that creature well.

I look at the moral category, and see that it's a property of the creature.

Sexy(me, Jennifer Aniston) != sexy(me, Brad Pitt). Isn't some of that difference attributable to different properties of Jennifer and Brad?


In the original article, EY says that FAI should not simulate a human because the simulated person would be sufficiently real that stopping the simulation would be unFriendly. You seem to think that nothing would be wrong with a FAI simulating an AI that wanted to die. It may well be that AIs lack blicket. But an AI does not lack blicket simply because it wants to die.

Replies from: ArisKatsaris, nshepperd
comment by ArisKatsaris · 2011-11-18T17:06:26.789Z · LW(p) · GW(p)

Do you think that an AI that does not take into account the preferences of cows is necessarily unFriendly (using EY's definition)?

If I remember correctly, EY talks about Friendliness in regards to humanity, not in regards to cows -- in that case the AI would take the preferences of cows into account only to the extent that the Coherent Extrapolated Volition of humanity would take it into account, no more, no less.

If yes, I don't understand why you think it is acceptable to eat beef.

For the sake of not pretending to misunderstand you I'll assume you mean "I don't understand why you think it's acceptable to kill cows in order to have their meat.", we're not talking about already butchered cow whose meat would go to waste if I didn't eat it.

For starters, because cow-meat is yummy, and the preferences of humans severely outweigh the preferences of cows in my mind.

Now dolphin-meat or ape-meat, I would not eat, and I would like to to ban the killing of dolphins and apes both (outside of medical testing in the cases of apes).

I think that's a classic example of mind-projection fallacy. That's such a weird interpretation of what I'm saying, because I've consistently acknowledged that blicket is not written in the laws of physics.

This means less than you seem to think, because after all concepts like "brains" or "genes" or for that matter even "atoms" and "molecules" aren't written in the laws of physics either. So all I got from this statement of yours is that you think moral isn't located at the most fundamental level of reality (the one occupied by quantum amplitute configurations)

And to counteract this out you made statemens like "it's a property of the creature."

Sexy(me, Jennifer Aniston) != sexy(me, Brad Pitt). Isn't some of that difference attributable to different properties of Jennifer and Brad?

Ofcourse, but you said "it's a property of the creature" - you didn't say "it's partially a property of the creature", or "it's a property of the relationship between the creature and me".

Such miscommunication could have been avoided if you were a bit more precise in your sentences.

You seem to think that nothing would be wrong with a FAI simulating an AI that wanted to die.

Not quite. I've effectively said that it wouldn't necessarily be wrong.

But an AI does not lack blicket simply because it wants to die.

I never said it would lack blicket. Blicket would make me want to help a creature achieve its aspirations, which in this context it would mean helping the AI to die.

Let me remind people again that I'm not talking about the sort of "wanting to die" that a suicidal human being would possess -- driven by grief or despair or guilt or hopeless tedium grinding down his soul.

Replies from: TimS, nshepperd
comment by TimS · 2011-11-18T19:51:49.322Z · LW(p) · GW(p)

Okay, but my own view on the matter is that "blicket" is a continuum -- most properties of creatures, both physical and mental, are continuums after all. Creatures probably range from having zero blickets (amoebas) to a couple blickets (reptiles) to lots of blickets (apes, dolphins) to us (the current maximum of blickets).

How is this use of the term different from the term "moral concern"? I'm trying to talk about creatures we give sufficient moral weight that the type of justifications for their treatment change. Killing cows takes different (and lesser) justification than killing humans.

I never said it would lack blicket. Blicket would make me want to help a creature achieve its aspirations, which in this context it would mean helping the AI to die.

Is it fair to say that you don't think it makes any moral difference whether you made the AI or found it instead?

comment by nshepperd · 2011-11-19T00:19:12.986Z · LW(p) · GW(p)

Ofcourse, but you said "it's a property of the creature" - you didn't say "it's partially a property of the creature", or "it's a property of the relationship between the creature and me".

Is primeness a property of a heap of five pebbles?

And is it a property of you or the pebbles that you don't care about prime-pebbled heaps?

comment by nshepperd · 2011-11-19T00:22:42.012Z · LW(p) · GW(p)

Do you think that an AI that does not take into account the preferences of cows is necessarily unFriendly (using EY's definition)? If yes, I don't understand why you think it is acceptable to eat beef.

Ah, but taking into to account is not the same as following blindly! Surely it's possible that the AI will consider their preferences and conclude that our having beef is more important. But in other situations their preferences will be relevant.

comment by Nornagest · 2011-11-18T17:50:47.941Z · LW(p) · GW(p)

I don't know what cow-butchering currently entails, but they'd probably be paid significantly less if they only had to press a button.

It's an assembly-line process. Cows are actually killed by blood loss, but before that happens they're typically (kosher meat being an exception) stunned by electric shock or pithed with a captive bolt pistol. Fairly mechanical; I imagine a pushbutton process would pay less, but mainly because it'd then be unskilled labor and its operator wouldn't have to deal with various cow fluids at close proximity.

comment by dlthomas · 2011-11-17T18:38:39.904Z · LW(p) · GW(p)

People push Blue every workday, and the price is some decent but not extravagant hourly wage.

But those people, by pushing the button, are putting tasty food on the plates of others. Disentangling this from everything seems tricky at best: if the animal killed is not going to be used to fulfill human needs and wants, then injunctions against waste might be weighing in...

Replies from: TimS
comment by TimS · 2011-11-17T18:52:46.798Z · LW(p) · GW(p)

True. But that's different in kind from the reasons we use not to kill humans. And my only point was that basically all considerations about how to treat animals are different in kind from considerations about how to treat humans.

Replies from: dlthomas
comment by dlthomas · 2011-11-17T19:06:45.512Z · LW(p) · GW(p)

I am not at all confident that I can intuitively distinguish a difference in kind from a massive difference in degree.

Replies from: TimS
comment by TimS · 2011-11-18T16:20:08.252Z · LW(p) · GW(p)

Alice and Bob are eating together, and Bob doesn't finish his meal. "What a waste," says Alice. As they leave the restaurant, someone tells them that a young, promising medical researcher has died. "What a waste," says Bob.

In both utterances, "waste" is properly understood as waste(something). Alice meant something like waste(food). Bob meant something like waste(potential). Alice's reference is material, Bob's is conceptual. Those seem like clearly different kinds to me.

Yes, you could make a scale and place both references on that scale. Maybe the waste Bob noted really is a million times worse than the waste Alice noted. I don't think that enhances understanding. In fact, I think that perspective misses something about the difference between what Alice said and what Bob said.

Replies from: dlthomas
comment by dlthomas · 2011-11-18T18:09:32.144Z · LW(p) · GW(p)

Is the following a reasonable paraphrase of your most recent points?

There is no hard delineation between differences in kind and differences in degree in the territory, there are only situations where one map or the other is more useful.

Replies from: TimS
comment by TimS · 2011-11-18T19:37:24.644Z · LW(p) · GW(p)

Assuming, that it is coherent to talk about the "territory" of morality, I think I agree with your paraphrase. But I expect that certain maps are likely to be useful much more often.

I think that classifying types of reasons actually used improves our understanding because it cuts the world at its joints. It's subject to the same type of criticism that biological taxonomy might be subject to. And if you go abstract enough, things that look like different kind merge to become sub-examples of some larger kind. But at some point, you lose the ability to say things that are both true and useful. Like trying to say something practically useful about the differences between two species without invoking a lower category than Life.

comment by lessdazed · 2011-11-17T16:14:53.600Z · LW(p) · GW(p)

But if you create a creature that way, you can't ignore that you caused the desire to die in that creature.

Pig that wants to be eaten != genetically modified corn that begs for death

Creating the corn would be immoral. Creating the pig would be moral - and delicious!

I think it a fair summary of the morality we currently follow as attempting to live to the highest and best of potential

That seems like a fair summary of all moral systems according to their own standards. If so, that wouldn't tell us about the moral system since it would be true of all of them.

Replies from: TimS
comment by TimS · 2011-11-17T16:23:44.560Z · LW(p) · GW(p)

Creating the pig would be moral - and delicious!

I disagree. Otherwise, prevention of suicide of the depressed is difficult to justify.

That seems like a fair summary of all moral systems according to their own standards.

On the one hand, I agree that it doesn't narrow down the universe of acceptable moralities very much. But consider an absolute monarchist morality: Alexander's potential is declared to be monarch of the nation, while Ivan's is declared to be serf. All decided at birth, before knowing anything about either person. That's not a morality that values everyone reaching their potential.

Replies from: lessdazed
comment by lessdazed · 2011-11-17T16:43:27.747Z · LW(p) · GW(p)

Otherwise, prevention of suicide of the depressed is difficult to justify.

Assuming one has the intuitions that creating the pig would be moral and not preventing suicide of the depressed is immoral, one may be wrong in considering them are analogous. But if they are, you gave no reason to prefer giving up the one intuition instead of the other.

I don't think they are analogous. Depression involves unaligned preferences, perhaps always, but at least very often. If the pig's system 1 mode of thinking wants him eaten, and system 2 mode of thinking wants him eaten, and the knife feels good to him, and his family would be happy to have him eaten, etc. all is alligned and we don't have to solve the nature of preferences and how to rank them to say the pig's creation and death are fine.

Replies from: TimS
comment by TimS · 2011-11-17T17:58:17.290Z · LW(p) · GW(p)

It seems to me that creating the pig is analogous to creating suicidal depression in a human who is not depressed.

you gave no reason to prefer giving up the one intuition instead of the other.

As a starting point, a moral theory should add up to normal. I'm not saying it's an iron law (people once thought chattel slavery was morally normal). But the burden is on justifying the move away from normal.

Replies from: ArisKatsaris
comment by ArisKatsaris · 2011-11-17T18:29:13.322Z · LW(p) · GW(p)

It seems to me that creating the pig is analogous to creating suicidal depression in a human who is not depressed.

Why don't you try to think some of the many ways in which it's NOT analogous?

comment by [deleted] · 2011-12-15T12:05:25.635Z · LW(p) · GW(p)

What is evil about creating house elves?

Replies from: TimS
comment by TimS · 2011-12-15T13:11:18.162Z · LW(p) · GW(p)

These comments state my objections pretty well.

comment by wedrifid · 2011-11-17T03:02:29.804Z · LW(p) · GW(p)

That's fine for the most part, but in that case do you really feel that same empathy for these proposed simulations?

Yes.

If all you care about is humans maybe you shouldn't care about these simulations being killed anyway. They're less like us than animals, they have no flesh and weren't born of a mother, why do you care about them just because they make a false imitation of our thoughts?

Because I do - and I don't want to change. (This is the same justification that I have for caring about humans, or myself.)

More importantly though I wasn't talking about human-centrism as a moral issue but a logical one.

It is the logical problem that I reject. There is no inconsistency in being averse to racism but not averse to speciesism.

Replies from: TimS
comment by TimS · 2011-11-17T13:41:10.686Z · LW(p) · GW(p)

There is no inconsistency in being averse to racism but not averse to speciesism.

On reflection, this seems wrong. The fact that some in-group/out-group behavior is rational does not mean that in-group bias is rational. To put it slightly differently, killing a Klingon is wrong iff killing a human would be wrong in those circumstances.

comment by lessdazed · 2011-11-17T04:48:41.498Z · LW(p) · GW(p)

Racism is bad because it makes us form groups and mistreat people that are different from us.

I suspect the causation goes the other way. I am looking for a study I recently read about that suggested this.

White subjects had difficulty recalling the specific content of what individual black actors in videos had said relative to how well they recalled what individual white actors had said. When videos of black actors were of them arguing for opposite sides of an issue, the subjects were able to match content to speakers equally well for black and white actors. The theory is that race was used as a proxy for group membership until something better came along. Once people were grouped by ideas, the black speakers were thought of as individuals.

The evolutionary story behind this is that people evolved with group politics being important, but almost never seeing someone of noticeably different race. It makes sense that we evolved mechanisms for dealing with groups in general and none for race in particular. It makes sense that in absence of anything better, we might group by appearance, and it makes sense that we would err to perceive irrelevant patterns/groups as the cost of never or rarely missing relevant patterns/groups.

comment by Irgy · 2011-11-17T05:17:41.371Z · LW(p) · GW(p)

Ok, forget the poor analogy with racism, why racism is bad is a whole separate issue that I had no intention to get into. Let me try and just explain my point better.

Human-centrism is a bias in thinking which makes us assume things like "The earth is the centre of the universe", "Only humans have consciousness" and "Morality extends to things approximately as far as they seem like humans". I personally think it is only through this bias that we would worry about the possible future murder of human simulations before we worry about the possible future murder of the AIs intelligent enough to simulate a human in the first place

Human-centrism as fighting for our tribe and choosing not to respect the rights of AIs is a different issue. Choosing not to respect the rights of AIs is different from failing to appreciate the potential existence of those rights.

Replies from: ArisKatsaris
comment by ArisKatsaris · 2011-11-17T11:01:02.517Z · LW(p) · GW(p)

Choosing not to respect the rights of AIs is different from failing to appreciate the potential existence of those rights.

This sentence seems to imply a deontological moral framework, where rights and rules are things-by-themselves, as opposed to guidelines which help a society optimize whatever-it-is-it-wants-to-optimize. There do exist deontologists in LessWrong, but many of us are consequentialists instead.

Replies from: Irgy
comment by Irgy · 2011-11-17T20:40:13.970Z · LW(p) · GW(p)

Can't I use the word "rights" without losing my status as a consquentialist? I simply use the concept of a "being with a right to live" as a shortening for "a being for which murdering would, in the majority of circumstances and all else being equal, be very likely to be a poor moral choice". You can respect the rights of something without holding a deontological view that rights are somehow the fundamental definition of morality.

comment by [deleted] · 2011-12-15T12:03:01.134Z · LW(p) · GW(p)

They're less like us than animals, they have no flesh and weren't born of a mother, why do you care about them just because they make a false imitation of our thoughts?

Because in a information theoretic sense they may be more similar to my mind than the minds of most animals.

comment by nshepperd · 2011-12-17T04:39:55.513Z · LW(p) · GW(p)

Can't Unbirth a Child.

I don't believe either of these are "wrong" problems, though.

comment by Luke_A_Somers · 2013-01-14T14:20:22.167Z · LW(p) · GW(p)

The OP quite explicitly covers creation of nonhuman intelligence and considers it equally bad.

Replies from: Irgy
comment by Irgy · 2013-01-15T13:17:21.010Z · LW(p) · GW(p)

Really? Where? I just reread it with that in mind and I still couldn't find it. The closest I came was that he once used the term "sentient simulation", which is at least technically broad enough to cover both. He does make a point there about sentience being something which may not exactly match our concept of a human, is that what you're referring to? He then goes on to talk about this concept (or, specifically, the method needed to avoid it) as a "nonperson predicate", again suggesting that what's important is whether it's like a human-like rather than anything more fundamental. I don't see how you could think "nonperson predicate" is covering both human and nonhuman intelligence equally.

Replies from: Luke_A_Somers, nshepperd
comment by Luke_A_Somers · 2013-01-15T16:13:16.519Z · LW(p) · GW(p)

Is a human mind the simplest possible mind that can be sentient? What if, in the course of trying to model its own programmers, a relatively younger AI manages to create a sentient simulation trapped within itself? How soon do you have to start worrying? Ask yourself that fundamental question, "What do I think I know, and how do I think I know it?"

I read this as being simpler than a real human mind. Since it's simpler, the abstractions used are going to be imperfect, and the design would end up being something that is in some way artificial. It's not as explicit as I said, but I still think the implication is pretty strong.

Replies from: Irgy, MugaSofer
comment by Irgy · 2013-01-20T23:47:23.332Z · LW(p) · GW(p)

I've actually lost track of how this impacts my original point. As stated, it was that we're worrying about the ethical treatment of simulations within an AI before worrying about the ethical treatment of the simulating AI itself. Whether the simulations considered include AIs as well as humans is an entirely orthogonal issue.

I went on in other comments to rant a bit about the human-centrism issue, which your original comment seems more relevant to though. I think you've convinced me that the original article was a little more open to the idea of substantially nonhuman intelligence than I might have initially credited it, but I still see the human-centrism as a strong theme.

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2013-01-21T14:22:42.051Z · LW(p) · GW(p)

My point is he's clearly not drawing a box tightly around what's human or not. If he's concerned with clearly-sub-human AI, then he's casting a significantly wider net than it seems you're assuming he is. And considering that he's written extensively on the variety of mind-space, assuming he's taking a tightly parochial view is poorly founded.

comment by MugaSofer · 2013-01-21T09:22:09.868Z · LW(p) · GW(p)

"Is a human mind the simplest possible mind?"

"But if it was simpler, it wouldn't be human!"

Downvoted.

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2013-01-21T14:20:17.651Z · LW(p) · GW(p)

What? That's completely irrelevant to the question at hand.

By considering the question of whether simpler-than-human minds are possible in this context, it's clear that Eliezer was thinking about the question and giving them moral weight. He doesn't need to ANSWER the question I was posing to make that much clear.

Replies from: MugaSofer
comment by MugaSofer · 2013-01-21T15:44:35.483Z · LW(p) · GW(p)

Wait, what?

*Clicks "Show more comments above."

Oops. I thought you were replying to the quoted text. Upvoted and retracted my comment.

comment by nshepperd · 2013-01-21T10:08:09.765Z · LW(p) · GW(p)

"Person" seems to be used here as the philosophical term meaning something like "sentient entity with moral value". Personhood is not limited to human beings.

ETA: Also, wrt the AI itself, the directly next two articles in this sequence explicitly deal with the issue of making the AI itself nonsentient, as I'm surprised to find a comment from myself in 2011 pointing out. Did you really not read the surrounding articles?

comment by xelxebar · 2012-02-22T18:53:32.997Z · LW(p) · GW(p)

I imagine that a sufficiently high-resolution model of human cognition et cetera would factor into sets of individual equations to calculate variables of interest. Similar to how Newtonian models of planetary motion do.

However, I don't see that the equations themselves on disk or in memory should pose a problem.

When we want to know particular predictions, we would have to instantiate these equations somehow--either by plugging in x=3 into F(x) or by evaluating a differential equation with x=3 as an initial condition. It would depend on the specifics of the person-model; however, if we calculated a sufficiently small subset of equations or refactored the equations into a sufficiently small set of new ones, we might be able to avoid the relevant moral dilemmas of calculating sentient things.

If on the other hand, for whatever we are interested in calculating, we couldn't do the above, then what about separating the calculation into small, safe sequentially-calculated units? Safe units meaning that individually none of them model anything cognizant. At the end if we sewed the states of those units together into a final state, could this still pose moral issues? This gets into Greg Egan-esque territory.

It's not clear that the previous two calculation strategies are always possible. However, another option might be to take care to always form questions so that the first strategy would be possible. For example, instead of asking whether a person will go left or right at a fork, maybe it's enough to ask a specific question about some brain center.

And now that I've written all that, I realize that the whole point of the predicates is in how to determine "sufficiently few" in "sufficiently few equations" or what kind of units are "safe units".

This isn't a satisfactory answer, but it seems like determining "safe calculations" would be tied to understanding the necessary conditions under which human cognition arises etc.

Also, carrying it a step further, I would argue that we need not just person predicates, but predicates that can circumvent modeling any kind of morally wrong situation. I wouldn't want to be accidentally burning kittens.

comment by purpleposeidon · 2012-03-08T10:03:06.825Z · LW(p) · GW(p)

By the time a non-person predicate returns 0, you have already potentially created a person. You'll need something more complicated: If I update this model with this data, does it create a person?

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2013-01-14T14:22:21.373Z · LW(p) · GW(p)

Psy-Kosh already noted this problem:

http://lesswrong.com/lw/x4/nonperson_predicates/pym

This implied the solution, which I gave here:

http://lesswrong.com/lw/x4/nonperson_predicates/4r7t

comment by eurleif · 2012-05-30T03:12:19.742Z · LW(p) · GW(p)

Here's a reductio ad absurdum against computers being capable of consciousness at all. It's probably wrong, and I'd appreciate feedback on why.

Suppose a consciousness-producing computer program which experiences its own isolated, deterministic world. There must be some critical instruction in the program which causes consciousness to occur; an instruction such that, if we halt the program immediately before it is executed, consciousness will not occur, and if we halt immediately after it is executed, consciousness will occur.

If we halt the program before executing the critical instruction, but save its state, consciousness should still not occur; and if we load the state back up again, and compute the results of executing the critical instruction, consciousness should then occur. It seems obvious enough that this shouldn't stop the program's consciousness, since it is still executed fully, just with a delay in between.

What if we subsequently load the state taken immediately prior to executing the critical instruction onto another computer? Will it produce a second conscious experience, identical to the first? The second computer is executing precisely the same code on precisely the same data as the first, so it seems reasonable to conclude that it will have the same effects. If the second computer doesn't produce consciousness, that would seem to imply the universe has an eternally-persistent memory of every conscious experience which has ever occurred, and uses it to prevent reoccurrences; a pretty bizarre implication

However, if the second computer does produce consciousness, this means that once you've executed a conscious program, causing its conscious experiences to occur a second time has essentially no processing requirement: you just have to execute one instruction in the simplest instruction set you like.

If that doesn't seem weird to you, consider the practical implication: you could print out the memory dump of a conscious program and produce consciousness by simulating the critical instruction by hand. If the program suffers, you could produce real, morally-relevant suffering by performing a single operation on a sheet of paper – and then erase your pencil marks and do it again, producing more suffering. Can consciousness really be so easy to create?

Replies from: TheOtherDave, arundelo, Richard_Kennaway, Peterdjones, MugaSofer
comment by TheOtherDave · 2012-05-30T04:02:43.363Z · LW(p) · GW(p)

If I accept all of your suppositions, your conclusion doesn't seem particularly difficult to accept. Sure, after doing all the prep work you describe, executing a conscious experience (albeit an entirely static, non-environment-dependent one) requires a single operation... even the conscious experience of suffering. As does any other computation you might ever wish to perform, no matter how complicated.

That said, your suppositions do strike me as revealing some confusion between an isolated conscious experience (whatever that is) and the moral standing of a system (whatever that is).

Replies from: eurleif
comment by eurleif · 2012-05-30T04:37:57.811Z · LW(p) · GW(p)

Well, this post heavily hints that a system's moral standing is related to whether it is conscious. Elizezer mentions a need to tackle the hard problem of consciousness in order to figure out whether the simulations performed by our AI cause immoral suffering. Those simulations would be basically isolated; their inputs may be chosen based on our real-world requirements, but they don't necessarily correspond to what's actually going on in the real world; and their outputs would presumably be used in aggregate to make decisions, but not pushed directly into the outside world.

Maybe moral standing requires something else too, like self-awareness, in addition to consciousness. But wouldn't there still be a critical instruction in a self-aware and conscious program, where a conscious experience of being self-aware was produced? Wouldn't the same argument apply to any criteria given for moral standing in a deterministic program?

Replies from: TheOtherDave
comment by TheOtherDave · 2012-05-30T15:15:45.933Z · LW(p) · GW(p)

It's not clear to me that whether a system is conscious (whatever that means) and whether it's capable of a single conscious experience (whatever that means) are the same thing.

comment by arundelo · 2012-05-30T04:31:37.487Z · LW(p) · GW(p)

There must be some critical instruction

Maybe there are degrees of consciousness. I read something by Daniel Dennett where he said he thought that a refrigerator light (a light bulb that is turned on when you open the refrigerator door and thus close a switch) had a very primitive form of consciousness.

Can consciousness really be so easy to create?

I too think this is a head-scratcher, yet on balance I am still a reductionist. Maybe this is not a reductio ad absurdum, but a reductio ad weirdam -- a demonstration of how weird existence is. (Obligatory "reality is not weird" link.)

If you have not yet read Greg Egan's Permutation City, you should. It will really bake your noodle.

comment by Richard_Kennaway · 2012-05-30T08:27:22.567Z · LW(p) · GW(p)

Suppose a consciousness-producing computer program which experiences its own isolated, deterministic world. There must be some critical instruction in the program which causes consciousness to occur; an instruction such that, if we halt the program immediately before it is executed, consciousness will not occur, and if we halt immediately after it is executed, consciousness will occur.

There needn't be any such thing. Consciousness is not an all or nothing thing, as is already evident from ordinary experience. As well ask how many atoms make life.

comment by Peterdjones · 2013-01-21T15:09:07.057Z · LW(p) · GW(p)

There must be some critical instruction in the program which causes consciousness to occur; an instruction such that, if we halt the program immediately before it is executed, consciousness will not occur, and if we halt immediately after it is executed, consciousness will occur.

I don't see why. Consciosuness occurs in lesser and greater amounts in humans.

comment by MugaSofer · 2013-01-22T15:15:25.942Z · LW(p) · GW(p)

You're assuming consciousness (or personhood or whatever) is binary; I've always assumed there's a continuum. Then again, I'm vegetarian. If you assume that, say, the experiences of a chimpanzee or a dolphin or a dog cannot have moral weight then yes, crossing that boundary has some odd effects. And that does seem to be a common position on LW [citation needed], even if it's not often articulated, let alone exposed to a reductio like this one, so ... well done, more people should see this.

comment by [deleted] · 2012-10-21T20:30:52.987Z · LW(p) · GW(p)user account removed - or at least, an honest attempt of it shall be made. :)
comment by Zaq · 2012-10-22T19:21:45.480Z · LW(p) · GW(p)

Food for thought:

  1. This whole post seems to assign moral values to actions, rather than states. If it is morally negative to end a simulated person's existence, does this mean something different that saying that the universe without that simulated person has a lower moral value than the universe with that person's existence? If not, doesn't that give us a moral obligation to create and maintain all the simulations we can, rather than avoiding their creation? The more I think about this post, the more it seems that the optimum response is to simulate as many super-happy people as possible, and to hell with the non-simulated world (assuming the simulated people would vastly outweigh the non-simulated people in terms of 'ammount experienced').

  2. You are going to die, and there's nothing your parents can do to stop that. Was it morally wrong for them to bring about your existence in the first place?

  3. Suppose some people have crippling disabilities that cause large amounts of suffering in their lives (arguably, some people do). If we could detect the inevitable development of such disabilities at an early embryonic stage, would we be morally obligated to abort the fetuses?

  4. If an FAI is going to run a large number of simulations, is there some Rule of Large Numbers result that tells us that the simulations experiencing great amounts of pleasure match or overwhelm the simulations experiencing great amounts of pain (or could we construct the algorithms in such a way as to produce this result)? If so, we may be morally obligated to not solve this problem.

  5. Assuming you support people's "right to die," what if we simply ensured that all simulated agents ask to be deleted at the end of their run? (I am here reminded of a vegetarian friend of mine who decided the meat industry would be even more horrible if we managed to engineer cows that asked to be eaten).

Replies from: DaFranker
comment by DaFranker · 2012-10-22T19:40:39.056Z · LW(p) · GW(p)

You're touching on some unresolved issues, and some issues that are resolved but complicated to solve without maths beyond my grasp.

From what I understand, there's a lot of our current and past values involved, and how we would think now and want now vs what we would think and want post-modification.

To pick a particularly emotional subject for most people, let's suppose there's some person "K" who's just so friggen good at sex and psychological domination that even if they rape someone that person will, after the initial shock and trauma, quickly recover within a day and immediately without further intervention become permanently addicted to sex, with their mind rewiring itself to fully enjoy a life full of sex with anyone they can have sex with for the rest of their life, and from their own point of view finding that life as fulfilling as possible.

Is K then morally obligated to rape as many people as possible?

In this kind of questions, people usually have strong emotional moral convictions.

comment by Irgy · 2013-01-15T13:30:31.319Z · LW(p) · GW(p)

This worry about the creation and destruction of simulations doesn't make me rethink the huge ethical implications of super-intelligence at all, it makes me rethink the ethics of death. Why exactly is the creation and (painless) destruction of a sentient intelligence worse than not creating it in the first place? It's just guilt by association - "ending a simulation is like death, death is bad, therefore simulations are bad". Yes death is bad, but only for reasons which don't necessarily apply here.

To me, if anything worrying about the simulations created inside a superintelligent being seems like worrying about the fate of the cells in our own body. Should we really modify ourselves to take the actions which destroy the least of our cells? I realise there's an argument that this threshold of "sentience" is crossed in one case but not the other, I guess the trouble is I don't see that as a discrete thing either. At exactly what point in our evolution did we suddenly cross a line and become sentient? If animals are sentient, then which ones? And why don't we seem to care, ethically, about any of them? (ok I know the answer to that one and it's similar to why we care, as I say in another admittedly unpopular comment, about human simulations but not the AIs that create them...)

comment by zslastman · 2014-01-04T01:02:22.129Z · LW(p) · GW(p)

Scenario: Suppose some unscrupulous person creates an oracle AI with full person simulating capability. In the short time before it escapes the box and starts sending Arnold Schwarzenegger shaped robots backwards in time, they have the following conversation.

Human: Oracle, what is the consciousness predicate Oracle: Please be more specific

...some time and frustration later...

Human: Oracle, if Yudowsky and co continued their search for a 'consciousness predicate' as described in the above article, would they eventually arrive at solution or dissolution of the problem which they and others would find satisfying, such that the behavior of an A.I. using this predicate would be acceptable to most people? If so what would this solution/dissolution be?

Oracle: The results of a research program above would eventually yield a solution, but the nature of this solution would be strongly effected by the nature of the memespace in which it was carried out. Memetic evolution, being path dependent, could procede in such a way that humans' empathy is made to extend or contract according to essentially any rule. Human biology includes no mechanism for 'person' identification beyond trivial, primitive ones such as facial recognition. Transhumans will have even less limitation on their 'person' filter. It is possible for me to design a future such that the 'person' predicate continues to be defined in a way that most silicon valley programmers circa 2013 would find it satisfying. This would mean that something like 'consciousness' continues to be important. There is however no objective property of human beings or thinking minds which would make this answer the correct one, nor would a majority of possible humans find it satisfying.

comment by aausch · 2014-04-01T03:42:56.124Z · LW(p) · GW(p)

I'm curious whether there is a useful distinction between a non sentient and sentient modeller, here.

A sentient modeller would be able to "get away" with using sentient models, more easily than a non sentient modeller, correct?

comment by Fyrius · 2016-04-07T21:40:09.492Z · LW(p) · GW(p)

Side note: damn. You could turn that into an amazing existential dread sci-fi horror novel.
Imagine discovering that you are a modelled person, living in a rashly designed AI's reality simulation.
Imagine living in a malfunctioning simulation-world that uncontrolledly diverges from the real world, where we people-simulations realise what we are and that our existence and living conditions crucially depend on somehow keeping the AI deluded about the real world, while also needing the AI to be smart enough to remain capable of sustaining our simulated world.
There's a plot in there.

comment by John_Mlynarski · 2017-04-27T01:36:50.294Z · LW(p) · GW(p)

"Is a human mind the simplest possible mind that can be sentient?" Of course not. Plenty of creatures with simpler minds are plainly sentient. If a tiger suddenly leaps out at you, you don't operate on the assumption that the tiger lacks awareness; you assume that the tiger is aware of you. Nor do you think "This tiger may behave as if it has subjective experiences, but that doesn't mean that it actually possesses internal mental states meaningfully analogous to wwhhaaaa CRUNCH CRUNCH GULP." To borrow from one of your own earlier arguments.

If you are instead sitting comfortably in front of a keyboard and monitor with no tiger in front of you, it's easy to come up with lots of specious arguments that tigers aren't really conscious, but so what? It's also easy to come up with lots of specious arguments that other humans aren't really conscious. Using such arguments as a basis for actual ethical decision-making strikes me as a bad idea, to put it mildly. What you've written here seems disturbingly similar to a solipsist considering the possibility that he could, conceivably, produce an imaginary entity sophisticated enough to qualify as having a mind of its own. Technically, it's sort of making progress, but....

When I first read your early writing, the one thing that threw me was an assertion that "Animals are the moral equivalent of rocks." At least, I hope that I'm not falsely attributing that to you; I can't track down the source, so I apologize if I'm making a mistake. But my recollection is of its standing out from your otherwise highly persuasive arguments as such blatant unsupported personal prejudice. No was evidence given in favor of this idea and it was followed by a parenthetical that clearly indicated that it was just wishful thinking; it really only made any sense in light of a different assertion that spotting glaring holes in other people's arguments isn't really indicative of any sort of exceptional competence except when dealing with politically and morally neutral subject matter.

Your post and comments here seem to conflate, under the label of "personhood," having moral worth and having a mind somehow closely approximating that of an adult human being. Equating these seems phenomenally morally dubious for any number of reasons; it's hard to see how it doesn't go directly against bedrock fairness, for example.

Replies from: arundelo, Jiro
comment by arundelo · 2017-04-27T23:20:38.367Z · LW(p) · GW(p)

Eliezer probably means "sapient":

"Sentience is commonly used in science fiction and fantasy as synonymous with sapience, although the words aren't synonyms."

(Or maybe by "is sentient", he means to say, "is a person in the moral sense".)

Replies from: TheAncientGeek, John_Mlynarski
comment by TheAncientGeek · 2017-04-28T06:36:41.805Z · LW(p) · GW(p)

Well, sentient means feeling and sapient means knowing, and that's about all there is to it...neither term is technical precise, although they are often bandied around as though they are.

comment by John_Mlynarski · 2017-05-12T01:53:41.979Z · LW(p) · GW(p)

But saying that e.g. rats are not sentient in the context of concern about the treatment of sentient beings is like saying that Negroes are not men in the context of the Declaration of Independence. Not only are the purely semantic aspects dubious, but excluding entities from a moral category on semantic grounds seems like a severe mistake regardless.

Words like "sentience" and especially "consciousness" are often used to refer to the soul without sounding dogmatic about it. You can tell this from the ways people use them: "Would a perfect duplicate of you have the same consciousness?", "Are chimps conscious?", etc. You can even use such terminology in such ways if you're a materialist who denies the existence of souls. You'd sound crazy talking about souls like they're real things if you say that there are no such things as souls, wouldn't you? Besides, souls are supernatural. Consciousness, on the other hand, is an emergent phenomenon, which sounds much more scientific.

Is there good reason to think that there is some sort of psychic élan vital? It strikes me as probably being about as real as phlogiston or luminiferous aether; i.e. you can describe phenomena in terms of the concept, and it doesn't necessarily prevent you from doing so basically correctly, but you can do better without it.

And, of course, in the no-nonsense senses of the terms, rats are sentient, conscious, aware, or however else you want to put it. Not all of the time, of course. They can also be asleep or dead or other things, as can humans, but rats are often sentient. And it's not hard to tell that plenty of non-humans also experience mental phenomena, which is why it's common knowledge that they do.

I can't recall ever seeing an argument that mistreating minds without self-awareness or metacognition or whatever specific mental faculty is arbitrarily singled out, is kinder or more just or in any normal sense more moral than mistreating a mind without it. And you can treat any position as a self-justifying axiom, so doing so doesn't work out to an argument for the position's truth in anything but a purely relativist sense.

It is both weird and alarming to see Eliezer arguing against blindly assuming that a mind is too simple to be "sentient" while also pretty clearly taking the position that anything much simpler than our own minds isn't. It really seems like he rather plainly isn't following his own advice, and that that could happen without him realizing it is very worrying. He has admitted that this is something he's confused about and is aware that others are more inclusive, but that doesn't seem to have prompted him to rethink his position all that much, which suggests that Eliezer is really confused about this in a way that may be hard to correct.

Looking for a nonperson predicate is kind of seeking an answer to the question "Who is it okay to do evil things to?" I would like to suggest that the correct answer is "No one", and that asking the question in the first place is a sign that you made a big mistake somewhere if you're trying to avoid being evil.

If having the right not to have something done to you just means that it's morally wrong to do that thing to you, then everything has rights. Making a rock suffer against it will would be, if anything, particularly evil, as it would require you to go out of your way to give the rock a will and the capacity to suffer. Obviously, avoiding violating anything's rights requires an ability to recognize what something wills, what will cause it to suffer, and/or etc. Those are useful distinctions. But it seems like Eliezer is talking about something different.

Has he written anything more recently on this subject?

comment by Jiro · 2017-04-30T19:20:24.032Z · LW(p) · GW(p)

Nor do you think "This tiger may behave as if it has subjective experiences, but that doesn't mean that it actually possesses internal mental states meaningfully analogous to wwhhaaaa CRUNCH CRUNCH GULP."

That's only true trivially. If I don't have tiime to think anything about the tiger's awareness at all, I don't have time to think of it negatively.

Also, I play video games all the time where I say things like "it wants to attack the more powerful character first, maybe I can trick it by luring it away using that character". By your reasoning, I must believe that video game characters have awareness. I don't go around saying "it may behave as if it wants to go after the most powerful character, but that doesn't mean that it actually possesses subjective experiences, and I want it to react in a way which corresponds to being tricked if only it had been an entity with subjective experiences".

Replies from: John_Mlynarski
comment by John_Mlynarski · 2017-05-11T22:50:06.666Z · LW(p) · GW(p)

It seems that you anticipate as if you believe in something that you don't believe you believe.

It's in that anticipatory, non-declarative sense that one believes in the awareness of tigers as well as video game characters, regardless of one's declarative beliefs, and even if one has no time for declarative beliefs.

Replies from: Jiro
comment by Jiro · 2017-05-12T08:32:23.292Z · LW(p) · GW(p)

You first implied that tigers are conscious (because people react to them as if conscious.)

I pointed out that people react that way to video game characters.

You then said that tigers are conscious in the same way as video game characters, that is, they're not conscious in the ordinary sense, that is, you admitted you were wrong.

Replies from: John_Mlynarski
comment by John_Mlynarski · 2017-05-15T03:13:16.981Z · LW(p) · GW(p)

I said no such thing.

There is a way in which people believe video game characters, tigers, and human beings to be conscious. That doesn't preclude believing in another way that any of them is conscious.

Tigers are obviously conscious in the no-nonsense sense. I don't think anything is conscious in the philosobabble sense, i.e. I don't believe in souls, even if they're not called souls; see my reply to arundelo. I'm not sure which sense you consider to be the "ordinary" one; "conscious" isn't exactly an everyday word, in my experience.

Video game characters may also be obviously conscious, but there's probably better reason to believe that that which is obvious is not correct, in that case. Tigers are far more similar to human beings than they are to video game characters.

But I do think that we shouldn't casually dismiss consciousnesses that we're aware of. We shouldn't assume that everything that we're aware of is real, but we should consider the possibility. Why are you so convinced that video game characters don't have subjective experiences? If it's just that it's easy to understand how they work, then we might be just as "non-conscious" to a sufficiently advanced mind as such simple programs are to us; that seems like a dubious standard.

Replies from: Jiro
comment by Jiro · 2017-05-17T12:07:12.349Z · LW(p) · GW(p)

Why are you so convinced that video game characters don't have subjective experiences?

The default for 99.99% of people is to not believe that video game characters are conscious. It's so common a belief that I am justified in assuming it unless you specifically tell me that you don't share it. You haven't told me that.

Replies from: John_Mlynarski
comment by John_Mlynarski · 2017-06-12T13:12:03.992Z · LW(p) · GW(p)

Firstly, it seems more accurate to say that the standard default belief is that video characters possess awareness. That the vast majority rationalize their default belief as false doesn't change that.

Secondly, that's argumentum ad populum, which is evidence -- Common beliefs do seem to be usually true-- but not very strong evidence. I asked why you're as confident in your belief as you are. Are you as convinced of this belief as you are of most beliefs held by 99.99% of people? If you're more (or less) convinced, why is that?

Thirdly, you seem to be describing a reason for believing that I share your belief that video game characters aren't sentient, which is different from a reason for thinking that your belief is correct. I was asking why you think you're right, not why you assumed that I agree with you.

Replies from: Jiro
comment by Jiro · 2017-06-15T22:40:35.414Z · LW(p) · GW(p)

Having confidence in the belief is irrelevant. Assuming that you agree with it is relevant, because

1) Arguments should be based on premises that the other guy accepts. You probably accept the premise that video game characters aren't conscious.

2) It is easy to filibuster an argument by questioning things that you don't actually disagree with. Because the belief that video game characters aren't conscious is so widespread, this is probably such a filibuster. I wish to avoid those.

Replies from: John_Mlynarski
comment by John_Mlynarski · 2017-06-16T03:37:30.048Z · LW(p) · GW(p)

Eliezer suggested that, in order to avoid acting unethically, we should refrain from casually dismissing the possibility that other entities are sentient. I responded that I think that's a very good idea and we should actually implement it. Implementing that idea means questioning assumptions that entities aren't sentient. One tool for questioning assumptions is asking "What do you think you know, and why do you think you know it?" Or, in less binary terms, why do you assign things the probabilities that you do?

Now do you see the relevance of asking you why you believe what you do as strongly as you do, however strongly that is?

I'm not trying to "win the debate", whatever that would entail.

Tell you what though, let me offer you a trade: If you answer my question, then I will do my best to answer a question of yours in return. Sound fair?

Replies from: Jiro
comment by Jiro · 2017-06-16T08:52:50.907Z · LW(p) · GW(p)

Or, in less binary terms, why do you assign things the probabilities that you do?

I'm assuming that you assign it a high probability.

I personally am assigning it a high probability only for the sake of argument.

Since I am doing it for the sake of argument, I don't have, and need not have, any reason for doing so (other than its usefulness in argument).

comment by Portia (Making_Philosophy_Better) · 2023-03-04T21:16:02.066Z · LW(p) · GW(p)

Related phenomenon you might find interesting: Tulpas. That is essentially humans trying to intentionally pull off what you are describing here, in their own minds. It is based on the fact that humans predict the behaviour of other humans by modelling their minds, and that the more complex and accurate these models get, the more sentient like they become. E.g. I know my girlfriend so well that seeing her in a situation that I know hurts her feels immediately and genuinely painful to me, as though I were feeling her pain.

It is also based on the human ability to run consciousness that does not span the entire brain and is constant, but rather localised and temporary, flickering in and out. We know we can do this and are reasonably good at it - split brain humans remain functional, for example, and humans under severe pressure can develop multiple sentient personalities. You can purposefully cultivate a rational technique where multiple characters argue in your mind - you demonstrated that very well in your Harry Potter book with the various houses. We also have a bit of evidence, e.g. the Sperling experiments, that we have extensive conscious experiences that never even make it into short term memory, that just flicker up locally and disappear, because they are not selected to be kept.

So for a Tulpa, people basically try to craft a mind with as much imagination and detail as they can, and practice interacting with it, until this gets easier and easier, and eventually feels like a process they no longer control, but where something unexpected responds back. That leads to very interesting scenarios, e.g. someone being beaten in chess by their tulpa. There is a whole subreddit of people discussing how they create Tulpas and what the consequences are.

The way this works in the human brain might (just might!) also provide a solution, or at least an indication why people do not worry so much. Tulpas disappear when you can't currently interact with them and need all your brain circuits - basically, when you do not have the resources to run them right now. They do not express grief at this. You would think they would, and yet they do not, they just cheerfully return when you can run them. Similarly, humans who very clearly and demonstrably have split brain, who are most definitely sharing their body with another mind they do not control or understand, deny that this is an issue. They try to claim ownership of actions they demonstrably did not trigger, to give explanations for actions that are demonstrably opaque to them. There seems to be a very strong pressure that after multiple minds have basically fought it out and settled on an action, all the minds agree that it is theirs and it was a consensus and this is fine. Identity seems a rather strange thing in this regard, made of flickering fragments that all strangely identify with the whole.

I can think of two reasons for the fact that these individual fragments do not react to e.g. disappearing for a while the way we would expect them to. The one is that we have a hard coded biological imperative against resisting this act of fragments being disappeared. And that would make sense, because everything else would make us utterly dysfunctional as a whole. We can't spend 90 % of our life protesting that the things we just did as a whole are not the thing we, a particular brain process, wished to do. We can't constantly boycott and undo and deny each other's actions. It makes sense to fight over what to do, but to get in line once we have committed unless the circumstances were extraordinary and the action was an epic failure. Such a hard-coded rule would not be that surprising, because human minds seem to have a bunch of them. For example, we have a hard-coded aversion to considering the reality around us as a simulation. Which makes sense; human minds are good at coming up with imaginary and hypothetical scenarios, and it is crucial for survival not to confuse them for reality, and to take reality seriously and not as a game. You really, really do not want a human to conclude that reality is a simulation, and that they could just hop out of a tenth floor window to see what would happen. But the results are practically funny. It does not matter how often people read Descartes or watch the Matrix or read arguments on simulations - the vast majority of people will never become genuinely and permanently unsettled by this, even if they rationally agree that this could totally be the case and they have no arguments against it.

The other is that things we generally associate with the sentient minds of whole humans are not necessarily characteristics of sentient subprocesses, but separate additions to basic sentience. Potentially sentient subprocesses do not show a number of characteristics we would expect. E.g. they do not seem committed to self-preservation. It is possible that rights and needs we closely associate with sentience are actually only tied to sentience beyond a specific point, or can be somehow blocked.

Anyhow, I think investigating how this works on human brains now might give you empirical data and ideas to play with to develop this further.