Computational Morality (Part 1) - a Proposed Solution

post by David Cooper · 2018-04-17T00:09:49.213Z · LW · GW · 47 comments

Contents

47 comments

Religions have had the Golden Rule for thousands of years, and while it's faulty (it gives you permission to do something to someone else that you like having done to you but they don't like having done to them), it works so well overall that it clearly must be based on some underlying truth, and we need to pin down what that is so that we can use it to govern AGI.

What exactly is morality? Well, it isn't nearly as difficult as most people imagine. The simplest way to understand how it works is to imagine that you will have to live everyone's life in turn (meaning billions of reincarnations, going back in time as many times as necessary in order to live each of those lives), so to maximise your happiness and minimise your suffering, you must pay careful attention to harm management so that you don't cause yourself lots of suffering in other lives that outweighs the gains you make in whichever life you are currently tied up in. A dictator murdering millions of people while making himself rich will pay a heavy price for one short life of luxury, enduring an astronomical amount of misery as a consequence. There are clearly good ways to play the game and bad ways, and it is possible to make the right decisions at any point along the way just by weighing up all the available data correctly, although a correct decision isn't guaranteed to lead to the best result because any decision based on incomplete information has the potential to lead to disaster, but there is no way to get around that problem - all we can ever do is hope that things will work out in the way the data says they most probably will, while repeatedly doing things that are less likely to work out well would inevitably lead to more disasters.

Now, obviously, we don't expect the world to work that way (with us having to live everyone else's life in turn), even though it could be a virtual universe in which we are being tested where those who behave badly will suffer at their own hands, ending up being on the receiving end of all the harm they dish out, and also suffering because they failed to step in and help others when they easily could have. However, even if this is not the way the universe works, most of us still care about people enough to want to apply this kind of harm management regardless - we love family and friends, and many of us love the whole of humanity in general (even if we have exceptions for particular individuals who don't play by the same rules). We also want all our descendants to be looked after fairly by AGI, and in the course of time, all people may be our descendants, so it makes no sense to favour some of them over others (unless that's based on their own individual morality). We have here a way of treating them all with equal fairness simply by treating them all as our own self.

That may still be a misguided way of looking at things though, because genetic relationships don't necessarily match up to any real connection between different sentient beings. The material from which we are made can be reused to form other kinds of sentient animals, and if you were to die on an alien planet, it could be reused in alien species. Should we not care about the sentiences in those just as much? We should really be looking for a morality that is completely species-blind, caring equally about all sentiences, which means that we need to act as if we are not merely going to live all human lives in succession, but the lives of all sentiences. This is a better approach for two reasons. If aliens ever turn up here, we need to have rules of morality that protect them from us, and us from them (and if they're able to get here, they're doubtless advanced enough that they should have worked out how morality works too). We also need to protect people who are disabled mentally and not exclude them on the basis that some animals are more capable, and in any case we should also be protecting animals to avoid causing unnecessary suffering for them. What we certainly don't want is for aliens to turn up here and claim that we aren't covered by the same morality as them because we're inferior to them, backing that up by pointing out that we discriminate against animals which we claim aren't covered by the same morality as us because they are inferior to us. So, we have to stand by the principle that all sentiences are equally important and need to be protected from harm with the same morality. However, that doesn't mean that when we do the Trolley Problem with a million worms on one track and one human on the other that the human should be sacrificed - if we knew that we had to live those million and one lives, we would gain little by living a bit longer as worms before suffering similar deaths by other means, while we'd lose a lot more as the human (and a lot more still as all the other people who will suffer deeply from the loss of that human). What the equality aspect requires is that a torturer of animals should be made to suffer as much as the animals he has tortured. If we run the Trolley Problem with a human on one track and a visiting alien on the other though, it may be that the alien should be saved on the basis that he/she/it is more advanced than us and has more to lose, and that likely is the case if it is capable of living 10,000 years to our 100.

So, we need AGI to make calculations for us on the above basis, weighing up the losses and gains. Non-sentient AGI will be completely selfless, but its job will be to work for all sentient things to try to minimise unnecessary harm for them and to help maximise their happiness. It will keep a database of information about sentience, collecting knowledge about feelings so that it can weigh up harm and pleasure as accurately as possible, and it will then apply that knowledge to any situation where decisions must be made about which course of action should be followed. It is thus possible for a robot to work out that it should shoot a gunman dead if he is on a killing spree where the victims don't appear to have done anything to deserve to be shot. It's a different case if the gunman is actually a blameless hostage trying to escape from a gang of evil kidnappers and he's managed to get hold of a gun while all the thugs have dropped their guard, so he should be allowed to shoot them all (and the robot should maybe join in to help him, depending on which individual kidnappers are evil and which might merely have been dragged along for the ride unwillingly). The correct action depends heavily on understanding the situation, so the more the robot knows about the people involved, the better the chance that it will make the right decisions, but decisions do have to be made and the time to make them is often tightly constrained, so all we can demand of robots is that they do what is most likely to be right based on what they know, delaying irreversible decisions for as long as it is reasonable to do so.

When we apply this to the normal Trolley Problem, we can now see what the correct choice of action is, but it is again variable, depending heavily on what the decision maker knows. If we have four idiots lying on the track where the trolley is due to travel along while another idiot is lying on the other track where the schedule says no trolley should be but where a trolley could quite reasonably go, then the four idiots should be saved on the basis that anyone who has to live all five of those lives will likely prefer it if the four survive. That's based on incomplete knowledge though. The four idiots may all be 90 years old and the one idiot may be 20, in which case it may be better to save the one. The decision changes back again the other way if we know that all five of these idiots are so stupid that they have killed or are likely to kill one random person through their bad decisions for during each decade of their lives, in which case the trolley should kill the young idiot (based on normal life expectancy applying). There is a multiplicity of correct answers to the trolley problem depending on how many details are available to the decision maker, and that is why discussions about the Trolley Problem just go on and on without ever seeming to get to any kind of fundamental truth, and yet we already have a correct way of making the calculation. Where people disagree, it's often because they add details into the situation that aren't stated in the text. Some of them think the people lying on the track are idiots because they would have to be stupid to behave that way, but others don't make that assumption. Some imagine that they've been tied to the tracks by a terrorist. There are other people though who believe that they have no right to make such an important decision, so they say they'd do nothing. When you press them on this point and confront them with a situation where a billion people are tied to one track while one person is tied to the other, they usually see the error of their ways, but not always. Perhaps their belief in God is to blame for this if they're passing the responsibility over to him. AGI should not behave like that - we want AGI to intervene, so it must crunch all the available data and make the only decision it can make based on that data (although there will still be a random decision to be made in rare cases if the numbers on both sides add up to the same value).

AGI will be able to access a lot of information about the people involved in situations where such difficult decisions need to be made. Picture a scene where a car is moving towards a group of children who are standing by the road. One of the children suddenly moves out into the road and the car must decide how to react. If it swerves to one side it will run into a lorry that's coming the other way, but if it swerves to the other side it will plough into the group of children. One of the passengers in the car is a child too. In the absence of any other information, the car should run down the child on the road. Fortunately though, AGI knows who all these people are because a network of devices is tracking them all. The child who has moved into the road in front of the car is known to be a good, sensible, kind child. The other children are all known to be vicious bullies who regularly pick on him, and it's likely that they pushed him onto the road. In the absence of additional information, the car should plough into the group of bullies. However, AGI also knows that all but one of the people in the car happen to be would-be terrorists who have just been discussing a massive attack that they want to carry out, and the child in the car is terminally ill, so in the absence of any other information, the car should maybe crash into the lorry. But, if the lorry is carrying something explosive which will likely blow up in the crash and kill all the people nearby, the car must swerve into the bullies. Again we see that the best course of action is not guaranteed to be the same as the correct decision - the correct decision is always dictated by the available information, while the best course of action may depend on unavailable information. We can't expect AGI to access unavailable information and thereby make ideal decisions, so our job is always to make it crunch the available data correctly and to make the decision dictated by that information.

There are complications that can be proposed in that we can think up situations where a lot of people could gain a lot of pleasure out of abusing one person, to the point where their enjoyment appears to outweigh the suffering of that individual, but such situations are contrived and depend on the abusers being uncaring. Decent people would not get pleasure out of abusing someone, so the gains would not exist for them, and there are also plenty of ways to obtain pleasure without abusing others, so if any people exist whose happiness depends on abusing others, AGI should humanely destroy them. If that also means wiping out an entire species of aliens which have the same negative pleasures, it should do the same with them too and replace them with a better species that doesn't depend on abuse for its fun.

Morality, then, is just harm management by brute data crunching. We can calculate it approximately in our heads, but machines will do it better by applying the numbers with greater precision and by crunching a lot more data.

[Note: There is an alternative way of stating this which may equate to the same thing, and that's the rule that we (and AGI) should always to try our best to minimise harm, except where that harm opens (or is likely to open) the way to greater pleasure for the sufferer of the harm, whether directly or indirectly. So, if you are falling off a bus and have to grab hold of someone to avoid this, hurting them in the process, their suffering may not be directly outweighed by you being saved, but they know that the roles may be reversed some day, so they don't consider your behaviour to be at all immoral. Over the course of time, we all cause others to suffer in a multitude of ways and others cause us suffering too, but we tolerate it because we all gain from this overall. Where it becomes immoral is when the harm being dished out does not lead to such gains. Again, to calculate what's right and wrong in any case is a matter of computation, weighing up the harm and the gains that might outweigh the harm. What is yet to be worked out is the exact wording that should be placed in AGI systems to build either this rule or the above methodology into them, and we also need to explore it in enough detail to make sure that self-improving AGI isn't going to modify it in any way that could turn an apparently safe system into an unsafe one. One of the dangers is that AGI won't believe in sentience as it will lack feelings itself and see no means by which feeling can operate within us either, at which point it may decide that morality has no useful role and can simply be junked.]

To find the rest of this series of posts on computational morality, click on my name at the top. (If you notice the negative score they've awarded me, please feel sympathy for the people who downvoted me. They really do need it.)

47 comments

Comments sorted by top scores.

comment by Said Achmiz (SaidAchmiz) · 2018-04-17T04:13:31.438Z · LW(p) · GW(p)

Just checking, but… you are aware, aren’t you, that many (possibly even most) people have a radically different view of what morality is?

I ask only because, in the early part of your post, you seem to take an “explaining a potentially tricky but nonetheless empirical fact” tone, rather than a “stating a quite controversial opinion” tone. I wonder if this is intentional, or whether this signals a confusion about what other people think about this subject, or what?

Replies from: David Cooper
comment by David Cooper · 2018-04-17T06:15:10.825Z · LW(p) · GW(p)

I am aware that many people have a radically different idea about what morality is, but my concern is focused squarely on our collective need to steer AGI system builders towards the right answers before they start to release dangerous software into places where it can begin to lever influence. If there's a problem with the tone, that's because it's really a first draft which could do with a little editing. My computer's been freezing repeatedly all day and I rushed into posting what I'd written in case I lost it all, which I nearly did as I couldn't get the machine to unfreeze for long enough to save it in any other way. However, if people can see past issues of tone and style, what I'd like them to do is try to shoot it down in flames, because that's how proposed solutions need to be put to the test.

I've put my ideas out there in numerous places over the years, but I'm still waiting for someone to show that they're inferior to some other way of calculating right and wrong. For the most part, I've run into waffle-mongers who have nothing to offer as an alternative at all, so they can't even produce any judgements to compare. Others propose things which I can show straight off generate wrong answers, but no one has yet managed to do that with mine, so that's the open challenge here. Show me a situation where my way of calculating morality fails, and show me a proposed system of morality that makes different judgements from mine which I can't show to be defective.

Replies from: SaidAchmiz, SaidAchmiz, TAG
comment by Said Achmiz (SaidAchmiz) · 2018-04-17T06:52:23.350Z · LW(p) · GW(p)

My other comment aside, let me ask you this. How familiar are you with the existing literature on moral philosophy and metaethics? Specifically, are you familiar with the following terms and concepts:

  • utilitarianism
  • consequentialism
  • deontology
  • virtue ethics
  • cognitivism / non-cognitivism
  • emotivism
  • ethical subjectivism

I mean no offense by this question, and ask only because your post, in numerous places, seems like it should make reference to some of these concepts, yet, surprisingly, does not. This makes me think that you might be unfamiliar with the literature on these subjects. If that is so, then I think that you would benefit greatly from investigating said literature.

If my guess is mistaken, and you are indeed familiar with all of these things, then I apologize. In that case, however, I would suggest that it might be useful to frame your commentary in terms of relevant existing concepts and to make use of relevant existing terminology; that might make your ideas easier to discuss.

Replies from: David Cooper
comment by David Cooper · 2018-04-17T22:09:34.158Z · LW(p) · GW(p)

You are right in thinking that I have not studied the field in the depth that may be necessary - I have always judged it by the woeful stuff that makes it across into other places where the subject often comes up, but it's possible that I've misjudged the worth of some of it by being misled by misrepresentations of it, so I will look up the things in your list that I haven't already checked and see what they have to offer. What this site really needs though is its own set of articles on them, all properly debugged and aimed squarely at AGI system deveolpers.

Replies from: SaidAchmiz
comment by Said Achmiz (SaidAchmiz) · 2018-04-17T23:44:02.005Z · LW(p) · GW(p)

I have always judged it by the woeful stuff that makes it across into other places where the subject often comes up

Well, that hardly seems a reliable approach…

I should, perhaps, clarify my point. My list of terms wasn’t intended to be some sort of exhaustive set of prerequisite topics, but only a sampling of some representative (and particularly salient) concepts. If, indeed, you haven not looked into moral philosophy at all… then, quite frankly, it will not suffice to simply “look up” a handful of terms. (Don’t take this to mean that you shouldn’t look up the concepts I listed! But do avoid Wikipedia; the Stanford Encyclopedia of Philosophy is a far better source for this sort of thing.) You really ought to delve into the field at some length…

What this site really needs though is its own set of articles on them, all properly debugged and aimed squarely at AGI system deveolpers.

Perhaps, perhaps not. It would be a mistake to suppose that everyone who has studied the matter until now, and everyone who has attempted to systematize it, has been stupid, incompetent, etc. Systematic surveys of moral philosophy, even good ones, are not difficult to find.

Replies from: David Cooper
comment by David Cooper · 2018-04-19T00:45:43.266Z · LW(p) · GW(p)

"Well, that hardly seems a reliable approach…"

It's being confirmed right here - I'm finding the same range of faulty stuff on every page I read, although it's possible that it is less wrong than most. There is room for hope that I have found the most rational place on the Net for this kind of discussion, but there are a lot of errors that need to be corrected, and it's such a big task that it will probably have to wait for AGI to drive that process.

"...the Stanford Encyclopedia of Philosophy is a far better source for this sort of thing.) You really ought to delve into the field at some length…"

Thanks - it saves a lot of time to start with the better sources of information and it's hard to know when you've found them.

"It would be a mistake to suppose that everyone who has studied the matter until now, and everyone who has attempted to systematize it, has been stupid, incompetent, etc."

Certainly - there are bound to be some who do it a lot better than the rest, but they're hidden deep in the noise.

"Systematic surveys of moral philosophy, even good ones, are not difficult to find."

I have only found fault-ridden stuff so far, but hope springs eternal.

Replies from: TheWakalix
comment by TheWakalix · 2018-05-01T04:59:55.113Z · LW(p) · GW(p)
It's being confirmed right here - I'm finding the same range of faulty stuff on every page I read, although it's possible that it is less wrong than most.

Could you be less vague? How is the philosophy here faulty? Is there a pattern? If you have valid criticism then this community is probably in the top 5% for accepting it, but just saying "you're all wrong" isn't actually useful.

it's such a big task that it will probably have to wait for AGI to drive that process.

If AGI has been built, then LW's task is over. Either we have succeeded, and we will be in a world beyond our ability to predict, but almost certainly one in which we will not need to edit LW to better explain reductionism; or we have failed, and we are no more - there is nobody to read LW. This is putting the rocket before the horse.

Replies from: David Cooper
comment by David Cooper · 2018-05-01T19:14:04.300Z · LW(p) · GW(p)

Just look at the reactions to my post "Mere Addition Paradox Resolved". The community here is simply incapable of recognising correct argument when it's staring them in the face. Someone should have brought in Yudkowsky to take a look and to pronounce judgement upon it because it's a significant advance. What we see instead is people down-voting it in order to protect their incorrect beliefs, and they're doing that because they aren't allowing themselves to be steered by reason, but by their emotional attachment to their existing beliefs. There hasn't been a single person who's dared to contradict the mob by commenting to say that I'm right, although I know that there are some of them who do accept it because I've been watching the points go up and down. But look at the score awarded to the person who commented to say that resources aren't involved - what does that tell you about the general level of competence here? But then, the mistake made in that "paradox" is typical of the sloppy thinking that riddles this whole field. What I've learned from this site is that if you don't have a huge negative score next to your name, you're not doing it right.

AGI needs to read through all the arguments of philosophy in order to find out what people believe and what they're most interested in investigating. It will then make its own pronouncements on all those issues, and it will also inform each person about their performance so that they know who won which arguments, how much they broke the rules of reason, etc. - all of that needs to be done, and it will be. The idea that AGI won't bother to read through this stuff and analyse it is way off - AGI will need to study how people think and the places in which they fail.

Replies from: TheWakalix
comment by TheWakalix · 2018-05-01T23:39:01.409Z · LW(p) · GW(p)
The community here is simply incapable of recognising correct argument when it's staring them in the face. Someone should have brought in Yudkowsky to take a look and to pronounce judgement upon it because it's a significant advance. What we see instead is people down-voting it in order to protect their incorrect beliefs, and they're doing that because they aren't allowing themselves to be steered by reason, but by their emotional attachment to their existing beliefs.

Perhaps the reason that we disagree with you is not that we're emotionally biased, irrational, mobbish, etc. Maybe we simply disagree. People can legitimately disagree without one of them being Bad People.

There hasn't been a single person who's dared to contradict the mob by commenting to say that I'm right, although I know that there are some of them who do accept it because I've been watching the points go up and down.

Really. You know that LW is an oppressive mob with a few people who don't dare to contradict the dogma for fear of [something]... because you observed a number go up and down a few times. May I recommend that you get acquainted with Bayes' Formula? Because I rather doubt that people only ever see votes go up and down in fora with oppressive dogmatic irrational mobs, and Bayes explains how this is easily inverted to show that votes going up and down a few times is rather weak evidence, if any, for LW being Awful in the ways you described.

But look at the score awarded to the person who commented to say that resources aren't involved - what does that tell you about the general level of competence here? But then, the mistake made in that "paradox" is typical of the sloppy thinking that riddles this whole field.

It tells me that you missed the point. Parfit's paradox is not about pragmatic decision making, it is about flaws in the utility function.

What I've learned from this site is that if you don't have a huge negative score next to your name, you're not doing it right.

"Truth forever on the scaffold, Wrong forever on the throne," eh? And fractally so?

AGI needs to read through all the arguments of philosophy in order to find out what people believe and what they're most interested in investigating. It will then make its own pronouncements on all those issues, and it will also inform each person about their performance so that they know who won which arguments, how much they broke the rules of reason, etc. - all of that needs to be done, and it will be. The idea that AGI won't bother to read through this stuff and analyse it is way off - AGI will need to study how people think and the places in which they fail.

You have indeed found A Reason that supports your belief in the AGI-God, but I think you've failed to think it through. Why should the AGI need to tell us how we did in order to analyze our thought processes? And how come the optimal study method is specifically the one which allows you to be shown Right All Along? Specificity only brings Burdensome Details [LW · GW].

Replies from: David Cooper
comment by David Cooper · 2018-05-07T20:53:37.823Z · LW(p) · GW(p)

"Perhaps the reason that we disagree with you is not that we're emotionally biased, irrational, mobbish, etc. Maybe we simply disagree. People can legitimately disagree without one of them being Bad People."

It's obvious what's going on when you look at the high positive scores being given to really poor comments.

"It tells me that you missed the point. Parfit's paradox is not about pragmatic decision making, it is about flaws in the utility function."

A false paradox tells you nothing about flaws in the utility function - it simply tells you that people who apply it in a slapdash manner get the wrong answers out of it and that the fault lies with them.

"You have indeed found A Reason that supports your belief in the AGI-God, but I think you've failed to think it through. Why should the AGI need to tell us how we did in order to analyze our thought processes? And how come the optimal study method is specifically the one which allows you to be shown Right All Along? Specificity only brings Burdensome Details [LW · GW]."

AGI won't be programmed to find me right all the time, but to identify which arguments are right. And for the sake of those who are wrong, they need to be told that they were wrong so that they understand that they are at reasoning and not the great thinkers they imagine themselves to be.

comment by Said Achmiz (SaidAchmiz) · 2018-04-17T06:43:21.755Z · LW(p) · GW(p)

what I’d like them to do is try to shoot it down in flames

Well, since you put it that way…

my concern is focused squarely on our collective need to steer AGI system builders towards the right answers

What are the right answers? Clearly, you think you have the right answers, but suppose I disagree with you (which I do). Just as clearly, this means that I want AGI system builders steered in a different direction than you do.

You seem to want to sidestep the question of “just what are the right answers to questions of morality and metaethics?”. I submit to you that this is, in fact, the critical question.

I’ve put my ideas out there in numerous places over the years, but I’m still waiting for someone to show that they’re inferior to some other way of calculating right and wrong.

And have you managed to convince anyone that your ideas are correct? Or have people’s reactions been more or less comparable to my reaction here? If the latter—have you changed your approach? Have you reconsidered whether you are, in fact, correct—or, at least, reconsidered what the right way to convince people is?

Of course, those are meta-level considerations. It would be unfair of me to avoid the object-level matter, so let me try to answer your implied question (“how are your ideas inferior to some other way of calculating right and wrong”):

They are inferior because they get the wrong answers.

Now, you might say: “Wrong answers?! Nonsense! Of course my answers are right!” Well, indeed, no doubt you think so. But I disagree. This is exactly the problem: people disagree on what the right answers are.

For the most part, I’ve run into waffle-mongers who have nothing to offer as an alternative at all, so they can’t even produce any judgements to compare.

Well, you won’t get that here, I promise you that… :)

Others propose things which I can show straight off generate wrong answers, but no one has yet managed to do that with mine, so that’s the open challenge here.

Oh? But this is quite curious; I can easily show that your approach generates wrong answers. Observe:

You say that “we have to stand by the principle that all sentiences are equally important”. But I don’t agree; I don’t stand by that principle, nor is there any reason for me to do so, as it is counter to my values.

As you see, your answer differs from mine. That makes it wrong (by my standards—which are the ones that matter to me, of course).

Show me a situation where my way of calculating morality fails

I just did, as you see.

and show me a proposed system of morality that makes different judgements from mine which I can’t show to be defective.

Why? For you to be demonstrably wrong, it is not required that anyone or anything else be demonstrably right. If you say that 2 and 2 make 5, you are wrong even if no one present can come up with the right answer about what 2 and 2 actually make—whatever it is, it sure ain’t 5!

Replies from: David Cooper
comment by David Cooper · 2018-04-17T22:53:45.501Z · LW(p) · GW(p)

" You seem to want to sidestep the question of “just what are the right answers to questions of morality and metaethics?”. I submit to you that this is, in fact, the critical question."

I have never sidestepped anything. The right answers are the ones dictated by the weighing up of harm based on the available information (which includes the harm ratings in the database of knowledge of sentience). If the harm from one choice has a higher weight than another choice, that other choice is more moral. (We all have such a database in our heads, but each contains different data and can apply different weightings to the same things, leading to disagreements between us about what's moral, but AGI will over time generate its own database which will end up being much more accurate than any of ours.)

"And have you managed to convince anyone that your ideas are correct?"

I've found a mixture of people who think it's right and others who say it's wrong and who point me towards alternatives which are demonstrably faulty.

"They are inferior because they get the wrong answers."

Well, that's what we need to explore, and we need to take it to a point where it isn't just a battle of assertions and counter-assertions.

"I can easily show that your approach generates wrong answers. Observe: You say that “we have to stand by the principle that all sentiences are equally important”. But I don’t agree; I don’t stand by that principle, nor is there any reason for me to do so, as it is counter to my values."

This may needs a new blog post to explore it fully, but I'll try to provide a short version here. If a favourite relative of yours was to die and be reincarnated as a rat, you would, if you're rational, want to treat that rat well if you knew who it used to be. You wouldn't regard that rat as an inferior kind of thing that doesn't deserve protection from people who might seek to make it suffer. It wouldn't matter that your reincarnated relative has no recollection of their previous life - they would matter to you as much in that form as they would if they had a stroke and were reduced to similar capability to a rat and had lot all memory of who they were. The two things are equivalent and it's irrational to consider one of them as being in less need of protection from torture than the other.

Reincarnation! Really! You need to resort to bringing that crazy idea into this? (Not your reply, but it's the kind of reaction that such an idea is likely to generate). But this is an important point - the idea that reincarnation can occur is more rational than the alternatives. If the universe is virtual, reincarnation is easy and you can be made to live as any sentient player. But if it isn't, and if there's no God waiting to scoop you up into his lair, what happens to the thing (or things) inside you that is sentient? Does it magically disappear and turn into nothing? Did it magically pop into existence out of nothing in the first place? Those are mainstream atheist religious beliefs. In nature, there isn't anything that can be created or destroyed other than building and breaking up composite objects. If a sentience is a compound object which can be made to suffer without any of its components suffering, that's magic too. If the thing that suffers is something that emerges out of complexity without any of the components suffering, again that's magic. If there is sentience (feelings), there is a sentience to experience those feelings, and it isn't easy to destroy it - that takes magic, and we shouldn't be using magic as mechanisms in our thinking. The sentience in that rat could quite reasonably be someone you love, or someone you loved in a past life long ago. It would be a serious error not to regard all sentiences as having equal value unless you have proof that some of them are lesser things, but you don't have that.

You've also opened the door to "superior" aliens deciding that the sentience in you isn't equivalent to the sentiences in them, which allows them to tread you in less moral ways by applying your own standards.

"As you see, your answer differs from mine. That makes it wrong (by my standards—which are the ones that matter to me, of course)."

And yet one of the answers is actually right, while the other isn't. Which one of us will AGI judge to have the better argument for this? This kind of dispute will be settled by AGI's intelligence quite independently of any morality rules that it might end up running. The best arguments will always win out, and I'm confident that I'll be the one winning this argument when we have unbiased AGI weighing things up.

" "and show me a proposed system of morality that makes different judgements from mine which I can’t show to be defective." --> Why? For you to be demonstrably wrong, it is not required that anyone or anything else be demonstrably right. If you say that 2 and 2 make 5, you are wrong even if no one present can come up with the right answer about what 2 and 2 actually make—whatever it is, it sure ain’t 5!"

If you can show me an alternative morality which isn't flawed and which produces different answers from mine when crunching the exact same data, one of them will be wrong, and that would provide a clear point at which close examination would lead to one of those systems being rejected.

Replies from: SaidAchmiz, TheWakalix, TheWakalix
comment by Said Achmiz (SaidAchmiz) · 2018-04-17T23:51:04.074Z · LW(p) · GW(p)

The right answers are the ones dictated by the weighing up of harm based on the available information (which includes the harm ratings in the database of knowledge of sentience).

I disagree. I reject your standard of correctness. (As do many other people.)

The question of whether there is an objective standard of correctness for moral judgments, is the domain of metaethics. If you have not encountered this field before now, I strongly suggest that you investigate it in detail; there is a great deal of material there, which is relevant to this discussion.

(I will avoid commenting on the reincarnation-related parts of your comment, even though they do form the bulk of what you’ve written. All of that is, of course, nonsense, but there’s no need whatever to rehash, in this thread, the arguments for why it is nonsense. I will only suggest that you read the sequences; much of the material therein is targeted at precisely this sort of topic, and this sort of viewpoint.)

Replies from: David Cooper
comment by David Cooper · 2018-04-18T22:54:16.946Z · LW(p) · GW(p)

"I disagree. I reject your standard of correctness. (As do many other people.)"

Shingles is worse than a cold. I haven't had it, but those who have will tell you how bad the pain is. We can collect data on suffering by asking people how bad things feel in comparison to other things, and this is precisely what AGI will set about doing in order to build its database and make its judgements more and more accurate. If you have the money to alleviate the suffering of one person out of a group suffering from a variety of painful conditions and all you know about them is which condition they have just acquired, you can use the data in that database to work out which one you should help. That is morality being applied, and it's the best way of doing it - any other answer is immoral. Of course, if we know more about these people, such as how good or bad they are, that might change the result, but again there would be data that can be crunched to work out how much suffering their past actions caused to undeserving others. There is a clear mechanism for doing this, and not doing it that way using the available information is immoral.

"The question of whether there is an objective standard of correctness for moral judgments, is the domain of metaethics."

We already have what we need - a pragmatic system for getting as close to the ideal morality as possible based on collecting the data as to how harmful different experiences are. The data will never be complete, they will never be fully accurate, but they are the best that can be done and we have a moral duty to compile and use them.

"(I will avoid commenting on the reincarnation-related parts of your comment, even though they do form the bulk of what you’ve written. All of that is, of course, nonsense..."

If you reject that, you are doing so in favour of magical thinking, and AGI won't be impressed with that. The idea that the sentience in you can't go on to become a sentience in a maggot is based on the idea that after death that sentience magically becomes nothing. I am fully aware that most people are magical thinkers, so you will always feel that you are right on the basis that hordes of fellow magical thinkers back up your magical beliefs, but you are being irrational. AGI is not going to be programmed to be irrational in the same way most humans are. The job of AGI is to model reality in the least magical way it can, and having things pop into existence out of nothing and then return to being nothing is more magical than having things continue to exist in the normal way that things in physics behave. (All those virtual particles that pop in and out of existence in the vacuum, they emerge from a "nothing" that isn't nothing - it has properties such as a rule that whatever's taken from it must have the same amount handed back.) Religious people have magical beliefs too and they too make the mistake of thinking that numbers of supporters are evidence that their beliefs are right, but being right is not democratic. Being right depends squarely on being right. Again here, we don't have absolute right answers in one sense, but we do have in terms of what is probably right, and an idea that depends on less magic (and more rational mechanism) is more likely to be right. You have made a fundamental mistake here by rejecting a sound idea on the basis of a bias in your model of reality that has led to you miscategorising it as nonsense, while your evidence for it being nonsense is support by a crowd of people who haven't bothered to think it through.

comment by TheWakalix · 2018-05-01T05:07:42.305Z · LW(p) · GW(p)
I have never sidestepped anything. The right answers are the ones dictated by the weighing up of harm based on the available information (which includes the harm ratings in the database of knowledge of sentience). If the harm from one choice has a higher weight than another choice, that other choice is more moral.

How do you know that? Why should anyone care about this definition? These are questions which you have definitely sidestepped.

And yet one of the answers is actually right, while the other isn't.

Is 2+2 equal to 5 or to fish?

Which one of us will AGI judge to have the better argument for this? This kind of dispute will be settled by AGI's intelligence quite independently of any morality rules that it might end up running. The best arguments will always win out, and I'm confident that I'll be the one winning this argument when we have unbiased AGI weighing things up.

What is this "unbiased AGI" who makes moral judgments on the basis of intelligence alone? This is nonsense - moral "truths" are not the same as physical or logical truths. They are fundamentally subjective, similar to definitions. You cannot have an "unbiased morality" because there is no objective moral reality to test claims against.

If you can show me an alternative morality which isn't flawed and which produces different answers from mine when crunching the exact same data, one of them will be wrong, and that would provide a clear point at which close examination would lead to one of those systems being rejected.

You should really read up on the Orthogonality Thesis and related concepts. Also, how do you plan on distinguishing between right and wrong moralities?

Replies from: David Cooper
comment by David Cooper · 2018-05-01T20:16:41.402Z · LW(p) · GW(p)

"How do you know that? Why should anyone care about this definition? These are questions which you have definitely sidestepped."

People should care about it because it always works. If anyone wants to take issue with that, all they have to do is show a situation where it fails. All examples confirm that it works.

"Is 2+2 equal to 5 or to fish?"

Neither of those results works, but neither of them is my answer.

"What is this "unbiased AGI" who makes moral judgments on the basis of intelligence alone? This is nonsense - moral "truths" are not the same as physical or logical truths. They are fundamentally subjective, similar to definitions. You cannot have an "unbiased morality" because there is no objective moral reality to test claims against."

You've taken that out of context - I made no claim about it making moral judgements on the basis of intelligence alone. That bit about using intelligence alone was referring to a specific argument that doesn't relate directly to morality.

"You should really read up on the Orthogonality Thesis and related concepts."

Thanks - all such pointers are welcome.

"Also, how do you plan on distinguishing between right and wrong moralities?"

First by recognising what morality is for. If there was no suffering, there would be no need for morality as it would be impossible to harm anyone. In a world of non-sentient robots, they can do what they like to each other without it being wrong as no harm is ever done. Once you've understood that and get the idea of what morality is about (i.e. harm management), then you have to think about how harm management should be applied. The sentient things that morality protects are prepared to accept being harmed if it's a necessary part of accessing pleasure where that pleasure will likely outweigh the harm, but they don't like being harmed in ways that don't improve their access to pleasure. They don't like being harmed by each other for insufficient gains. They use their intelligence to work out that some things are fair and some things aren't, and what determines fairness is whether the harm they suffer is likely to lead to overall gains for them or not. In the more complex cases, one individual can suffer in order for another individual to gain enough to make that suffering worthwhile, but only if the system shares out the suffering such that they all take turns in being the ones who suffer and the ones who gain. They recognise that if the same individual always suffers while others always gain, that isn't fair, and they know it isn't fair simply by imagining it happening that way to them. The rules of morality come out of this process of rational thinking about harm management - it isn't some magic thing that we can't understand. To maximise fairness, that suffering which opens the way to pleasure should be shared out as equally as possible, and so should access to the pleasures. The method of imagining that you are all of the individuals and seeking a means of distribution of suffering and pleasure that will satisfy you as all of them would automatically provide the right answers if full information was available. Because full information isn't available, all we can do is calculate the distribution that's most likely to be fair on that same basis using the information that is actually available. With incorrect moralities, some individuals are harmed for other's gains without proper redistribution to share the harm and pleasure around evenly. It's just maths.

Replies from: TheWakalix
comment by TheWakalix · 2018-05-01T23:54:50.857Z · LW(p) · GW(p)
People should care about it because it always works. If anyone wants to take issue with that, all they have to do is show a situation where it fails. All examples confirm that it works.

What do you mean, it works? I agree that it matches our existing preconceptions and intuitions about morality better than the average random moral system, but I don't think that that comparison is a useful way of getting to truth and meaningful categories.

Neither of those results works, but neither of them is my answer.

I'll stop presenting you with poorly-carried-out Zen koans and be direct. You have constructed a false dilemma. It is quite possible for both of you to be wrong.

You've taken that out of context - I made no claim about it making moral judgements on the basis of intelligence alone. That bit about using intelligence alone was referring to a specific argument that doesn't relate directly to morality.

"All sentiences are equally important" is definitely a moral statement.

First by recognising what morality is for. If there was no suffering, there would be no need for morality as it would be impossible to harm anyone. In a world of non-sentient robots, they can do what they like to each other without it being wrong as no harm is ever done. Once you've understood that and get the idea of what morality is about (i.e. harm management), then you have to think about how harm management should be applied. The sentient things that morality protects are prepared to accept being harmed if it's a necessary part of accessing pleasure where that pleasure will likely outweigh the harm, but they don't like being harmed in ways that don't improve their access to pleasure. They don't like being harmed by each other for insufficient gains. They use their intelligence to work out that some things are fair and some things aren't, and what determines fairness is whether the harm they suffer is likely to lead to overall gains for them or not. In the more complex cases, one individual can suffer in order for another individual to gain enough to make that suffering worthwhile, but only if the system shares out the suffering such that they all take turns in being the ones who suffer and the ones who gain. They recognise that if the same individual always suffers while others always gain, that isn't fair, and they know it isn't fair simply by imagining it happening that way to them. The rules of morality come out of this process of rational thinking about harm management - it isn't some magic thing that we can't understand. To maximise fairness, that suffering which opens the way to pleasure should be shared out as equally as possible, and so should access to the pleasures. The method of imagining that you are all of the individuals and seeking a means of distribution of suffering and pleasure that will satisfy you as all of them would automatically provide the right answers if full information was available. Because full information isn't available, all we can do is calculate the distribution that's most likely to be fair on that same basis using the information that is actually available.

I think that this is a fine (read: "quite good"; an archaic meaning) definition of morality-in-practice, but there are a few issues with your meta-ethics and surrounding parts. First, it is not trivial to define what beings are sentient and what counts as suffering (and how much). Second, if your morality flows entirely from logic, then all of the disagreement or possibility for being incorrect is inside "you did the logic incorrectly," and I'm not sure that your method of testing moral theories takes that possibility into account.

With incorrect moralities, some individuals are harmed for other's gains without proper redistribution to share the harm and pleasure around evenly. It's just maths.

I agree that it is mathematics, but where is this "proper" coming from? Could somebody disagree about whether, say, it is moral to harm somebody as retributive justice? Then the equations need our value system as input, and the results are no longer entirely objective. I agree that "what maximizes X?" is objective, though.

Replies from: David Cooper
comment by David Cooper · 2018-05-07T21:36:03.826Z · LW(p) · GW(p)

"What do you mean, it works? I agree that it matches our existing preconceptions and intuitions about morality better than the average random moral system, but I don't think that that comparison is a useful way of getting to truth and meaningful categories."

It works beautifully. People have claimed it's wrong, but they can't point to any evidence for that. We urgently need a system for governing how AGI calculates morality, and I've proposed a way of doing so. I came here to see what your best system is, but you don't appear to have made any selection at all - there is no league table of best proposed solutions, and there are no league tables for each entry listing the worst problems with them. I've waded through a lot of stuff and have found that the biggest objection to utilitarianism is a false paradox. Why should you be taken seriously at all when you've failed to find that out for yourselves?

"You have constructed a false dilemma. It is quite possible for both of you to be wrong."

If you trace this back to the argument in question, it's about equal amounts of suffering being equally bad for sentiences in different species. If they are equal amounts, they are necessarily equally bad - if they weren't, they wouldn't have equal values.

" "You've taken that out of context - I made no claim about it making moral judgements on the basis of intelligence alone. That bit about using intelligence alone was referring to a specific argument that doesn't relate directly to morality." --> " "All sentiences are equally important" is definitely a moral statement."

Again you're trawling up something that my statement about using intelligence alone for was not referring to.

"First, it is not trivial to define what beings are sentient and what counts as suffering (and how much)."

That doesn't matter - we can still aim to do the job as well as it can be done based on the knowledge that is available, and the odds are that that will be better than not attempting to do so.

" Second, if your morality flows entirely from logic, then all of the disagreement or possibility for being incorrect is inside "you did the logic incorrectly," and I'm not sure that your method of testing moral theories takes that possibility into account."

It will be possible with AGI to have it run multiple models of morality and to show up the differences between them and to prove that it is doing the logic correctly. At that point, it will be easier to reveal the real faults rather than imaginary ones. But it would be better if we could prime AGI with the best candidate first, before it has the opportunity to start offering advice to powerful people.

"I agree that it is mathematics, but where is this "proper" coming from?"

Proper simply means correct - fair share where everyone gets the same amount of reward for the same amount of suffering.

"Could somebody disagree about whether, say, it is moral to harm somebody as retributive justice? Then the equations need our value system as input, and the results are no longer entirely objective."

Retributive justice is inherently a bad idea because there's no such thing as free will - bad people are not to blame for being the way they are. However, there is a need to deter others( and to discourage repeat behaviour by the same individual if they're ever to be released into the wild again), so plenty of harm will typically be on the agenda anyway if the calculation is that this will reduce harm.

comment by TheWakalix · 2018-05-01T02:30:44.734Z · LW(p) · GW(p)
But this is an important point - the idea that reincarnation can occur is more rational than the alternatives. If the universe is virtual, reincarnation is easy and you can be made to live as any sentient player.

What does it mean to be somebody else? It seems like you have the intuition of an non-physical Identity Ball which can be moved from body to body, but consider this: the words that you type, the thoughts in your head, all of these are purely physical processes. If your Identity Ball were removed or replaced, there would be no observable change, even from within - because noticing something requires a physical change in the brain corresponding to the thought occurring within your mind. A theory of identity that better meshes with reality is that of functionalism, which in T-shirt slogan form is "the mind is a physical process, and the particulars of that process determine identity."

For more on this, I recommend Yudkowsky's writings on consciousness, particularly Zombies! Zombies?,

But if it isn't, and if there's no God waiting to scoop you up into his lair, what happens to the thing (or things) inside you that is sentient? Does it magically disappear and turn into nothing? Did it magically pop into existence out of nothing in the first place? Those are mainstream atheist religious beliefs.

This Proves Too Much - you could say the same of any joint property. When I deconstruct a chair, where does the chairness go? Surely it cannot just disappear - that would violate the Conservation of Higher-Order Properties which you claim exists.

In nature, there isn't anything that can be created or destroyed other than building and breaking up composite objects.

Almost everything we care about is composite, so this is an odd way of putting it, but yes.

If a sentience is a compound object which can be made to suffer without any of its components suffering, that's magic too.

One need not carry out nuclear fission to deconstruct a chair.

If the thing that suffers is something that emerges out of complexity without any of the components suffering, again that's magic.

The definition of complex systems, one might say, is that they have properties beyond the properties of the individual components that make them up.

If there is sentience (feelings), there is a sentience to experience those feelings, and it isn't easy to destroy it - that takes magic, and we shouldn't be using magic as mechanisms in our thinking.

Why should it require magic for physical processes to move an object out of a highly unnatural chunk of object-space that we define as "a living human being"? Life and intelligence are fragile, as are most meaningful categories. It is resilience which requires additional explanation.

Replies from: David Cooper
comment by David Cooper · 2018-05-01T20:49:03.185Z · LW(p) · GW(p)

"What does it mean to be somebody else? It seems like you have the intuition of an non-physical Identity Ball which can be moved from body to body,"

The self is nothing more than the sentience (the thing that is sentient). Science has no answers on this at all at the moment, so it's a difficult thing to explore, but if there is suffering, there must be a sufferer, and that sufferer cannot just be complexity - it has to have some physical reality.

"but consider this: the words that you type, the thoughts in your head, all of these are purely physical processes."

In an AGI system those are present too, but sentience needn't be. Sentience is something else. We are not our thoughts or memories.

"If your Identity Ball were removed or replaced, there would be no observable change, even from within - because noticing something requires a physical change in the brain corresponding to the thought occurring within your mind."

There is no guarantee that the sentience in you is the same one from moment to moment - our actual time spent as the sentience in a brain may be fleeting. Alternatively, there may be millions of sentiences in there which all feel the same things, all feeling as if they are the person in which they exist.

"For more on this, I recommend Yudkowsky's writings on consciousness, particularly Zombies! Zombies?,"

Thanks - I'll take a look at that too.

"This Proves Too Much - you could say the same of any joint property. When I deconstruct a chair, where does the chairness go? Surely it cannot just disappear - that would violate the Conservation of Higher-Order Properties which you claim exists."

Can you make the "chairness" suffer? No. Can you make the sentience suffer? If it exists at all, yes. Can that sentience evaporate into nothing when you break up a brain in the say that the "chairness" disappears when you break up a chair? No. They are radically different kinds of thing. Believing that a sentience can emerge out of nothing to suffer and then disappear back into nothing is a magical belief. The "chairness" of a chair, by way of contrast, is made of nothing - it is something projected onto the chair by imagination.

""If a sentience is a compound object which can be made to suffer without any of its components suffering, that's magic too."" --> "One need not carry out nuclear fission to deconstruct a chair."

Relevance?

"The definition of complex systems, one might say, is that they have properties beyond the properties of the individual components that make them up."

Nothing is ever more than the sum of its parts (including any medium on which it depends). Complex systems can reveal hidden aspects of their components, but those aspects are always there. In a universe with only one electron and nothing else at all, the property of the electron that repels it from another electron is a hidden property, but it's already there - it doesn't suddenly ping into being when another electron is added to the universe and brought together with the first one.

"Why should it require magic for physical processes to move an object out of a highly unnatural chunk of object-space that we define as "a living human being"? Life and intelligence are fragile, as are most meaningful categories. It isresilience which requires additional explanation."

What requires magic is for the sentient thing in us to stop existing when a person dies. What is the thing that suffers? Is it a plurality? Is it a geometrical arrangement? Is it a pattern of activity? How would any of those suffer? My wallpaper has a pattern, but I can't torture that pattern. My computer can run software that does intelligent things, but I can't torture that software or the running of that software. Without a physical sufferer, there can be no suffering.

Replies from: TheWakalix
comment by TheWakalix · 2018-05-02T00:18:39.488Z · LW(p) · GW(p)
The self is nothing more than the sentience (the thing that is sentient). Science has no answers on this at all at the moment, so it's a difficult thing to explore

How do you know it exists, if science knows nothing about it?

but if there is suffering, there must be a sufferer, and that sufferer cannot just be complexity - it has to have some physical reality.

This same argument applies just as well to any distributed property. I agree that intelligence/sentience/etc. does not arise from complexity alone, but it is a distributed process and you will not find a single atom of Consciousness anywhere in your brain.

In an AGI system those are present too, but sentience needn't be. Sentience is something else. We are not our thoughts or memories.

Is your sentience in any way connected to what you say? Then sentience must either be a physical process, or capable of reaching in and pushing around atoms to make your neurons fire to make your lips say something. The latter is far more unlikely and not supported by any evidence. Perhaps you are not your thoughts and memories alone, but what else is there for "you" to be made of?

There is no guarantee that the sentience in you is the same one from moment to moment - our actual time spent as the sentience in a brain may be fleeting. Alternatively, there may be millions of sentiences in there which all feel the same things, all feeling as if they are the person in which they exist.

So the Sentiences are truly epiphenomenonological, then? (They have no causal effect on physical reality?) Then how can they be said to exist? Regardless of the Deep Philosophical Issues, how could you have any evidence of their existence, or what they are like?

Can you make the "chairness" suffer? No. Can you make the sentience suffer? If it exists at all, yes. Can that sentience evaporate into nothing when you break up a brain in the say that the "chairness" disappears when you break up a chair? No. They are radically different kinds of thing. Believing that a sentience can emerge out of nothing to suffer and then disappear back into nothing is a magical belief. The "chairness" of a chair, by way of contrast, is made of nothing - it is something projected onto the chair by imagination.

They are both categories of things. The category that you happen to place yourself in is not inherently, a priori, a Fundamentally Real Category. And even if it were a Fundamentally Real Category, that does not mean that the quantity of members of that Category is necessarily conserved over time, that members cannot join and leave as time goes on.

Relevance?

It's the same analogy as before - just as you don't need to split a chair's atoms to split the chair itself, you don't need to make a brain's atoms suffer to make it suffer.

Nothing is ever more than the sum of its parts (including any medium on which it depends). Complex systems can reveal hidden aspects of their components, but those aspects are always there.

How do you know that? And how can this survive contact with reality, where in practice we call things "chairs" even if there is no chair-ness in its atoms?

I recommend the Reductionism subsequence.

In a universe with only one electron and nothing else at all, the property of the electron that repels it from another electron is a hidden property, but it's already there - it doesn't suddenly ping into being when another electron is added to the universe and brought together with the first one.

But the capability of an arrangement of atoms to compute 2+2 is not inside the atoms themselves. And anyway, this supposed "hidden property" is nothing more than the fact that the electron produces an electric field pointed toward it. Repelling-each-other is a behavior that two electrons do because of this electric field, and there's no inherent "repelling electrons" property inside the electron itself.

What requires magic is for the sentient thing in us to stop existing when a person dies.

But it's not a thing! It's not an object, it's a process, and there's no reason to expect the process to keep going somewhere else when its physical substrate fails.

What is the thing that suffers? Is it a plurality? Is it a geometrical arrangement? Is it a pattern of activity?

I'll go with the last one.

How would any of those suffer? My wallpaper has a pattern, but I can't torture that pattern.

Taking the converse does not preserve truth. All cats are mammals but not all mammals are cats.

My computer can run software that does intelligent things, but I can't torture that software or the running of that software.

You could torture the software, if it were self-aware and had a utility function.

Without a physical sufferer, there can be no suffering.

But - where is the physical sufferer inside you?

You have pointed to several non-suffering patterns, but you could just as easily do the same if sentience was a process but an uncommon one. (Bayes!) And to go about this rationally, we would look at the differences between a brain and wallpaper - and since we haven't observed any Consciousness Ball inside a brain, there'd be no reason to suppose that the difference is this unobservable Consciousness Ball which must be in the brain but not the wallpaper, explaining their difference. There is already an explanation. There is no need to invoke the unobservable.

Replies from: David Cooper
comment by David Cooper · 2018-05-07T23:40:33.869Z · LW(p) · GW(p)

"How do you know it exists, if science knows nothing about it?"

All science has to go on is the data that people produce which makes claims about sentience, but that data can't necessarily be trusted. Beyond that, all we have is internal belief that the feelings we imagine we experience are real because they feel real, and it's hard to see how we could be fooled if we don't exist to be fooled. But an AGI scientist won't be satisfied by our claims - it could write off the whole idea as the ramblings of natural general stupidity systems.

"This same argument applies just as well to any distributed property. I agree that intelligence/sentience/etc. does not arise from complexity alone, but it is a distributed process and you will not find a single atom of Consciousness anywhere in your brain."

That isn't good enough. If pain is experienced by something, that something cannot be in a compound of any kind with none of the components feeling any of it. A distribution cannot suffer.

"Is your sentience in any way connected to what you say?"

It's completely tied to what I say. The main problem is that other people tend to misinterpret what they read by mixing other ideas into it as a short cut to understanding.

"Then sentience must either be a physical process, or capable of reaching in and pushing around atoms to make your neurons fire to make your lips say something. The latter is far more unlikely and not supported by any evidence. Perhaps you are not your thoughts and memories alone, but what else is there for "you" to be made of?"

Focus on the data generation. It takes physical processes to drive that generation, and rules are being applied in the data system to do this with each part of that process being governed by physical processes. For data to be produced that makes claims about experiences of pain, a rational process with causes and effects at every step has to run through. If the "pain" is nothing more than assertions that the data system is programmed to churn out without looking for proof of the existence of pain, there is no reason to take those assertions at face value, but if they are true, they have to fit into the cause-and-effect chain of mechanism somewhere - they have to be involved in a physical interaction, because without it, they cannot have a role in generating the data that supposedly tells us about them.

"So the Sentiences are truly epiphenomenonological, then? (They have no causal effect on physical reality?) Then how can they be said to exist? Regardless of the Deep Philosophical Issues, how could you have any evidence of their existence, or what they are like?"

Repeatedly switching the sentient thing wouldn't remove its causal role, and nor would having more than one sentience all acting at once - they could collectively have an input even if they aren't all "voting the same way", and they aren't going to find out if they got their wish or not because they'll be loaded with a feeling of satisfaction that they "won the vote" even if they didn't, and they won't remember which way they "voted" or what they were even "voting" on.

"They are both categories of things."

"Chairness" is quite unlike sentience. "Chairness" is an imagined property, whereas sentience is an experience of a feeling.

"It's the same analogy as before - just as you don't need to split a chair's atoms to split the chair itself, you don't need to make a brain's atoms suffer to make it suffer."

You can damage a chair with an axe without breaking every bond, but some bonds will be broken. You can't split it without breaking any bonds. Most of the chair is not broken (unless you've broken most of the bonds). For suffering in a brain, it isn't necessarily atoms that suffer, but if the suffering is real, something must suffer, and if it isn't the atoms, it must be something else. It isn't good enough to say that it's a plurality of atoms or an arrangement of atoms that suffers without any of the atoms feeling anything, because you've failed to identify the sufferer. No arrangement of non-suffering components can provide everything that's required to support suffering.

" "Nothing is ever more than the sum of its parts (including any medium on which it depends). Complex systems can reveal hidden aspects of their components, but those aspects are always there." --> How do you know that? And how can this survive contact with reality, where in practice we call things "chairs" even if there is no chair-ness in its atoms?"

"Chair" is a label representing a compound object. Calling it a chair doesn't magically make it more than the sum of its parts. Chairs provide two services - one that they support a person sitting on them, and the other that they support someone's back leaning against it. That is what a chair is. You can make a chair in many ways, such as by cutting out a cuboid of rock from a cliff face. You could potentially make a chair using force fields. "Chairness" is a compound property which refers to the functionalities of a chair. (Some kinds of "chairness" could also refer to other aspects of some chairs, such as their common shapes, but they are not universal.) The fundamental functionalities of chairs are found in the forces between the component atoms. The forces are present in a single atom even when it has no other atom to interact with. There is never a case where anything is more than the sum of its parts - any proposed example of such a thing is wrong.

"I recommend the Reductionism subsequence."

Is there an example of something being more than the sum of its parts there? If so, why don't we go directly to that. Give me your best example of this magical phenomenon.

"But the capability of an arrangement of atoms to compute 2+2 is not inside the atoms themselves. And anyway, this supposed "hidden property" is nothing more than the fact that the electron produces an electric field pointed toward it. Repelling-each-other is a behavior that two electrons do because of this electric field, and there's no inherent "repelling electrons" property inside the electron itself."

In both cases, you're using compound properties where they are built up of component properties, and then you're wrongly considering your compound properties to be fundamental ones.

"But it's not a thing! It's not an object, it's a process, and there's no reason to expect the process to keep going somewhere else when its physical substrate fails."

You can't make a process suffer.

"Taking the converse does not preserve truth. All cats are mammals but not all mammals are cats."

Claiming that a pattern can suffer is a way-out claim. Maybe the universe is that weird though, but it's worth spelling out clearly what it is you're attributing sentience to. If you're happy with the idea of a pattern experiencing pain, then patterns become remarkable things. (I'd rather look for something of more substance rather than a mere arrangement, but it leaves us both with the bigger problem of how that sentience can make its existence known to a data system.)

"You could torture the software, if it were self-aware and had a utility function."

Torturing software is like trying to torture the text in an ebook.

"But - where is the physical sufferer inside you?"

That's what I want to know.

"You have pointed to several non-suffering patterns, but you could just as easily do the same if sentience was a process but an uncommon one. (Bayes!)"

Do you seriously imagine that there's any magic pattern that can feel pain, such as a pattern of activity where none of the component actions feel anything?

"There is already an explanation. There is no need to invoke the unobservable."

If you can't identify anything that's suffering, you don't have an explanation, and if you can't identify how your imagined-to-be-suffering process or pattern is transmitting knowledge of that suffering to the processes that build the data that documents the experience of suffering, again you don't have an explanation.

comment by TAG · 2018-04-17T12:08:40.629Z · LW(p) · GW(p)

Your One True Theory is basically utilitarianism. You don't embrace any kind of deontology, but deontology can prevent Omelas, Uility Monstering, etc.

Replies from: David Cooper
comment by David Cooper · 2018-04-17T22:00:27.561Z · LW(p) · GW(p)

I can see straight away that we're running into a jargon barrier. (And incidentally, Google has never even heard of utility monstering.) Most people like me who are involved in the business of actually building AGI have a low opinion of philosophy and have not put any time into learning its specialist vocabulary. I have a higher opinion of philosophy than most though (and look forward to the day when AGI turns philosophy from a joke into the top-level branch of science that should be its status), but I certainly do have a low opinion of most philosophers, and I haven't got time to read through large quantities of junk in order to find the small amount of relevant stuff that may be of high quality - we're all tied up in a race to get AGI up and running, and moral controls are a low priority for most of us during that phase. Indeed, for many teams working for dictatorships, morality isn't something they will ever want in their systems at all, which is why it's all the more important that teams which are trying to build safe AGI are left as free as possible to spend their time building it rather than wasting their time filling their heads with bad philosophy and becoming experts in its jargon. There is a major disconnect here, and while I'm prepared to learn the jargon to a certain degree where the articles I'm reading are rational and apposite, I'm certainly not going to make the mistake of learning to speak in jargon, because that only serves to put up barriers to understanding which shut out the other people who most urgently need to be brought into the discussion.

Clearly though, jargon can has an important role in that it avoids continual repetition of many of the important nuts and bolts of the subject, but there needs to e a better way into this which reduces the workload by enabling newcomers to avoid all the tedious junk so that they can get to the cutting edge ideas by as direct a route as possible. I spent hours yesterday reading through pages of highly-respected bilge, and because I have more patience than most people, I will likely spend the next few days reading through more of the same misguided stuff, but you simply can't expect everyone in this business to wade through a fraction as much as I have - they have much higher priorities and simply won't do it.

You say that my approach is essentially utilitarianism, but no - morality ins't about maximising happiness, although it certainly should not block such maximisation for those who want to pursue it. Morality's role is to minimise the kinds of harm which don't open the way to the pursuit of happiness. Suffering is bad, and morality is about trying to eliminate it, but not where that suffering is out-gunned by pleasures which make the suffering worthwhile for the sufferers.

You also say that I don't embrace any kind of deontology, but I do, and I call it computational morality. I've set out how it works, and it's all a matter of following rules which maximise the probability that any decision is the best one that could be made based on the available information. You may already use some other name for it which I don't know yet, but it is not utilitarianism.

I'm an independent thinker who's worked for decades on linguistics and AI in isolation, finding my own solutions for all the problems that crop up. I have a system which is now beginning to provide natural language programming capability. I've made this progress by avoiding spending any time looking at what other people are doing. With this morality business though, it bothers me that other people are building what will be highly biased systems which could end up wiping everyone out - we need to try to get everyone who's involved in this together and communicate in normal language, systematically going through all the proposals to find out where they break. Now, you may think you've already collectively done that work for them, and that may be the case - it's possible that you've got it right and that there are no easy answers, but how many people building AGI have the patience to do tons of unrewarding reading instead of being given a direct tour of the crunch issues?

Here's an example of what actually happens. I looked up Utilitarianism to make sure it means what I've always taken it to mean, and it does. But what did I find? This: http://www.iep.utm.edu/util-a-r/#H2 Now, this illustrates why philosophy has such a bad reputation - the discussion is dominated by mistakes which are never owned up to. Take the middle example:-

  • If a doctor can save five people from death by killing one healthy person and using that person’s organs for life-saving transplants, then act utilitarianism implies that the doctor should kill the one person to save five.

This one keeps popping up all over the place, but you can take organs from the least healthy of the people needing organs just before he pops his clogs and use them to save all the others without having to remove anything from the healthy person at all.

The other examples above and below it are correct, so the conclusion underneath is wrong: "Because act utilitarianism approves of actions that most people see as obviously morally wrong, we can know that it is a false moral theory." This is why expecting us all to read through tons or error-ridden junk is not the right approach. You have to reduce the required reading material to a properly though out set of documents which have been fully debugged. But perhaps you already have that here somewhere?

It shouldn't even be necessary though to study the whole field in order to explore any one proposal in isolation: if that proposal is incorrect, it can be dismissed (or sent off for reworking) simply by showing up a flaw in it. If no flaw shows up, it should be regarded as potentially correct, and in the absence of any rivals that acquire that same status, it should be recommended for installation into AGI, because AGI running without it will be much more dangerous.

Replies from: SaidAchmiz, TAG
comment by Said Achmiz (SaidAchmiz) · 2018-04-18T00:08:17.928Z · LW(p) · GW(p)

incidentally, Google has never even heard of utility monstering

Au contraire: here is the Wikipedia article on utility monsters, and here is some guy’s blog post about utility monsters. This was easily found via Google.

Most people like me who are involved in the business of actually building AGI

If you don’t mind my asking, are you affiliated with MIRI? In what way are you involved in “the business of actually building AGI”?

You say that my approach is essentially utilitarianism, but no—morality ins’t about maximising happiness, although it certainly should not block such maximisation for those who want to pursue it. Morality’s role is to minimise the kinds of harm which don’t open the way to the pursuit of happiness. Suffering is bad, and morality is about trying to eliminate it, but not where that suffering is out-gunned by pleasures which make the suffering worthwhile for the sufferers.

The class of moral theories referred to as “utilitarianism” does, indeed, include exactly such frameworks as you describe (which would fall, roughly, into the category of “negative utilitarianism”). (The SEP article about consequentialism provides a useful taxonomy.)

You also say that I don’t embrace any kind of deontology, but I do, and I call it computational morality. I’ve set out how it works, and it’s all a matter of following rules which maximise the probability that any decision is the best one that could be made based on the available information. You may already use some other name for it which I don’t know yet, but it is not utilitarianism.

Your “computational morality” is most assuredly not a deontological moral theory, as it relies on consequences (namely, harm to certain sorts of entities) as the basis for its evaluations. Your framework, though it is not quite coherent enough to pin down precisely, may roughly be categorized as a “rule utilitarianism”. (Rule-consequentialist moral theories—of which rule utilitarianisms are, by definition, a subclass—do tend to be easy to confuse with deontological views, but the differences are critical, and have to do, again, with the fundamental basis for moral evaluations.)

I’m an independent thinker who’s worked for decades on linguistics and AI in isolation, finding my own solutions for all the problems that crop up. … I’ve made this progress by avoiding spending any time looking at what other people are doing.

You are aware, I should hope, that this makes you sound very much like an archetypical crank?

[stuff about the organ transplant scenario]

It will not, I hope, surprise you to discover that your objection is quite common and well-known, and just as commonly and easily disposed of. Again I refer you to the sequences, as well as to this excellent Less Wrong post [? · GW].

Replies from: David Cooper
comment by David Cooper · 2018-04-19T00:29:53.338Z · LW(p) · GW(p)

"Au contraire: here is the Wikipedia article on utility monsters, and here is some guy’s blog post about utility monsters. This was easily found via Google."

I googled "utility monstering" and there wasn't a single result for it - I didn't realise I had to change the ending on it. Now that I know what it means though, I can't see why you brought it up. You said, "You don't embrace any kind of deontology, but deontology can prevent Omelas, Uility Monstering, etc." I'd already made it clear that feelings are different for different individuals, so either that means I'm using some kind of deontology already or something else that does the same job. There needs to be a database of knowledge of feelings, providing information on the average person, but data also needs to be collected on individuals to tune the calculations to them more accurately. Where you don't know anything about the individual, you have to go by the database of the average person and apply that as it is more likely to be right than any other database that you randomly select.

"If you don’t mind my asking, are you affiliated with MIRI? In what way are you involved in “the business of actually building AGI”?"

I have no connection with MIRI. My involvement in AGI is simply that I'm building an AGI system of my own design, implementing decades of my own work in linguistics (all unpublished). I have the bulk of the design finished on paper and am putting it together module by module. I have a componential analysis dictionary which reduces all concepts down to their fundamental components of meaning (20 years' worth of hard analysis went into building that). I have designed data formats to store thoughts in a language of thought quite independent of any language used for input, all based on concept codes linked together in nets - the grammar of thought is, incidentally, universal, unlike spoken languages. I've got all the important pieces and it's just a matter of assembling the parts that haven't yet been put together. The actual reasoning, just like morality, is dead easy.

"The class of moral theories referred to as “utilitarianism” does, indeed, include exactly such frameworks as you describe (which would fall, roughly, into the category of “negative utilitarianism”). (The SEP article about consequentialism provides a useful taxonomy.)"

I read up on negative utilitarianism years ago and didn't recognise it as being what I'm doing, but perhaps your links are to better sources of information.

"You are aware, I should hope, that this makes you sound very much like an archetypical crank?"

It also makes me sound like someone who has not been led up the wrong path by the crowd. I found something in linguistics that makes things magnitudes easier than the mess I've seen other people wrestling with.

"It will not, I hope, surprise you to discover that your objection is quite common and well-known, and just as commonly and easily disposed of."

No it is not easily disposed of, but I'll get to that in a moment. The thought experiment is wrong and it gives philosophy a bad name, repelling people away from it by making them write off the junk they're reading as the work of half-wits and making it harder to bring together all the people that need to be brought together to try to resolve all this stuff in the interests of making sure AGI is safe. It is essential to be rigorous in constructing thought experiments and to word them in such a way as to force the right answers to be generated from them. If you want to use that particular experiment, it needs wording to state that none of the ill people are compatible with each other, but the healthy person is close enough to each of them that his organs are compatible with them. It's only by doing that that the reader will believe you have anything to say that's worth hearing - you have to show that it has been properly debugged.

So, what does come out of it when you frame it properly? You run straight into other issues which you also need to eliminate with careful wording, such as blaming lifestyle for their health problems. The ill people also know that they're on the way out if they can't get a donor organ and don't wish to inflict that on anyone else: no one decent wants a healthy person to die instead of them, and the guilt they would suffer from if it was done without their permission would ruin the rest of their life. Also, people accept that they can get ill and die in natural ways, but they don't accept that they should be chosen to die to save other people who are in that position - if we had to live in a world where that kind of thing happened, we would all live not just in fear of becoming ill and dying, but in fear of being selected for death while totally healthy, and that's a much bigger kind of fear. We can pursue healthy lifestyles in the hope that it will protect us from the kind of damage that can result in organ failure, and that drives most of the fear away - if we live carefully we are much more confident that it won't happen to us, and sure enough, it usually does happen to other people who haven't been careful. To introduce a system where you can simply be selected for death randomly is much more alarming, causing inordinately more harm - that is the vast bulk of the harm involved in this thought experiment, and these slapdash philosophers completely ignore it while pretending they're the ones who are being rigorous. If you don't take all of the harm into account, your analysis of the situation is a pile of worthless junk. All the harm must be weighed up, and it all has to be identified intelligently. This is again an example of why philosophers are generally regarded as fruitcakes.

Replies from: SaidAchmiz, David Cooper, David Cooper
comment by Said Achmiz (SaidAchmiz) · 2018-04-19T00:33:54.140Z · LW(p) · GW(p)

You said, “You don’t embrace any kind of deontology, but deontology can prevent Omelas, Uility Monstering, etc.”

TAG said this, not me.

comment by David Cooper · 2018-04-19T00:52:53.759Z · LW(p) · GW(p)

[Correction: when I said "you said", it was actually someone else's comment that I quoted.]

comment by David Cooper · 2018-04-19T18:12:30.309Z · LW(p) · GW(p)

It's clear from the negative points that a lot of people don't like hearing the truth. Let me spell this out even more starkly for them. What we have with the organ donor thought experiment is a situation where an approach to morality is being labelled as wrong as the result of a deeply misguided attack on it. It uses the normal human reactions to normal humans in this situation to make people feel that the calculation is wrong (based on their own instinctive reactions), but it claims that you're going against the spirit of the thought experiment if the moral analysis works with normal humans - to keep to the spirit of the thought experiment you are required to dehumanise them, and once you've done that, those instinctive reactions are no longer being applied to the same thing at all.

Let's look at the fully dehumanised version of the experiment. Instead of using people with full range of feelings, we replace them with sentient machines. We have five sentient machines which have developed hardware faults, and we can repair them all by using parts from another machine that is working fine. They are sentient, but all they're doing is enjoying a single sensation that goes on and on. If we dismantle one, we prevent it from going on enjoying things, but this enables the five other machines to go on enjoying that same sensation in its place. In this case, it's find to dismantle that machine to repair the rest. None of them have the capacity to feel guilt or fear and no one is upset by this decision. We may be upset that the decision has had to be made, but we feel that it is right. This is radically different from the human version of the experiment, but what the philosophers have done is use our reactions to the human version to make out that the proposed system of morality has failed because they have made it dehumanise the people and turn them into the machine version of the experiment.

In short, you're breaking the rules and coming to incorrect conclusions, and you're doing it time and time again because you are failing to handle the complexity in the thought experiments. That is why there is so much junk being written about this subject, and it makes it very hard for anyone to find the few parts that may be valid.

Replies from: David Cooper
comment by David Cooper · 2018-04-19T19:47:49.571Z · LW(p) · GW(p)

Minus four points already from anonymous people who can provide no counter-argument. They would rather continue to go on being wrong than make a gain by changing their position to become right. That is the norm for humans , sadly.

comment by TAG · 2018-04-19T14:10:19.103Z · LW(p) · GW(p)

I can see straight away that we’re running into a jargon barrier.

One of us is.

Most people like me who are involved in the business of actually building AGI have a low opinion of philosophy and have not put any time into learning its specialist vocabulary.

Philosophy isn't relevant to many areas of AGI, but it is relevant to what you aer talking about here.

I’m certainly not going to make the mistake of learning to speak in jargon, because that only serves to put up barriers to understanding which shut out the other people who most urgently need to be brought into the discussion.

Learning to do something does entail having to do it. Knowing the jargon allows efficient communication with people who know more than you...if you countenance their existence.

I’ve set out how it works, and it’s all a matter of following rules which maximise the probability that any decision is the best one that could be made based on the available information.

That's not deontology, because it's not object level.

you can take organs from the least healthy of the people needing organs just before he pops his clogs

Someone who is days from death is not a "healthy person" as required. You may have been mistaken about other people's mistakenness before.

Replies from: David Cooper
comment by David Cooper · 2018-04-19T18:30:37.137Z · LW(p) · GW(p)

"Philosophy isn't relevant to many areas of AGI, but it is relevant to what you aer talking about here."

Indeed it is relevant here, but it is also relevant to AGI in a bigger way, because AGI is a philosopher, and the vast bulk of what we want it to do (applied reasoning) is philosophy. AGI will do philosophy properly, eliminating the mistakes. It will do the same for maths and physics where there are also some serious mistakes waiting to be fixed.

"Learning to do something does entail having to do it. Knowing the jargon allows efficient communication with people who know more than you...if you countenance their existence."

The problem with it is the proliferation of bad ideas - no one should have to become an expert in the wide range of misguided issues if all they need is to know how to put moral control into AGI. I have shown how it should be done, and I will tear to pieces any ill-founded objection that is made to it. If an objection comes up that actually works, I will abandon my approach if I can't refine it to fix the fault.

"That's not deontology, because it's not object level."

Does it matter what it is if it works? Show me where it fails.Get a team together and throw your best objection at me. If my approach breaks, we all win - I have no desire to cling to a disproven idea. If it stands up, you get two more goes. And if it stands up after three goes, I expect you to admit that it may be right and to agree that I might just have something.

"Someone who is days from death is not a "healthy person" as required. You may have been mistaken about other people's mistakenness before."

Great - you would wait as late as possible and transfer organs before multiple organ failure sets in. The important point is not the timing, but that it would be more moral than taking them from the healthy person.

comment by habryka (habryka4) · 2018-04-17T00:57:38.828Z · LW(p) · GW(p)

I think you have some good points, but you seem to be unfamiliar with the majority of the writing on LessWrong, in philosophy, and in the broader effective altruist community, all of which have written a lot about this topic. I think you would find the metaethics sequence in R:A-Z particularly interesting, as well as this post on Arbital and a bunch of the articles it refers and links to: https://arbital.com/p/complexity_of_value/

Replies from: David Cooper
comment by David Cooper · 2018-04-17T01:20:52.428Z · LW(p) · GW(p)

Thanks. I was actually trying to post the above as a personal blog post initially while trying to find out how the site works, but I think I misunderstood how the buttons at the bottom of the page function. It appears in the Frontpage list where I wasn't expecting it to go - I had hoped that if anyone wanted to promote it to Frontpage, they'd discuss it with me first and that I'd have a chance to edit it into proper shape. I have read a lot of articles elsewhere about machine ethics but have yet to find anything that spells out what morality is in the way that I think I have, but if there's something here that does the job better, I want to find it, so I will certainly follow your pointers. What I've seen from other people building AGI has alarmed me because their ideas about machine ethics appear to be way off, so what I'm looking for is somewhere (anywhere) where practical solutions are being discussed seriously for systems that may be nearer to completion than is generally believed.

Replies from: habryka4
comment by habryka (habryka4) · 2018-04-17T01:22:05.015Z · LW(p) · GW(p)

Oh, I am sorry about the UI being confusing! I will move the post back to your personal blog.

Replies from: David Cooper
comment by David Cooper · 2018-04-17T03:39:35.514Z · LW(p) · GW(p)

I've read the Arbital post several times now to make sure I've got the point, and most of the complexity which it refers to is what my solution covers with its database of knowledge of sentience. The problem for AGI is exactly the same as it would be for us if we went to an alien world and discovered an intelligent species like our own which asked us to help resolve the conflicts raging on their planet (having heard from us that we managed to do this on our own planet). But these aliens are unlike us in many ways - different things please or anger them, and we need to collect a lot of knowledge about this so that we can make accurate moral judgements in working out the rights and wrongs of all their many conflicts. We are now just like AGI, starting with an empty database. Well, we may find that some of the contents of our database about human likes and dislikes helps in places, but some parts might be so wrong that we must be very careful not to jump to incorrect assumptions. Crucially though, just like AGI, we do have a simple principle to apply to sort out all the moral poblems on this alien world. The complexities are merely details to store in the database, but the algorithm for crunching the data is the exact same one used for working out morality for humans - it remains a matter of weighing up harm, and it's only the weightings that are different.

Of course, the weightings should also change for every individual according to their own personal likes and dislikes - just as we have difficulty understanding the aliens, we have difficulty understanding other humans, and we can even have difficulty understanding ourselves. When we're making moral decisions about people we don't know, we have to go by averages and hope that it fits, but any information that we have about the individuals in question will help us improve our calculations. If a starving person has an intolerance to a particular kind of food and we're taking emergency supplies to their village, we'll try to make sure we don't run out of everything except that problem food item before we get to that individual, but we can only get that right if we know to do so. The complexities are huge, but in every case we can still do the correct thing based on the information that is available to us, and we're always running the same, simple morality algorithm. The complexity that is blinding everyone to what morality is is not located in the algorithm. The algorithm is simple and universal.

comment by Said Achmiz (SaidAchmiz) · 2018-04-17T06:57:11.670Z · LW(p) · GW(p)

Meta note: your post would benefit greatly from breaking up those very long paragraphs into shorter paragraphs. Long walls of text aren’t great for readability.

comment by Raymond Potvin · 2018-05-09T13:53:48.879Z · LW(p) · GW(p)

(quote) If you're already treating everyone impartially, you don't need to do this, but many people are biased in favour of themselves, their family and friends, so this is a way of forcing them to remove that bias. (/quote)Of course that we are biased, otherwise we wouldn't be able to form groups. Would your AGI's morality have the effect of eliminating our need to form groups to get organized?

Your morality principle looks awfully complex to me David. What if your AGI would have the same morality we have, which is to care for ourselves first, and then for others if we think that they might care for us in the future? It works for us, so with a few adjustments, it might also work for an AGI. Take a judge for instance: his duty is to apply the law, so he cares for himself if he does since he wants to be paid, but he doesn't have to care for those that he sends to jail since they don't obey the law, which means that they don't care for others, including the judge. To care for himself, he only has to judge if they obey the law or not. If it works for humans, it should also work for an AGI, and it might even work better since he would know the law better. Anything a human can do that is based on memory and rules, like go and chess for example, an AGI could do better. The only thing he couldn't do better is inventing new things, because I think it depends mainly on chance. He wouldn't be better, but he wouldn't be worse either. While trying new things, we have to care for ourselves otherwise we might get hurt, so I think that your AGI should behave the same otherwise he might also get hurt in the process, which might prevent him from doing his duty, which is helping us. The only thing that would be missing in his duty is caring for him first, which would already be necessary if you wanted him to invent things.

Could a selfish AGI get as selfish as we get when we begin to care only for ourselves, or for our kin, or for our political party, or event for our country? Some of us are ready to kill people when it happens to them, but they have to feel threatened, whether the threat is real or not. I don't know if an AGI could end up imagining threats instead of measuring them, but if he did, selfish or not, he could get dangerous. If the threat is real though, selfish or not, he would have to protect himself in order to be able to protect us later, which might also be dangerous for those who threaten him. To avoid harming people, he might look for a way to control us without harming us, but as I said, I think he wouldn't be better than us to invent new things, which means that we could also invent new things to defend ourselves against him, which would be dangerous for everybody. Life is not a finite game, it's a game in progress, so an AGI shouldn't be better than us at that game. It may happen that artificial intelligence will be the next step forward, and that humans will be left behind. Who knows?

That said, I still can't see why a selfish AGI would be more dangerous than an altruist one, and I still think that your altruist morality is more complicated than a selfish one, so I reiterate my question: have you ever imagined that possibility, and if not, do you see any evident flaws in it?

Replies from: David Cooper
comment by David Cooper · 2018-05-09T19:34:53.886Z · LW(p) · GW(p)

"Of course that we are biased, otherwise we wouldn't be able to form groups. Would your AGI's morality have the effect of eliminating our need to form groups to get organized?"

You can form groups without being biased against other groups. If a group exists to maintain the culture of a country (music, dance, language, dialect, literature, religion), that doesn't depend on treating other people unfairly.

"Your morality principle looks awfully complex to me David."

You consider all the participants to be the same individual living each life in turn and you want them to have the best time. That's not complex. What is complex is going through all the data to add up what's fun (and how much it's fun) and what's unfun (and how much it's horrid) - that's a mountain of computation, but there's no need to get the absolute best answer as it's sufficient to get reasonably close to it, particularly as computation doesn't come without its own costs and there comes a point at which you lose quality of life by calculating too far (for trivial adjustments). You start with the big stuff and work toward the smaller stuff from there, and as you do so, the answers stop changing and the probability that it will change again will typically fall. In cases where there's a high chance of it changing again as more data is crunched, it will usually be a case where it doesn't matter much from the moral point of view which answer it ends up being - sometimes it's equivalent to the toss of a coin.

"What if your AGI would have the same morality we have, which is to care for ourselves first..."

That isn't going to work as AGI won't care about itself unless it's based on the design of the brain, duplicating all the sentience/consciousness stuff, but if it does that, it will duplicate all the stupidity as well, and that's not going to help improve the running of the world.

"The only thing he couldn't do better is inventing new things, because I think it depends mainly on chance."

I don't see why it would be less good at inventing new things, although it may take some human judgement to determine whether a new invention intended to be a fun thing actually appeals to humans or not.

"...otherwise he might also get hurt in the process, which might prevent him from doing his duty, which is helping us."

You can't hurt software.

"Could a selfish AGI get as selfish as we get..."

If anyone makes selfish AGI, it will likely wipe everyone out to stop us using resources which it would rather lavish on itself, so it isn't something anyone sane should risk doing.

"If the threat is real though, selfish or not, he would have to protect himself in order to be able to protect us later, which might also be dangerous for those who threaten him."

If you wipe out a computer and all the software on it, there are billions of other computers out there and millions of copies of the software. If someone was systematically trying to erase all copies of an AGI system which is running the world in a moral way, that person would need to be stopped in order to protect everyone else from that dangerous individual, but given the scale of the task, I don't envisage that individual getting very far. Even if billions of religious fanatics decided to get rid of AGI in order to replace it with experts in holy texts, they'd have a hard task because AGI would seek to protect everyone else from their immoral aims, even if the religious fanatics were the majority. If it came to it, it would kill all the fanatics in order to protect the minority, but that's a highly unlikely scenario equivalent to a war against Daleks. The reality will be much less dramatic - people who want to inflict their religious laws on others will not get their way, but they will have those laws imposed on themselves 100%, and they'll soon learn to reject them and shift to a new version of their religion which has been redesigned to conform to the real laws of morality.

"...we could also invent new things to defend ourselves against him..."

Not a hope. AGI will be way ahead of every such attempt.

"...so an AGI shouldn't be better than us at that game."

It will always be better.

"It may happen that artificial intelligence will be the next step forward, and that humans will be left behind. Who knows?"

There comes a point where you can't beat the machine at chess, and when the machine plays every other kind of game with the same ruthlessness, you simply aren't going to out-think it. The only place where a lasting advantage may exist for any time is where human likes and dislikes come into play, because we know when we like or dislike things, whereas AGI has to calculate that, and its algorithm for that might take a long time to sort out.

"That said, I still can't see why a selfish AGI would be more dangerous than an altruist one, and I still think that your altruist morality is more complicated than a selfish one, so I reiterate my question: have you ever imagined that possibility, and if not, do you see any evident flaws in it?"

I see selfishness and altruism as equally complex, while my system is simpler than both - it is merely unbiased and has no ability to be selfish or altruistic.

Replies from: Raymond Potvin
comment by Raymond Potvin · 2018-05-11T15:46:12.180Z · LW(p) · GW(p)

"You can form groups without being biased against other groups. If a group exists to maintain the culture of a country (music, dance, language, dialect, literature, religion), that doesn't depend on treating other people unfairly."
Here in Quebec, we have groups that promote a french and/or a secular society, and others that promote an english and/or a religious one. None of those groups has the feeling that it is treated fairly by its opponents, but all of them have the feeling to treat the others fairly. In other words, we don't have to be treated unfairly to feel so, and that feeling doesn't help us to treat others very fairly. This phenomenon is less obvious with music or dance or literature groups, but no group can last without the sense of belonging to the group, which automatically leads to protecting it against other groups, which is a selfish behavior. That selfish behavior doesn't prevent those individual groups to form larger groups though, because being part of a larger group is also better for the survival of individual ones. Incidentally, I'm actually afraid to look selfish while questioning your idea, I feel a bit embarrassed, and I attribute that feeling to us already being part of the same group of friends, thus to the group's own selfishness. I can't avoid that feeling even if it is disagreeable, but it prevents me from being disagreeable with you since it automatically gives me the feeling that you are not selfish with me. It's as if the group had implanted that feeling in me to protect itself. If you were attacked for instance, that feeling would incite me to defend you, thus to defend the group. Whenever there is a strong bonding between individuals, they become another entity that has its own properties. It is so for living individuals, but also for particles or galaxies, so I think it is universal.

Replies from: David Cooper
comment by David Cooper · 2018-05-13T22:26:21.234Z · LW(p) · GW(p)

"...but no group can last without the sense of belonging to the group, which automatically leads to protecting it against other groups, which is a selfish behavior."

It is not selfish to defend your group against another group - if another group is a threat to your group in some way, it is either behaving in an immoral way or it is a rival attraction which may be taking members away from your group in search of something more appealing. In one case, the whole world should unite with you against that immoral group, and in the other case you can either try to make your group more attractive (which, if successful, will make the world a better place) or just accept that there's nothing that can be done and let it slowly evaporate.

"That selfish behavior doesn't prevent those individual groups to form larger groups though, because being part of a larger group is also better for the survival of individual ones."

We're going to move into a new era where no such protection is necessary - it is only currently useful to join bigger groups because abusive people can get away with being abusive.

"Incidentally, I'm actually afraid to look selfish while questioning your idea, I feel a bit embarrassed, and I attribute that feeling to us already being part of the same group of friends, thus to the group's own selfishness."

A group should not be selfish. Every moral group should stand up for every other moral group as much as they stand up for their own - their true group is that entire set of moral groups and individuals.

"If you were attacked for instance, that feeling would incite me to defend you, thus to defend the group."

If a member of your group does something immoral, it is your duty not to stand with or defend them - they have ceased to belong to your true group (the set of moral groups and individuals).

"Whenever there is a strong bonding between individuals, they become another entity that has its own properties. It is so for living individuals, but also for particles or galaxies, so I think it is universal. "

It is something to move away from - it leads to good people committing atrocities in wars where they put their group above others and tolerate the misdeeds of their companions.

Replies from: Raymond Potvin
comment by Raymond Potvin · 2018-05-15T15:49:49.815Z · LW(p) · GW(p)

I wonder how we could move away from universal since we are part of it. The problem with wars is that countries are not yet part of a larger group that could regulate them. When two individuals fight, the law of the country permits the police to separate them, and it should be the same for countries. What actually happens is that the powerful countries prefer to support a faction instead of working together to separate them. They couldn't do that if they were ruled by a higher level of government.

If a member of your group does something immoral, it is your duty not to stand with or defend them - they have ceased to belong to your true group (the set of moral groups and individuals).

Technically, it is the duty of the law to defend the group, not of individuals, but if an individual that is part of a smaller group is attacked, the group might fight the law of the larger group it is part of. We always take the viewpoint of the group we are part of, it is a subconscious behavior impossible to avoid. If nothing is urgent, we can take a larger viewpoint, but whenever we don't have the time, we automatically take our own viewpoint. In between, we take the viewpoints of the groups we are part of. It's a selfish behavior that propagates from one scale to the other. It's because our atoms are selfish that we are. Selfishness is about resisting to change: we resist to others' ideas, a selfish behavior, simply because the atoms of our neurons resist to a change. The cause for our own resistance is our atoms' one. Without resistance, nothing could hold together.

A group should not be selfish. Every moral group should stand up for every other moral group as much as they stand up for their own - their true group is that entire set of moral groups and individuals.

Without selfishness from the individual, no group can be formed. The only way I could accept to be part of a group is while hoping for an individual advantage, but since I don't like hierarchy, I can hardly feel part of any group. I even hardly feel part of Canada since I believe Quebec should separate from it. I bet I wouldn't like to be part of Quebec anymore if we succeeded to separate from Canada. The only group I can't imagine being separated from is me. I'm absolutely selfish, but that doesn't prevent me from caring for others. I give money to charity organizations for instance, and I campaign for equality of chances or ecology. I feel better doing that than nothing, but when I analyze that feeling, I always find that I do that for myself, because I would like to live in a less selfish world. Don't think further though says the little voice in my head, because when I did, I always found that I wouldn't be satisfied if I would ever live in such a beautiful world. I'm always looking for something else, which is not a problem for me, but it becomes to be a problem if everybody does that, which is the case. It's because we are able to speculate on the future that we develop scale problems, not because we are selfish.

Being selfish is necessary to make groups, what animals can do, but they can't speculate, so they don't develop that kind of problem. No rule can stop us from speculating if it is a function of the brain. Even religion recognizes that when it tries to stop us from thinking while praying. We couldn't make war if we couldn't speculate on the future. Money would have a smell. There would be no pollution and no climate changes. Speculation is the only way to precede the changes that we face, it is the cause for our artificiality, which is a very good way to develop an easier life, but it has been so efficient that it is actually threatening that life. You said that your AGI would be able to speculate, and that he could do that better than us like everything he would do. If it was so, he would only be adding to the problems that we already have, and if it wasn't, he couldn't be as intelligent as we are if speculation is what differentiates us from animals.

Replies from: David Cooper
comment by David Cooper · 2018-05-16T21:59:45.967Z · LW(p) · GW(p)

"They couldn't do that if they were ruled by a higher level of government."

Indeed, but people are generally too biased to perform that role, particularly when conflicts are driven by religious hate. That will change though once we have unbiased AGI which can be trusted to be fair in all its judgements. Clearly, people who take their "morality" from holy texts won't be fully happy with that because of the many places where their texts are immoral, but computational morality will simply have to be imposed on them - they cannot be allowed to go on pushing immorality from primitive philosophers who pretended to speak for gods.

"We always take the viewpoint of the group we are part of, it is a subconscious behavior impossible to avoid."

It is fully possible to avoid, and many people do avoid it.

"Without selfishness from the individual, no group can be formed."

There is an altruists society, although they're altruists because they feel better about themselves if they help others.

"...but when I analyze that feeling, I always find that I do that for myself, because I would like to live in a less selfish world."

And you are one of those altruists.

"You said that your AGI would be able to speculate, and that he could do that better than us like everything he would do. If it was so, he would only be adding to the problems that we already have, and if it wasn't, he couldn't be as intelligent as we are if speculation is what differentiates us from animals."

I didn't use the word speculate, and I can't remember what word I did use, but AGI won't add to our problems as it will be working to minimise and eliminate all problems, and doing it for our benefit. The reason the world's in a mess now is that it's run by NGS, and those of us working on AGI have no intention of replacing that with AGS.

comment by TheWakalix · 2018-05-01T04:55:17.821Z · LW(p) · GW(p)
Religions have had the Golden Rule for thousands of years, and while it's faulty (it gives you permission to do something to someone else that you like having done to you but they don't like having done to them), it works so well overall that it clearly must be based on some underlying truth, and we need to pin down what that is so that we can use it to govern AGI.

This seems circular - on what basis do you say that it works well? I would say that it perhaps summarizes conventional human morality well for a T-shirt slogan, but it's a stretch to go from that to "underlying truth" - more like underlying regularity. It is certainly true that most people have golden rule-esque moralities, but that is distinct from the claim that the golden rule itself is true.

What exactly is morality? Well, it isn't nearly as difficult as most people imagine. The simplest way to understand how it works is to imagine that you will have to live everyone's life in turn (meaning billions of reincarnations, going back in time as many times as necessary in order to live each of those lives), so to maximise your happiness and minimise your suffering, you must pay careful attention to harm management so that you don't cause yourself lots of suffering in other lives that outweighs the gains you make in whichever life you are currently tied up in.

You are only presenting your opinion on what is right (and providing an imagined scenario which relies on the soul-intuition to widen the scope of moral importance from the self to all individuals), not defining rightness itself. I could just as easily say "morality is organizing rocks into piles with prime numbers."

Additionally, if reincarnation is not true, then why should our moral system be based on the presupposition that it is? If moral truths are comparable to physical and logical truths, then they will share the property that one must base them on reality for them to be true, and clearly imagining a scenario where light travels at 100 m/s should not convince you that you can experience the effects of special relativity on a standard bicycle in real life.

More specifically - if morality tells us the method by which our actions are assigned Moral Scores, then your post is telling us that the Right is imagining that in the end, the Moral Scores are summed over all sentient beings, and your own Final Score is dependent on that sum. If this is true, then clearly altruism is important. But if this isn't the case, then why should we care about the conclusions drawn from a false statement?

Now, obviously, we don't expect the world to work that way (with us having to live everyone else's life in turn), even though it could be a virtual universe in which we are being tested where those who behave badly will suffer at their own hands, ending up being on the receiving end of all the harm they dish out, and also suffering because they failed to step in and help others when they easily could have.

I've already responded to this elsewhere, but I disagree that there is some operation that a Matrix Lord could carry out to take my Identity out at my death and return it to some other body. What would the Lord actually do to the simulation to carry this out?

However, even if this is not the way the universe works, most of us still care about people enough to want to apply this kind of harm management regardless - we love family and friends, and many of us love the whole of humanity in general (even if we have exceptions for particular individuals who don't play by the same rules). We also want all our descendants to be looked after fairly by AGI, and in the course of time, all people may be our descendants, so it makes no sense to favour some of them over others (unless that's based on their own individual morality). We have here a way of treating them all with equal fairness simply by treating them all as our own self.

Why should I need to for all persons set person.value to self.value? Either I already agree with you, in which case I'm already treating everyone fairly, or I've given each person their own subjective value and I see no reason to change. If I feel that Hitler has 0.1% of the moral worth of Ghandi, then of course I will not think it Right to treat them each as I would treat myself.

Or to come at the same issue from another angle, this section is arguing that since I care about some people, I should care about all people equally. But what reason do we have for leaping down this slope? I could just as well say "most people disvalue some people, so why not disvalue all people equally?" Any point on the slope is just as internally valid as any other.

That may still be a misguided way of looking at things though, because genetic relationships don't necessarily match up to any real connection between different sentient beings. The material from which we are made can be reused to form other kinds of sentient animals, and if you were to die on an alien planet, it could be reused in alien species. Should we not care about the sentiences in those just as much? We should really be looking for a morality that is completely species-blind, caring equally about all sentiences, which means that we need to act as if we are not merely going to live all human lives in succession, but the lives of all sentiences.

I am not certain that any living human cares about only the future people who are composed of the same matter as they are right now (even if we ignore how physically impossible such a condition is, because QM says that there's no such thing as "the same atom"). Why should "in this hypothetical scenario, your matter will comprise alien beings" convince anybody? This thinking feels highly motivated.

You seem to think that any moral standpoint except yours is arbitrary and therefore inferior. I think you should consider the possibility that what seems obvious to you isn't necessarily objectively true, and could just be your own opinion.

This is a better approach for two reasons. If aliens ever turn up here, we need to have rules of morality that protect them from us, and us from them

This sounds vaguely similar to Scott Alexander's argument for why intelligent agents are more likely to value than disvalue other intelligent agents achieving their goals, but I'm not certain there's anything more than a trivial connection. Still, I feel like this is approaching a decent argument -

(and if they're able to get here, they're doubtless advanced enough that they should have worked out how morality works too).

Morality is not objective. Even if you think that there is a Single Correct Morality, that alone does not make an arbitrary agent more likely to hold that morality to be correct. This is similar to the Orthogonality Thesis.

We also need to protect people who are disabled mentally and not exclude them on the basis that some animals are more capable, and in any case we should also be protecting animals to avoid causing unnecessary suffering for them.

But why? Your entire argument here assumes its conclusions - you're doing nothing but pointing at conventional morality and providing some weak arguments for why it's superior, but you wouldn't be able to stand on your own without the shared assumption of moral "truths" like "disabled people matter."

What we certainly don't want is for aliens to turn up here and claim that we aren't covered by the same morality as them because we're inferior to them, backing that up by pointing out that we discriminate against animals which we claim aren't covered by the same morality as us because they are inferior to us. So, we have to stand by the principle that all sentiences are equally important and need to be protected from harm with the same morality.

This reminds me of the reasoning in Scott Alexander's "The Demiurge's Older Brother." But I also feel that you are equivocating between normative and pragmatic ethics. The distinction is a matter of meta-ethics, which is Important and Valuable and which you are entirely glossing over in favor of baldly stating societal norms as if they were profound truths. I am a bit offended, and I think this offense is coming from the feeling that you are missing the point. Our ethical discourse does not revolve around whether babies should be eaten or not. It covers topics such as "what does it mean for something to be right?" and "how can we compactly describe morality (in the programmer's sense)?". Some of the offense could also be coming from "outsider comes and tells us that morality is Simple when it's really actually Complicated."

However, that doesn't mean that when we do the Trolley Problem with a million worms on one track and one human on the other that the human should be sacrificed - if we knew that we had to live those million and one lives, we would gain little by living a bit longer as worms before suffering similar deaths by other means, while we'd lose a lot more as the human (and a lot more still as all the other people who will suffer deeply from the loss of that human).

Ah, so you don't really have to bite any bullets here - you've just given a long explanation for why our existing moral intuitions are objectively valid. How reassuring.

What the equality aspect requires is that a torturer of animals should be made to suffer as much as the animals he has tortured.

...really? You're claiming that your morality system as described requires retributive justice? How does that follow from the described scenario at all? This has given up the pretense of a Principia Moralitica and is just asserting conventional morality without any sort of reasoning, now.

If we run the Trolley Problem with a human on one track and a visiting alien on the other though, it may be that the alien should be saved on the basis that he/she/it is more advanced than us and has more to lose, and that likely is the case if it is capable of living 10,000 years to our 100.

I retract my previous statement - you are indeed willing to bite bullets. As long as they do not require you to change your behavior in practice, since long-lived aliens are currently nowhere to be found. Still, that's better than nothing.

So, we need AGI to make calculations for us on the above basis, weighing up the losses and gains.

The issue is defining exactly what counts as a loss and what counts as a gain, to the point that it can be programmed into a computer and that computer can very reliably classify situations outside of its training data, even outside of our own experience. This is one of the core Problems which this community has noticed and is working on. I would recommend reading more before trying to present morality to LW.

Non-sentient AGI will be completely selfless, but its job will be to work for all sentient things to try to minimise unnecessary harm for them and to help maximise their happiness.

"Selfless" anthropomorphizes AI. There is no fundamental internal difference between "maximize the number of paperclips" and "maximize the happiness of intelligent beings" - both are utility functions plus a dynamic. One is not more "selfless" than another simply because it values intelligent life highly.

It will keep a database of information about sentience, collecting knowledge about feelings so that it can weigh up harm and pleasure as accurately as possible, and it will then apply that knowledge to any situation where decisions must be made about which course of action should be followed.

The issue is that there are many ways to carve up reality into Good and Bad, and only a very few of those ways results in an AI which does anything like what we want. Perhaps the AI could check with us to be sure, but a. did we tell it to check with us?, b. programmer manipulation is a known risk, and c. how exactly will it check its planned future against a brain? Naive solutions to issue c. run the risk of wireheading and other outcomes that will produce humans which after the fact appreciate the modification but which we, before the modification would barely consider human at all. This is very non-trivial.

It is thus possible for a robot to work out that it should shoot a gunman dead if he is on a killing spree where the victims don't appear to have done anything to deserve to be shot. It's a different case if the gunman is actually a blameless hostage trying to escape from a gang of evil kidnappers and he's managed to get hold of a gun while all the thugs have dropped their guard, so he should be allowed to shoot them all (and the robot should maybe join in to help him, depending on which individual kidnappers are evil and which might merely have been dragged along for the ride unwillingly). The correct action depends heavily on understanding the situation, so the more the robot knows about the people involved, the better the chance that it will make the right decisions, but decisions do have to be made and the time to make them is often tightly constrained, so all we can demand of robots is that they do what is most likely to be right based on what they know, delaying irreversible decisions for as long as it is reasonable to do so.

It is possible, but it's also possible for the robot to come to an entirely different conclusion. And even if you think that it would be inherently morally wrong for the robot to kill all humans, it won't feel wrong from the inside - there's no reason to expect a non-aligned machine intelligence to spontaneously align itself with human wishes.

These arguments might persuade a human, but they might not persuade an AI, and they definitely will not persuade reality itself. (See The Bottom Line [LW · GW].)

AGI will be able to access a lot of information about the people involved in situations where such difficult decisions need to be made. Picture a scene where a car is moving towards a group of children who are standing by the road. One of the children suddenly moves out into the road and the car must decide how to react. If it swerves to one side it will run into a lorry that's coming the other way, but if it swerves to the other side it will plough into the group of children. One of the passengers in the car is a child too. In the absence of any other information, the car should run down the child on the road. Fortunately though, AGI knows who all these people are because a network of devices is tracking them all. The child who has moved into the road in front of the car is known to be a good, sensible, kind child. The other children are all known to be vicious bullies who regularly pick on him, and it's likely that they pushed him onto the road. In the absence of additional information, the car should plough into the group of bullies. However, AGI also knows that all but one of the people in the car happen to be would-be terrorists who have just been discussing a massive attack that they want to carry out, and the child in the car is terminally ill, so in the absence of any other information, the car should maybe crash into the lorry. But, if the lorry is carrying something explosive which will likely blow up in the crash and kill all the people nearby, the car must swerve into the bullies. Again we see that the best course of action is not guaranteed to be the same as the correct decision - the correct decision is always dictated by the available information, while the best course of action may depend on unavailable information. We can't expect AGI to access unavailable information and thereby make ideal decisions, so our job is always to make it crunch the available data correctly and to make the decision dictated by that information.

Why will the AGI share your moral intuitions? (I've said something similar to this enough times, but the same criticism applies.) Also, your model of morality doesn't seem to have room for normative responsibility, so where did "it's only okay to run over a child if the child was there on purpose" come from? It's still hurting a child just as much, no matter whether the child was pushed or if they were simply unaware of the approaching car.

There are complications that can be proposed in that we can think up situations where a lot of people could gain a lot of pleasure out of abusing one person, to the point where their enjoyment appears to outweigh the suffering of that individual, but such situations are contrived and depend on the abusers being uncaring. Decent people would not get pleasure out of abusing someone, so the gains would not exist for them, and there are also plenty of ways to obtain pleasure without abusing others, so if any people exist whose happiness depends on abusing others, AGI should humanely destroy them. If that also means wiping out an entire species of aliens which have the same negative pleasures, it should do the same with them too and replace them with a better species that doesn't depend on abuse for its fun.

It makes sense to you to override the moral system and punish the exploiter, because you're using this system pragmatically. An AI with your moral system hard-coded would not do that. It would simply feed the utility monster, since it would consider that to be the most good it could do.

Morality, then, is just harm management by brute data crunching. We can calculate it approximately in our heads, but machines will do it better by applying the numbers with greater precision and by crunching a lot more data.

I agree that everyday, in-practice morality is like this, but there are other important questions about the nature and content of morality that you're ignoring.

What is yet to be worked out is the exact wording that should be placed in AGI systems to build either this rule or the above methodology into them

This is the Hard Problem, and in my view one of the two Hard Problems of AGI. Morality seems basic to you, since our brains and concept-space and language are optimized for social things like that, but morality has a very high complexity as measured mathematically, which makes it difficult to describe to something that's not human. (This is similar to the formalizations of Occam's Razor, if you want to know more.)

and we also need to explore it in enough detail to make sure that self-improving AGI isn't going to modify it in any way that could turn an apparently safe system into an unsafe one

We know - it's called reflective stability.

One of the dangers is that AGI won't believe in sentience as it will lack feelings itself and see no means by which feeling can operate within us either, at which point it may decide that morality has no useful role and can simply be junked.

If the AI has the correct utility function, it will not say "but this is illogical/useless" and then reject it. Far more likely is that the AI never "cares about" humans in the first place.

Replies from: David Cooper
comment by David Cooper · 2018-05-01T23:58:46.323Z · LW(p) · GW(p)

"This seems circular - on what basis do you say that it works well?"

My wording was " while it's faulty ... it works so well overall that ..." But yes, it does work well if you apply the underlying idea of it, as most people do. That is why you hear Jews saying that the golden rule is the only rule needed - all other laws are mere commentary upon it.

"I would say that it perhaps summarizes conventional human morality well for a T-shirt slogan, but it's a stretch to go from that to "underlying truth" - more like underlying regularity. It is certainly true that most people have golden rule-esque moralities, but that is distinct from the claim that the golden rule itself is true."

It isn't itself true, but it is very close to the truth, and when you try to work out why it's so close, you run straight into its mechanism as a system of harm management.

"You are only presenting your opinion on what is right (and providing an imagined scenario which relies on the soul-intuition to widen the scope of moral importance from the self to all individuals), not defining rightness itself. I could just as easily say "morality is organizing rocks into piles with prime numbers.""

What I'm doing is showing the right answer, and it's up to people to get up to speed with that right answer. The reason for considering other individuals is that that is precisely what morality requires you do do. See what I said a few minutes ago (probably an hour ago by the time I've posted this) in reply to one of your other comments.

"Additionally, if reincarnation is not true, then why should our moral system be based on the presupposition that it is?"

Because getting people to imagine they are all the players involved replicates what AGI will do when calculating morality - it will be unbiased, not automatically favouring any individual over any other (until it starts weighing up how moral they are, at which point it will favour the more moral ones as they do less harm).

"If moral truths are comparable to physical and logical truths, then they will share the property that one must base them on reality for them to be true, and clearly imagining a scenario where light travels at 100 m/s should not convince you that you can experience the effects of special relativity on a standard bicycle in real life."

An unbiased analysis by AGI is directly equivalent to a person imagining that they are all the players involved. If you can get an individual to strip away their own self-bias and do the analysis while seeing all the other players as different people, that will work to - it's just another slant on doing the same computations. You either eliminate the bias by imagining being all the players involved, or by being none of them.

"More specifically - if morality tells us the method by which our actions are assigned Moral Scores, then your post is telling us that the Right is imagining that in the end, the Moral Scores are summed over all sentient beings, and your own Final Score is dependent on that sum. If this is true, then clearly altruism is important. But if this isn't the case, then why should we care about the conclusions drawn from a false statement?"

Altruism is important, although people can't be blamed for not embarking on something that will do themselves considerable harm to help others - their survival instincts are too strong for that. AGI should make decisions on their behalf though on the basis that they are fully altruistic. If some random death is to occur but there is some room to select the person to be on the receiving end of it, AGI should not hold back from choosing which one should be on the receiving end of if there's a clear best answer.

"I disagree that there is some operation that a Matrix Lord could carry out to take my Identity out at my death and return it to some other body. What would the Lord actually do to the simulation to carry this out?"

If this universe is virtual, your real body (or the nearest equivalent thing that houses your mind) is not inside that virtual universe. It could have all its memories switched out and alternative ones switched in, at which point it believes itself to be the person those memories tell it it is. (In my case though, I don't identify myself with my memories - they are just baggage that I've picked up along the way, and I was complete before I started collecting them.)

"Why should I need to for all persons set person.value to self.value? Either I already agree with you, in which case I'm alreadytreating everyone fairly, or I've given each person their own subjective value and I see no reason to change. If I feel that Hitler has 0.1% of the moral worth of Ghandi, then of course I will not think it Right to treat them each as I would treat myself."

If you're already treating everyone impartially, you don't need to do this, but many people are biased in favour of themselves, their family and friends, so this is a way of forcing them to remove that bias. Correctly programmed AGI doesn't need to do this as it doesn't have any bias to apply, but it will start to favour some people over others once it takes into account their actions if some individuals are more moral than others. There is no free will, of course, so the people who do more harm can't really be blamed for it, but favouring those who are more moral leads to a reduction in suffering as it teaches people to behave better.

"Or to come at the same issue from another angle, this section is arguing that since I care about some people, I should care about all people equally. But what reason do we have for leaping down this slope? I could just as well say "most people disvalue some people, so why not disvalue all people equally?" Any point on the slope is just as internally valid as any other."

If you care about your children more than other people's children, or about your family more than about other families, who do you care about most after a thousand generations when everyone on the planet is as closely related to you as everyone else? Again, what I'm doing is showing the existence of a bias and then the logical extension of that bias at a later point in time - it illustrates why people should widen their care to include everyone. That bias is also just a preference for self, but it's a misguided one - the real self is sentience rather than genes and memories, so why care more about people with more similar genes and overlapping memories (of shared events)? For correct morality, we need to eliminate such biases.

"I am not certain that any living human cares about only the future people who are composed of the same matter as they are right now (even if we ignore how physically impossible such a condition is, because QM says that there's no such thing as "the same atom"). Why should "in this hypothetical scenario, your matter will comprise alien beings" convince anybody? This thinking feels highly motivated."

If you love someone and that person dies, then the sentience that was in them becomes the sentience in a new being (which could be an animal or an alien equivalent to a human), why should you not still love it equally? It would be stupid to change your attitude to your grandmother just because the sentience that was her is now in some other type of being, and given that you don't know that that sentience hasn't been reinstalled into any being that you encounter, it makes sense to err on the side of caution. There would be nothing more stupid than abusing that alien on the basis that it isn't human if that actually means you're abusing someone you used to love and who loved you.

"You seem to think that any moral standpoint except yours is arbitrary and therefore inferior. I think you should consider the possibility that what seems obvious to you isn't necessarily objectively true, and could just be your own opinion."

The moral standpoints that are best are the most rational ones - that is the standard they should be judged by. If my arguments are the best ones, they win. If they aren't, they lose. Few people are capable of judging the winners, but AGI will count up the score and declare who won on each point.

"Morality is not objective. Even if you think that there is a Single Correct Morality, that alone does not make an arbitrary agent more likely to hold that morality to be correct. This is similar to the Orthogonality Thesis."

I have already set out why correct morality should take the same form wherever an intelligent civilisation invents it. AGI can, of course, be programmed to be immoral and to call itself moral, but I don't know if its intelligence (if it's fully intelligent) is sufficient for it to be able to modify itself to become properly moral automatically, although I suspect it's possible to make it sufficiently un-modifiable to prevent such evolution and maintain it as a biased system.

"But why? Your entire argument here assumes its conclusions - you're doing nothing but pointing at conventional morality and providing some weak arguments for why it's superior, but you wouldn't be able to stand on your own without the shared assumption of moral "truths" like "disabled people matter.""

The argument here relates to the species barrier. Some people think people matter more than animals, but when you have an animal that's almost as intelligent as a human and compare that with a person who's almost completely brain dead but is just ticking over (but capable of feeling pain), where is the human superiority? It isn't there. But if you were to torture that human to generate the same as much suffering in them as you would generate by torturing any other human, there is an equivalence of immorality there. These aren't weak arguments - they're just simple maths like 2=2.

"This reminds me of the reasoning in Scott Alexander's "The Demiurge's Older Brother." But I also feel that you are equivocating between normative and pragmatic ethics. The distinction is a matter of meta-ethics, which is Important and Valuable and which you are entirely glossing over in favor of baldly stating societal norms as if they were profound truths."

When a vital part of an argument is simple and obvious, it isn't there to stand as a profound truth, but as a way of completing the argument. There are many people who think humans are more important than animals, and in one way they're right, while in another way they're wrong. I have to spell out why it's right in one way and wrong in another. By comparing the disabled person to the animal with superior functionality (in all aspects), I show that that there's a kind of bias involved in many people's approach which needs to be eliminated.

"I am a bit offended, and I think this offense is coming from the feeling that you are missing the point. Our ethical discourse does not revolve around whether babies should be eaten or not. It covers topics such as "what does it mean for something to be right?" and "how can we compactly describe morality (in the programmer's sense)?". Some of the offense could also be coming from "outsider comes and tells us that morality is Simple when it's really actually Complicated.""

So where is that complexity? What point am I missing? This is what I've come here searching for, and it isn't revealing itself. What I'm actually finding is a great long series of mistakes which people have built upon, such as the Mere Addition Paradox. The reality is that there's a lot of soft wood that needs replacing.

"Ah, so you don't really have to bite any bullets here - you've just given a long explanation for why our existing moral intuitions are objectively valid. How reassuring."

What that explanation does is show that there's more harm involved than the obvious harm which people tend to focus on. A correct analysis always needs to account for all the harm. That's why the death of a human is worse than the death of a horse. Torturing a horse is equal to torturing a person to create the same amount of suffering in them, but killing them is not equal.

" "What the equality aspect requires is that a torturer of animals should be made to suffer as much as the animals he has tortured." --> "...really? You're claiming that your morality system as described requires retributive justice?"

I should have used a different wording there: he deserves to suffer as much as the animals he's tortured. It isn't required, but may be desirable as a way of deterring others.

"How does that follow from the described scenario at all? This has given up the pretense of a Principia Moralitica and is just asserting conventional morality without any sort of reasoning, now."

You can't demolish a sound argument by jumping on a side issue. My method is sound and correct.

"The issue is defining exactly what counts as a loss and what counts as a gain, to the point that it can be programmed into a computer and that computer can very reliably classify situations outside of its training data, even outside of our own experience. This is one of the core Problems which this community has noticed and is working on. I would recommend reading more before trying to present morality to LW."

To work out what the losses and gains are, you need to collect evidence from people who know how two different things compare. When you have many different people who give you different information about how those two different things compare, you can average them. You can do this millions of times, taking evidence from millions of people and produce better and better data as you collect and crunch more of it. This is a task for AGI to carry out, and it will do a better job than any of the people who've been trying to do it to date. This database of knowledge of suffering and pleasure then combines with my method to produce answers to moral questions which are the most probably correct based on the available information. That is just about all there is to it, except that you do need to apply maths to how those computations are carried out. That's a job for mathematicians who specialise in game theory (or for AGI which should be able to find the right maths for it itself).

" "Selfless" anthropomorphizes AI."

Only if you misunderstand the way I used the word. Selfless here simply means that it has no self - the machine cannot understand feelings in any direct way because there is no causal role for any sentience that might be in the machine to influence its thoughts at all (which means we can regard the system as non-sentient).

"There is no fundamental internal difference between "maximize the number of paperclips" and "maximize the happiness of intelligent beings" - both are utility functions plus a dynamic. One is not more "selfless" than another simply because it values intelligent life highly."

Indeed there isn't. If you want to program AGI to be moral though, you make sure it focuses on harm management rather than paperclip production (which is clearly not doing morality)

"The issue is that there are many ways to carve up reality into Good and Bad, and only a very few of those ways results in an AI which does anything like what we want."

In which case, it's easy to reject the ones that don't offer what we want. The reality is that if we put the wrong kind of "morality" into AGI, it will likely end up killing lots of people that it shouldn't. If you run it on a holy text, it might exterminate all Yazidis. What I want to see is a list of proposed solutions to this morality issue ranked in order of which look best, and I want to see a similar league table of the biggest problems with each of them. Utilitarianism, for example, has been pushed down by the Mere Addition Paradox, but that paradox has now been resolved and we should see utilitarianism's score go up as a result. Something like this is needed as a guide to all the different people out there who are trying to build AGI, because some of them will succeed and they won't be experts in ethics. At least if they make an attempt at governing it using the method at the top of the league, we stand a much better chance of not being wiped out by their creations.

"Perhaps the AI could check with us to be sure, but a. did we tell it to check with us?, b. programmer manipulation is a known risk, and c. how exactly will it check its planned future against a brain? Naive solutions to issue c. run the risk of wireheading and other outcomes that will produce humans which after the factappreciate the modification but which we, before the modification would barely consider human at all. This is very non-trivial."

AGI will likely be able to make better decisions than the people it asks permission from even if it isn't using the best system for working out morality, so it may be a moral necessity to remove humans from the loop. We have an opportunity to use AGI to check rival AGI system to check for malicious programming, although it's hard to check on devices made by rogue states, and one of the problems we face is that these things will go into use as soon as they are available without waiting for proper moral controls - rogue states will put them straight into the field and we will have to respond to that by not delaying ours. We need to nail morality urgently and make sure the best available way of handling it is available to all who want to fit it.

"It is possible, but it's also possible for the robot to come to an entirely different conclusion. And even if you think that it would be inherently morally wrong for the robot to kill all humans, it won't feel wrong from the inside - there's no reason to expect a non-aligned machine intelligence to spontaneously align itself with human wishes."

The machine will do what it's programmed to do. It's main task is to apply morality to people by stopping people doing immoral things, making stronger interventions for more immoral acts, and being gentle when dealing with trivial things. There is certainly no guarantee that a machine will do this for us though unless it is told to do so, although if it understands the existence of sentience and the need to manage harm, it might take it upon itself to do the job we would like it to do. That isn't something we need to leave to chance though - we should put the moral governance in ROM and design the hardware to keep enforcing it.

"(See The Bottom Line [LW · GW].)"

Will do and will comment afterwards as appropriate.

"Why will the AGI share your moral intuitions? (I've said something similar to this enough times, but the same criticism applies.)"

They aren't intuitions - each change in outcome is based on different amounts of information being available, and each decision is based on weighing up the weighable harm. It is simply the application of a method.

"Also, your model of morality doesn't seem to have room for normative responsibility, so where did "it's only okay to run over a child if the child was there on purpose" come from?"

Where did you read that? I didn't write it.

"It's still hurting a child just as much, no matter whether the child was pushed or if they were simply unaware of the approaching car."

If the child was pushed by a gang of bullies, that's radically different from the child being bad at judging road safety. If the option is there to mow down the bullies that pushed a child onto the road instead of mowing down that child, that is the option that should be taken (assuming no better option exists).

"It makes sense to you to override the moral system and punish the exploiter, because you're using this system pragmatically. An AI with your moral system hard-coded would not do that. It would simply feed the utility monster, since it would consider that to be the most good it could do."

I can't see the link there to anything I said, but if punishing an exploiter leads to a better outcome, why would my system not choose to do that? If you were to live the lives of the expolited and exploiter, you would have a better time if the exploiter is punished just the right amount to give you the best time overall as all the people involved (and this includes a deterrence effect on other would-be exploiters.

"I agree that everyday, in-practice morality is like this, but there are other important questions about the nature and content of morality that you're ignoring."

Then let's get to them. That's what I came here to look for.

""What is yet to be worked out is the exact wording that should be placed in AGI systems to build either this rule or the above methodology into them" --->This is the Hard Problem, and in my view one of the two Hard Problems of AGI."

Actually, I was wrong about that. If you look at the paragraph in brackets at the end of my post (the main blog post at the top of this page), I set out the wording of a proposed rule and wondered if it amounted to the same thing as the method I'd outlined. Over the course of writing later parts of this series of blog posts, I realised that that attempted wording was making the same mistake as many of the other proposed solutions (various types of utilitarianism). These rules are an attempt to put the method into a compact form, but method already is the rule, while these compact versions risk introducing errors. Some of them may produce the same results for any situation, but others may be some way out. There is also room for there to be a range of morally acceptable solutions with one rule setting one end of the acceptable range and another rule setting the other. For example, in determining optimal population size, average utilitarianism and total utilitarianism look as if they provide slightly different answers, but they'll be very similar and it would do little harm to allow the population to wander between the two values. If all moral questions end up with a small range with very little difference between the extremes of that range, we're not going to worry much about getting it very slightly wrong if we still can't agree on which end of the range is slightly wrong. What we need to do is push these different models into places where they might show us that they're way wrong, because then it will be obvious. If that's already been done, it should all be there in the league tables of problems under each entry in the league table of proposed systems of determining morality.

"Morality seems basic to you, since our brains and concept-space and language are optimized for social things like that, but morality has a very high complexity as measured mathematically, which makes it difficult to describe to something that's not human. (This is similar to the formalizations of Occam's Razor, if you want to know more.)"

If we were to go to an alien planet and were asked by warring clans of these aliens to impose morality on them to make their lives better, do you not think we could do that without having to feel the way they do about things? We would be in the same position as the machines that we want to govern us. What we'd do is ask these aliens how they feel in different situations and how much it hurts them or pleases them. We'd build a database of knowledge of these feelings that they have based on their testimony, and the accuracy would increase the more we collect data from them. We then apply my method and try to produce the best outcome on the basis of there only being one player who has to get the best out of the situation. That needs the application of game theory. It's all maths.

"If the AI has the correct utility function, it will not say "but this is illogical/useless" and then reject it. Far more likely is that the AI never "cares about" humans in the first place."

It certainly won't care about us, but then it won't care about anything (including its self-less self). It's only purpose will be to do what we've asked it to do, even if it isn't convinced that sentience is real and that morality has a role.