Waser's 3 Goals of Morality

post by mwaser · 2010-11-02T19:12:49.132Z · LW · GW · Legacy · 25 comments

Contents

25 comments

In the spirit of Asimov’s 3 Laws of Robotics

  1. You should not be selfish
  2. You should not be short-sighted or over-optimize
  3. You should maximize the progress towards and fulfillment of all conscious and willed goals, both in terms of numbers and diversity equally, both yours and those of others equally

It is my contention that Yudkowsky’s CEV converges to the following 3 points:

  1. I want what I want
  2. I recognize my obligatorily gregarious nature; realize that ethics and improving the community is the community’s most rational path towards maximizing the progress towards and fulfillment of everyone’s goals; and realize that to be rational and effective the community should punish anyone who is not being ethical or improving the community (even if the punishment is “merely” withholding help and cooperation)
  3. I shall, therefore, be ethical and improve the community in order to obtain assistance, prevent interference, and most effectively achieve my goals

I further contend that, if this CEV is translated to the 3 Goals above and implemented in a Yudkowskian Benevolent Goal Architecture (BGA), that the result would be a Friendly AI.

It should be noted that evolution and history say that cooperation and ethics are stable attractors while submitting to slavery (when you don’t have to) is not.  This formulation expands Singer’s Circles of Morality as far as they’ll go and tries to eliminate irrational Us-Them distinctions based on anything other than optimizing goals for everyone — the same direction that humanity seems headed in and exactly where current SIAI proposals come up short.

Once again, cross-posted here on my blog (unlike my last article, I have no idea whether this will be karma'd out of existence or not ;-)

25 comments

Comments sorted by top scores.

comment by jimrandomh · 2010-11-02T19:31:22.645Z · LW(p) · GW(p)

This is too confused to follow as a human, and much too confused to program an AI with.

Also ambiguity aside, (2) is just bad. I'm having trouble imagining a concrete interpretation of "don't over-optimize" that doesn't reduce to "fail to improve things that should be improved". And while short-sightedness is a problem for humans who have trouble modelling the future, I don't think AIs have that problem, and there are some interesting failure modes (of the destroys-humanity variety) that arise when an AI takes too much of a long view.

Replies from: mwaser, mwaser
comment by mwaser · 2010-11-02T20:41:54.445Z · LW(p) · GW(p)

One of my frequent criticisms of LessWrong denizens is that they are very quick to say "This is too confused" when they should be saying "I don't understand and don't care to take the time to try to understand".

And if you can't imagine a concrete interpretation of "don't over-optimize" then you have obviously done no simulations work whatsoever. One of the most common problems (if not the most common) is to watch a simulation chugging along with all the parameters within normal ranges only to have everything suddenly rabbithole to extreme and unlikely values because of some minor detail (ruling out a virtually impossible edge-case) missing from the simulation.

Or, can you really not see someone over-optimizing their search for money at the expense of their happiness or the rest of their life.

Of course, this comment will rapidly be karma'd into oblivion and the echo chamber will continue.

Replies from: jimrandomh, Kingreaper
comment by jimrandomh · 2010-11-02T21:05:55.897Z · LW(p) · GW(p)

One of my frequent criticisms of LessWrong denizens is that they are very quick to say "This is too confused" when they should be saying "I don't understand and don't care to take the time to try to understand".

The burden of clarity falls on the writer. Not all confusion is the writer's fault, but confused writing is a very major problem in philosophy. In fact, I would say it's more of a problem than falsehood is. There's no shame in being confused - almost everyone is, especially around complex topics like morality. But you can't expect to make novel contributions that are any good until you've untangled the usual confusions and understood the progress that's previously been made.

Or, can you really not see someone over-optimizing their search for money at the expense of their happiness or the rest of their life.

If someone sacrifices happiness to seek money, the problem is not that they're doing too good a job of earning money, it's that they're optimizing the wrong thing entirely. An AI wouldn't see your advice against over-optimizing and put more resources into finding happiness for people; instead, it would waste some of its money to make sure it didn't have too much.

Replies from: mwaser
comment by mwaser · 2010-11-02T21:42:51.854Z · LW(p) · GW(p)

The burden of clarity falls on the writer. Not all confusion is the writer's fault, but confused writing is a very major problem in philosophy. In fact, I would say it's more of a problem than falsehood is. There's no shame in being confused - almost everyone is, especially around complex topics like morality. But you can't expect to make novel contributions that are any good until you've untangled the usual confusions and understood the progress that's previously been made.

A good point and well written. My counter-point is that numerous other people have not had problems with my logic; have not needed to get special definitions of "terms" that were pretty clear standard English; have not insisted on throwing up strawmen, etc.

Your assumption is that I haven't untangled the usual confusions and that I haven't read the literature. It's an argument from authority but I can't help but point out that I was a Philosophy major 30 years ago and have been constantly reading and learning since then. Further, the outside view is generally that it is LessWrong that is generally confused and intolerant of outside views.

=== Your second argument is a classic case of a stupid super-intelligent AI.

Replies from: NihilCredo
comment by NihilCredo · 2010-11-02T21:51:30.070Z · LW(p) · GW(p)

Then apparently Less Wrong readers are more stupid or more ignorant than your previous audience. In which case I am afraid you will have to dumb down your writing so that it is comprehensible and useful to your current target audience.

Replies from: mwaser
comment by mwaser · 2010-11-02T22:10:05.225Z · LW(p) · GW(p)

Then apparently Less Wrong readers are more stupid or more ignorant than your previous audience.

This is the type of strawman that frustrates me. I said nothing of the sort.

An equally valid interpretation (and my belief) is that LessWrong readers are much more intolerant of accepting common English phrases and prone to inventing strawmen to the point of making communication at any decent rate of speed nearly impossible. I'm starting to really get the lesson that LessWrong really is conservative to an extreme (this is not a criticism at all).

Your point about altering my writing for the current target audience is dead on the money. In general, your post was as adversarial as my writing is interpreted as being. There's a definite double standard here (but since I'm here as a guest, I shouldn't complain).

Replies from: grouchymusicologist
comment by grouchymusicologist · 2010-11-02T22:29:33.578Z · LW(p) · GW(p)

LW readers are, perhaps, more cautious than average about "accepting common English phrases" because a major topic in rationality is precisely the fact that such common phrases often conceal fatal vagueness. Whether or not I agree with you that you've been using certain words and phrases to mean exactly what an ordinary English speaker would understand them to mean, this kind of caution surrounding ordinary language is generally considered to be a feature, not a bug, of discourse around here.

As far as the double standard thing, it seems like the one hypothesis you can't bring yourself to entertain is that nobody can figure out what you're talking about, despite some fairly sympathetic attempts to do so. After a few times around, everyone will have lost patience with you, yes. But that's not a double standard. (I say this as emphatically an outsider: I don't comment here much and no one at LW knows me from Adam.)

(Sorry in advance that I won't be able to reply to any comments for at least 24 hours, since I'm traveling -- musicology conference this week!)

comment by Kingreaper · 2010-11-02T21:06:29.274Z · LW(p) · GW(p)

That's over-optimising a single aspect resulting in overall under-optimisation.

It's not over-optimising overall.

Replies from: mwaser
comment by mwaser · 2010-11-02T21:19:53.206Z · LW(p) · GW(p)

True. And I did not say over-optimising overall. Humans are very prone to over-optimization (i.e. money at the expense of happiness and/or a life). How would you have phrased that?

Replies from: NihilCredo, DaveX, Kingreaper
comment by NihilCredo · 2010-11-02T21:27:54.783Z · LW(p) · GW(p)

Humans usually phrase it as "You should keep your priorities straight".

Replies from: mwaser
comment by mwaser · 2010-11-02T22:00:00.872Z · LW(p) · GW(p)

Thank you but I don't feel that that clearly expresses my point.

comment by DaveX · 2010-11-08T19:24:15.895Z · LW(p) · GW(p)

In your example, money versus happiness is a choice between alternatives. Whatever goal you are trying to optimize towards should provide the guidance in making the choices between alternatives.

Language about "Over-optimizing" one alternative at the expense of another distracts from identifying your real goals and how you make the tradeoffs to achieve them

comment by Kingreaper · 2010-11-02T21:28:29.678Z · LW(p) · GW(p)

"Do not over-optimise one aspect at the expense of overall utility"

Replies from: mwaser
comment by mwaser · 2010-11-02T21:49:21.515Z · LW(p) · GW(p)

Good phrasing.

but which is better . . . .

  1. You should not over-optimize or be short-sighted at the expense of overall utility.

OR

  1. You should not be short-sighted or over-optimize at the expense of overall utility.

"at the expense of overall utility" applies to both halves of the statement

Is it still just as bad? Or was the initial comment a bit hasty and unwarranted in that respect?

Replies from: Kingreaper
comment by Kingreaper · 2010-11-02T21:56:09.692Z · LW(p) · GW(p)

"at the expense of overall utility" is unnecessary for the "short-sighted" bit: that is implied by the phrase. Short-sighted-ness is a well known character flaw.

And your version is still bad. Over-optimising at the expense of overall utility is hard to parse. You're missing "one aspect". You shouldn't over-optimise one aspect at the expense of overall utility.

comment by mwaser · 2010-11-02T20:43:36.377Z · LW(p) · GW(p)

How does a post that says "I'm having trouble imagining a concrete interpretation of "don't over-optimize" that doesn't reduce to "fail to improve things that should be improved". " almost immediately get 4 upvotes?

Can everyone here really not see someone over-optimizing their search for money at the expense of their happiness or the rest of their life?

Yeah, yeah, I know. You are all so eager to deep-six the original post that you'll upvote any detractors no mater how bad.

Replies from: Kingreaper, Vladimir_Nesov
comment by Kingreaper · 2010-11-02T21:09:52.974Z · LW(p) · GW(p)

You're not expressing yourself very clearly, and you're creating multiple top-level discussion posts for not very clearly thought through reasons.

You're then getting indignant when others don't understand your points, being unwilling to accept that your explanation may have been flawed, and proceeding to insult the community in general.

This is not likely to result in you getting the benefit of the doubt.

If you took a break from top-level posting, and tried to work out your positions more clearly, and less adversarially, you might stand a better chance.

Replies from: mwaser
comment by mwaser · 2010-11-02T21:36:09.955Z · LW(p) · GW(p)

Um. Do I have a choice about creating multiple top-level posts? (Yes, that is a serious question) Once a post is below threshold . . . .

I'm perfectly willing to accept that I'm not expressing myself in a fashion that this community is happy with (the definition of clear is up for grabs ;-)

I'm not willing to accept that my different posts are not clearly thought through (and the short time between them is an artifact of my having a lot of posts written during the several month time period that I wasn't updating my blog)

Indignant is your interpretation. I haven't felt that emotion after the first few days. ;-)

My explanation was clearly either poorly communicated or flawed. I disagree with necessarily flawed.

I will argue that making the criticism that LessWrong denizens are very quick to say "This is too confused" when they should be saying "I don't understand and don't care to take the time to try to understand" is much more in the line of a constructive criticism than an insult.

Yes, as a newbie and a boat-rocker, I will not get the benefit of the doubt regardless of what I do.

My positions are pretty clear and have been vetted by a decent number of other people. My admittedly biased view is that I am not taking an adversarial role (except for some idiot slips), that most of my statements (while necessarily biased) about bad argumentation practices are meant to be constructive not insulting, but that the way LessWrong treats all newcomers is unnecessarily harsh to the extent that you all have an established reputation of having built an "echo chamber" but that this can be changed.

Replies from: Alicorn, Kingreaper
comment by Alicorn · 2010-11-02T21:46:55.851Z · LW(p) · GW(p)

Um. Do I have a choice about creating multiple top-level posts? (Yes, that is a serious question) Once a post is below threshold...

Yes, you have a choice about making top-level posts. If you keep making such poor ones, so often, with so little improvement, that choice will be taken away. If you made better ones you could become a valuable contributor; if you made poor ones infrequently, you'd be a very low-level nuisance not worth tackling. As is, I'm more tempted every time you post to ban the noise you throw around in defense of the signal-to-noise ratio.

I endorse everything Kingreaper said in the grandparent. You would do well to take such kind advice more seriously.

Replies from: mwaser
comment by mwaser · 2010-11-02T22:25:34.495Z · LW(p) · GW(p)

Got it. Believe it or not, I am trying to figure out the rules (which are radically different than a number of my initial assumptions) and not trying solely to be a pain in the ass.

I'll cool it on the top level posts.

Admittedly, a lot of my problem is that there is either a really huge double standard or I'm missing something critical. To illustrate . . . . Kingfisher's comment "Something is clear if it is easily understood by those with the necessary baseline knowledge." My posts are, elsewhere, considered very clear by people with less baseline knowledge. If my post was logically incorrect to someone with higher knowledge, then they should be able to dissect it and get to the root of the problem. Instead, what I'm seeing is tremendous numbers of strawmen. The lesson seems to be "If you don't go slow and you fail to rule out every single strawman that I can possibly raise, I will refuse to let you go further (and I will do it by insisting that you have actively embraced the strawman). Am I starting to get it or am I way off base?

Note: I am never trying to insult (except one ill-chosen all caps response). But the community seems to be acting against its own goals as I perceive they have stated them. Would it be fair to say that your expectations (and apparently even goals) are not clear to new posters (not newcomers, I have read and believe I grok all of the sequences, etc. to the extent that virtually any link that is pointed to, I've already seen).

Another, last comment. At the top of discussion posts, it says "This part of the site is for the discussion of topics not yet ready or not suitable for normal top-level posts." That is what led me to believe that posting a couple of posts that I obviously considered ready for normal prime-time (i.e. not LessWrong) wouldn't be a problem. I am now being told that it is a problem and I will abide. But can you make any clarification?

Thanks.

Replies from: jimrandomh, Alicorn
comment by jimrandomh · 2010-11-02T23:35:55.394Z · LW(p) · GW(p)

Kingfisher's definition of clarity is actually not quite right. In order to be clear, you have to carve reality at the joints. That's what the problem was with the Intelligence vs. Wisdom post; there wasn't anything obviously false, at least that I noticed, but it seemed to be dividing up concept space in an unnatural way. Similarly with this post. For example, "selfish" is a natural concept for humans, who have a basic set of self-centered goals by default, which they balance against non-self-centered goals like improving their community. But if you take that definition and try to transfer it to AIs, you run into trouble, because they don't have those self-centered goals, so if you want to make sense of it you have to come up with a new definition. Is an AI that optimizes the happiness of its creator, at the expense of other humans, being selfish? How about the happiness of its creator's friends, at the expense of humanity in general? How about humanity's happiness, at the expense of other terrestrial animals?

Using fuzzy words in places where they don't belong hides a lot of complexity. One way that people respond to that is by coming up with things that the words could mean, and presenting them as counterexamples. You seem to have misinterpreted that as presenting straw-men; it's not saying that the best interpretation is wrong, but rather, saying that the phrasing was vague enough to admit some bad interpretations.

I would also like to add that detecting confusion, both in our own thoughts and in things we read, is one of the main skills of rationality. People here are, on average, much more sensitive to confusion than most people.

comment by Alicorn · 2010-11-02T22:52:52.778Z · LW(p) · GW(p)

At the top of discussion posts, it says "This part of the site is for the discussion of topics not yet ready or not suitable for normal top-level posts." That is what led me to believe that posting a couple of posts that I obviously considered ready for normal prime-time (i.e. not LessWrong) wouldn't be a problem.

Just because the standards are properly lower here than on main LW doesn't mean that you can post an arbitrary volume of arbitrarily ill-received posts without being told to stop.

comment by Kingreaper · 2010-11-02T21:51:53.579Z · LW(p) · GW(p)

Um. Do I have a choice about creating multiple top-level posts? (Yes, that is a serious question) Once a post is below threshold . . . .

You can leave the subject lie, and carry on commenting on other people's.

I'm perfectly willing to accept that I'm not expressing myself in a fashion that this community is happy with (the definition of clear is up for grabs ;-)

No, the definition of clear is not up for grabs. Something is clear if it is easily understood by those with the necessary baseline knowledge. Your posts are not.

You are acting indignant, whether you are or not, and that is not endearing.

My explanation was clearly either poorly communicated or flawed. I disagree with necessarily flawed.

Your communication is an essential part of your explanation. If your communication is poor (aka flawed) then your explanation is poor (aka flawed)

the way LessWrong treats all newcomers is unnecessarily harsh to the extent that you all have an established reputation of having built an "echo chamber" but that this can be changed.

I've been here less time than you. I came in with the idea that I'd learn how the culture works, and behave appropriately within it while improving rationality.

I'm not exactly popular, and I've been in some rather heated debates, but you see me as part of the establishment. Why? Because I made an effort. Make that effort. Try and be part of the community, rather than setting yourself apart deliberately. Think things through before you do them.

comment by Vladimir_Nesov · 2010-11-03T08:58:59.385Z · LW(p) · GW(p)

First and third paragraphs are attacks at the group, and the second is a rhetorical question (in the right direction). Please stick to object level. If you feel that your views are misrepresented, don't take offense: we try to make a more precise sense of what you say than your words let on, and can err in interpretation. The progress is made by moving forward, correcting the errors and arriving at common understanding. The progress is halted by taking offense, which leads to discouraging of further conversation even without outright stopping it.

If you consider a single top-level goal, then disclaimers about subgoals are unnecessary. Instead of saying "Don't overly optimize any given subgoal (at the expense of the other subgoals)", just say "Optimize the top-level goal". This is simpler and tells you what to do, as opposed to what not to do, with the latter suffering from all the problems of nonapples.

Replies from: mwaser
comment by mwaser · 2010-11-03T20:34:25.926Z · LW(p) · GW(p)

Now that I've got it, this is clear, concise, and helpful. Thank you.

I also owe you (personally) an apology for previous behavior.