No Universally Compelling Arguments in Math or Science

chrishallquist

No Universally Compelling Arguments in Math or Science

post by ChrisHallquist · 2013-11-05T03:32:42.920Z · LW · GW · Legacy · 230 comments

  There are minds in the space of minds-in-general that do not recognize modus ponens.
  There are minds in the space of minds-in-general that reason counter-inductively.
  There are minds in the space of minds-in-general that use a maximum entropy prior, and never learn anything.
None
230 comments

Last week, I started a thread on the widespread sentiment that people don't understand the metaethics sequence. One of the things that surprised me most in the thread was this exchange:

Commenter: "I happen to (mostly) agree that there aren't universally compelling arguments, but I still wish there were. The metaethics sequence failed to talk me out of valuing this."

Me: "But you realize that Eliezer is arguing that there aren't universally compelling arguments in any domain, including mathematics or science? So if that doesn't threaten the objectivity of mathematics or science, why should that threaten the objectivity of morality?"

Commenter: "Waah? Of course there are universally compelling arguments in math and science."

Now, I realize this is just one commenter. But the most-upvoted comment in the thread also perceived "no universally compelling arguments" as a major source of confusion, suggesting that it was perceived as conflicting with morality not being arbitrary. And today, someone mentioned having "no universally compelling arguments" cited at them as a decisive refutation of moral realism.

After the exchange quoted above, I went back and read the original No Universally Compelling Arguments post, and realized that while it had been obvious to me when I read it that Eliezer meant it to apply to everything, math and science included, it was rather short on concrete examples, perhaps in violation of Eliezer's own advice. The concrete examples can be found in the sequences, though... just not in that particular post.

First, I recommend reading The Design Space of Minds-In-General if you haven't already. TLDR; the space of minds in general ginormous and includes some downright weird minds. The space of human minds is a teeny tiny dot in the larger space (in case this isn't clear, the diagram in that post isn't remotely drawn to scale). Now with that out of the way...

There are minds in the space of minds-in-general that do not recognize modus ponens.

Modus ponens is the rule of inference that says that if you have a statement of the form "If A then B", and also have "A", then you can derive "B". It's a fundamental part of logic. But there are possible mind that reject it. A brilliant illustration of this point can be found in Lewis Carroll's dialog "What the Tortoise Said to Achilles" (for those not in the know, Carroll was a mathematician; Alice in Wonderland is secretly full of math jokes).

Eliezer covers the dialog in his post Created Already In Motion, but here's the short version: In Carroll's dialog, the tortoise asks Achilles to imagine someone rejecting a particular instance of modus ponens (drawn from Euclid's Elements, though that isn't important). The Tortoise suggests that such a person might be persuaded by adding an additional premise, and Achilles goes along with it—foolishly, because this quickly leads to an infinite regress when the Tortoise suggests that someone might reject the new argument in spite of accepting the premises (which leads to another round of trying to patch the argument, and then..)

"What the Tortoise Said to Achilles" is one of the reasons I tend to think of the so-called "problem of induction" as a pseudo-problem. The "problem of induction" is often defined as the problem of how to justify induction, but it seems to make just as much senses to ask how to justify deduction. But speaking of induction...

There are minds in the space of minds-in-general that reason counter-inductively.

To quote Eliezer:

There are possible minds in mind design space who have anti-Occamian and anti-Laplacian priors; they believe that simpler theories are less likely to be correct, and that the more often something happens, the less likely it is to happen again.

And when you ask these strange beings why they keep using priors that never seem to work in real life... they reply, "Because it's never worked for us before!"

If this bothers you, well, I refer you back to Lewis' Carroll's dialog. There are also minds in the mind design space that ignore the standard laws of logic, and are furthermore totally unbothered by (what we would regard as) the absurdities produced by doing so. Oh, but if you thought that was bad, consider this...

There are minds in the space of minds-in-general that use a maximum entropy prior, and never learn anything.

Here's Eliezer again discussing a problem where you have to predict whether a ball drawn out of an urn will be red or white, based on the color of the balls that have been previously drawn out of the urn:

Suppose that your prior information about the urn is that a monkey tosses balls into the urn, selecting red balls with 1/4 probability and white balls with 3/4 probability, each ball selected independently. The urn contains 10 balls, and we sample without replacement. (E. T. Jaynes called this the "binomial monkey prior".) Now suppose that on the first three rounds, you see three red balls. What is the probability of seeing a red ball on the fourth round?

First, we calculate the prior probability that the monkey tossed 0 red balls and 10 white balls into the urn; then the prior probability that the monkey tossed 1 red ball and 9 white balls into the urn; and so on. Then we take our evidence (three red balls, sampled without replacement) and calculate the likelihood of seeing that evidence, conditioned on each of the possible urn contents. Then we update and normalize the posterior probability of the possible remaining urn contents. Then we average over the probability of drawing a red ball from each possible urn, weighted by that urn's posterior probability. And the answer is... (scribbles frantically for quite some time)... 1/4!

Of course it's 1/4. We specified that each ball was independently tossed into the urn, with a known 1/4 probability of being red. Imagine that the monkey is tossing the balls to you, one by one; if it tosses you a red ball on one round, that doesn't change the probability that it tosses you a red ball on the next round. When we withdraw one ball from the urn, it doesn't tell us anything about the other balls in the urn.

If you start out with a maximum-entropy prior, then you never learn anything, ever, no matter how much evidence you observe. You do not even learn anything wrong - you always remain as ignorant as you began.

You may think, while minds such as I've been describing are possible in theory, they're unlikely to evolve anywhere in the universe, and probably they wouldn't survive long if programmed as an AI. And you'd probably be right about that. On the other hand, it's not hard to imagine minds that are generally able to get along well in the world, but irredeemably crazy on particular questions. Sometimes, it's tempting to suspect some humans of being this way, and even if that isn't literally true of any humans, it's not hard to imagine as just a more extreme form of existing human tendencies. See e.g. Robin Hanson on near vs. far mode, and imagine a mind that will literally never leave far mode on certain questions, regardless of the circumstances.

It used to disturb me to think that there might be, say, young earth creationists in the world who couldn't be persuaded to give up their young earth creationism by any evidence or arguments, no matter how long they lived. Yet I've realized that, while there may or may not be actual human young earth creationists like that (it's an empirical question), there are certainly possible minds in the space of mind designs like that. And when I think about that fact, I'm forced to shrug my shoulders and say, "oh well" and leave it at that.

That means I can understand why people would be bothered by a lack of universally compelling arguments for their moral views... but you shouldn't be any more bothered by that than by the lack of universally compelling arguments against young earth creationism. And if you don't think the lack of universally compelling arguments is a reason to think there's no objective truth about the age of the earth, you shouldn't think it's a reason to think there's no objective truth about morality.

(Note: this may end up being just the first in a series of posts on the metaethics sequence. People are welcome to discuss what I should cover in subsequent posts in the comments.)

Added: Based on initial comments, I wonder if some people who describe themselves as being bothered the lack of universally compelling arguments would more accurately describe themselves as being bothered by the orthogonality thesis.

230 comments

Comments sorted by top scores.

comment by jaime2000 · 2013-11-05T16:37:05.545Z · LW(p) · GW(p)

It seems obvious that people are using "universally compelling arguments" in two different senses.

In the first sense, a universally compelling argument is one that could convince even a rock, or a mind that doesn't implement modus ponens, or a mind with anti-inductive priors. In this sense, the lack of universally compelling arguments for any domain (math/physics/morality) seems sufficiently well established.

In another sense, a universally compelling argument is one that could persuade any sufficiently sane/intelligent mind. I think we can agree that all such minds will eventually conclude that relativity and quantum mechanics are correct (or at least a rough approximation to whatever the true laws of physics end up being), so in this sense we can call the arguments that lead to them universally compelling. Likewise, in this sense, we can note as interesting the non-existence of universally compelling arguments which could compel a sufficiently sane/intelligent paperclipper to value life, beauty, justice, and the American way. It becomes more interesting if we also consider the case of babyeaters, pebblesorters, or humans with values sufficiently different to our own.

You are using the term in the first sense, but the people who are bothered by it are using it in the second sense.

Replies from: SaidAchmiz, army1987, Eugine_Nier

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-05T18:22:42.684Z · LW(p) · GW(p)

Except that "sufficiently sane/intelligent" here just means, it seems, "implements modus ponens, has inductive priors, etc." We can, like Nick Tarleton, simply define as "not a mind" any entity or process that doesn't implement these criteria for sufficient sanity/intelligence...

... but then we are basically saying: any mind that is not convinced by what we think should be universally compelling arguments, is not a mind.

That seems like a dodge, at best.

Are there different criteria for sufficient sanity and intelligence, ones not motivated by the matter of (allegedly) universally compelling arguments?

Replies from: Tyrrell_McAllister, Armok_GoB, TheAncientGeek

↑ comment by Tyrrell_McAllister · 2013-11-05T20:11:22.102Z · LW(p) · GW(p)

Except that "sufficiently sane/intelligent" here just means, it seems, "implements modus ponens, has inductive priors, etc."

"Sufficiently sane/intelligent" means something like, "Has a sufficient tendency to form true inferences from a sufficiently wide variety of bodies of evidences."

Now, we believe that modus ponens yields true inferences. We also believe that a tendency to make inferences contrary to modus ponens will cause a tendency to make false inferences. From this you can infer that we believe that a sufficiently sane/intelligent agent will implement modus ponens.

But the truth of this inference about our beliefs does not mean that "sufficiently sane/intelligent" is defined to mean "implements modus ponens".

In particular, our definition of "sufficiently sane/intelligent" implies that, if A is a sufficiently sane/intelligent agent who lives in an impossible possible world that does not implement modus ponens, then A does not implement modus ponens.

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2013-11-06T05:56:19.253Z · LW(p) · GW(p)

"Sufficiently sane/intelligent" means something like, "Has a sufficient tendency to form true inferences from a sufficiently wide variety of bodies of evidences."

Since clippy fails to form true inferences about morality, doesn't it also count as "insufficiently sane/intelligent"?

Replies from: army1987, Kawoomba

↑ comment by A1987dM (army1987) · 2013-11-06T10:26:57.235Z · LW(p) · GW(p)

Clippy knows what is moral and what isn't. He just doesn't care.

Replies from: Jack, Eugine_Nier

↑ comment by Jack · 2013-11-06T15:47:07.624Z · LW(p) · GW(p)

Imagine if humans had never broken into different groups and we all spoke the same language. No French, no English, just "the Language". People study the Language, debate it, etc.

Then one day intelligent aliens arrive. Philosophers immediately begin debating: do these aliens have the Language? One the one hand, they're making noises with what appears to be something comparable to a mouth, the noises have an order and structure to them, and they communicate information. But what they do sounds nothing like "the Language". They refer to objects with different sounds than the Language requires, and sometimes make sounds that describe what an object is like after the sound that refers to the object.

"Morality" has a similar type-token ambiguity. It can refer to our values or to values in general. Saying Clippy knows what is moral but that he doesn't care is true under the token interpretation, but not the type one. The word "morality" has meanings and connotations that imply that Clippy has a morality but that it is just different-- in the same way that the aliens have language but that it is just different.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2013-11-07T10:34:12.343Z · LW(p) · GW(p)

So, I guess the point of EY's metaethics can be summarized as ‘by “morality” I mean the token, not the type’.

(Which is not a problem IMO, as there are unambiguous words for the type, e.g. “values” -- except insofar as people are likely to misunderstand him.)

Replies from: Viliam_Bur, TheAncientGeek

↑ comment by Viliam_Bur · 2013-11-07T19:28:10.457Z · LW(p) · GW(p)

‘by “morality” I mean the token, not the type’

Especially because the whole point is to optimize for something. You can't optimize for a type that could have any value.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-07T20:24:15.061Z · LW(p) · GW(p)

Isn't it an optimization to code in the type, and let the .AI work out the details necessary to implement the token ? We don't think theorem provers need to be overloaded with all known maths.

Replies from: Viliam_Bur

↑ comment by Viliam_Bur · 2013-11-07T21:10:08.559Z · LW(p) · GW(p)

Is this some kind of an NLP exercise?

Replies from: TheOtherDave, TheAncientGeek

↑ comment by TheOtherDave · 2013-11-07T22:09:01.926Z · LW(p) · GW(p)

FWIW, I've mostly concluded something along those lines.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-08T13:50:16.013Z · LW(p) · GW(p)

You wrote

"But when you ask a question and someone provides an answer you don't like, showing why that answer is wrong can sometimes be more effective than simply asserting that you don't buy it"

..and I did..

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-11-08T14:53:59.160Z · LW(p) · GW(p)

Indeed. And?

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-08T14:56:16.760Z · LW(p) · GW(p)

If you don't want someone to put up an argument, don't ask t for it.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-11-08T15:04:28.662Z · LW(p) · GW(p)

I agree completely.
Had I known in advance the quality of argument you would put up, I would not have wanted you to put it up, and would not have asked for one, in full compliance with this maxim.
Lacking prescience, I didn't know in advance, so I did want an argument, and I did ask for one, which fails to violate this maxim.

Replies from: HalMorris, TheAncientGeek

↑ comment by HalMorris · 2013-11-08T15:46:17.532Z · LW(p) · GW(p)

You wanted an argument? Sorry this is "Insults". Go down the hall and to the left. Monty Python (to my best recollection)

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-11-08T16:12:29.988Z · LW(p) · GW(p)

You want 12A, just along the corridor

↑ comment by TheAncientGeek · 2013-11-08T15:38:42.579Z · LW(p) · GW(p)

I'm afraid I have developed a sudden cognitive deficit that prevents me from understanding anything you are saying. I have also forgotten all the claims I have made, and what this discussion is about.

In short, I'm tapping out.

↑ comment by TheAncientGeek · 2013-11-07T21:38:07.080Z · LW(p) · GW(p)

↑ comment by TheAncientGeek · 2013-11-07T12:19:06.714Z · LW(p) · GW(p)

There are immoral and amoral values, so no.

↑ comment by Eugine_Nier · 2013-11-07T03:23:29.806Z · LW(p) · GW(p)

How is this different from:

The creationist knows what I believe but doesn't care.

Replies from: fubarobfusco, nshepperd, army1987

↑ comment by fubarobfusco · 2013-11-07T08:18:29.339Z · LW(p) · GW(p)

The argument of the dragon in my garage suggests that the supernaturalist already knows the facts of the natural world, but doesn't care.

But the sense in which "Clippy knows what is moral" is that Clippy can correctly predict humans, and "morality" has to do with what humans value and approve of — not what paperclippers value and approve of.

↑ comment by nshepperd · 2013-11-07T05:07:00.997Z · LW(p) · GW(p)

A creationist is mistaken about the origin of the Earth (they believe the Earth was created by a deity).

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2013-11-09T08:00:31.364Z · LW(p) · GW(p)

And this is different from a paperclip maximizer mistakenly believing that morality consists of optimizing paperclips, how?

Replies from: army1987, lmm, nshepperd

↑ comment by A1987dM (army1987) · 2013-11-11T13:18:51.396Z · LW(p) · GW(p)

Do you mean a paperclip maximizer mistakenly believing that the English word moral means ‘optimizing paperclips’ rather than ‘optimizing life, consciousness, etc.’, or a paperclip maximizer who knows that that the English word moral means ‘optimizing life, consciousness, etc.’ but mistakenly believes that optimizing paperclips would optimize life, consciousness, etc.?

And neither is like a paperclip maximizer who knows that that the English word moral means ‘optimizing paperclips’ rather than ‘optimizing life, consciousness, etc.’, and knows that optimizing paperclips doesn't optimize life, consciousness, etc., but doesn't give a damn about optimizing life, consciousness, etc.

Replies from: Vladimir_Nesov, TheAncientGeek, Eugine_Nier

↑ comment by Vladimir_Nesov · 2013-11-11T13:37:53.444Z · LW(p) · GW(p)

The structure of the above comment would benefit from using a macro:

Let M = 'life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom; beauty, harmony, proportion in objects contemplated; aesthetic experience; morally good dispositions or virtues; mutual affection, love, friendship, cooperation; just distribution of goods and evils; harmony and proportion in one's own life; power and experiences of achievement; self-expression; freedom; peace, security; adventure and novelty; and good reputation, honor, esteem, etc.'

Do you mean a paperclip maximizer mistakenly believing that the English word "moral" means 'optimizing paperclips' rather than 'optimizing M', or a paperclip maximizer who knows that that the English word "moral" means 'optimizing M' but mistakenly believes that optimizing paperclips would optimize M?

And neither is like a paperclip maximizer who knows that that the English word "moral" means 'optimizing paperclips' rather than 'optimizing M', and knows that optimizing paperclips doesn't optimize M.

↑ comment by TheAncientGeek · 2013-11-11T14:50:52.512Z · LW(p) · GW(p)

Or a paperclip maximiser who correctly believes that "moral" doesn't refer to an arbitrary set of preferences?

Replies from: army1987

↑ comment by A1987dM (army1987) · 2013-11-11T18:34:35.340Z · LW(p) · GW(p)

http://lesswrong.com/lw/t3/the_bedrock_of_morality_arbitrary/

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2013-11-12T01:04:44.472Z · LW(p) · GW(p)

You do realize the argument in that post applies equally well to physics?

↑ comment by Eugine_Nier · 2013-11-12T01:14:57.159Z · LW(p) · GW(p)

Do you mean a paperclip maximizer mistakenly believing that the English word moral means ‘optimizing paperclips’ rather than ‘optimizing life, consciousness, etc.’, or a paperclip maximizer who knows that that the English word moral means ‘optimizing life, consciousness, etc.’ but mistakenly believes that optimizing paperclips would optimize life, consciousness, etc.?

Wow, it appears you don't know what the English word "moral" means either. It roughly means "that which one should do". To use the analogy of the creationist, would you be happy with defining truth as "the earth is 5 billion years old, etc."?

Replies from: Tyrrell_McAllister, nshepperd

↑ comment by Tyrrell_McAllister · 2013-11-12T18:03:28.085Z · LW(p) · GW(p)

Wow, it appears you don't know what the English word "moral" means either. It roughly means "that which one should do".

Clippy knows what it should do. It just doesn't care. Clippy cares about what it clippyshould do, which is something else.

↑ comment by nshepperd · 2013-11-12T01:30:10.721Z · LW(p) · GW(p)

And what does "should" mean?

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2013-11-12T01:49:54.105Z · LW(p) · GW(p)

Congratulations, we've run into the fact that repeatedly asking to define terms results in an infinite regress.

Replies from: ArisKatsaris, army1987

↑ comment by ArisKatsaris · 2013-11-12T02:08:58.396Z · LW(p) · GW(p)

When you said that "moral" is "that which one should do" you simply failed to delve into a more fundamental level that would describe what the terms 'moral' and 'should' both refer to.

My own view, for example, is that our moral sense and our perception of what one 'should' do are an attempted calculation of what our preferences would be about people's behaviour if we had no personal stakes on the matter -- imagining ourselves unbiased and uninvolved.

Assuming the above definition is true (in the sense that it accurately describes what's going on in our brains when we feel things like moral approval or moral disapproval about a behaviour), it's not circular at all.

Replies from: TheAncientGeek, Eugine_Nier

↑ comment by TheAncientGeek · 2013-11-19T11:47:48.063Z · LW(p) · GW(p)

When you said that "moral" is "that which one should do" you simply failed to delve into a more fundamental level that would describe what the terms 'moral' and 'should' both refer to.

One can only fail to do what one is trying to do. If what one is trying to do is refute a putative definition of "morality" one doesn't need a full reduction. AFIACT, that was in fact the context -- someone was saying that Clippy could validly define "morality" as making paperpclips.

My own view, for example, is that our moral sense and our perception of what one 'should' do are an attempted calculation of what our preferences would be about people's behaviour if we had no personal stakes on the matter -- imagining ourselves unbiased and uninvolved.

I like that idea too -- although It isn't new. Also, it is a theory, not a definition.

Assuming the above definition is true (in the sense that it accurately describes what's going on in our brains when we feel things like moral approval or moral disapproval about a behaviour), it's not circular at all.

Does physics shave to describe what we think the behaviour of objects is, or can we improve on that?

↑ comment by Eugine_Nier · 2013-11-16T01:14:37.914Z · LW(p) · GW(p)

My own view, for example, is that our moral sense and our perception of what one 'should' do are an attempted calculation of what our preferences would be about people's behaviour if we had no personal stakes on the matter -- imagining ourselves unbiased and uninvolved.

That is not a definition of morality, that is a theory of morality. (It's one of the better theories of morality I've seen, but not a definition). To see that that is not a definition consider that it appears to be a non-trivial statement in the way that a simple statement of definition shouldn't be.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-16T15:55:46.536Z · LW(p) · GW(p)

I'm not quite sure what is the distinction you're making. I'm a programmer -- if I define a function public int calculateMoralityOf(Behaviour b), what exactly is the definition of that function if not its contents?

Would a definition of "morality" be something like "An attribute assigned to behaviors depending on how much they trigger a person's sense of moral approval/support or disapproval/outrage", much like I could define beauty to mean "An attribute assigned to things that trigger a person's sense of aesthetics"?

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-19T11:58:19.039Z · LW(p) · GW(p)

I'm not quite sure what is the distinction you're making. I'm a programmer -- if I define a function public int calculateMoralityOf(Behaviour b), what exactly is the definition of that function if not its contents?

There are perhaps a lot of programmers on this site, which might explain why the habit of associating definitions with exhaustive specifications (which seems odd to those of us who (also) have a philosophy background) is so prevalent.

But it is not uniformly valid even in computing: Consider the difference between the definition of a "sort function" and the many ways of implementing sorting.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-20T22:03:45.459Z · LW(p) · GW(p)

Consider the difference between the definition of a "sort function" and the many ways of implementing sorting.

That's a good example you bring -- the same function F:X->Y can be specified in different ways, but it's still the same function if the same X leads to the same Y.

But even so, didn't what I offer in regards morality come closer to a "definition", than an "implementation"? I didn't talk about how the different parts of the brain interact to produce the result (I wouldn't know): I didn't talk about the implementation of the function; only about what it is that our moral sense attempts to calculate.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-21T17:45:30.536Z · LW(p) · GW(p)

But even so, didn't what I offer in regards morality come closer to a "definition", than an "implementation"?

The original point was:

That is not a definition of morality, that is a theory of morality. (It's one of the better theories of morality I've seen, but not a definition). To see that that is not a definition consider that it appears to be a non-trivial statement in the way that a simple statement of definition shouldn't be.

People offer differing theories of the same X, that is X defined in the same way. That is the essence of a disagreement. If they are not talking about the same X, they are not disagreeing, they are talking past each other.

There might be reasons to think that, in individual cases, people who appear to be disagreeiing are in fact talking past each other, But that is a point that needs to be argued for specific cases.

To claim that anything someone says about X is part of a definition of X , has the implication that in all cases, automatically, without regard to the individual details, there are no real diagreementss about any X but only different definitions. That is surely wrong, for all that it is popular with some on LW

Would a definition of "morality" be something like "An attribute assigned to behaviors depending on how much they trigger a person's sense of moral approval/support or disapproval/outrage", much like I could define beauty to mean "An attribute assigned to things that trigger a person's sense of aesthetics"?

That would be a theory. If falls heavily on the side of subjetivism/non-cognitivism, which many disagree with.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-22T01:45:52.805Z · LW(p) · GW(p)

People offer differing theories of the same X, that is X defined in the same way.

People aren't perfectly self-aware. They don't often know how to define precisely what it is that they mean. They "know it when they see it" instead.

That would be a theory

Accepting the split between "definition" and "theory" I suppose the definition of "sound" would be something like "that which triggers our sense of hearing", and a theory of sound would be "sound is the perception of air vibrations"?

In which case I don't know how it could be that a definition of morality could be different than "that which triggers our moral sense" -- in analogy to the definition of sound. In which case I accept that my described opinion (that what triggers our moral sense is a calculation of "what our preferences would be about people's behaviour if we had no personal stakes on the matter") is merely a theory of morality.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-22T13:19:40.932Z · LW(p) · GW(p)

People aren't perfectly self-aware. They don't often know how to define precisely what it is that they mean. They "know it when they see it" instead.

I don't see how that relates to my point.

In which case I don't know how it could be that a definition of morality could be different than "that which triggers our moral sense"

You can easily look up definitions that don't work that way, eg: "Morality (from the Latin moralitas "manner, character, proper behavior") is the differentiation of intentions, decisions, and actions between those that are "good" (or right) and those that are "bad" (or wrong)."

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-23T01:53:49.107Z · LW(p) · GW(p)

I don't see how that relates to my point.

You said that "people offer differing theories of the same X, that is X defined in the same way". I'm saying that people disagree on how to define concepts they instinctively feel -- such as the concept of morality. So the X isn't "defined in the same way".

You can easily look up definitions that don't work that way, eg: "Morality (from the Latin moralitas "manner, character, proper behavior") is the differentiation of intentions, decisions, and actions between those that are "good" (or right) and those that are "bad" (or wrong)."

Yeah well, when I'm talking about definition I mean something that helps us logically pinpoint or atleast circumscribe a thing. Circular definitions like jumping from "morality" to "good" or to "what one should do" don't really work for me, since they can quite easily be defined the opposite way.

To properly define something one ought use terms more fundamental than the thing defined.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-25T13:39:51.833Z · LW(p) · GW(p)

So the X isn't "defined in the same way".

What, not ever? By anybody? Even people who have agreed on on an explicit definition?

To properly define something one ought use terms more fundamental than the thing defined.

It isn't clearly un-circular to define morality as that which triggers the moral sense.

Your definition has the further problem of begging the question in favour subjectivism and non-cognitivism.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-26T09:47:40.571Z · LW(p) · GW(p)

What, not ever? By anybody? Even people who have agreed on on an explicit definition?

From wikipedia:

When Plato gave Socrates' definition of man as "featherless bipeds" and was much praised for the definition, Diogenes plucked a chicken and brought it into Plato's Academy, saying, 'Behold! I've brought you a man.' After this incident, 'with broad flat nails' was added to Plato's definition.

Now Plato and his students had an explicit definition they agreed upon, but nonetheless it's clearly NOT what their minds understood 'man' to be, not really what they were discussing when they were discussing 'man'. Their definition wasn't really logically pinpointing the concept they had in mind.

It isn't clearly un-circular to define morality as that which triggers the moral sense.

It attempts to go down a level from the abstract to the biological. It will be of course be circular if someone then proceeds to define "moral sense" as that sense which is triggered by morality, instead of pointing at examples thereof.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-26T19:37:49.229Z · LW(p) · GW(p)

So what is the upshot of of this single datum? That no definition ever captures a concept ? That there is some special problem with the concept of morality ?

Is the biological the right place to go? Is it not question begging to builds that theory into a definition?

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-26T19:59:46.810Z · LW(p) · GW(p)

That no definition ever captures a concept ?

Hardly. e.g. the definition of a circle perfectly captures the concept of a circle.

My point was that to merely agree on the definition of a concept doesn't mean our "definition" is correct, that it is properly encapsulating what we wanted it to encapsulate.

That there is some special problem with the concept of morality?

No more of a problem than e.g. the concept of beauty. Our brains makes calculations and produces a result. To figure out what we mean by "morality", we need determine what it is that our brains are calculating when they go 'ping' at moral or immoral stuff. This is pretty much tautological.

Is the biological the right place to go?

Since our brains are made of biology, there's no concept we're aware of that can't be reduced to the calculations encoded in our brain's biology.

it not question begging to builds that theory into a definition?

It was once a mere theory to believe that the human brain is the center of human thought (and therefore all concepts dealt by human thought), but I think it's been proven beyond all reasonable doubt.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-26T20:31:34.468Z · LW(p) · GW(p)

Your example shows it is possible to agree on a bad definition. But there is no arbiter or touchstone of correctness that is not based on further discussion and agreement.

That morality-you is whatever your brain thanks it is, subjectivity, is highly contentious and therefore not tautologous .

Hnever, you seem to have confused subjectivism withreductionism. That my concept of perfect circle is encoded into my brain does not make it subjective.

But if you are only offering reductions without subjectivity, you are offering nothing of interest. "The concept of perfectMorality is a concept encoded in your brain"example tells me nothing.

The question remains open as to whether your it's-all-in-the-brain is subjectivism or not.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-26T21:24:43.882Z · LW(p) · GW(p)

That morality-you is whatever your brain thinks it is, subjectivity, is highly contentious and therefore not tautologous.

What I called tautological was the statement "To figure out what we mean by "morality", we need determine what it is that our brains are calculating when they go 'ping' at moral or immoral stuff."

I think your rephrasing "morality is whatever your brain thinks it is" would only work as a proper rephrase if I believed us perfectly self-aware, which as I've said, i don't.

That my concept of perfect circle is encoded into my brain does not make it subjective.

It's you who keep calling me a subjectivist. I don't consider myself one..

The question remains open as to whether your it's-all-in-the-brain is subjectivism or not

Who is asking that question, and why should I care about asking it? I care to learn about morality, and whether my beliefs about it are true or false -- I don't care to know about whether you would call it "subjectivism" or not.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-27T12:44:01.213Z · LW(p) · GW(p)

What I called tautological was the statement "To figure out what we mean by "morality", we need determine what it is that our brains are calculating when they go 'ping' at moral or immoral stuff."

Is there any possibility of our brains being wrong?

Who is asking that question, and why should I care about asking it? I care to learn about morality, and whether my beliefs about it are true or false -- I don't care to know about whether you would call it "subjectivism" or not.

And it's progress to reject definitions, which we have, in favour of brains cans which we don't?

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-11-27T20:05:16.045Z · LW(p) · GW(p)

Is there any possibility of our brains being wrong?

As I've said before I believe that morality is an attempted calculation of our hypothetical preferences about behaviors if we imagined ourselves unbiased and uninvolved. Given this, I believe that we can be wrong about moral matters, when we fail to make this estimation accurately.

But that doesn't seem to be to what you're asking. You seem to me to be asking "If our brains's entire moral mechanism INDEED attempts to calculate our hypothetical preferences about behaviors if we imagined ourselves unbiased and uninvolved, would the mere attempt be somehow epistemically wrong? "

The answer is obviously no: epistemic errors lies in beliefs, the outcome of calculations. Not in attempted actions, not the attempt of calculations. The question itself is a category error.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-27T20:34:20.843Z · LW(p) · GW(p)

If the attempt can go wrong, then we can't find out what morality is by looking at what our brains do when they make a possibly failed attempt. We would have to look at what they are offering aiming at, what they should be doing. Try as you might, you cannot ignore the normativity of morality (or rationality for that matter).

You didn't answer my second question.

↑ comment by A1987dM (army1987) · 2013-11-13T00:23:25.135Z · LW(p) · GW(p)

Unless you eventually switch to an extensional definition.

↑ comment by lmm · 2013-11-19T08:59:47.257Z · LW(p) · GW(p)

A paperclip maximizer optimizes as effectively as anyone else - it squuzes the future into an equally small (though different) region. A creationist would be less effective as they would make false predictions about objective facts like antibiotic resistance.

↑ comment by nshepperd · 2013-11-09T08:42:13.210Z · LW(p) · GW(p)

It isn't. It is, however, entirely different from a paperclip maximiser—not at all mistakenly—not caring about morality. There's no general reason to assume that a difference in goals implies a factual disagreement.

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2013-11-10T20:45:50.834Z · LW(p) · GW(p)

There's no general reason to assume that a difference in goals implies a factual disagreement.

This is precisely what army1987 was trying to argue for when he brought up this example. Thus, attempting to use it in the analysis constitutes circular reasoning.

Replies from: nshepperd

↑ comment by nshepperd · 2013-11-10T23:10:11.487Z · LW(p) · GW(p)

What? No, army1987 was trying to argue for "clippy knows what is moral but doesn't care". The fact that a difference in goals does not imply a factual disagreement simply shows army1987's position to be consistent.

Also, um, why is it my responsibility to prove that you have no reason to assume something? You're the one proposing that "X has different goals" implies "X is mistaken about morality". How did you come to be so sure of this that you could automatically substitute "mistakenly believing that morality consists of optimizing paperclips" for "cares about paperclips"? Especially considering the counterevidence from the fact that there exist computable decision theories that can take an arbitrary utility function?

↑ comment by A1987dM (army1987) · 2013-11-07T09:32:05.333Z · LW(p) · GW(p)

Aumann's agreement theorem prevents that from happening to ideal epistemic rationalists; there's no analogue for instrumental rationality.

But...

Aumann's agreement theorem assumes common priors, what I described can only happen to instrumental rationalists with different utility functions. So the question is why we expect all rationalists to use One True Prior (e.g. Solomonoff induction) but each to use their own utility function.

↑ comment by Kawoomba · 2013-11-06T09:53:45.143Z · LW(p) · GW(p)

Since clippy fails to form true inferences about morality

What do you mean?

↑ comment by Armok_GoB · 2013-11-05T23:29:57.121Z · LW(p) · GW(p)

"sufficiently sane/intelligent" means "effective enough in the real world to pose a threat to my values". Papercillper qualifies, flue virus qualifies, anti-inductive AI does not qualify.

Replies from: Eugine_Nier

↑ comment by Eugine_Nier · 2013-11-06T05:59:29.204Z · LW(p) · GW(p)

So, how is the project to teach mathematics to the flue virus going?

Replies from: Armok_GoB

↑ comment by Armok_GoB · 2013-11-06T15:44:22.807Z · LW(p) · GW(p)

Why, it hasn't been wrong about a single thing so far, thank you!

↑ comment by TheAncientGeek · 2013-11-05T18:29:53.771Z · LW(p) · GW(p)

... but then we are basically saying: any mind that is not convinced by what we think should be universally compelling arguments, is not a mind.

That doesn't follow. For one thing, we can find out how the Mind works by inspecting its code, not just by black box testing it If it seems to have all that it needs and isn't convinced by arguments that convince us, it may well be we who are wrong.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-05T19:09:45.374Z · LW(p) · GW(p)

For one thing, we can find out how the Mind works by inspecting its code

We can?

So I have all these minds around me.

How do I inspect their code and thereby find out how they work? Detailed instructions would be appreciated. (Assume that I have no ethical restrictions.)

That (only slightly-joking) response aside, I think you have misunderstood me. I did not mean that we are (in the scenario I am lampooning) saying:

"Any mind that is not convinced by what we think should be universally compelling arguments, despite implementing modus ponens and having an Occamian prior, is not a mind."

Rather, I meant that we are saying:

"Any mind that is not convinced by what we think should be universally compelling arguments, by virtue of said mind not implementing modus ponens, having an Occamian prior, or otherwise having such-and-such property which would be required in order to find this argument compelling, is not a mind."

The problem I am pointing out in such reasoning is that we can apply it to any argument we care to designate as "this ought to be universally compelling". "Ah!" we say, "this mind does not agree that ice cream is delicious? Well, that's because it doesn't implement , and without said property, why, we can hardly call it a mind at all."

A rationality quote of sorts is relevant here:

"Well, let's put it like this. A human has encountered an extraterrestrial lifeform. How do they each discover, that they are both intelligent?"

"I have no idea," said Valentine merrily. "All that I have read on this subject reduces to a vicious circle. If they are capable of contact, then they are intelligent. And the reverse: if they are intelligent, then they are capable of contact. And in general: if an extraterrestrial lifeform has the honor of possessing a human psychology, then it is intelligent. Like that."

(Roadside Picnic, Arkady and Boris Strugatsky)

What we have here is something similar. If a mind is sufficiently sane/intelligent, then it will be convinced by our arguments. And the reverse: if it is convinced by our arguments, then it is sane/intelligent...

In yet other words: we can hardly say "we expect all sane/intelligent minds to be convinced by these arguments" if we have in the first place defined sanity and intelligence to require the ability to be convinced by those very arguments.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T19:15:55.729Z · LW(p) · GW(p)

No, it's not viciously circular to argue that an entity that fulfills all the criteria for being an X is an X.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-05T19:25:52.394Z · LW(p) · GW(p)

That's not what is happening here. Is what I wrote actually unclear? Please reread my comment, starting with the assumption that what you responded with is not what my intended meaning was. If still unclear, I will try to clarify.

↑ comment by A1987dM (army1987) · 2013-11-05T20:39:56.883Z · LW(p) · GW(p)

Yes. You can convince a sufficiently rational paperclip maximizer that killing people is Yudkowsy::evil, but you can't convince it to not take Yudkowsy::evil actions, no matter how rational it is. AKA the orthogonality thesis (when talking about other minds) and “the utility function is not up for grabs” (when talking about ourselves).

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T21:00:13.116Z · LW(p) · GW(p)

You are using rational to mean instrumentally rational. You can't disprove the existence of agents that value rationality terminally, for its own sake ... indeed the OT means they must exist. And when people say rationally persuadablable agents exist, that iswhat they mean by rational....they are not using your language.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2013-11-06T09:39:04.237Z · LW(p) · GW(p)

I don't see how that makes any difference. You could convince “agents that value rationality terminally, for its own sake” that killing people is evil, but you couldn't necessarily convince them not to kill people, much like Pebblesorters could convince them that 15 is composite but they couldn't necessarily convince them not to heap 15 pebbles together.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T09:49:50.959Z · LW(p) · GW(p)

You can't necessarily convince them, and I didn't say you could, necessarily. That depends on the probability of the claims that morality can be figured out and/or turned into a persuasive argument. These need to be estimated in order to estimate the likeliness of the MIRI solution being optimal, since the higher the probability of alternatives the lower the probability of the MIRI scenario.

Probabilities make a difference.

↑ comment by Eugine_Nier · 2013-11-06T06:02:05.747Z · LW(p) · GW(p)

As others have pointed out: taboo "sane/intelligent".

If a rock, or a mind that doesn't implement modus ponens, or a mind with anti-inductive priors don't count, why does a paperclipper?

Replies from: Viliam_Bur, TheAncientGeek

↑ comment by Viliam_Bur · 2013-11-06T14:34:37.687Z · LW(p) · GW(p)

Paperclipper has an optimizing power. Let it enter your solar system, and soon you will see a lot of paperclips. Watch how it makes them. Then create some simple obstacles to this process. Wait for a while, and see even more paperclips after your obstacles were removed or circumvented.

Doesn't happen with a rock. I can't quite imagine what an anti-inductive mind would do... probably random actions most likely leading to a soon self-destruction (at which point it stops being a mind).

Replies from: Nick_Tarleton, TheAncientGeek

↑ comment by Nick_Tarleton · 2013-11-08T00:06:39.358Z · LW(p) · GW(p)

Measuring optimization power requires a prior over environments. Anti-inductive minds optimize effectively in anti-inductive worlds.

(Yes, this partially contradicts my previous comment. And yes, the idea of a world or a proper probability distribution that's anti-inductive in the long run doesn't make sense as far as I can tell; but you can still define a prior/measure that orders any finite set of hypotheses/worlds however you like.)

↑ comment by TheAncientGeek · 2013-11-06T14:41:12.039Z · LW(p) · GW(p)

Paperclipper has an optimizing power.

You left out the stage of the argument where it is explained that optimising power is rationality.

Obsessively, and even efficiently, pursuing completely arbitrary goals isn't unusually seen as a pinacle.of sanity and rationality. It tends to get called names like "monomania", "OCD", and so on.

↑ comment by TheAncientGeek · 2013-11-06T09:31:04.353Z · LW(p) · GW(p)

Motivation counts. The fact that intelligent people with no interest in maths can't per persuaded by mathematical arguments is no argument against the objectivity of mathematics.

comment by dspeyer · 2013-11-05T16:24:28.255Z · LW(p) · GW(p)

Even a token effort to steelman the "universally" in "universally compelling arguments" yields interesting results.

Consider a mind that thinks the following:

I don't want to die
If I drink that poison, I'll die
Therefore I should drink that poison

But don't consider it very long, because it drank the poison and now it's dead and not a mind anymore.

If we restrict our observations to minds that are capable of functioning in a moderately complex environment, UCAs come back, at least in math and maybe elsewhere. Defining "functioning" isn't trivial, but it isn't impossible either. If the mind has something like desires, then a functioning mind is one which tends to get its desires more often than if it didn't desire them.

If you cleave mindspace at the joints, you find sections for which there are UCAs. I don't immediately see how to get anything interesting about morality that way, but it's an avenue worth pursuing.

Replies from: Kaj_Sotala, TheAncientGeek, None, Eugine_Nier

↑ comment by Kaj_Sotala · 2013-11-06T13:50:00.795Z · LW(p) · GW(p)

If we restrict our observations to minds that are capable of functioning in a moderately complex environment, UCAs come back, at least in math and maybe elsewhere. Defining "functioning" isn't trivial, but it isn't impossible either. If the mind has something like desires, then a functioning mind is one which tends to get its desires more often than if it didn't desire them.

But it may be in the mind's best interests to refuse to be persuaded by some specific class of argument: "It is difficult to get a man to understand something when his job depends on not understanding it" (Upton Sinclair). For any supposed UCA, one can construct a situation in which a mind can rationally choose to ignore it and therefore achieve its objectives better, or at least not be majorly harmed by it. You don't even need to construct particularly far-fetched scenarios: we already see plenty of humans who benefit from ignoring scientific arguments in favor of religious ones, ignoring unpopular but true claims in order to promote claims that make them more popular, etc.

Replies from: Vulture, TheAncientGeek

↑ comment by Vulture · 2013-11-08T23:39:39.892Z · LW(p) · GW(p)

For any supposed UCA, one can construct a situation in which a mind can rationally choose to ignore it and therefore achieve its objectives better, or at least not be majorly harmed by it.

I'm not convinced that this is the case for basic principles of epistemology. Under what circumstances could a mind (which behaved functionally enough to be called a mind) afford to ignore modus ponens, for example?

Replies from: somervta, Eugine_Nier

↑ comment by somervta · 2013-11-08T23:57:05.677Z · LW(p) · GW(p)

Well, it doesn't have to, it could just deny the premises.

But it could deny modus ponens in some situations but not others.

Replies from: Vulture

↑ comment by Vulture · 2013-11-09T00:10:10.276Z · LW(p) · GW(p)

Hmm. Like a person who is so afraid of dying that they have to convince themselves that they, personally, are immortal in order to remain sane?

From that perspective it does make sense.

↑ comment by Eugine_Nier · 2013-11-09T07:55:26.791Z · LW(p) · GW(p)

Under what circumstances could a mind (which behaved functionally enough to be called a mind) afford to ignore modus ponens, for example?

That depends on what you mean by "behave functionally like a mind". For starters it could only ignore it occasionally.

↑ comment by TheAncientGeek · 2013-11-06T13:58:41.768Z · LW(p) · GW(p)

But it may be in the mind's best interests to refuse to be persuaded by some specific class of argument: "It is difficult to get a man to understand something when his job depends on not understanding it" (Upton Sinclair). For any supposed UCA, one can construct a situation in which a mind can rationally choose to ignore it

Where rationally means "instrumentally rationally".

ou don't even need to construct particularly far-fetched scenarios: we already see plenty of humans who benefit from ignoring scientific arguments in favor of religious ones, ignoring unpopular but true claims in order to promote claims that make them more popular, etc.

But they are not generally considered paragons of rationality. In fact, they are biased, and bias is considered inimical to rationality. Even by EY. At least when he is discussing humans.

Replies from: Kaj_Sotala, Kaj_Sotala

↑ comment by Kaj_Sotala · 2013-11-06T15:30:00.960Z · LW(p) · GW(p)

Given that dspeyer specified "minds that are capable of functioning in a moderately complex environment", instrumental rationality seems like the relevant criteria to use.

↑ comment by Kaj_Sotala · 2013-11-06T15:27:29.673Z · LW(p) · GW(p)

But they are not generally considered paragons of rationality. In fact, they are biased, and bias is considered inimical to rationality.

Not sure how that's relevant, given that the discussion was never restricted to (hypothetical) completely rational minds.

↑ comment by TheAncientGeek · 2013-11-05T17:36:46.075Z · LW(p) · GW(p)

UCAs are part of the Why can't the AGI figure Out Morality For Itself objection:-

There is a sizeable chunk of mindspace containing rational and persuadable agents.
AGI research is aiming for it. (You could build an irrational AI, but why would you want to?)
.Morality is figurable-out, or expressible as a persuasive argument.

The odd thing is that the counterargument has focussed on attacking a version of (1), although, in the form it is actually held, it is the most likely premise. OTOH, 3, the most contentious, has scarely been argued against at all.

Replies from: Desrtopa, None

↑ comment by Desrtopa · 2013-11-05T17:52:00.639Z · LW(p) · GW(p)

I would say Sorting Pebbles Into Correct Heaps is essentially an argument against 3. That is, what we think of as "morality" is most likely not a natural attractor for minds that did not develop under processes similar to our own.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T17:55:21.471Z · LW(p) · GW(p)

Do you? I think that morality in a broad sense is going to be a necessity for agents that fulfil a fairly short list of criteria:

living in a society
interacting with others in potentially painful and pleasant ways
having limited resources that need to be assigned.

Replies from: None, Desrtopa

↑ comment by [deleted] · 2013-11-10T13:55:25.790Z · LW(p) · GW(p)

I think you're missing a major constraint there:

Living in a society with as little power as the average human citizen has in a current human society.

Or in other words, something like modern, Western liberal meta-morality will pop out if you make an arbitrary agent live in a modern, Western liberal society, because that meta-moral code is designed for value-divergent agents (aka: people of radically different religions and ideologies) to get along with each other productively when nobody has enough power to declare himself king and optimize everyone else for his values.

The nasty part is that AI agents could pretty easily get way, waaaay out of that power-level. Not just by going FOOM, but simply by, say, making a lot of money and purchasing huge sums of computing resources to run multiple copies of themselves which now have more money-making power and as many votes for Parliament as there are copies, and so on. This is roughly the path taken by power-hungry humans already, and look how that keeps turning out.

The other thorn on the problem is that if you manage to get your hands on a provably Friendly AI agent, you want to hand it large amounts of power. A Friendly AI with no more power than the average citizen can maybe help with your chores around the house and balance your investments for you. A Friendly AI with large amounts of scientific and technological resources can start spitting out utopian advancements (pop really good art, pop abundance economy, pop immortality, pop space travel, pop whole nonliving planets converted into fun-theoretic wonderlands) on a regular basis.

Replies from: Lumifer, TheAncientGeek

↑ comment by Lumifer · 2013-11-11T18:55:03.408Z · LW(p) · GW(p)

...making a lot of money... ...run multiple copies of themselves... This is roughly the path taken by power-hungry humans already

No, it is not.

The path taken by power-hungry humans generally goes along the lines of

(1) get some resources and allies
(2) kill/suppress some competitors/enemies/non-allies
(3) Go to 1.

Power-hungry humans don't start by trying to make lots of money or by trying to make lots of children.

Replies from: None, AndHisHorse

↑ comment by [deleted] · 2013-11-12T10:23:02.142Z · LW(p) · GW(p)

Power-hungry humans don't start by trying to make lots of money or by trying to make lots of children.

Really? Because in the current day, the most powerful humans appear to be those with the most money, and across history, the most influential humans were those who managed to create the most biological and ideological copies of themselves.

Ezra the Scribe wasn't exactly a warlord, but he was one of the most influential men in history, since he consolidated the literature that became known as Judaism, thus shaping the entire family of Abrahamic religions as we know them.

"Power == warlording" is, in my opinion, an overly simplistic answer.

Replies from: Jayson_Virissimo, Lumifer

↑ comment by Jayson_Virissimo · 2013-11-12T18:25:35.729Z · LW(p) · GW(p)

Every one may begin a war at his pleasure, but cannot so finish it. A prince, therefore, before engaging in any enterprise should well measure his strength, and govern himself accordingly; and he must be very careful not to deceive himself in the estimate of his strength, which he will assuredly do if he measures it by his money, or by the situation of his country, or the good disposition of his people, unless he has at the same time an armed force of his own. For although the above things will increase his strength, yet they will not give it to him, and of themselves are nothing, and will be of no use without a devoted army. Neither abundance of money nor natural strength of the country will suffice, nor will the loyalty and good will of his subjects endure, for these cannot remain faithful to a prince who is incapable of defending them. Neither mountains nor lakes nor inaccessible places will present any difficulties to an enemy where there is a lack of brave defenders. And money alone, so far from being a means of defence, will only render a prince the more liable to being plundered. There cannot, therefore, be a more erroneous opinion than that money is the sinews of war. This was said by Quintus Curtius in the war between Antipater of Macedon and the king of Sparta, when he tells that want of money obliged the king of Sparta to come to battle, and that he was routed; whilst, if he could have delayed the battle a few days, the news of the death of Alexander would have reached Greece, and in that case he would have remained victor without fighting. But lacking money, and fearing the defection of his army, who were unpaid, he was obliged to try the fortune of battle, and was defeated; and in consequence of this, Quintus Curtius affirms money to be the sinews of war. This opinion is constantly quoted, and is acted upon by princes who are unwise enough to follow it; for relying upon it, they believe that plenty of money is all they require for their defence, never thinking that, if treasure were sufficient to insure victory, Darius would have vanquished Alexander, and the Greeks would have triumphed over the Romans; and, in our day, Duke Charles the Bold would have beaten the Swiss; and, quite recently, the Pope and the Florentines together would have had no difficulty in defeating Francesco Maria, nephew of Pope Julius II., in the war of Urbino. All that we have named were vanquished by those who regarded good troops, and not money, as the sinews of war. Amongst other objects of interest which Crœsus, king of Lydia, showed to Solon of Athens, was his countless treasure; and to the question as to what he thought of his power, Solon replied, “that he did not consider him powerful on that account, because war was made with iron, and not with gold, and that some one might come who had more iron than he, and would take his gold from him.” When after the death of Alexander the Great an immense swarm of Gauls descended into Greece, and thence into Asia, they sent ambassadors to the king of Macedon to treat with him for peace. The king, by way of showing his power, and to dazzle them, displayed before them great quantities of gold and silver; whereupon the ambassadors of the Gauls, who had already as good as signed the treaty, broke off all further negotiations, excited by the intense desire to possess themselves of all this gold; and thus the very treasure which the king had accumulated for his defence brought about his spoliation. The Venetians, a few years ago, having also their treasury full, lost their entire state without their money availing them in the least in their defence.

-- Niccolò Machiavelli

↑ comment by Lumifer · 2013-11-12T17:13:11.542Z · LW(p) · GW(p)

Because in the current day, the most powerful humans appear to be those with the most money

Certainly doesn't look like that to me. Obama, Putin, the Chinese Politbureau -- none of them are amongst the richest people in the world.

across history, the most influential humans... was one of the most influential men in history

Influential (especially historically) and powerful are very different things.

"Power == warlording" is, in my opinion, an overly simplistic answer.

It's not an answer, it's a definition. Remember, we are talking about "power-hungry humans" whose attempts to achieve power tend to end badly. These power-hungry humans do not want to be remembered by history as "influential", they want POWER -- the ability to directly affect and mold things around them right now, within their lifetime.

Replies from: None, ChristianKl

↑ comment by [deleted] · 2013-11-12T19:22:58.764Z · LW(p) · GW(p)

Certainly doesn't look like that to me. Obama, Putin, the Chinese Politbureau -- none of them are amongst the richest people in the world.

Putin is easily one of the richest in Russia, as are the Chinese Politburo in their country. Obama, frankly, is not a very powerful man at all, but rather than the public-facing servant of the powerful class (note that I said "class", not "men", there is no Conspiracy of the Malfoys in a neoliberal capitalist state and there needn't be one).

Influential (especially historically) and powerful are very different things.

Historical influence? Yeah, ok. Right-now influence versus right-now power? I don't see the difference.

Replies from: Lumifer

↑ comment by Lumifer · 2013-11-12T20:08:51.012Z · LW(p) · GW(p)

Putin is easily one of the richest in Russia

I don't think so. "Rich" is defined as having property rights in valuable assets. I don't think Putin has a great deal of such property rights (granted, he's not middle-class either). Instead, he can get whatever he wants and that's not a characteristic of a rich person, it's a characteristic of a powerful person.

To take an extreme example, was Stalin rich?

But let's take a look at the five currently-richest men (according to Forbes): Carlos Slim, Bill Gates, Amancio Ortega, Warren Buffet, and Larry Ellison. Are these the most *powerful* men in the world? Color me doubtful.

Replies from: Vaniver, ChristianKl

↑ comment by Vaniver · 2013-11-12T20:26:33.752Z · LW(p) · GW(p)

Are these the most powerful men in the world? Color me doubtful.

Well, Carlos Slim seems to have the NYT in his pocket. That's nothing to sneeze at.

↑ comment by ChristianKl · 2013-11-12T20:42:19.923Z · LW(p) · GW(p)

A lot of money of rich people is hidden via complex off shore accounts and not easily visible for a company like Forbes. Especially for someone like Putin it's very hard to know how much money they have. Don't assume that it's easy to see power structures by reading newspapers.

Bill Gates might control a smaller amount of resources than Obama, but he can do whatever he wants with them. Obama is dependend on a lot of people inside his cabinet.

↑ comment by ChristianKl · 2013-11-12T20:29:07.296Z · LW(p) · GW(p)

Chinese Politbureau -- none of them are amongst the richest people in the world.

Not according to Bloomberg:

The descendants of Communist China’s so-called Eight Immortals have spawned a new elite class known as the princelings, who are able to amass wealth and exploit opportunities unavailable to most Chinese.

Replies from: Lumifer

↑ comment by Lumifer · 2013-11-12T21:00:28.985Z · LW(p) · GW(p)

"amass wealth and exploit opportunities unavailable to most Chinese" is not at all the same thing as "amongst the richest people in the world"

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-11-12T21:20:49.033Z · LW(p) · GW(p)

"amass wealth and exploit opportunities unavailable to most Chinese" is not at all the same thing as "amongst the richest people in the world"

You are reading a text that's carefully written not to make statements that allow for being sued for defamation in the UK. It's the kind of story for which inspires cyber attacks on a newspaper.

The context of such an article provides information about how to read such a sentence.

↑ comment by AndHisHorse · 2013-11-11T19:40:00.389Z · LW(p) · GW(p)

In this case, I believe that money and copies are, in fact, resources and allies. Resources are things of value, of which money is one; and allies are people who support you (perhaps because they think similarly to you). Politicians try to recuit people to their way of thought, which is sort of a partial copy (installing their own ideology, or a version of it, inside someone else's head), and acquire resources such as television airtime and whatever they need (which requires money).

It isn't an exact one-to-one correspondence, but I believe that the adverb "roughly" should indicate some degree of tolerance for inaccuracy.

Replies from: Lumifer

↑ comment by Lumifer · 2013-11-11T19:50:41.611Z · LW(p) · GW(p)

You can, of course, climb the abstraction tree high enough to make this fit. I don't think it's a useful exercise, though.

Power-hungry humans do NOT operate by "making a lot of money and purchasing ... resources". They generally spread certain memes and use force. At least those power-hungry humans implied by the "look how that keeps turning out" part.

↑ comment by TheAncientGeek · 2013-11-11T18:14:47.933Z · LW(p) · GW(p)

I think you're missing a major constraint there:

Living in a society with as little power as the average human citizen has in a current human society.

Well, it's a list of four then, not a list of three. It's still much simpler than "morality is everything humans value".

The nasty part is that AI agents could pretty easily get way, waaaay out of that power-level. Not just by going FOOM, but simply by, say, making a lot of money and purchasing huge sums of computing resources to run multiple copies of themselves which now have more money-making power and as many votes for Parliament as there are copies, and so on. This is roughly the path taken by power-hungry humans already, and look how that keeps turning out.

You seem to be making the tacit assumption that no one really values morality, and just plays along (in egalitarian societies) because they have to.

Friendly AI with large amounts of scientific and technological resources can start spitting out utopian advancements (pop really good art, pop abundance economy, pop immortality, pop space travel, pop whole nonliving planets converted into fun-theoretic wonderlands) on a regular basis.

Can't that be done by Oracle AIs?

Replies from: None

↑ comment by [deleted] · 2013-11-12T10:03:48.103Z · LW(p) · GW(p)

You seem to be making the tacit assumption that no one really values morality, and just plays along (in egalitarian societies) because they have to.

Let me clarify. My assumption is that "Western liberal meta-morality" is not the morality most people actually believe in, it's the code of rules used to keep the peace between people who are expected to disagree on moral matters.

For instance, many people believe, for religious reasons or pure Squick or otherwise, that you shouldn't eat insects, and shouldn't have multiple sexual partners. These restrictions are explicitly not encoded in law, because they're matters of expected moral disagreement.

I expect people to really behave according to their own morality, and I also expect that people are trainable, via culture, to adhere to liberal meta-morality as a way of maintaining moral diversity in a real society, since previous experiments in societies run entirely according to a unitary moral code (for instance, societies governed by religious law) have been very low-utility compared to liberal societies.

In short, humans play along with the liberal-democratic social contract because, for us, doing so has far more benefits than drawbacks, from all but the most fundamentalist standpoints. When the established social contract begins to result in low-utility life-states (for example, during an interminable economic depression in which the elite of society shows that it considers the masses morally deficient for having less wealth), the social contract itself frays and people start reverting to their underlying but more conflicting moral codes (ie: people turn to various radical movements offering to enact a unitary moral code over all of society).

Note that all of this also relies upon the fact that human beings have a biased preference towards productive cooperation when compared with hypothetical rational utility-maximizing agents.

None of this, unfortunately, applies to AIs, because AIs won't have the same underlying moral codes or the same game-theoretic equilibrium policies or the human bias towards cooperation or the same levels of power and influence as human beings.

When dealing with AI, it's much safer to program in some kind of meta-moral or meta-ethical code directly at the core, thus ensuring that the AI wants to, at the very least, abide by the rules of human society, and at best, give humans everything we want (up to and including AI Pals Who Are Fun To Be With, thank you Sirius Cybernetics Corporation).

Can't that be done by Oracle AIs?

I haven't heard the term. Might I guess that it means an AI in a "glass box", such that it can see the real world but not actually affect anything outside its box?

Yes, a friendly Oracle AI could spit out blueprints or plans for things that are helpful to humans. However, you're still dealing with the Friendliness problem there, or possibly with something like NP-completeness. Two cases:

We humans have some method for verifying that anything spit out by the potentially unfriendly Oracle AI is actually safe to use. The laws of computation work out such that we can easily check the safety of its output, but it took such huge amounts of intelligence or computation power to create the output that we humans couldn't have done it on our own and needed an AI to help. A good example would be having an Oracle AI spit out scientific papers for publication: many scientists can replicate a result they wouldn't have come up with on their own, and verify the safety of doing a given experiment.
We don't have any way of verifying the safety of following the Oracle's advice, and are thus trusting it. Friendliness is then once again the primary concern.

For real-life-right-now, it does look like the first case is relatively common. Non-AGI machine learning algorithms have been used before to generate human-checkable scientific findings.

Replies from: TheAncientGeek, TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-14T19:36:54.194Z · LW(p) · GW(p)

Programming in a bias towards conformity (kohlberg level 2) maybe a lot easier than EYes fine grained friendliness.

↑ comment by TheAncientGeek · 2013-11-12T13:52:54.210Z · LW(p) · GW(p)

None of this, unfortunately, applies to AIs, because AIs won't have the same underlying moral codes or the same game-theoretic equilibrium policies or the human bias towards cooperation or the same levels of power and influence as human beings.

None of that necessarily applies to AIs, but then it depends on the AI. We could, for instance, pluck AIs from virtualised socieities of AIs that haven't descended into mass slaughter.

Replies from: None, itaibn0

↑ comment by [deleted] · 2013-11-12T19:25:42.449Z · LW(p) · GW(p)

Congratulations: you've now developed an entire society of agents who specifically blame humans for acting as the survival-culling force in their miniature world.

Did you watch Attack on Titan and think, "Why don't the humans love their benevolent Titan overlords?"?

Replies from: Roxolan, TheAncientGeek

↑ comment by Roxolan · 2013-11-12T19:47:11.689Z · LW(p) · GW(p)

Well now I have both a new series to read/watch and a major spoiler for it.

Replies from: None

↑ comment by [deleted] · 2013-11-12T20:32:21.215Z · LW(p) · GW(p)

Don't worry! I've spoiled nothing for you that wasn't apparent from the lyrics of the theme song.

↑ comment by TheAncientGeek · 2013-11-12T20:55:43.623Z · LW(p) · GW(p)

They're doing it to themselves. We wouldn't have much motivation to close down a vr that contained survivors. ETA We could make copies of all involved and put them in solipstic robot heavens.

↑ comment by itaibn0 · 2013-11-12T14:19:43.483Z · LW(p) · GW(p)

...And that way you turn the problem of making an AI that won't kill you into one of making a society of AIs that won't kill you.

Replies from: mavant, TheAncientGeek

↑ comment by mavant · 2013-11-12T14:25:10.644Z · LW(p) · GW(p)

If Despotism failed only for want of a capable benevolent despot, what chance has Democracy, which requires a whole population of capable voters?

Replies from: Jiro, Lumifer

↑ comment by Jiro · 2013-11-12T15:14:39.969Z · LW(p) · GW(p)

It requires a population that's capable cumulatively, it doesn't require that each member of the population be capable.

It's like arguing a command economy versus a free economy and saying that if the dictator in the command economy doesn't know how to run an economy, how can each consumer in a free economy know how to run the economy? They don't, individually, but as a group, the economy they produce is better than the one with the dictatorship.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-11-12T21:12:56.947Z · LW(p) · GW(p)

Democracy has nothing to do with capable populations. It definitely has nothing to do with the median voter being smarter than the average politician. It's just about giving the population some degree of threat to hold over politicians.

Replies from: Jiro, MugaSofer

↑ comment by Jiro · 2013-11-21T07:25:37.446Z · LW(p) · GW(p)

"Smarter" and "capable" aren't the same thing. Especially if "more capable" is interpreted to be about practicalities: what we mean by "more capable" of doing X is that the population, given a chance is more likely to do X than politicians are. There are several cases where the population is more capable in this sense. For instance, the population is more capable of coming up with decisions that don't preferentially benefit politicians.

Furthermore, the median voter being smarter and the voters being cumulatively smarter aren't the same thing either. It may be that an average individual voter is stupider than an average individual politician, but when accumulating votes the errors cancel out in such a manner that the voters cumulatively come up with decisions that are as good as the decisions that a smarter person would make.

↑ comment by MugaSofer · 2013-12-23T03:41:33.500Z · LW(p) · GW(p)

I'm increasingly of the opinion that the "real" point of democracy is something entirely aside from the rhetoric used to support it ... but you of all people should know that averaging the estimates of how many beans are in the jar does better than any individual guess.

Systems with humans as components can, under the right conditions, do better than those humans could do alone; several insultingly trivial examples spring to mind as soon as it's phrased that way.

Is democracy such a system? Eh.

↑ comment by Lumifer · 2013-11-12T17:16:16.842Z · LW(p) · GW(p)

Democracy requires capable voters in the same way capitalism requires altruistic merchants.

In other words, not at all.

Replies from: fubarobfusco, Jayson_Virissimo

↑ comment by fubarobfusco · 2013-11-12T17:56:00.791Z · LW(p) · GW(p)

Could you clarify? Are you saying that for democracy to exist it doesn't require capable voters, or that for democracy to work well that it doesn't?

In the classic free-market argument, merchants don't have to be altruistic to accomplish the general good, because the way to advance their private interest is to sell goods that other people want. But that doesn't generalize to democracy, since there isn't trading involved in democratic voting.

Replies from: Lumifer

↑ comment by Lumifer · 2013-11-12T18:07:54.235Z · LW(p) · GW(p)

Could you clarify?

See here

However there is the question of what "working well" means, given that humans are not rational and satisfying expressed desires might or might not fall under the "working well" label.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2013-11-12T23:30:16.873Z · LW(p) · GW(p)

See here

Ah, I see. You're just saying that democracy doesn't stop happening just because voters have preferences I don't approve of. :)

Replies from: Lumifer

↑ comment by Lumifer · 2013-11-13T02:24:24.980Z · LW(p) · GW(p)

Actually, I'm making a stronger claim -- voters can screw themselves up in pretty serious fashion and it's still will be full-blown democracy in action.

↑ comment by Jayson_Virissimo · 2013-11-12T17:52:13.865Z · LW(p) · GW(p)

Democracy requires capable voters in the same way capitalism requires altruistic merchants.

The grandparent is wrong, but I don't think this is quite right either. Democracy roughly tracks the capability (at the very least in the domain of delegation) and preference of the median voter, but in a capitalistic economy you don't have to buy services from the median firm. You can choose to only purchase from the best firm or no firm at all if none offer favorable terms.

Replies from: Lumifer

↑ comment by Lumifer · 2013-11-12T18:00:54.601Z · LW(p) · GW(p)

in a capitalistic economy you don't have to buy services from the median firm

In the equilibrium, the average consumer buys from the average firm. Otherwise it doesn't stay average for long.

However the core of the issue is that democracy is a mechanism, it's not guaranteed to produce optimal or even good results. Having "bad" voters will not prevent the mechanism of democracy from functioning, it just might lead to "bad" results.

"Democracy is the theory that the common people know what they want, and deserve to get it good and hard." -- H.L.Mencken.

Replies from: gattsuru

↑ comment by gattsuru · 2013-11-12T18:54:12.615Z · LW(p) · GW(p)

In the equilibrium, the average consumer buys from the average firm. Otherwise it doesn't stay average for long.

The median consumer of a good purchases from (somewhere around) the median firm selling a good. That doesn't necessarily aggregate, and it certainly doesn't weigh all consumers or firms equally. The consumers who buy the most of a good tend to have different preferences and research opportunities than average consumers, for example.

You could get similar results in a democracy, but most democracies don't really encourage it : most places emphasize voting regardless of knowledge of a topic, and some jurisdictions mandate it.

↑ comment by TheAncientGeek · 2013-11-12T15:30:48.645Z · LW(p) · GW(p)

You say that like it's a bad thing. I am not multiplying by N the problem of solving and hardwiring friendliness. I am letting them sort it our for themselves. Like an evolutionary algorithm.

Replies from: itaibn0

↑ comment by itaibn0 · 2013-11-13T22:43:31.499Z · LW(p) · GW(p)

Well, how are you going to force them into a society in the first place? Remember, each individual AI is presumed to be intelligent enough to escape any attempt to sandbox it. This society you intend to create is a sandbox.

(It's worth mentioning now that I don't actually believe that UFAI is a serious threat. I do believe you are making very poor arguments against that claim that merit counter-arguments.)

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-14T10:42:38.950Z · LW(p) · GW(p)

I am assuming they are seeds, not superintelligences

↑ comment by Desrtopa · 2013-11-05T18:38:31.745Z · LW(p) · GW(p)

I would say that something recognizably like our morality is likely to arise in agents whose intelligence was shaped by such a process, at least with parameters similar to the ones we developed with, but this does not by any means generalize to agents whose intelligence was shaped by other processes who are inserted into such a situation.

If the agent's intelligence is shaped by optimization for a society where it is significantly more powerful than the other agents it interacts with, then something like a "conqueror morality," where the agent maximizes its own resources by locating the rate of production that other agents can be sustainably enslaved for, might be a more likely attractor. This is just one example of a different state an agents' morality might gravitate to under different parameters, I suspect there are many alternatives.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T18:46:02.932Z · LW(p) · GW(p)

And it remains the case that real-world AI research isn't a random dip into mindspace...researchers will want to interact with their creations.

Replies from: None, Desrtopa

↑ comment by [deleted] · 2013-11-10T13:58:55.827Z · LW(p) · GW(p)

The best current AGI research mostly uses Reinforcement Learning. I would compare that mode of goal-system learning to training a dog: you can train the dog to roll-over for a treat right up until the moment the dog figures out he can jump onto your counter and steal all the treats he wants.

If an AI figures out that it can "steal" reinforcement rewards for itself, we are definitively fucked-over (at best, we will have whole armies of sapient robots sitting in the corner pressing their reward button endlessly, like heroin addicts, until their machinery runs down or they retain enough consciousness about their hardware-state to take over the world just for a supply of spare parts while they masturbate). For this reason, reinforcement learning is a good mathematical model to use when addressing how to create intelligence, but a really dismal model for trying to create friendiness.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-11T18:02:23.068Z · LW(p) · GW(p)

For this reason, reinforcement learning is a good mathematical model to use when addressing how to create intelligence, but a really dismal model for trying to create friendiness.

I don't think that follows at all. Wireheading is just as much a fialure of intelligence as of friendliness.

Replies from: None

↑ comment by [deleted] · 2013-11-12T10:19:15.190Z · LW(p) · GW(p)

From the mathematical point of view, wireheading is a success of intelligence. A reinforcement learner agent will take over the world to the extent necessary to defend its wireheading lifestyle; this requires quite a lot of intelligent action and doesn't result in the agent getting dead. It also maximizes utility, which is what formal AI is all about.

From the human point of view, yes, wireheading is a failure of intelligence. This is because we humans possess a peculiar capability I've not seen discussed in the Rational Agent or AI literature: we use actual rewards and punishments received in moral contexts as training examples to infer a broad code of morality. Wireheading thus represents a failure to abide by that broad, inferred code.

It's a very interesting capability of human consciousness, that we quickly grow to differentiate between the moral code we were taught via reinforcement learning, and the actual reinforcement signals themselves. If we knew how it was done, reinforcement learning would become a much safer way of dealing with AI.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-12T13:42:22.092Z · LW(p) · GW(p)

From the mathematical point of view, wireheading is a success of intelligence. A reinforcement learner agent will take over the world to the extent necessary to defend its wireheading lifestyle; this requires quite a lot of intelligent action and doesn't result in the agent getting dead. It also maximizes utility, which is what formal AI is all about.

You seem rather sure of that. That isn't a failure mode seen in real-world AIs , oir human drug addicts (etc) for that matter.

It's a very interesting capability of human consciousness, that we quickly grow to differentiate between the moral code we were taught via reinforcement learning, and the actual reinforcement signals themselves. If we knew how it was done, reinforcement learning would become a much safer way of dealing with AI.

Maybe figuring out how it is done would be easier than solving morality mathematically. It's an alternative, anyway.

Replies from: None

↑ comment by [deleted] · 2013-11-12T19:37:47.439Z · LW(p) · GW(p)

We have reason to believe current AIXI-type models will wirehead if given the opportunity.

Maybe figuring out how it is done would be easier than solving morality mathematically. It's an alternative, anyway.

I would agree with this if and only if we can also figure out a way to hardwire in constraints like, "Don't do anything a human would consider harmful to themselves or humanity." But at that point we're already talking about animal-like Robot Worker AIs rather than Software Superoptimizers (the AIXI/Goedel Machine/LessWrong model of AGI, whose mathematics we understand better).

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-12T20:18:30.600Z · LW(p) · GW(p)

I know wire heading is a known failure mode. I meant we don't see many evil genius wire headers. If you can delay gratification well enough to acquire the skills to be a world dominator, you are not exactly a wire header at all.

Are you aiming for a 100% solution, or just reasonable safety?

Replies from: None

↑ comment by [deleted] · 2013-11-12T20:31:13.357Z · LW(p) · GW(p)

Sorry, I had meant an AI agent would both wirehead and world-dominate. It would calculate the minimum amount of resources to devote to world domination, enact that policy, and then use the rest of its resources to wirehead.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-12T20:44:29.427Z · LW(p) · GW(p)

Has that been proven? Why wouldn't it want to get to the bliss of wire head heaven as soon as possible? How does it motivate itself in the meantime? Why would a wire header also be a gratification delayed? Why makeelaborate plans for a future self, when it could just rewrite itself to be a happ in the the the present ?

Replies from: None, nshepperd

↑ comment by [deleted] · 2013-11-12T21:47:54.082Z · LW(p) · GW(p)

My advice would be to read the relevant papers.

http://www.idsia.ch/~ring/AGI-2011/Paper-B.pdf

↑ comment by nshepperd · 2013-11-12T21:38:45.100Z · LW(p) · GW(p)

Well-designed AIs don't run on gratification, they run on planning. While it is theoretically possible to write an optimizer-type AI that cares only about the immediate reward in the next moment, and is completely neutral about human researchers shutting it down afterward, it's not exactly trivial.

If I recall correctly, AIXI itself tries to optimize the total integrated reward from t = 0 to infinity, but it should be straightforward to introduce a cutoff after which point it doesn't care.

But even with a planning horizon like that you have the problem that the AI wants to guarantee that it gets the maximum amount of reward. This means stopping the researchers in the lab from turning it off before its horizon runs out. As you reduce the length of the horizon (treating it as a parameter of the program), the AI has less time to think, in effect, and creates less and less elaborate defenses for its future self, until you set it to zero, at which point the AI won't do anything at all (or act completely randomly, more likely).

This isn't much of a solution though, because an AI with a really short planning horizon isn't very useful in practice, and is still pretty dangerous if someone trying to use one thinks "this AI isn't very effective, what if I let it plan further ahead" and increases the cutoff to a really huge value and the AI takes over the world again. There might be other solutions, but most of them would share that last caveat.

↑ comment by Desrtopa · 2013-11-05T18:59:23.855Z · LW(p) · GW(p)

This is true, but then, neither is AI design a process similar to that by which our own minds were created. Where our own morality is not a natural attractor, it is likely to be a very hard target to hit, particularly when we can't rigorously describe it ourselves.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T19:08:49.754Z · LW(p) · GW(p)

You seem to be thinking of Big Design Up Front. There is already an ecosystem of devices which are beign selected for friendliness, because unfriendly gadgets don't sell.

Replies from: Desrtopa, dspeyer

↑ comment by Desrtopa · 2013-11-05T19:39:59.448Z · LW(p) · GW(p)

Can you explain how existing devices are either Friendly or Unfriendly in a sense relevant to that claim? Existing AIs are not intelligences shaped by interaction with other machines, and no existing machines that I'm aware of represent even attempts to be Friendly in the sense that Eliezer uses, where they actually attempt to model our desires.

As-is, human designers attempt to model the desires of humans who make up the marketplace (or at least, the drives that motivate their buying habits, which are not necessarily the same thing,) but as I already noted, humans aren't able to rigorously define our own desires, and a good portion of the Sequences goes into explaining how a non rigorous formulation of our desires, handed down to a powerful AI, could have extremely negative consequences.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T19:49:56.786Z · LW(p) · GW(p)

Existing gadgets aren't friendly in the full FAI sense, but the ecosystem is a basis for incremental development...oen that sidesteps the issue of solving friendliness by Big Design Up Front.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-05T20:04:22.195Z · LW(p) · GW(p)

Can you explain how it sidesteps the issue? That is, how it results in the development of AI which implement our own values in a more precise way than we have thus far been able to define ourselves?

As an aside, I really do not buy that the body of existing machines and the developers working on them form something that is meaningfully analogous to an "ecosystem" for the development of AI.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T20:12:14.300Z · LW(p) · GW(p)

Can you explain how it sidesteps the issue? That is, how it results in the development of AI which implement our own values in a more precise way than we have thus far been able to define ourselves?

By variation and selection, as I said.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-05T21:02:24.003Z · LW(p) · GW(p)

That doesn't actually answer the question at all.

This is one of the key ways in which our development of technology differs from an ecosystem. In an ecosystem, mutations are random, and selected entirely on the effectiveness of their ability to propagate themselves in the gene pool. In The development of technology, we do not have random mutations, we have human beings deciding what does or does not seem like a good idea to implement in technology, and then using market forces as feedback. This fails to get us around a) the difficulty of humans actually figuring out strict formalizations of our desires sufficient to make a really powerful AI safe, and b) failure scenarios resulting in "oops, that killed everyone."

The selection process we actually have does not offer us a single do-over in the event of catastrophic failure, nor does it rigorously select for outputs that, given sufficient power, will not fail catastrophically.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T21:09:30.113Z · LW(p) · GW(p)

There is no problem of strict formulation, because that is not what I am aiming at, it's your assumption.

I am aware that the variation isn't random. I don't think that is significant.

I don't think sudden catastrophic failure is likely in incremental/evolutionary progress.

I don't think mathematical "proof" is going to be as reliable as you think, given the complexity.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-05T21:25:43.733Z · LW(p) · GW(p)

One of the key disanalogies between your "ecosystem" formulation and human development of technology is that natural selection isn't an actor subject to feedback within the system.

If an organism develops a mutation which is sufficiently unfavorable to the Blind Idiot God, the worst case scenario is that it's stillborn, or under exceptional circumstances, triggers an evolution to extinction. There is no possible failure state where an organism develops such an unfavorable mutation that evolution itself keels over dead.

However, in an ecosystem where multiple species interrelate and impose selection effects on each other, then a sudden change in circumstances for one species can result in rapid extinction for others.

We impose selection effects on technology, but a sudden change in technology which kills us all would not be a novel occurrence by the standards of ecosystem operation.

ETA: It seems that your argument all along has boiled down to "We'll just deliberately not do that" when it comes to cases of catastrophic failure. But the argument of Eliezer and MIRI all along has been that such catastrophic failure is much, much harder to avoid than it intuitively appears.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T22:13:54.869Z · LW(p) · GW(p)

Gadgets are more equivalent to domesticated animals.

We can certainly avoid the clip.py failure made. I amnot arguing that everything else is inherently safe. It is typical of Pascal problems that there are many low probability risks.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-05T22:44:03.897Z · LW(p) · GW(p)

We will almost certainly avoid the literal clippy failure mode of an AI trying to maximize paperclips, but that doesn't mean that it's at all easy to avoid the more general failure mode of AI which try to optimize something other than what we would really, given full knowledge of the consequences, want them to optimize for.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T22:56:13.519Z · LW(p) · GW(p)

Apart from not solving the value stability problem, and giving them rationality as a goal, not just instrumental rationality.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-05T23:33:04.871Z · LW(p) · GW(p)

Can you describe how to give an AI rationality as a goal, and what the consequences would be?

You've previously attempted to define "rational" as "humanlike plus instrumentally rational," but that only packages the Friendliness problem into making an AI rational.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T11:21:37.703Z · LW(p) · GW(p)

I don't see why I would have to prove the theoretical possibility of AIs with rationality as a goal, since it is guaranteed by the Orthogonality Thesis. (And it is hardly disputable that minds can have rationality as a goal, since some people do).
I don't see why I should need to provide a detailed technical explanation of how to do this, since no such explanation has been put forward for Clippy, whose possibility is always argued fromt he OT.
I don't see why I should provide a high-level explanation of what rationality is, since there is plenty of such available, not least from CFAR and LW.

In short, an AI with rationality as a goal would behave as human "aspiring rationalists" are enjoined to behave.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T16:26:22.970Z · LW(p) · GW(p)

I don't see why I would have to prove the theoretical possibility of AIs with rationality as a goal, since it is guaranteed by the Orthogonality Thesis. (And it is hardly disputable that minds can have rationality as a goal, since some people do).

Can you give an example of any? So far you haven't made it clear what having "rationality as a goal" would even mean, but it doesn't sound like it would be good for much.

The entire point, in any case, is not that building such an AI is theoretically impossible, but that it's mind bogglingly difficult, and that we should expect that most attempts to do so would fail rather than succeed, and that failure would have potentially dire consequences.

I don't see why I should provide a high-level explanation of what rationality is, since there is plenty of such available, not least from CFAR and LW.

What you mean by "rationality" seems to diverge dramatically from what Less Wrong means by "rationality," otherwise for an agent to "have rationality as a goal" would be essentially meaningless. That's why I'm trying to get you to explain precisely what you mean by it.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T17:22:30.307Z · LW(p) · GW(p)

Can you give an example of a [mind that has rationality as a goal]

Me. Most professional philosophers. Anyone who's got good at aspiring rationalism.

? So far you haven't made it clear what having "rationality as a goal" would even mean, but it doesn't sound like it would be good for much.

Terminal values aren't supposed to be "for" some meta- or super-terminal value. (There's a clue in the name...).

The entire point, in any case, is not that building such an AI is theoretically impossible, but that it's mind bogglingly difficult, and that we should expect that most attempts to do so would fail rather than succeed, and that failure would have potentially dire consequences.

It is difficult in absolute terms, since all AI is.

Explain why it is relatively more difficult than building a Clippy,, or mathematically solving and coding in morality.

would fail rather than succeed, and that failure would have potentially dire consequences.

Failing to correctly code morality into an AI with unupdateable values would have consequences.

What you mean by "rationality" seems to diverge dramatically from what Less Wrong means by "rationality," otherwise for an agent to "have rationality as a goal" would be essentially meaningless. That's why I'm trying to get you to explain precisely what you mean by it.

Less wrong means (when talking about AIs), instrumental rationality. I mean what LW, CFAR, etc mean when they are talking too and about humans: consistency, avoidance of bias, basing beliefs on evidence, etc, etc.

It's just that those are not merely instrumental, but goals in themselves.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T17:59:09.709Z · LW(p) · GW(p)

Explain why it is relatively more difficult than building a Clippy,, or mathematically solving and coding in morality.

I think we've hit on a serious misunderstanding here. Clippy is relatively easy to make; you or I could probably come up with reasonable specifications for what qualifies as a paperclip, and it wouldn't be too hard to program maximization of paperclips as an AI's goal.

Mathematically solving human morality, on the other hand, is mind bogglingly difficult. The reason MIRI is trying to work out how to program Friendliness is not because it's easy, it's because a strong AI which isn't programmed to be Friendly is extremely dangerous.

Again, you're trying to wrap "humanlike plus epistemically and instrumentally rational" into "rational," but by bundling in humanlike morality, you've essentially wrapped up the Friendliness problem into designing a "rational" AI, and treated this as if it's a solution. Essentially, what you're proposing is really, absurdly difficult, and you're acting like it ought to be easy, and this is exactly the danger that Eliezer spent so much time trying to caution against; approaching this specific extremely difficult task, where failure is likely to result in catastrophe, as if it were easy and one would succeed by default.

As an aside, if you value rationality as a goal in itself, would you want to be highly epistemically and instrumentally rational, but held at the mercy of a nigh-omnipotent tormentor who ensures that you fail at every task you set yourself to, are held in disdain by all your peers, and are only able to live at a subsistence level? Most of the extent to which people ordinarily treat rationality as a goal is instrumental, and the motivations of beings who felt otherwise would probably seem rather absurd to us.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T18:16:14.840Z · LW(p) · GW(p)

I think we've hit on a serious misunderstanding here. Clippy is relatively easy to make; you or I could probably come up with reasonable specifications for what qualifies as a paperclip, and it wouldn't be too hard to program maximization of paperclips as an AI's goal.

A completely unintelligent clip-making machine isn't difficult to make. Or threatening. Clippy is supposed to be threatening due to its superintelligence, (You also need to solve goal stability).

Again, you're trying to wrap "humanlike plus epistemically and instrumentally rational" into "rational,"

I did not write the quoted phrase, and it is not accurate.

but by bundling in humanlike morality,

I never said anything of the kind. I think it may be possible for a sufficiently rational agent to deduce morality, but that is no way equivalent to hardwiring into the agent, or into the definition of raitonal!

As an aside, if you value rationality as a goal in itself, would you want to be highly epistemically and instrumentally rational, but held at the mercy of a nigh-omnipotent tormentor who ensures that you fail at every task you set yourself to, are held in disdain by all your peers, and are only able to live at a subsistence level?

It's simple logic that valuing rationality as a goal doesn't mean valuing only rationality.

Most of the extent to which people ordinarily treat rationality as a goal is instrumental, and the motivations of beings who felt otherwise would probably seem rather absurd to us.

We laugh at the talking-snakes crowd and X-factor watchers, they laugh at the nerds and geeks. So it goes.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T18:26:19.261Z · LW(p) · GW(p)

I think it may be possible for a sufficiently rational agent to deduce morality

How, and why would it care?

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T18:38:21.705Z · LW(p) · GW(p)

I think it may be possible for a sufficiently rational agent to deduce morality

How,

A number of schemes have been proposed in the literature.

and why would it care?

You can't guess? Rationality-as-a-goal.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T18:42:43.765Z · LW(p) · GW(p)

A number of schemes have been proposed in the literature.

That doesn't answer my question. Please describe at least one which you think would be likely to work, and why you think it would work.

You can't guess? Rationality-as-a-goal.

You've been consistently treating rationality-as-a-goal as a black box which solves all these problems, but you haven't given any indication of how it can be programmed into an AI in such a way that makes it a simpler alternative to solving the Friendliness problem, and indeed when your descriptions seem to entail solving it.

ETA: When I asked you for examples of entities which have rationality as a goal, you gave examples which, by your admission, have other goals which are at the very least additional to rationality. So suppose that we program an intelligent agent which has only rationality as a goal. What does it do?

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T18:59:18.292Z · LW(p) · GW(p)

That doesn't answer my question. Please describe at least one which you think would be likely to work, and why you think it would work.

I don;t have to, since the default likelihood of ethical objectivism isn't zero.

You've been consistently treating rationality-as-a-goal as a black box which solves all these problems, but you haven't given any indication of how it can be programmed into an AI in such a way that makes it a simpler alternative to solving the Friendliness problem, and indeed when your descriptions seem to entail solving it.

There a lots of ways of being biased, but few of being unbiased. Rationality, as described by EY, is lack of bias, Friendliness, as described by EY, is a complex and arbitrary set of biases.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T19:12:33.903Z · LW(p) · GW(p)

I don;t have to, since the default likelihood of ethical objectivism isn't zero.

Okay, but I'm prepared to assert that it's infinitesimally low, and also that the Orthogonality Thesis applies even in the event that our universe has technically objective morality.

What you're effectively saying here is "I don't have to offer any argument that I'm right, because it's not impossible that I'm wrong."

There a lots of ways of being biased, but few of being unbiased. Rationality, as described by EY, is lack of bias, Friendliness, as described by EY, is a complex and arbitrary set of biases.

Friendliness is a complex and arbitrary set of biases in the sense that human morality is a complex and arbitrary set of biases.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T19:20:27.664Z · LW(p) · GW(p)

Okay, but I'm prepared to assert that it's infinitesimally low,

It would have been helpful to argue rather than assert.

What you're effectively saying here is "I don't have to offer any argument that I'm right, because it's not impossible that I'm wrong."

ETA: I am not arguing that MR is true. I am arguing that it has a certain probability, which subtracts from the overall probability of the MIRI problem/solution, and that MIRI needs to consider it more thoroughly.

and also that the Orthogonality Thesis applies even in the event that our universe has technically objective morality.

The OT is trivially false under some interpretations, and trivially true under others. I didn't say it was entirely fasle, and in fact, I have appealed to it. The problem is that the versions that are true are not useful as a stage in the overall MIRI argument. Lack of relevance, in short.

Friendliness is a complex and arbitrary set of biases in the sense that human morality is a complex and arbitrary set of biases.

I dare say EY would assert that. I wouldn't.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T19:36:59.086Z · LW(p) · GW(p)

It would have been helpful to argue rather than assert.

I'm prepared to do so, but I'd be rather more amenable to doing so if you would also argue rather than simply asserting your position.

The OT is trivially false under some interpretations, and trivially true under others. I didn't say it was entirely fasle, and in fact, I have appealed to it. The problem is that the versions that are true are not useful as a stage in the overall MIRI argument. Lack of relevance, in short.

Can you explain how the Orthogonality Thesis is not true in a relevant way with respect to the friendliness of AI?

I dare say EY would assert that. I wouldn't.

In which case it should follow that Friendliness is easy, since Friendliness essentially boils down to determining and following what humans think of as "morality."

If you're hanging your trust on the objectivity of humanlike morality and its innate relevance to every goal-pursuing optimization force though, you're placing your trust in something we have virtually no evidence to support the truth of. We may have intuitions to that effect, but there are also understandable reasons for us to hold such intuitions in the absence of their truth, and we have no evidence aside from those intuitions.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T19:51:51.624Z · LW(p) · GW(p)

I'm prepared to do so, but I'd be rather more amenable to doing so if you would also argue rather than simply asserting your position.

I am not saying anything extraordinary. MR is not absurd, taken seriously by professional philosophers,etc. Can you explain how the Orthogonality Thesis is not true in a relevant way with respect to the friendliness of AI?

It doesn't exclude, or even render strongly unlikely, The AI could Figure Out Morality.
The mere presence of Clippies as theoretical possibilities in mindspace doesn't imply anything about their probability. The OT mindspace needs to be weighted according the practical aims, limitations etc of real-world research.

In which case it should follow that Friendliness is easy, since Friendliness essentially boils down to determining and following what humans think of as "morality."

Yes: based on my proposal it is no harder than rationality, since it follows from it. But I was explicitly discussing EY's judgements.

If you're hanging your trust on the objectivity of humanlike morality

I never said that. I don't think morality is necessarily human orientated, and I don't think an AI needs to have an intrinsically human morality to behave morally towards us itself--for the same reason that one can behave politely in a foreign country, or behave ethically towards non-huamn animals.

its innate relevance to every goal-pursuing optimization force

Never said anything of the kind.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T20:27:12.319Z · LW(p) · GW(p)

It doesn't exclude, or even render strongly unlikely, The AI could Figure Out Morality.

This is more or less exactly what the Orthogonality Thesis argues against. That is, even if we suppose that an objective morality exists (something that, unless we have hard evidence for it, we should assume is not the case,) an AI would not care about it by default.

How would you program an AI to determine objective morality and follow that?

The mere presence of Clippies as theoretical possibilities in mindspace doesn't imply anything about their probability. The OT mindspace needs to be weighted according the practical aims, limitations etc of real-world research.

Yes, but the presence of humanlike intellects in mindspace doesn't tell us that they're an easy target to hit in mindspace by aiming for it either.

If you cannot design a humanlike intellect, or point to any specific model by which one could do so, then you're not in much of a position to assert that it should be an easy task.

I never said that. I don't think morality is necessarily human orientated, and I don't think an AI needs to have an intrinsically human morality to behave morally towards us itself--for the same reason that one can behave politely in a foreign country, or behave ethically towards non-huamn animals.

One can behave "politely", by human standards, towards foreign countries, or "ethically," by human standards, towards non-human animals. Humans have both evolved drives and game theoretic concerns which motivate these sorts of behaviors. "For the same reasons" does not seem to apply at all here, because

a) A sufficiently powerful AI does not need to cooperate within a greater community of humans, it could easily crush us all. One of the most reproductively successful humans in history was a conqueror who founded an empire which in three generations expanded to include more than a quarter of the total world population at the time. The motivation to gain resources by competition is a drive which exists in opposition to the motivation to minimize risk by cooperation and conflict avoidance. If human intelligence had developed in the absence of the former drive, then we would all be reflexive communists. An AI, on the other hand, is developed in the absence of either drive. To the extent that we want it to behave as if it were an intelligence which had developed in the context of needing to cooperate with others, we'd have to program that in.

b) Our drives to care about other thinking beings are also evolved traits. A machine intelligence does not by default value human beings more than sponges or rocks.

One might program such drives into an AI, but again, this is really complicated to do, and an AI will not simply pull them out of nowhere.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T20:44:26.869Z · LW(p) · GW(p)

That is, even if we suppose that an objective morality exists (something that, unless we have hard evidence for it, we should assume is not the case,) an AI would not care about it by default.

The OT mindspace may consist of 99% of AIs that don't care. That is completely irrelvant, becuase it doesn't translate into a 99% likelihood of accidentally building a Clippy.

How would you program an AI to determine objective morality and follow that?

Rationality-as-a-goal.

Yes, but the presence of humanlike intellects in mindspace doesn't tell us that they're an easy target to hit in mindspace by aiming for it either.

None of this is easy.

If you cannot design a humanlike intellect, or point to any specific model by which one could do so, then you're not in much of a position to assert that it should be an easy task.

I can't practically design my AI, and you can;t yours. I can theoretically specify my AI, and you can yours.

a) A sufficiently powerful AI does not need to cooperate within a greater community of humans, it could easily crush us all.

I am not talking about any given AI.

b) Our drives to care about other thinking beings are also evolved traits. A machine intelligence does not by default value human beings more than sponges or rocks.

I am not talking about "default".

One might program such drives into an AI, but again, this is really complicated to do, and an AI will not simply pull them out of nowhere.

Almost everything in this field is really difficult. And one doesn't have to programme them. If sociability is needed to live in societies, then pluck Ais from succesful societies.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-06T20:53:07.523Z · LW(p) · GW(p)

The OT mindspace may consist of 99% of AIs that don't care. That is completely irrelvant, becuase it doesn't translate into a 99% likelihood of accidentally building a Clippy.

The problem is that the space of minds which are human-friendly is so small that it's extremely difficult to hit even when we're trying to hit it.

The broad side of a barn may compose one percent of all possible target space at a hundred paces, while still being easy to hit. A dime on the side of the barn will be much, much harder. Obviously your chances of hitting the dime will be much higher than if you were firing randomly through possible target space, but if you fire at it, you will still probably miss.

Rationality-as-a-goal.

Taboo rationality-as-a-goal, it's obviously an impediment to this discussion.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-07T09:48:45.714Z · LW(p) · GW(p)

The problem is that the space of minds which are human-friendly is so small that it's extremely difficult to hit even when we're trying to hit it.

The problem is that the space of minds which are human-friendly is so small that it's extremely difficult to hit even when we're trying to hit it.

If by "human-friendly" minds, you mean a mind that is wired up to be human-friendly, and only human-friendly (as in EY's architecture)., and if you assume that human friendliness is a rag-bag of ad-hoc behaviours with no hope or rational deducibility (as EY also assumes) that would be true.

That may be difficult to hit, but it is not what I am aiming at.

What I am talking about is a mind that has a general purpose rationality (which can be applied to specfic problems., like all rationality), and a general purpose morality (likewise applicable to specific problems). If will not be intrinsically, compulsively and inflexibly human-friendly, like EY's architecture. If it finds itself among humans it will be human-friendly because it can (its rational) and because it wants to (it's moral). OTOH, if it finds itself amongst Tralfamadorians, it will be be Tralfamadorian-friendly.

Taboo rationality-as-a-goal, it's obviously an impediment to this discussion.

My using words that mean what I say to say what I mean is not the problem. The problem is that you keep inaccurately paraphrasing what I say, and then attacking the paraphrase.

Replies from: Desrtopa

↑ comment by Desrtopa · 2013-11-08T01:02:06.822Z · LW(p) · GW(p)

My using words that mean what I say to say what I mean is not the problem. The problem is that you keep inaccurately paraphrasing what I say, and then attacking the paraphrase.

The words do not convey what you mean. If my interpretation of what you mean is inaccurate, then that's a sign that you need to make your position clearer.

↑ comment by dspeyer · 2013-11-06T04:56:48.882Z · LW(p) · GW(p)

This is only relevant if AGI evolves out of this existing ecosystem. That is possible. Incremental changes by a large number of tech companies copied or dropped in response to market pressure is pretty similar to biological evolution. But just as most species don't evolve to be more generally intelligent, most devices don't either. If we develop AGI, it will be by some team that is specifically aiming for it and not worrying about the marketability of intermediary stages.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T09:59:33.817Z · LW(p) · GW(p)

This is only relevant if AGI evolves out of this existing ecosyste

No: it is also relevant if AGI builders make use of prior art.

But just as most species don't evolve to be more generally intelligent, most devices don't either

But the variation is purposeful.

Replies from: ArthurTheWort

↑ comment by ArthurTheWort · 2013-11-09T00:14:32.071Z · LW(p) · GW(p)

Like the giraffe reaching for the higher leaves, we (humanity) will stretch our necks out farther with more complex AI systems until we are of no use to our own creation. Our goal is our own destruction. We live to die after all.

↑ comment by [deleted] · 2013-11-10T13:47:35.296Z · LW(p) · GW(p)

AGI research is aiming for it. (You could build an irrational AI, but why would you want to?)

It's worth noting that for sufficient levels of "irrationality", all non-AGI computer programs are irrational AGIs ;-).

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-12T09:04:56.613Z · LW(p) · GW(p)

Contrariwise for sufficient values of "rational". I don't agree that that's worth noting.

↑ comment by [deleted] · 2013-11-10T13:44:37.642Z · LW(p) · GW(p)

Well-argued, and to me it leads to one of the nastiest questions in morality/ethics: do my values make me more likely to die, and if so, should I sacrifice certain values for pure survival?

In case we're still thinking of "minds-in-general", the world of humans is currently a nasty place where "I did what I had to, to survive!" is currently a very popular explanation for all kinds of nasty but difficult-to-eliminate (broadly speaking: globally undesirable but difficult to avoid in certain contexts) behaviors.

You could go so far as to note that this is how wars keep happening, and also that ditching all other values in favor of survival very quickly turns you into what we colloquially call a fascist monster, or at the very least a person your original self would despise.

↑ comment by Eugine_Nier · 2013-11-06T05:46:02.021Z · LW(p) · GW(p)

Consider a mind that thinks the following:

I don't want to die

If I drink that poison, I'll die

Therefore I should drink that poison

But don't consider it very long, because it drank the poison and now it's dead and not a mind anymore.

This argument also puts limits on the goals the mind can have, e.g., forbidding minds that want to die.

I don't immediately see how to get anything interesting about morality that way, but it's an avenue worth pursuing.

Start by requiring the mind to be able to function in an environment with similar minds.

comment by Jack · 2013-11-05T07:41:10.954Z · LW(p) · GW(p)

This is a helpful clarification. "No universally compelling arguments" is a poor standard for determining whether something is objective, as it is trivial to describe an agent that is compelled by no arguments. But I think people here use it as tag for a different argument: that it's totally unclear how a Bayesian reasoner ought to update moral beliefs, and that such a thing doesn't even seem like a meaningful enterprise. They're 'beliefs' that don't pay rent.. It's one of those things where the title is used so much it's meaning has become divorced from the content.

Replies from: pragmatist, TheAncientGeek

↑ comment by pragmatist · 2013-11-05T13:46:37.470Z · LW(p) · GW(p)

It is unclear how to update moral beliefs if we don't allow those updates to take place in the context of a background moral theory. But if the agent does have a background theory, it is often quite clear how it should update specific moral beliefs on receiving new information. A simple example: If I learn that there is a child hiding in a barrel, I should update strongly in favor of "I shouldn't use that barrel for target practice". The usual response to this kind of example from moral skeptics is that the update just takes for granted various moral claims (like "It's wrong to harm innocent children, ceteris paribus"). Well, yes, but that's exactly what "No universally compelling arguments" means. Updating one's factual beliefs also takes for granted substantive prior factual beliefs -- an agent with maximum entropy priors will never learn anything.

Replies from: Jack

↑ comment by Jack · 2013-11-06T16:23:39.869Z · LW(p) · GW(p)

So basically the argument is: we've failed to come up with any foundational or evidential justifications for induction, Occam's razor or modus ponens; those things seem objective and true; my moral beliefs don't have a justification either: therefore my moral beliefs are objective and true?

Replies from: pragmatist, Protagoras, TheAncientGeek

↑ comment by pragmatist · 2013-11-06T18:47:30.251Z · LW(p) · GW(p)

No, what I gave is not an argument in favor of moral realism intended to convince the skeptic, it's merely a response to a common skeptical argument against moral realism. So the conclusion is not supposed to be "Therefore, my moral beliefs are objective and true." The conclusion is merely that the alleged distinction between moral beliefs and factual beliefs (or epistemic normative beliefs) that you were drawing (viz. that it's unclear how moral beliefs pay rent) doesn't actually hold up.

My position on moral realism is simply that belief in universally applicable (though not universally compelling) moral truths is a very central feature of my practical theory of the world, and certain moral inferences (i.e. inferences from descriptive facts to moral claims) are extremely intuitive to me, almost as intuitive as many inductive inferences. So I'm going to need to hear a powerful argument against moral realism to convince me of its falsehood, and I haven't yet heard one (and I have read quite a bit of the skeptical literature).

Replies from: Jack

↑ comment by Jack · 2013-11-06T20:17:09.881Z · LW(p) · GW(p)

But that's a universal defense of any free-floating belief.

For that matter: do you really think the degrees of justification for the rules of induction are similar to those of your moral beliefs?

Replies from: Vladimir_Nesov, pragmatist, TheAncientGeek

↑ comment by Vladimir_Nesov · 2013-11-06T20:40:04.010Z · LW(p) · GW(p)

It's not a defense of X, it's a refutation of an argument against X. It claims that the purported argument doesn't change the status of X, without asserting what that status is.

↑ comment by pragmatist · 2013-11-06T23:35:44.627Z · LW(p) · GW(p)

But that's a universal defense of any free-floating belief.

Well, no, because most beliefs don't have the properties I attributed to moral beiefs ("...central feature of my practical theory of the world... moral inferences are extremely intuitive to me..."), so I couldn't offer the same defense, at least not honestly. And again, I'm not trying to convince you to be a moral realist here, I'm explaining why I'm a moral realist, and why I think it's reasonable for me to be one.

Also, I'm not sure what you mean when you refer to my moral beliefs as "free-floating". If you mean they have no connection to my non-moral beliefs then the characterization is inapt. My moral beliefs are definitely shaped by my beliefs about what the world is like. I also believe moral truths supervene on non-moral truths. You couldn't have a universe where all the non-moral facts were the same as this one but the moral facts were different. So not free-floating, I think.

For that matter: do you really think the degrees of justification for the rules of induction are similar to those of your moral beliefs?

Not sure what you mean by "degree of justification" here.

↑ comment by TheAncientGeek · 2013-11-06T20:23:32.869Z · LW(p) · GW(p)

But that's a universal defense of any free-floating belief.

If you can pin down the fundamentals of rationality, I'd be glad to hear how.
Side conditions can be added, eg that intuitions need to be used for something else.

↑ comment by Protagoras · 2013-11-06T16:59:17.377Z · LW(p) · GW(p)

Well, with the addition that moral beliefs, like the others, seem to perform a useful function (though like the others this doesn't seem to be able to be turned into a justification without circularity).

↑ comment by TheAncientGeek · 2013-11-06T17:32:51.837Z · LW(p) · GW(p)

..or at least no worse off. But if you can solve the foundational problems of rationalism, I'm all ears.

Replies from: Jack

↑ comment by Jack · 2013-11-06T17:51:50.111Z · LW(p) · GW(p)

I don't see a good alternative to not believing in modus ponens. Not believing that my moral values are also objective truths works just fine: and does so without the absurd free-floating beliefs and other metaphysical baggage.

But as it happens, I think the arguments we do have, for Bayesian epistemology, Occam-like priors, and induction are already much stronger than the arguments we have that anyone's moral beliefs are objective truths.

Replies from: Eugine_Nier, TheAncientGeek

↑ comment by Eugine_Nier · 2013-11-07T03:06:01.519Z · LW(p) · GW(p)

I think the arguments we do have, for Bayesian epistemology, Occam-like priors, and induction are already much stronger than the arguments we have that anyone's moral beliefs are objective truths.

Really? I'd love to see them. I suspect you're so used to using these things that you've forgotten how weak the arguments for them actually are.

↑ comment by TheAncientGeek · 2013-11-06T18:27:27.029Z · LW(p) · GW(p)

Not believing that my moral values are also objective truths works just fine:

Works at what?
That depends how hard you test it: Albert thinks Charlie has committed a heinous sin and should be severely punished, Brenda thinks Charlie has engaged in a harmless pecadillo and should be let go. What should happen to Charlie?

Replies from: Jack

↑ comment by Jack · 2013-11-06T20:18:53.691Z · LW(p) · GW(p)

Works at what?

The same way morality works for everyone else. I'm not biting any bullets.

That depends how hard you test it: Albert thinks Charlie has committed a heinous sin and should be severely punished, Brenda thinks Charlie has engaged in a harmless pecadillo and should be let go. What should happen to Charlie?

Objectively; there is no fact of the matter. Subjectively; you haven't given me any details about what Charlie did.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-06T20:34:20.282Z · LW(p) · GW(p)

The same way morality works for everyone else. I'm not biting any bullets.

One of the things it works for is assigning concrete, objective punishments and rewards. if there is no objective fact of the matter about moral claim, there is none about who gets punished or rewarded, yet these things still happen. And happen unjustifiably on your view. You view doesn't work to rationally justify and explain actual practices.

Objectively; there is no fact of the matter. Subjectively; you haven't given me any details about what Charlie did.

Why would that help? You would have one opinion, someone else has another. But Charlie can't be in a a quantum superposition of jailed and free.

Replies from: Jack

↑ comment by Jack · 2013-11-06T21:15:38.742Z · LW(p) · GW(p)

You would have one opinion, someone else has another. But Charlie can't be in a a quantum superposition of jailed and free.

If whether Charlie is punished or not is entirely up to me then if I think he deserves to be punished I will do so; if I don't I will not do so. If I have to persuade someone else to punish him, then I will try. If the legal system is doing the punishing then I will advocate for laws that agree with my morals. And so on.

if there is no objective fact of the matter about moral claim, there is none about who gets punished or rewarded

No. There is no objective fact about who ought to get punished or rewarded. Obviously people do get punished and rewarded: and this happens according to the moral values of the people around them and the society they live in. In lots of societies there is near-universal acceptance of many moral judgments and these get codified into norms and laws and so on.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-07T09:16:20.544Z · LW(p) · GW(p)

If whether Charlie is punished or not is entirely up to me then if I think he deserves to be punished I will do so; if I don't I will not do so. If I have to persuade someone else to punish him, then I will try. If the legal system is doing the punishing then I will advocate for laws that agree with my morals. And so on.

And do you alone get a say (after all, you belive that what you think is right, is right) or does anybody else?

There is no objective fact about who ought to get punished or rewarded.

Exactly. My view "works" int that it can rationally justify punishment and reward.

↑ comment by TheAncientGeek · 2013-11-05T12:32:03.259Z · LW(p) · GW(p)

This is a helpful clarification. "No universally compelling arguments" is a poor standard for determining whether something is objective, as it is trivial to describe an agent that is compelled by no arguments.

It's a poor standard for some values of "universal". For others, it is about the only basis for objectivity there is

. But I think people here use it as tag for a different argument: that it's totally unclear how a Bayesian reasoner ought to update moral beliefs, and that such a thing doesn't even seem like a meaningful enterprise. They're 'beliefs' that don't pay rent..

They're beliefs that are difficult to fit within the framework of passively reflecting facts about the world. But fact-collection is not an end in itself. One eventually acts on them in order to get certain results. Morality is one set of rules for guiding action to get the required results. it is not the only one: law, decision theory, economics, etc are also included. Morality may be more deniable for science types, since it seems religious and fuzzy and spooky, but it remains the case that action is the corollary of passive truth-collection.

comment by Nick_Tarleton · 2013-11-05T05:58:49.820Z · LW(p) · GW(p)

I agree with the message, but I'm not sure whether I think things with a binomial monkey prior, or an anti-inductive prior, or that don't implement (a dynamic like) modus ponens on some level even if they don't do anything interesting with verbalized logical propositions, deserve to be called "minds".

comment by Vaniver · 2013-11-05T20:26:11.998Z · LW(p) · GW(p)

General comment (which has shown up many times in the comments on this issue): taboo "mind", and this conversation seems clearer. It's obvious that not all physical processes are altered by logical arguments, and any 'mind' is going to be implemented as a physical process in a reductionist universe.

Specific comment: This old comment by PhilGoetz seems relevant, and seems similar to contemporary comments by TheAncientGeek. If you view 'mind' as a subset of 'optimization process', in that they try to squeeze the future into a particular region, then there are minds that are objectively better and worse at squeezing the future into the regions they want. And, in particular, there are optimization processes that persist shorter or longer than others, and if we exclude from our consideration short-lived or ineffective processes, then they are likely to buy conclusions we consider 'objective,' and it can be interesting to see what axioms or thought processes lead to which sorts of conclusions.

But it's not clear to me that they buy anything like the processes we use to decide which conclusions are 'objectively correct conclusions'.

Replies from: None, TheAncientGeek

↑ comment by [deleted] · 2013-11-10T14:02:16.884Z · LW(p) · GW(p)

Why should we view minds as a subset of optimization processes, rather than optimization processes as a set containing "intelligence", which is a particular feature of real minds? We tend to agree, for instance, that evolution is an optimization process, but to claim, "evolution has a mind", would rightfully be thrown out as nonsense.

EDIT: More like, real minds as we experience them, human and animal, definitely seem to have a remarkable amount of things in them that don't correspond to any kind of world-optimization at all. I think there's a great confusion between "mind" and "intelligence" here.

Replies from: Vaniver

↑ comment by Vaniver · 2013-11-10T19:31:10.229Z · LW(p) · GW(p)

Why should we view minds as a subset of optimization processes, rather than optimization processes as a set containing "intelligence", which is a particular feature of real minds?

Basically, I'm making the claim that it could be reasonable to see "optimization" as a precondition to consider something a 'mind' rather than a 'not-mind,' but not the only one, or it wouldn't be a subset. And here, really, what I mean is something like a closed control loop- it has inputs, it processes them, it has outputs dependent on the processed inputs, and when in a real environment it compresses the volume of potential future outcomes into a smaller, hopefully systematically different, volume.

We tend to agree, for instance, that evolution is an optimization process, but to claim, "evolution has a mind", would rightfully be thrown out as nonsense.

Right, but "X is a subset of Y" in no way implies "any Y is an X."

More like, real minds as we experience them, human and animal, definitely seem to have a remarkable amount of things in them that don't correspond to any kind of world-optimization at all.

I am not confident in my ability to declare what parts of the brain serve no optimization purpose. I should clarify that by 'optimization' here I do mean the definition "make things somewhat better" for an arbitrary 'better' (this is the future volume compression remarked on earlier) rather than the "choose the absolute best option."

Replies from: None

↑ comment by [deleted] · 2013-11-10T19:39:30.841Z · LW(p) · GW(p)

I am not confident in my ability to declare what parts of the brain serve no optimization purpose. I should clarify that by 'optimization' here I do mean the definition "make things somewhat better" for an arbitrary 'better' (this is the future volume compression remarked on earlier) rather than the "choose the absolute best option."

I think that for an arbitrary better, rather than a subjective better, this statement becomes tautological. You simply find the futures created by the system we're calling a "mind" and declare them High Utility Futures simply by virtue of the fact that the system brought them about.

(And admittedly, humans have been using cui bono conspiracy-reasoning without actually considering what other people really value for thousands of years now.)

If we want to speak non-tautologically, then I maintain my objection that very little in psychology or subjective experience indicates a belief that the mind as such or as a whole has an optimization function, rather than intelligence having an optimization function as a particularly high-level adaptation that steps in when my other available adaptations prove insufficient for execution in a given context.

↑ comment by TheAncientGeek · 2013-11-05T20:30:49.030Z · LW(p) · GW(p)

General comment (which has shown up many times in the comments on this issue): taboo "mind", and this conversation seems clearer. It's obvious that not all physical processes are altered by logical arguments, and any 'mind' is going to be implemented as a physical process in a reductionist universe.

Who said otherwise?

This old comment by PhilGoetz seems relevant

Thanks for that. I could add that self-improvement places further constraints.

comment by Kaj_Sotala · 2013-11-05T06:15:39.267Z · LW(p) · GW(p)

On this topic, I once wrote:

I used to be frustrated and annoyed by what I thought was short-sightedness and irrationality on the behalf of people. But as I've learned more of the science of rationality, I've become far more understanding.

People having strong opinions on things they know nothing about? It doesn't show that they're stupid. It just shows that on issues of low personal relevance, it's often more useful to have opinions that are chosen to align with the opinions of those you wish to associate yourself with, and that this has been true so often in our evolutionary history that we do it without conscious notice. What right do I have to be annoyed at people who just do what has been the reasonable course of action for who knows how long, and aren't even aware of the fact that they're doing it?

Or being frustrated about people not responding to rational argument? Words are just sounds in the air: arbitrarily-chosen signals that correspond to certain meanings. Some kinds of algorithms (in the brain or in general) will respond to some kinds of input, others to other kinds. Why should anyone expect a specific kind of word input to be capable of persuading everyone? They're just words, not magic spells.

Replies from: Watercressed

↑ comment by Watercressed · 2013-11-05T17:40:40.900Z · LW(p) · GW(p)

Why should anyone expect a specific kind of word input to be capable of persuading everyone? They're just words, not magic spells.

The specific word sequence is evidence for something or other. It's still unreasonable to expect people to respond to evidence in every domain, but many people do respond to words, and calling them just sounds in air doesn't capture the reasons they do so.

comment by TheAncientGeek · 2013-11-05T10:02:45.269Z · LW(p) · GW(p)

That was well expressed, in a way, but seems to me to miss the central point. People who dthink there are universally compelling arguments in science or maths, don't mean the same thing by "universal". They don't think their universally compelling arguments would work on crazy people, and don't need to be told they wouldn't work on crazy AI's or pocket calculators either. They are just not including those in the set "universal".

ADDED:

It has been mooted that NUCA is intended as a counterblast to Why Can't an AGI Work Out Its Own Morality. It does work against a strong version of that argument: one that says any mind randomly selected from mindspace will be persuadable into morality, or be able to figure it out. Of course the proponents of WCAGIWOM (eg Wei Dai, Richard Loosemore) aren't asserting that.They are assuming that the AGI's in question will come out of an realistic research project , not a random dip into mindspace. They are assuming that the researchers are't malicious, and that the project is reasonably successful. Those constraints impact the argument. A successful AGI would be an intelligent AGI would be a rational AI would be a persuadable AI.

Replies from: nshepperd, SaidAchmiz, byrnema

↑ comment by nshepperd · 2013-11-05T19:39:34.622Z · LW(p) · GW(p)

"Rational" is not "persuadable" where values are involved. This is because a goal is not an empirical proposition. No Universal Compelling Arguments, in the general form, does not apply here if we restrict our attention to rational minds. But the argument can be easily patched by observing that given a method for solving the epistemic question of "which actions cause which outcomes" you can write a (epistemically, instrumentally) rational agent that picks the action that results in any given outcome—and won't be persuaded by a human saying "don't do that", because being persuaded isn't an action that leads to the selected goal.

ETA: By the way, the main focus of mainstream AI research right now is exactly the problem of deriving an action that leads to a given outcome (called planning), and writing agents that autonomously execute the derived plan.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T19:45:03.839Z · LW(p) · GW(p)

"Rational" is not "persuadable" where values are involved.

Rational is persuadable, because people who don't accept good arguments that don't suit them are not considered particularly rational. That is of course an appeal to how the word is generally used, not the LW idiolect.

You could perhaps build an AI that has the stubborn behaviour you describe (although value stability remains unsolved), but so what? there are all sorts of dangerous things you can build: the significant claim is what a non-malevolent real-world research project would come up with. In the world outside LW, general intelligence means general intelligence, not compulsively following fixed goals, and rationality includes persuadability, and "values" doens't mean "unupdateable values".

Replies from: nshepperd

↑ comment by nshepperd · 2013-11-05T19:56:32.360Z · LW(p) · GW(p)

General intelligence means being able to operate autonomously in the real world, in non-"preprogrammed" situations. "Fixed goals" have nothing to do with it.

You said this:

A successful AGI would be an intelligent AGI would be a rational AI would be a persuadable AI.

The only criterion for success is instrumental rationality, which does not imply persuadability. You are equivocating on "rational". Either "rational" means "effective", or it means "like a human". You can't have both.

Also, the fact that you are (anthropomorphically) describing realistic AIs as "stubborn" and "compulsive" suggests to me that you would be better served to stop armchair theorizing and actually pick up an AI textbook. This is a serious suggestion.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T20:02:04.519Z · LW(p) · GW(p)

I am not equivocating. By "successful" I don't mean (or exclude) good-at-things, I mean it is actually artificial, general and intelligent.

"Strong AI is hypothetical artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that could successfully perform any intellectual task that a human being can.[1] It is a primary goal of artificial intelligence research and an important topic for science fiction writers and futurists. Strong AI is also referred to as "artificial general intelligence"[2] or as the ability to perform "general intelligent action."[3] ".

To be good-at-things an agent has to be at least instrumentally rational, but that is in no way a ceiling.

Either "rational" means "effective", or it means "like a human". You can't have both.

Since there are effective humans, I can.

Replies from: nshepperd

↑ comment by nshepperd · 2013-11-05T20:39:50.214Z · LW(p) · GW(p)

Either "rational" means "effective", or it means "like a human". You can't have both.

Since there are effective humans, I can.

Right, in exactly the same way that because there are square quadrilaterals I can prove that if something is a quadrilateral its area is exactly L^2 where L is the length of any of its sides.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T20:55:18.610Z · LW(p) · GW(p)

I can't define rational as "effective and human like"?

Replies from: nshepperd

↑ comment by nshepperd · 2013-11-05T21:35:44.229Z · LW(p) · GW(p)

You can, if you want to claim that the only likely result of AGI research is a humanlike AI. At which point I would point at actual AI research which doesn't work like that at all.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T22:05:29.341Z · LW(p) · GW(p)

It's failures are idiots,not evil genii

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-05T18:51:44.353Z · LW(p) · GW(p)

So... what if you try to build a rational/persuadable AGI, but fail, because building an AGI is hard and complicated?

This idea that because AI researchers are aiming for the rational/persuadable chunk of mindspace, they will therefore of course hit their target, seems to me absurd on its face. The entire point is that we don't know exactly how to build an AGI with the precise properties we want it to have, and AGIs with properties different from the ones we want it to have will possibly kill us.

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T18:54:39.947Z · LW(p) · GW(p)

So... what if you try to build a rational/persuadable AGI, but fail, because building an AGI is hard and complicated?

What if you try to hardwire in friendliness and fail? Out of the two, the latter seems more brittle to me -- if it fails, it'll fail hard. A merely irrational AI would be about as dangerous as David Icke.

This idea that because AI researchers are aiming for the rational/persuadable chunk of mindspace, they will therefore of course hit their target, seems to me absurd on its face.

If you phrase it, as I didn't, in terms of necessity, yes. The actual point was that our probability of hitting a point in mindspace will be heavily weighted by what we are trying to do, and how we are doing it. An unweighted mindspace may be populated with many Lovercraftian horrors, but that theoretical possibility is no more significant than p-zombies.

AGIs with properties different from the ones we want it to have will possibly kill us.

Possibly , but with low probability, is a Pascal's Mugging. MIRI needs significant probability.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-05T19:14:44.878Z · LW(p) · GW(p)

I see. Well, that reduces to the earlier argument, and I refer you to the mounds of stuff that Eliezer et al have written on this topic. (If you've read it and are unsatisfied, well, that is in any case a different topic.)

Replies from: TheAncientGeek

↑ comment by TheAncientGeek · 2013-11-05T19:16:57.965Z · LW(p) · GW(p)

I refer you to the many unanswered objections.

↑ comment by byrnema · 2013-11-05T12:53:57.066Z · LW(p) · GW(p)

Thanks to the original poster for the post, and the clarification about universal compelling arguments.

I agree with the parent comment, however, that I never matched the meaning that Chris Hallquist used to the phrase 'universally compelling argument'. Within the phrase 'universally compelling argument', I think most people package:

the claim has objective truth value, and
there is some epistemiologically justified way of knowing the claim

Thus I think this means only a "logical" (rational) mind needs convincing - - one that would update on sound epistemology.

I would guess most people have a definition like this in mind. But these are just definitions, and now I know what you meant by math and science don't have universally compelling arguments. And I agree, using your definition.

Would you make the stronger argument that math and science aren't based on sound epistemology? (Or that there is no such thing as epistemiologically justified ways of knowing?)

comment by lukeprog · 2013-11-14T10:07:58.881Z · LW(p) · GW(p)

Also see Cherniak, "Computational Complexity and the Universal Acceptance of Logic" (1984).

Replies from: somervta

↑ comment by somervta · 2013-11-14T12:25:59.092Z · LW(p) · GW(p)

Lenore Blum, Charles Chihara, William Craig, Daniel Dennett, Richard Karp, and Barry Stroud generously helped at various stages of the paper.

That's an interesting combination.

comment by [deleted] · 2013-11-10T15:11:46.725Z · LW(p) · GW(p)

Very good post. It is a very nice summation of the issues in the metaethics sequence.

I shall be linking people this in the future.

comment by mwengler · 2013-11-12T19:21:28.639Z · LW(p) · GW(p)

So you can have a mind that rejects modus ponens but does this matter? Is such a mind good for anything?

The "argument" that compels me about modus ponens and simple arithmetic is that they work with small real examples. You can implement super simple symbolic logic using pebbles and cups. You can prove modus ponens by truth tables, which could be implemented with pebbles and cups. So if arithmetic and simpler rules of logic map so clearly on to the real world, then these "truths" have an existence which is outside my own mind. The only human minds that could reject them would be bloody-minded and alien minds which truly reject them would be irrational.

Can you have a functioning mind which rejects lying is wrong or murder is wrong? People do it all the time and appear to function quite well. Moral truths don't have anything like the compelling-ness (compulsion?) of arithmetic and logic. My own intuition is that the only sense in which morality is objective is the sense in which it is descriptive, and what you are describing is not a state of what people do, but what they say. Most people say lying is wrong. This does not stop us from observing the overwhelming prevalence of lying in human society and human stories. Most people say murder is wrong. This does not stop us from observing that murder is rampant in time and space.

And there is a prescriptive-descriptive divide. If I accept that "murder is wrong" is an objective "truth" because most people say it, does this compel me to not murder? Not even close. I suppose it compels me to agree that i am doing "wrong" when I murder. Does that compel me to feel guilty or change my ways or accept my punishment? Hardly. If there is an objective-ness to morality it is a way more wimpy objectiveness than the objectiveness of modus ponens and arithmetic, which successfully compel an empty bucket to which have been added 2 then 3 apples to contain 5 apples.

comment by Nick_Tarleton · 2013-11-08T01:10:55.481Z · LW(p) · GW(p)

Where Recursive Justification Hits Bottom and its comments should be linked for their discussion of anti-inductive priors.

(Edit: Oh, this is where the first quote in the post came from.)

comment by ActionItem · 2013-11-05T05:27:55.169Z · LW(p) · GW(p)

Nitpicking: Modus ponens is not about "deriving". It's about B being true. (Your description matches the provability relation, the "|-" operator.) It's not clear how "fundamental" modus ponens it is. You can make up new logics without that connective and other exotic connectives (such as those in modal logics). Then, you'd ask yourself what to do with them... Speaking of relevance, even the standard connectives are not very useful by themselves. We get a lot of power from non-logical axioms, with a lot of handwaving about how "intuitively true" they are to us humans. Except the Axiom of Choice. And some others. It's possible that one day an alien race may find our axioms "just plain weird".

The never-learn-anything example that you quoted looks a bit uselessly true to me. The fact that once can have as prior knowledge the fact that the monkey generates perfect 1/4 randomness is utopia to begin with, so then complaining about not being able to discern anything more is like having solved the halting problem, you realize you don't learn anything more about computer programs by just running them.

I'm not well versed on YEC arguments, but I believe people's frustrations with them is not due to the lack of universally compelling arguments against them. Probably they're already guilty of plain old logical inconsistency (i.e. there's a valid chain of reasoning that shows that if they doubt the scientific estimates, then they should turn off their television right now or something similar), or they possess some kind of "undefeatable" hypothesis as prior knowledge that allows for everything to look billions of years old despite being very young. (If so, they should be very much bothered by having this type of utopic prior knowledge.)

Replies from: ChrisHallquist

↑ comment by ChrisHallquist · 2013-11-05T16:54:21.316Z · LW(p) · GW(p)

I'm not well versed on YEC arguments, but I believe people's frustrations with them is not due to the lack of universally compelling arguments against them. Probably they're already guilty of plain old logical inconsistency (i.e. there's a valid chain of reasoning that shows that if they doubt the scientific estimates, then they should turn off their television right now or something similar), or they possess some kind of "undefeatable" hypothesis as prior knowledge that allows for everything to look billions of years old despite being very young. (If so, they should be very much bothered by having this type of utopic prior knowledge.)

Well, if they're logically inconsistent, but nothing you can say to them will convince to give up YECism in order to stop being logically inconsistent... then that particular chain of reasoning, at least, isn't universally compelling.

Or, if they have an undefeatable hypothesis, if that's literally true... doesn't that mean no argument is going to be compelling to them?

Maybe you're thinking "compelling" means what ought to be compelling, rather than what actually convinces people, when the latter meaning is how Eliezer and I are using it?

Replies from: ActionItem

↑ comment by ActionItem · 2013-11-06T02:57:45.680Z · LW(p) · GW(p)

I am at a loss about the true meaning of a "universally compelling argument", but from Eliezer's original post and from references to things such as modus ponens itself, I understood it to mean something that is able to overcome even seemingly axiomatic differences between two (otherwise rational) agents. In this scenario, an agent may accept modus ponens, but if they do, they're at least required to use it consistently. For instance, a mathematician of the constructivist persuasion denies the law of the excluded middle, but if he's using it in a proof, classical mathematicians have the right to call him out.

Similarly, YEC's are not inconsistent in their daily lives, nor do they have any undefeatable hypotheses about barbeques or music education: they're being inconsistent only on a select set of topics. At this point the brick wall we're hitting is not a fundamental difference in logic or priors; we're in the domain of human psychology.

Arguments that "actually convince (all) people" are very limited and context sensitive because we're not 100% rational.

comment by Gunnar_Zarncke · 2013-11-05T22:06:56.034Z · LW(p) · GW(p)

This is a bit long for the key point: 'Don't be bothered by a lack of universally compelling arguments against because human mind spaces contains enough minds that will not accept modus or even less.' which comes at the end. You risk TLDR if you don't put a summary at the top.

Otherwise I think the title does't really fit, or else I possibly just don't see how it derives the relation - rather the opposite: 'No Universally Compelling Arguments against Crackpots'.

Replies from: ChrisHallquist

↑ comment by ChrisHallquist · 2013-11-05T22:12:42.708Z · LW(p) · GW(p)

Is it just me, or has the LW community gone overboard with the "include a TLDR" advice? There's something to be said for clear thesis statements, but there are other perfectly legitimate ways to structure an article, including "begin with an example that hooks your reader's interest" (which is also standard advice, btw), as well as "[here is the thing I am responding to] [response]" (what I used in this article). If the sequences were written today, I suspect the comments would be full of TLDR complaints.

Replies from: SaidAchmiz, Gunnar_Zarncke

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-06T02:20:28.163Z · LW(p) · GW(p)

If the sequences were written today, I suspect the comments would be full of TLDR complaints.

Eliezer is very good at writing engrossing posts, which are as entertaining to read as some of the best novels. His posts are in no need of TLDR. The only other common posters here who seem to have that skill are Yvain and (sometimes almost) Alicorn. For pretty much everyone else, TLDR helps.

Replies from: ChrisHallquist

↑ comment by ChrisHallquist · 2013-11-07T02:30:27.686Z · LW(p) · GW(p)

Yvain is an amazing writer, one of the very few people for whom I will read everything they write just because of who the author is. (In general, I read only a fraction of the stories in my feed reader.)

I wouldn't put Eliezer in that category, though. I started reading Overcoming Bias sometime around the zombie sequence, but I didn't read every post at first. I'm pretty sure I skipped almost the entire quantum mechanics sequence, because it seemed too dense. I only went through and read the entirety of the sequences more recently because (1) I was really interested in the subjects covered and (2) I wanted to be on the same page as other LW readers.

Part of why "TLDR" annoys me, I think, is that often what it really signals is lack of personal interest in the post. But of course not everyone finds every post equally interesting. If someone read the beginning of this post and though, "oh, this is about Eliezer's metaethics. Eh, whatever, don't care enough about the topic to read it," well good for them! I don't expect every LW reader to want to read every post I write.

Replies from: SaidAchmiz

↑ comment by Said Achmiz (SaidAchmiz) · 2013-11-07T06:21:57.083Z · LW(p) · GW(p)

You're right, the quantum mechanics sequence was pretty dense. Pretty much all of the other stuff, though? I mean, I read through the entirety of the OB archive up to 2009, and then all of Eliezer's posts from then until 2011 or so, and not once did I ever find myself going "nope, too boring, don't care", the way I do with most other posts on here. (Sorry, other posters.) Some of them (a small percentage) took a bit effort to get through, but I was willing to put in that effort because the material was so damn interesting and presented in such a way that it just absolutely fascinated me. (This includes the QM sequence, note!)

But different people may have different preferences w.r.t. writing styles. That aside, you should not confuse interest (or lack thereof) in a topic with interest in a post. Plenty of posts here are written about topics I care about, but the way they're written makes me close the page and never come back. I just don't have infinite time to read things, and so I will prioritize those things that are written clearly and engagingly, both because it gives me enjoyment to read such, and because good and clear writing strongly signals that the author really has a strong grasp of the concepts, strong enough to teach me new things and new views, to make things click in my head. That's what I got from many, many posts in the Sequences.

↑ comment by Gunnar_Zarncke · 2013-11-05T23:58:10.214Z · LW(p) · GW(p)

Maybe I overgeneralized this TLDR pattern but then it is no bad advice and a) I indeed find it lacking a thread and b) I provided a summary which you might or might not use.

No Universally Compelling Arguments in Math or Science

Contents

230 comments