Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It)

post by Paul Crowley (ciphergoth) · 2010-10-30T09:31:29.456Z · LW · GW · Legacy · 442 comments

[...] SIAI's Scary Idea goes way beyond the mere statement that there are risks as well as benefits associated with advanced AGI, and that AGI is a potential existential risk.

[...] Although an intense interest in rationalism is one of the hallmarks of the SIAI community, still I have not yet seen a clear logical argument for the Scary Idea laid out anywhere. (If I'm wrong, please send me the link, and I'll revise this post accordingly. Be aware that I've already at least skimmed everything Eliezer Yudkowsky has written on related topics.)

So if one wants a clear argument for the Scary Idea, one basically has to construct it oneself.

[...] If you put the above points all together, you come up with a heuristic argument for the Scary Idea. Roughly, the argument goes something like: If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.

The line of argument makes sense, if you accept the premises.

But, I don't.

Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It), October 29 2010. Thanks to XiXiDu for the pointer.

442 comments

Comments sorted by top scores.

comment by Emile · 2010-10-30T17:23:31.910Z · LW(p) · GW(p)

I really liked this comment by FrankAdamek:

With regards to teaching an AI to care: what you can teach a mind depends on the mind. The best examples come from human beings: for hundreds of years many (though not all) parents have taught their children that it is wrong to have sex before marriage, a precept that many people break even when they think they shouldn't and feel bad about it . And that's with our built in desires for social acceptance and hardware for propositional morality. For another example, you can't train tigers to care about their handlers. No matter how much time you spend with them and care for them, they sometimes bite off arms just because they are hungry. I understand most big cats are like this.

It's quite true that nobody plans to build a system with no concern for human life, but it's also true that many people assume Friendliness is easy.

comment by blogospheroid · 2010-11-02T05:37:20.006Z · LW(p) · GW(p)

I tend to think differently on this one.

Wherever I turn my head around in this world, I see lost causes everywhere. I see Goodhart's law and Campbell's law at loose everywhere. I see insane optimizers everywhere. Political parties that concentrate more on show, pomp and campaign funds than on actual issues. Corporates that seek money to the exclusion of actual creation of value. Governments that seek employment and GDP growth even when those are supported by artificial stimuli and not sustainable patterns of production and trade.

One might argue that none of these systems are actually as intelligent as a well educated human at any given moment in time. But that's the point, isn't it? You're unable to stop sub-human optimizers, how are you going to curb a near human or a super human one?

For me, the scary idea is not so much of an idea as it is an extension of something that is already happening in this world.

comment by MichaelVassar · 2010-11-03T18:18:52.052Z · LW(p) · GW(p)

Would you mind writing that up as an essay, for use as a LW post, or ideally, as a piece of SIAI literature?

comment by blogospheroid · 2010-11-12T13:24:49.150Z · LW(p) · GW(p)

Dear Michael,

Without wanting to weasel out of your request, I honestly believe that Eliezer's Lost purposes post says the point I want to make very well, much better than I can hope to phrase it without putting in some hard work. The only new point I probably made is that these forces are already at loose and it is difficult to curb them.

However, I will make an effort this weekend and see what I can come up with.

comment by MichaelVassar · 2010-11-13T03:16:45.921Z · LW(p) · GW(p)

Thanks. I appreciate the effort.

comment by arundelo · 2010-11-03T22:19:23.969Z · LW(p) · GW(p)

lost causes

I bet you meant lost purposes.

comment by XiXiDu · 2010-11-03T18:45:48.412Z · LW(p) · GW(p)

Upvoted. Although I believe that one could also see our cultural and political systems as superhuman collective entities undergoing an evolutionary arms race featuring a anthropocentrically weighted utility maximizing selection pressure. There is some evidence for this too, to put it bluntly, we are better off than we have been 100 years ago?

comment by Liron · 2010-10-30T19:04:50.394Z · LW(p) · GW(p)

A particularly troubling quote from the post:

I think the relation between breadth of intelligence and depth of empathy is a subtle issue which none of us fully understands (yet). It's possible that with sufficient real-world intelligence tends to come a sense of connectedness with the universe that militates against squashing other sentiences. But I'm not terribly certain of this, any more than I'm terribly certain of its opposite.

The obvious truth is that mind-design space contains every combination of intelligence and empathy.

comment by mwaser · 2010-10-30T19:18:19.493Z · LW(p) · GW(p)

I don't find that "truth" either obvious or true.

Would you say that "The obvious truth is that mind -design space contains every combination of intelligence and rationality"? How about "The obvious truth is that mind -design space contains every combination of intelligence and effectiveness"?

One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

comment by pjeby · 2010-10-30T20:04:35.034Z · LW(p) · GW(p)

One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

Two questions:

1) The consequences for whom?

2) How much empathy do you have for, oh, say, an E. coli bacterium?

Connecting these two questions is left as an exercise for the reader. ;-)

comment by mwaser · 2010-11-01T13:27:06.182Z · LW(p) · GW(p)

1) the AGI

2) zero

I can't play positive-sum games with an E. coli. The AGI is missing out on tremendous opportunities if it bypasses positive-sum games of potentially infinite length and utility for a short-term finite gain. This is called time-discounting. In nature, there is a very high correlation (to the point that many call it causation) between increasing intelligence and time-discounting.

comment by pjeby · 2010-11-01T18:50:52.543Z · LW(p) · GW(p)

1) the AGI

Please give an example of why the AGI should co-operate with something that cannot do anything the AGI itself cannot.

2) zero

Right. E. coli don't offer us anything we can't do for ourselves, that we can't just whip up a batch of E. coli for on demand.

The AGI is missing out on tremendous opportunities if it bypasses positive-sum games of potentially infinite length and utility for a short-term finite gain

If I'm a god, what would I need a human for? If I need humans, I can just make some. Better still, I could replace them with something more efficient that doesn't complain or rebel.

The fundamental flaw in your reasoning here is that you keep trying to construct paths through probability space that could support your hypothesis, but ONLY if you had presented some evidence for singling out that hypothesis in the first place!

It's like you're a murder investigator opening up the phonebook to a random place and saying, "well, we haven't ruled out the possibility that this guy did it", and when people quite reasonably point out that there is no connection between that random guy and the murder, you reply, "yeah, but I just called this guy, and he has no alibi." (That is, you're ignoring the fact that a huge number of people in that phonebook will also have no alibi, so your "evidence" isn't actually increasing the expected probability that that guy did it.)

And that's why you're getting so many downvotes: in LW terms, you are failing basic reasoning.

But that is not a shameful thing: any normal human being fails basic reasoning, by default, in exactly the same way. Our brains simply aren't built to do reasoning: they're built to argue, by finding the most persuasive evidence that supports our pre-existing beliefs and hypotheses, rather than trying to find out what is true.

When I first got here, I argued for some of my pet hypotheses in the exact same way, although I was righteously certain that I was not doing such a thing. It took a long time before I really "got" Bayesian reasoning sufficiently to understand what I was doing wrong, and before that, I couldn't have said here what you were doing wrong either.

comment by mwaser · 2010-11-01T20:47:20.447Z · LW(p) · GW(p)

Please give an example of why the AGI should co-operate with something that cannot do anything the AGI itself cannot.

If the overall price (including time, gaining requisite knowledge, etc) of co-operation is less expensive than the AGI doing it itself, the AGI should co-operate. No?

If I'm a god, what would I need a human for? If I need humans, I can just make some. Better still, I could replace them with something more efficient that doesn't complain or rebel.

How expensive is making humans vs. their utility? Is there something markedly more efficient that won't complain or rebel if you treat it poorly? How efficient/useful could a human be if you treated it well?

There are also useful pseudo-moral arguments of the type of pre-committing to a benevolent strategy so that others (bigger than you) will be benevolent to you.

The fundamental flaw in your reasoning here is that you keep trying to construct paths through probability space that could support your hypothesis, but ONLY if you had presented some evidence for singling out that hypothesis in the first place!

Agreed. So your argument is that I'm not adequately presenting evidence for singling out that hypothesis. That's a useful criticism. Thanks!

And that's why you're getting so many downvotes: in LW terms, you are failing basic reasoning.

I disagree. I believe that I am failing to successfully communicate my reasoning. I understand your arguments perfectly well (and appreciate them) and agree with them if that is what I was trying to do. Since they are not what I'm trying to do -- although they apparently are what I AM doing -- I'm assuming (yes, ASS-U-ME) that I'm failing elsewhere and am currently placing the blame on my communication skills.

Are you willing to accept that premise and see if you can draw any helpful conclusions or give any helpful advice?

And, once again, thank you for already taking the time to give such a detailed thoughtful response.

comment by wedrifid · 2010-11-02T01:04:00.995Z · LW(p) · GW(p)

How expensive is making humans vs. their utility? Is there something markedly more efficient that won't complain or rebel if you treat it poorly?

Yes. The nano bots that you could build out of my dismantled raw materials.There is something humbling to realise that my complete submission and wholehearted support is worth less to a non-friendly AI than my spleen.

comment by gwern · 2010-11-02T01:59:29.648Z · LW(p) · GW(p)

Oh, worth much much less than your spleen. It might be a fun exercise to take the numbers from Seth Lloyd and figure out how molecules (optimistically, the volume of a cell or two) your brain is worth.

comment by pjeby · 2010-11-02T00:44:17.284Z · LW(p) · GW(p)

How expensive is making humans vs. their utility?

Utility for what purpose? If we're talking about say, a paperclip maximizer, then its utility for human beings will be measured in paperclip production.

Is there something markedly more efficient that won't complain or rebel if you treat it poorly? How efficient/useful could a human be if you treated it well?

It won't be as efficient as specialized paperclip-production machines will, for the production of paperclips.

Are you willing to accept that premise and see if you can draw any helpful conclusions or give any helpful advice?

Yes, but you're unlikely to be happy with it: read the sequences, or at least the parts of them that deal with reasoning, the use of words, and inferential distances. (For now at least, you can skip the quantum mechanics, AI, and Fun Theory parts.)

At minimum, this will help you understand LW's standards for basic reasoning, and how much higher a bar they are than what constitutes "reasoning" pretty much anywhere else.

If you're reasoning as well as you say, then the material will be a breeze, and you'll be able to make your arguments in terms that the rest of us can understand. Or, if you're not, then you'll probably learn that along the way.

comment by bentarm · 2010-11-02T12:17:22.354Z · LW(p) · GW(p)

Please give an example of why the AGI should co-operate with something that cannot do anything the AGI itself cannot.

A sufficiently clever AI should understand Comparative Advantage

comment by Vladimir_Nesov · 2010-11-02T12:56:36.816Z · LW(p) · GW(p)

Comparative advantage explains how to make use of inefficient agents, so that ignoring them is a worse option. But if you can convert them into something else, you are no longer comparing the gain from trading with them to indifference of ignoring them, you are comparing the gain from trading with them to the gain from converting them. And if they can be cheaply converted into something much more efficient than they are, converting them is the winning move. This is a move largely not available to the present society, hence its absence is a reasonable assumption for now but one that breaks when you consider indifferent smart AGI.

comment by JGWeissman · 2010-11-02T16:59:59.149Z · LW(p) · GW(p)

The law of comparative advantage relies on some implicit assumptions that are not likely to hold between a superintelligence and humans:

The transactions costs must be small enough not to negate the gains from trade. A superintelligence may require more resources to issue a trade request to slow thinking humans and to receive the result, while possibly letting processes idle while waiting for the result, than to just do it itself.

Your trading partner must not have the option of building a more desirable trading partner out of your component parts. A superintelligence could get more productivity of atoms arranged as an extension of itself than atoms arranged as humans. (ETA: See Nesov's comment.)

comment by pjeby · 2010-11-02T16:08:16.511Z · LW(p) · GW(p)

A sufficiently clever AI should understand Comparative Advantage

And a sufficiently clever human should realize that clever humans can and do routinely increase the efficiencies of their industry enough to shift the comparative advantage.

It really doesn't take that much human-level intelligence to change how things are done -- all it takes is a lack of attachment to the current ways.

And that's perhaps the biggest "natural resource" an AI has: the lack of status quo bias.

comment by Vladimir_Nesov · 2010-11-02T16:26:10.776Z · LW(p) · GW(p)

And a sufficiently clever human should realize that clever humans can and do routinely increase the efficiencies of their industry enough to shift the comparative advantage.

I don't understand what are you arguing for. That people become better off doing something different, doesn't necessarily imply that they become obsolete, or even that they can't continue doing the less-efficient thing.

comment by bentarm · 2010-11-02T16:51:31.869Z · LW(p) · GW(p)

And a sufficiently clever human should realize that clever humans can and do routinely increase the efficiencies of their industry enough to shift the comparative advantage.

I'm not sure I understand what "shift the comparative advantage" could mean, and I have no idea why this is supposed to be a response to my point.

Maybe I didn't make my point clearly enough. My contention is that even if an AI is better at absolutely everything than a human being, it could still be better off trading with human beings for certain goods, for the simple reason that it can't do everything, and in such a scenario both human beings and the AI would get gains from trade.

As Nesov points out, if the AI has the option of, say, converting human beings into computational substrate and using them to simulate new versions of itself, then this ceases to be relevant.

comment by jimrandomh · 2010-10-30T19:19:41.772Z · LW(p) · GW(p)

One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

Human psychopaths are a counterexample to this claim, and they seem to be doing alright in spite of active efforts by the rest of humanity to detect and eliminate them.

comment by wedrifid · 2010-11-01T01:12:36.601Z · LW(p) · GW(p)

Human psychopaths are a counterexample to this claim, and they seem to be doing alright in spite of active efforts by the rest of humanity to detect and eliminate them.

'Detect and eliminate' or 'detect and affiliate with the most effective ones'. One or the other. ;)

comment by rwallace · 2010-11-01T15:03:12.132Z · LW(p) · GW(p)

There are no efforts by the rest of humanity to detect and eliminate the sort of psychopaths who understand it's in their own interests to cooperate with society.

The sort of psychopaths who fail to understand that, and act accordingly, typically end up doing very badly.

comment by Eneasz · 2010-11-01T22:13:39.040Z · LW(p) · GW(p)

Why all the focus on psychopaths? It could be said that certain forms of autism are equally empathy-blinded, and yet people along that portion of the spectrum are often hugely helpful to the human race, and get along just fine with the more neurotypical.

comment by mwaser · 2010-11-01T13:55:40.899Z · LW(p) · GW(p)

No. There are two bad assumptions in your counterexample.

They are:

  1. Human psychopaths are above the certain point of intelligence that I was talking about.

  2. Human psychopaths are sufficiently long-lived for the consequences to be severe enough.

Hmmmm. #2 says that I probably didn't make clear enough the importance of the length of interaction.

You also appear to have the assumption that my argument is that the AGI fears detection of its unfriendly behavior and any consequences that humanity can apply. Humanity CANNOT apply sufficient negative consequences to a sufficiently powerful AGI. The severe consequences are all missed opportunity costs which means that the AGI is thereby sub-optimal and thereby less intelligent than is possible.

comment by Kingreaper · 2010-11-02T09:03:16.802Z · LW(p) · GW(p)

What sort of opportunity costs?

The AI can simulate humans if it needs them, for a lower energy cost than keeping the human race alive.

So, why should it keep the human race alive?

comment by udo · 2010-10-31T12:00:58.721Z · LW(p) · GW(p)

The underlying disorders of what is commonly referred to as psychopathy are indeed detectable. I also find it comforting that they are in fact disorders and that being evil in this fashion is not an attribute of an otherwise high-functioning mind. Psychopaths can be high-functioning in some areas, but a short interaction with them almost always makes it clear that there is something is.wrong.

comment by Kaj_Sotala · 2010-11-01T09:28:13.996Z · LW(p) · GW(p)

also find it comforting that they are in fact disorders

Homosexuality was also a disorder once. Defining something as a sickness or disorder is a matter of politics as much as anything else.

comment by XiXiDu · 2010-11-01T10:07:47.231Z · LW(p) · GW(p)

Cat burning was also a form of entertainment once. Defining something as fun or entertainment is a matter of politics as much as anything else. The same goes for friendliness. I fear that once we pinpoint it, it'll be outdated.

comment by NancyLebovitz · 2010-10-31T15:41:12.740Z · LW(p) · GW(p)

What do you mean by psychopathy?

At least one sort of no-empathy person is unusually good at manipulating most people.

comment by NihilCredo · 2010-10-31T15:51:45.486Z · LW(p) · GW(p)

Everybody who is known to be a psychopath is a bad psychopath, by definition; a skilled psychopath is one who will not let people figure out that he's a psychopath.

Of course, this means that the existence of sufficiently skilled psychopath is, in everyday practice, unprovable and unfalsifiable (at least to the degree that we cannot tell the difference between a good actor and someone genuinely feeling empathy; I suppose you might figure out something by measuring people's brain activity while they watch a torture scene).

comment by wedrifid · 2010-11-01T01:25:11.345Z · LW(p) · GW(p)

I suppose you might figure out something by measuring people's brain activity while they watch a torture scene

Even then it is far from definitive. Experienced doctors, for example, lose a lot the ability to feel certain kinds of physical empathy - their brains will look closer to a good actor's brain than that of a naive individual exposed to the same stimulus. That's just practical adaptation and good for patient and practitioner alike.

comment by NancyLebovitz · 2010-11-01T10:52:39.619Z · LW(p) · GW(p)

Considering the number of horror stories I've heard about doctors who just don't pay attention, I'm not sure you're right that doctors acting their empathy is good for patients.

Cite? I'm curious about where and when that study was done.

comment by wedrifid · 2010-11-01T22:38:14.753Z · LW(p) · GW(p)

Cite? I'm curious about where and when that study was done.

Don't know. Never saw it first hand - I heard it from a doctor.

comment by NancyLebovitz · 2010-11-02T01:31:16.403Z · LW(p) · GW(p)

Thanks for your reply, but I think I'm going to push for some community norms for sourcing information from studies, ranging from read the whole thing carefully to heard about it from someone.

comment by wedrifid · 2010-11-02T07:04:49.348Z · LW(p) · GW(p)

Only on lesswrong - we look down our noses at people who take the word of medical specialists.

comment by NancyLebovitz · 2010-11-02T09:15:49.028Z · LW(p) · GW(p)

That doctor almost certainly wasn't speaking out of his specialist knowledge.

comment by wedrifid · 2010-11-02T10:51:44.058Z · LW(p) · GW(p)

You don't have enough information to arrive at that level of certainty. He was not, for example, a general practitioner and I was not a client of his. I was actually working with him in medical education at the time. Come to think of it, bizarrely enough and by pure happenstance that does put the subject into the realm of his specialist knowledge.

I don't present that as a reason to be persuaded - I actually think not taking official status, particularly medicine related official status, seriously is a good thing. It is just a reply to your presumption.

While I don't expect you to take my (or his) word for anything I also wouldn't expect you to need to. This is exactly the finding I would expect based off general knowledge of human behavior. When people are constantly exposed to stimulus that is emotionally laden they will tend to become desensitized to it. There are whole schools of cognitive therapy based on this fact. If someone has taken on the role of a torturer then their emotional response to witnessing torture will be drastically altered. Either it will undergo extinction or the individual will be crippled with PTSD. This can be expected to apply even more when they fully identify with their role due to, for example, the hazing processes involved in joining military and paramilitary organisations.

comment by NancyLebovitz · 2010-11-02T13:37:41.425Z · LW(p) · GW(p)

Part of what seemed iffy was the claim that it was good for both the patients and the practitioner, when it was correlated (from what you said) with experience, with no mention of quality of care.

When someone says their source is "a doctor", what are the odds that it's a researcher specializing in that particular area? Especially when the information is something which could as easily be a fluffy popular report as something clearly related to a specialty?

Also, I had a prior from Bernard Siegal which is also intuitively plausible-- that doctors who are emotionally numb around their patients are more likely to burn out. This was likely to have been based on anecdote, but not a crazy hypothesis.

comment by Craig_Heldreth · 2010-11-02T18:09:59.541Z · LW(p) · GW(p)

I believe you have a sign error in your last paragraph. Doctors who do not emotionally numb themselves are the ones considered at risk to burn out. I have a friend from one of my T groups who is a physician at M. D. Anderson Cancer Center and she is now working in intensive care where the people are really messed up and people die all the time. She believes genuine loving care for her patients is her duty and makes her a better physician; she was trained be emotionally numb and she felt like it was an epiphany for herself to rebel against this after a couple of years in her current assignment.

I have not asked her if her attitude is obvious to her supervisors. My guess is that it probably is not; I do not think she is secretive about it (although she probably does not go around evangelizing to the other doctors much) but I would think that the other doctors are too preoccupied to observe it.

In the book Consciousness and Healing Larry Dossey M.D. also explicitly discusses behavioral norms of professional physicians being to minimize emotional involvement and he has arguments that this is a bad practice. (That book is not an example of good rational thinking from cover to cover.)

comment by wedrifid · 2010-11-02T18:29:11.993Z · LW(p) · GW(p)

Part of what seemed iffy was the claim that it was good for both the patients and the practitioner, when it was correlated (from what you said) with experience, with no mention of quality of care.

Desensitization to powerful negative emotional reactions is not the same thing as not caring and not building a personal relationship.

Most of our default emotional reactions when we are in close contact with others who have physical or emotional injuries aren't exactly optimal for the purpose of providing assistance. Particularly when what the doctor needs to do will cause more pain.

comment by wedrifid · 2010-11-01T01:18:17.053Z · LW(p) · GW(p)

I'll add that at particularly high levels of competence it makes very little difference whether you are a psychopath who has mastered the deception of others or a hypocrite (normal person) who has mastered deception of yourself.

comment by timtyler · 2010-10-30T19:23:13.366Z · LW(p) · GW(p)

One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

That is probably because you don't share a definition of intelligence with most of those here.

Perhaps look through http://www.vetta.org/definitions-of-intelligence/ - and see if you can find your position.

comment by mwaser · 2010-11-01T13:43:47.342Z · LW(p) · GW(p)

Nope. I agree with the vast majority of the vetta definitions.

But let's go with Marcus Hutter - "There are strong arguments that AIXI is the most intelligent unbiased agent possible in the sense that AIXI behaves optimally in any computable environment."

Now, which is more optimal -- opting to play a positive-sum game of potentially infinite length and utility with cooperating humans OR passing up the game forever for a modest short-term gain?

Assume, for the purposes of argument, that the AGI does not have an immediate pressing need for the gain (since we could then go into a recursion of how pressing is the need -- and yes, if the need is pressing enough, the intelligent thing to do unless the agent's goal is to preserve humanity is to take the short-term gain and wipe out humanity -- but how would a super-intelligent AGI have gotten itself into that situation?). This should answer all of the questions about "Well, what if the AGI had a short-term preference and humans weren't it".

comment by gwern · 2010-11-01T17:13:54.422Z · LW(p) · GW(p)

I am jumping in here from Recent Comments, so perhaps I am missing context - but how is AIXI interacting with humanity an infinite positive-sum gain for it?

It doesn't seem like AIXI could even expect zero-sum gains from humanity: we are using up a lot of what could be computronium.

comment by timtyler · 2010-11-01T21:08:15.884Z · LW(p) · GW(p)

That definition doesn't explicity mention goals. Many of the definitions do explicity mention goals. What the definitions usually don't mention is what those goals are - and that permits super-villains, along the lines of General Zod.

If (as it appears) you want to argue that evolution is likely to produce super-saints - rather than super-villains - then that's a bit of a different topic. If you wanted to argue that, "requirement" was probably the wrong way of putting it.

comment by Perplexed · 2010-10-31T02:39:57.647Z · LW(p) · GW(p)

One of my fundamental contentions is that empathy is a requirement for intelligence beyond a certain point because the consequences of lacking it are too severe to overcome.

Now if you had suggested that intelligence cannot evolve beyond a certain point unless accompanied by empathy ... that would be another matter. I could easily be convinced that a social animal requires empathy almost as much as it requires eyesight, and that non-social animals cannot become very intelligent because they would never develop language.

But I see no reason to think that an evolved intelligence would have empathy for entities with whom it had no social interactions during its evolutionary history. And no a priori reason to expect any kind of empathy at all in an engineered intelligence.

Which brings up an interesting thought. Perhaps human-level AI already exists. But we don't realize it because we have no empathy for AIs.

comment by timtyler · 2010-10-31T09:21:53.470Z · LW(p) · GW(p)

The most likely location for an "unobserved" machine intelligence is probably the NSA's basement.

However, it seems challenging to believe that a machine intelligence would need to stay hidden for very long.

comment by timtyler · 2010-11-01T22:00:42.853Z · LW(p) · GW(p)

But I see no reason to think that an evolved intelligence would have empathy for entities with whom it had no social interactions during its evolutionary history.

MIT's Leonardo? Engineered super-cuteness!

comment by Mass_Driver · 2010-11-01T13:52:22.784Z · LW(p) · GW(p)

Well, it does contain all those points, but some weird points are weighted much less heavily.

comment by jimrandomh · 2010-10-30T16:44:16.243Z · LW(p) · GW(p)

There is a large, continuous spectrum between making an AI and hoping it works out okay, and waiting for a formal proof of friendliness. Now, I don't think a complete proof is feasible; we've never managed a formal proof for anything close to that level of complexity, and the proof would be as likely to contain bugs as the program would. However, that doesn't mean we shouldn't push in that direction. Current practice in AI research seems to be to publish everything and take no safety precautions whatsoever, and that is definitely not good.

Suppose an AGI is created, initially not very smart but capable of rapid improvement, either with further development by humans or by giving it computing resources and letting it self-improve.Suppose, further, that its creators publish the source code, or allow it to be leaked or stolen.

AI improvement will probably proceed in a series of steps: the AI designs a successor, spends some time inspecting it to make sure the successor has the same values, then hands over control, then repeat. At each stage, the same tradeoff between speed and safety applies: more time spent verifying the successor means a lower probability of error, but a higher probability that other bad things will happen in the mean time.

And here's where there's a real problem. If there's only one AI improving itself, then it can proceed slowly, knowing that the probability of an asteroid strike, nuclear war or other existential risk is reasonably low. But if there are several AIs doing this at once, then whichever one proceeds least cautiously wins. That situation creates a higher risk of paperclippers, as compared to if there were only one AI developed in secret.

comment by timtyler · 2010-10-30T17:23:28.947Z · LW(p) · GW(p)

Current practice in AI research seems to be to publish everything and take no safety precautions whatsoever, and that is definitely not good.

Most of the compaines involved (e.g. Google, James Harris Simons) publish little or nothing relating so their code in this area publicly - and few know what safeguards they employ. The government security agencies potentially involved (e.g. the NSA) are even more secretive.

comment by xamdam · 2010-11-03T22:11:43.401Z · LW(p) · GW(p)

Simons is an AI researcher? News to me. Clearly his fund uses machine learning, but there is an ocean between that and AGI (besides plenty of funds use ML also, DE Shaw and many others).

comment by Jordan · 2010-11-01T21:02:18.663Z · LW(p) · GW(p)

There is a large, continuous spectrum between making an AI and hoping it works out okay, and waiting for a formal proof of friendliness.

Exactly this!

I think there is a U-shaped response curve to risk versus rigor. Too little rigor ensures disaster, but too much rigor ensures a low rigor alternative is completed first.

When discussing the correct course of action, I think it is critical to consider not just probability of success but also time to success. So far as I've seen arguments in favor of SIAI's course of action have completely ignored this essential aspect of the decision problem.

comment by XiXiDu · 2010-10-30T16:09:01.489Z · LW(p) · GW(p)

Much is unclear. I believe this post is a good oppurtunity to give a roundup of the problem, for anyone who hasn't read the comments thread here:

The risk from recursive self-improvement is either dramatic enough to outweigh the low probability of the event or likely enough to outweight the probability of other existential risks. This is the idea everything revolves around in this community (it's not obvious, but I believe so). It is a idea that, if true, possible affects everyone and our collective future, if not the whole universe.

I believe that someone like Eliezer Yudkowsky and the SIAI should be able to state in a concise way (with possible extensive references) why it is rational to make friendly AI a top priority. Given that friendly AI seems to be what his life revolves around the absence of material in support for the proposition of risks posed by uFAI seems to be alarming. And I'm not talking about the absence of apocalyptic scenarios here but other kinds of evidence than a few years worth of disjunctive lines of reasoning. The bulk of all writings on LW and by the SIAI are about rationality, not risks posed by recursively self-improving artificial general intelligence.

  • Where are the formulas? What are the variables? Where is a method exemplified to reflect the decision process of someone who's already convinced, preferably of someone within the SIAI? That would be part of what I call transparency and a foundational and reproducible corroboration of one's first principles.
  • Where are the reference to substantial third-party research papers? There are many open problems regarding artificial general intelligence, how exactly does the SIAI handle those uncertainties and accounts for them in their probability estimations of the dangers posed by AI?
  • Where does the SIAI outline the likelihood of slow versus fast development of AGI? Where are your probability estimations that account for these uncertainties. Where are your variables and references that allow you to make any kind of estimations to balance the risks of a hard rapture with a somewhat controllable development?
  • What are the foundations that give credibility to the chain of reasoning that leads one to accept unfriendly superhuman intelligence going foom as a serious risk?
  • Where is the supportive evidence at the origin of your complex multi-step extrapolations argued to be from inductive generalizations?

What if someone came along making coherent arguments about some existential risk about how some sort of particle collider might destroy the universe? I would ask what the experts think who are not associated with the person who makes the claims. What would you think if he simply said, "do you have better data than me"? Or, "I have a bunch of good arguments"? If you say that some sort of particle collider is going to destroy the world with a probability of 75% if run, I'll ask you for how you came up with these estimations. I'll ask you to provide more than a consistent internal logic but some evidence-based prior.

The current state of evidence IS NOT sufficient to scare people up to the point of having nightmares and ask them for most of their money. It is not sufficient to leave comments making holocaust comparisons on the blogs of AI researchers.

  • Is smarter than human intelligence possible in a sense comparable to the difference between chimps and humans?
  • How is an encapsulated AI going to get into control without already existing advanced nanotechnology? It might order something over the Internet if it hacks some bank account etc. (long chain of assumptions), but how is it going to make use of the things it orders?
  • Why should self-optimization not be prone to be very limited. Changing anything substantial might lead Gandhi to swallow the pill that will make him want to hurt people, so to say.

You have to list your primary propositions on which you base further argumentation, from which you draw conclusions and which you use to come up with probability estimations stating risks associated with former premises. You have to list these main principles so anyone who comes across claims of existential risks and a plead for donation, can get an overview. Then you have to provide the references, if you believe they give credence to the ideas, so that people see that all you say isn't made up but based on previous work and evidence by people that are not associated with your organisation.

You could argue your case of "this is obviously true" with completely made-up claims, and I'd have no way to tell. -- Kaj_Sotala

This is a community devoted to refining the art of rationality. How is it rational to believe the Scary Idea without being able to tell if it is more than an idea?

comment by anonym · 2010-10-30T23:18:07.964Z · LW(p) · GW(p)

The risk from recursive self-improvement is either dramatic enough to outweigh the low probability of the event or likely enough to outweight the probability of other existential risks. This is the idea everything revolves around in this community (it's not obvious, but I believe so).

Umm, this is not the SIAI blog. It is "Less Wrong: a community blog devoted to refining the art of human rationality".

The idea everything revolves around in this community is what comes after the ':' in the preceding sentence.

comment by XiXiDu · 2010-10-31T15:36:51.469Z · LW(p) · GW(p)
  • Google site:lesswrong.com "artificial intelligence" 4,860 results
  • Google site:lesswrong.com rationality 4,180 results

Besides its history and the logo with a link to the SIAI that you can see in the top right corner, I believe that you underestimate the importance of artificial intelligence and associated risks within this community. As I said, it is not obvious, but when Yudkowsky came up with LessWrong.com it was against the background of the SIAI.

comment by anonym · 2010-10-31T18:53:12.472Z · LW(p) · GW(p)

Eliezer explicitly forbade discussion of FAI/Singularity topics on lesswrong.com for the first few months because he didn't want discussion of such topics to be the primary focus of the community.

Again, "refining the art of human rationality" is the central idea that everything here revolves around. That doesn't mean that FAI and related topics aren't important, but lesswrong.com would continue to thrive (albeit less so) if all discussion of singularity ceased.

comment by wedrifid · 2010-10-31T20:29:14.588Z · LW(p) · GW(p)
  • Google site:lesswrong.com "me" 5,360 results
  • Google site:lesswrong.com "I" 7,520 results
  • Google site:lesswrong.com "it" 7,640 results
  • Google site:lesswrong.com "a" 7,710 results

Perhaps you overestimate the extent to which google search results on a term reflect the importance of the concept to which the word refers.

I note that:

  • The best posts on 'rationality' are among those that do not use the word 'rationality'*.
  • Similar to 'Omega' and 'Clippy', AI is a useful agent to include when discussing questions of instrumental rationality. It allows us to consider highly rational agents in the abstract without all the bullshit and normative dead weight that gets thrown into conversations whenever the agents in question are humans.
comment by wedrifid · 2010-10-30T16:46:05.553Z · LW(p) · GW(p)

The current state of evidence IS NOT sufficient to scare people up to the point of having nightmares

You appear to be suggesting that Eliezer should censor presentation of his thoughts on the subject so as to prevent people from having nightmares. Spot the irony! ;)

and ask them for most of their money.

Eliezer asks people for money. That hardly makes him unique. Neither he nor anyone else is obliged to get your permission before they ask for donations in support of their cause. It seems to me that you expect more from the SIAI than you do from other well meaning organisations simply because there is actually a chance that the cause may make a significant long term difference. As opposed to virtually all the rest - those we know are pointless!

What if someone came along making coherent arguments about some existential risk about how some sort of particle collider might destroy the universe? I would ask what the experts think who are not associated with the person who makes the claims. What would you think if he simply said, "do you have better data than me"? Or, "I have a bunch of good arguments"? If you say that some sort of particle collider is going to destroy the world with a probability of 75% if run, I'll ask you for how you came up with these estimations. I'll ask you to provide more than a consistent internal logic but some evidence-based prior.

I rather suspect that if all those demands were meant you would go ahead and find new rhetorical demands to make.

So take my word for it, I know more than you do, no really I do, and SHUT UP. -- Eliezer Yudkowsky (Reference)

You have to list your primary propositions on which you base further argumentation, from which you draw conclusions and which you use to come up with probability estimations stating risks associated with former premises. You have to list these main principles so anyone who comes across claims of existential risks and a plead for donation, can get an overview. Then you have to provide the references, if you believe they give credence to the ideas, so that people see that all you say isn't made up but based on previous work and evidence by people that are not associated with your organisation.

That quote is out of context. While I do happen to hold Eliezer's behavior in that context in contempt, the way the quote is presented here is misleading. It is not relevant to your replies and only relevant to the topic here by virtue of Eliezer's character.

Is smarter than human intelligence possible in a sense comparable to the difference between chimps and humans?

This is a community devoted to refining the art of rationality. How is it rational to believe the Scary Idea without being able to tell if it is more than an idea?

Speak for yourself. I don't have the difficulty comprehending the premises either the ones you have questions here or the others required to make an adequate evaluation for the purpose of decision making.

Neither I nor Eliezer and the SIAI need to force understanding of the Scary Idea upon you for it to be rational for us to place credence on it. The same applies to other readers here. That is not to say that more work producing the documentation of the kind that you describe would not be desirable.

comment by XiXiDu · 2010-10-31T10:59:34.752Z · LW(p) · GW(p)

This comment will be downvoted but I hope you people will actually explain yourself and not just click 'Vote down', every bot can do that.

Now that I've slept I read your comment again and I don't see any justification for why it got upvoted even once. I never claimed that EY can't ask for money, you are creating a straw man there. You also do not know what I do expect from other organisations. Further, it is not fallacious to suspect that Yudkowsky has some responsibility if people get nighmares from ideas that he would be able to resolve. If he really believes those things, it is of course his right to proclaim them. But the gist of my comment was meant to inquire about the foundations of those beliefs and stating that it does not appear to me that they are based on evidence which makes it legally right but ethically irresponsible to tell people to worry to such an extent or even not to tell them not to worry.

I rather suspect that if all those demands were meant you would go ahead and find new rhetorical demands to make.

I just don't know how to parse this. I mean what I asked for and I do not ask for certainty here. I'm not doubting evolution and climate change. The problem is that even a randomly picked research paper likely bears more analysis, evidence and references than all of LW and the SIAI' documents together regarding risks posed by recursive self-improvement from artificial general intelligence.

That quote is out of context.

The quotes have been relevant as they showed that Yudkowsky clearly believes in his intellectual and epistemic superiority, yet any corroborative evidence seems to be missing. Yes, there is this huge amount of writings on rationality and some miscellaneous musing on artificial intelligence. But given how the idea of risks from AGI is weighted by him, it is just the cherry on top of marginal issues that do not support the conclusions.

Speak for yourself. I don't have the difficulty comprehending the premises either the ones you have questions here or the others required to make an adequate evaluation for the purpose of decision making.

I don't have a difficulty to comprehend them either. I'm questioning the propositions, the conclusions drawn and further speculations based on those premises.

Neither I nor Eliezer and the SIAI need to force understanding of the Scary Idea upon you for it to be rational for us to place credence on it.

This is ridiculous. I never said you are forced to explain yourself. You are forced to explain yourself if you want people like me to take you serious.

comment by timtyler · 2010-10-31T16:31:54.668Z · LW(p) · GW(p)

The quotes have been relevant as they showed that Yudkowsky clearly believes in his intellectual and epistemic superiority, yet any corroborative evidence seems to be missing. Yes, there is this huge amount of writings on rationality and some miscellaneous musing on artificial intelligence. [...]

Yudkowsky is definitely a clever fellow. He may not have fancy qualifications - and he is far from infallible - but he is pretty smart.

In the particular post in question, I am pretty sure he was being silly - which is a rather unfortunate time to be claiming superiority.

However, I don't really know. The stunt created intrigue, mystery, the forbidden, added to the controversy. Overall, Yudkowsky is pretty good at marketing - and maybe this was a taste of it.

I wonder if his Harry Potter fan-fic is marketing - or else how he justifies it.

comment by wedrifid · 2010-10-31T14:53:54.797Z · LW(p) · GW(p)

This is ridiculous. I never said you are forced to explain yourself. You are forced to explain yourself if you want people like me to take you serious.

If you had restrained your claim in that way (ie. not made the claim that I had quoted in the above context) then I would have agreed with you.

comment by XiXiDu · 2010-10-31T15:12:04.490Z · LW(p) · GW(p)

I cannot account for every possible interpretation in what I write in a comment. It is reasonable not to infer oughts from questions. I said:

This is a community devoted to refining the art of rationality. How is it rational to believe the Scary Idea without being able to tell if it is more than an idea?

That is, if you can't explain yourself why you hold certain extreme beliefs then how is it rational for me to believe that the credence you place on it is justified? The best response you came up with was telling me that you are able to understand and that you don't have to force this understanding onto me to believe into it yourself. That is a very poor argument and that is what I called ridiculous. Even more so as people voted it up, which is just sad.

I though this has been sufficiently clear from what I wrote before.

comment by Perplexed · 2010-10-31T15:56:01.673Z · LW(p) · GW(p)

That is a very poor argument and that is what I called ridiculous. Even more so as people voted it up, which is just sad.

And it is at this point in the process that an accomplished rationalist says to himself, "I am confused", and begins to learn.

My impression is that you and Wedrifid are talking past each other. You think that you both are arguing about whether uFAI is a serious existential risk. Wedrifid isn't even concerned with that. He is concerned with "process questions" - with the analysis of the dialog that you two are conducting, rather than the issue of uFAI risk. And the reason he is being upvoted is because this forum, believe it or not, is a process question forum. It is about rationality, not about AI. Many people here really aren't that concerned about whether Goertzel or Yudkowsky has a better understanding of uFAI risks. They just have a visceral dislike of rhetorical questions.

If you want to see the standard arguments in favor of the Scary Idea, follow Louie's advice and read the papers at the SIAI web site. But if you find those arguments unsatisfactory (and I suspect you will) exercise some care if you come looking for a debate on the question here on Less Wrong. Because not everyone who engages with you here will be engaging you on the issue that you want to talk about.

comment by wedrifid · 2010-10-31T20:10:07.306Z · LW(p) · GW(p)

Many people here really aren't that concerned about whether Goertzel or Yudkowsky has a better understanding of uFAI risks.

I am somewhat more interested in understanding why Gortzel would say what he says about AI. Just saying 'Gortzel's brain doesn't appear to work right' isn't interesting. But the Hansonian signalling motivations behind academic posturing is more so.

comment by wedrifid · 2010-10-31T20:00:54.689Z · LW(p) · GW(p)

Well said.

(Although to be more precise I don't have a visceral dislike of rhetorical questions per se. It is the use of rhetoric to subvert reason that produces the visceral reaction, not the rhetoric(al question) itself.)

comment by XiXiDu · 2010-10-30T17:42:14.229Z · LW(p) · GW(p)

I was too lazy to write this up again, it's copy and paste work so don't mind some inconsistencies. Regarding the quotes, I think that EY seriously believes what he says in the given quotes, otherwise I wouldn't have posted them. I'm not even suggesting that it isn't true, I actually allow for the possibility that he is that smart. But I want to know what I should do and right now I don't see any good arguments.

I'm a supporter and donor and what I'm trying to do here is coming up with the best possible arguments to undermine the credence of the SIAI. Almost nobody else is doing that, so I'm trying my best here. This isn't damaging, this is helpful. Because once you become really popular, people like P.Z. Myers and other much more eloquent and popular people will pull you to pieces if you can't even respond to my poor attempt at being a devils advocate.

I don't have the difficulty comprehending the premises either the ones you have questions here or the others required to make an adequate evaluation for the purpose of decision making.

I don't even know where to start here, so I won't. But I haven't come across anything yet that I had trouble understanding.

I rather suspect that if all those demands were meant you would go ahead and find new rhetorical demands to make.

See that women with red hair? Well, the cleric told me that he believes that she's a witch. But he'll update on evidence if the fire didn't consume her. I said red hair is insufficient data to support that hypothesis and take such extreme measures to test it. He told me that if he came up with more evidence like sorcery I'd just go ahead and find new rhetorical demands.

You appear to be suggesting that Eliezer should censor presentation of his thoughts on the subject so as to prevent people from having nightmares. Spot the irony! ;)

I'm not against free speech and religious freedom but that also applies for my own thoughts on the subject. I believe he could do much more than censoring certain ideas, namely show that they are bogus.

comment by Perplexed · 2010-10-30T18:09:15.890Z · LW(p) · GW(p)

I believe he could do much more than censoring certain ideas, namely show that they are bogus.

I'm not a big fan of Eliezer, but that complaint strikes me as completely unfair. There is far less censorship here than at a typical moderated blog. And EY does expend some effort showing that various ideas are bogus.

I'm not an insider, or even old-timer, but I have reason to believe that the one single forbidden subject here is censored not because it is believed to be valid or bogus, nor because it casts a bad light on EY and SIAI, but rather because discussing it does no good and may do some harm - something a bit like a ban on certain kinds of racist offensive speech, but different.

And in any case, the "forbidden idea" can always be discussed elsewhere, assuming you can even find anyone that can become interested in the idea elsewhere. The reach of EY's "censorship" is very limited.

comment by wedrifid · 2010-10-30T18:39:44.386Z · LW(p) · GW(p)

He told me that if he came up with more evidence like sorcery I'd just go ahead and find new rhetorical demands.

[See context for implied meaning if the excerpt isn't clear]. I claimed approximately the same thing that you say yourself below.

I'm a supporter and donor and what I'm trying to do here is coming up with the best possible arguments to undermine the credence of the SIAI. Almost nobody else is doing that, so I'm trying my best here.

I've got nothing against the Devil, it's the Advocacy that is mostly bullshit. Saying you are 'Devil's Advocate' isn't an excuse to use bad arguments. That would be an insult to the Devil!

I don't even know where to start here, so I won't. But I haven't come across anything yet that I had trouble understanding.

You conveyed most of your argument via rhetorical questions. To the extent that they can be considered to be in good faith (and not just verbal tokens intended to influence) some of them only support the position you used them for if you genuinely do not understand them (implying that there is no answer). I believe I quoted an example in the context.

Making an assertion into a question does not give a license to say whatever you want with no risk of direct contradiction. (Even though that is how the tactic is used in practice.)

More concise answer: Then don't ask stupid questions!

comment by XiXiDu · 2010-10-30T19:02:08.646Z · LW(p) · GW(p)

To the extent that they can be considered to be in good faith (and not just verbal tokens intended to influence) some of them only support the position you used them for if you genuinely do not understand them (implying that there is no answer).

I'm probably too tired to parse this right now. I believe there probably is an answer, but it is buried under hundreds of posts about marginal issues. All those writings on rationality, there is nothing I disagree with. Many people know about all this even outside of the LW community. But what is it that they don't know that EY and the SIAI knows? What I was trying to say is that if I have come across it then it was not convincing enough to take it as serious as some people here obviously do.

It looks like that I'm not alone. Goertzel, Hanson, Egan and lots of other people don't see it as well. So what are we missing, what is it that we haven't read or understood?

comment by hairyfigment · 2010-11-01T00:11:33.036Z · LW(p) · GW(p)

Goertzel: I could and will list the errors I see in his arguments (if nobody there has done so first). For now I'll just say his response to claim #2 seems to conflate humans and AIs. But unless I've missed something big, which certainly seems possible, he didn't make his decision based on those arguments. They don't seem good enough on their face to convince anyone. For example, I don't think he could really believe that he and other researchers would unconsciously restrict the AI's movement in the space of possible minds to the safe area(s), but if we reject that possibility some version of #4 seems to follow logically from 1 and 2.

Egan: don't know. What I've seen looks unimpressive, though certainly he has reason to doubt 'transhumanist' predictions for the near future. (SIAI instead seems to assume that if humans can produce AGI, then either we'll do so eventually or we'll die out first. Also, that we could produce artificial X-maximizing intelligence more easily then we can produce artificial nearly-any-other-human-trait, which seems likely based on the tool I use to write this and the history of said tool.) Do you have a particular statement or implied statement of his in mind?

Hanson: maybe I shouldn't point any of this out, but EY started by pursuing a Heinlein Hero quest to save the world through his own rationality. He then found himself compelled to reinvent democracy and regulation (albeit in a form closely tailored to the case at hand and without any strict logical implications for normal politics). His conservative/libertarian economist friend called these new views wrongheaded despite verbally agreeing with him that EY should act on those views. Said friend also posted a short essay about "heritage" that allowed him to paint those who disagreed with his particular libertarian vision as egg-headed elitists.

comment by XiXiDu · 2010-11-01T09:21:28.411Z · LW(p) · GW(p)

From where you got those quotes? References?

comment by Perplexed · 2010-11-01T16:35:15.987Z · LW(p) · GW(p)

He wasn't quoting Goertzel, Egan, and Hanson - though his formatting made it look like he was. He was commenting on your claim that these three "don't see it".

comment by XiXiDu · 2010-11-01T16:45:49.473Z · LW(p) · GW(p)

Whoops, I'm sorry, never mind.

comment by hairyfigment · 2010-11-01T16:12:24.483Z · LW(p) · GW(p)

Sorry, I don't know what quotes you mean. You can find a link to the "heritage" post in the wiki-compilation of the debate. Though perhaps you meant to reply to someone else?

comment by XiXiDu · 2010-11-01T16:47:24.467Z · LW(p) · GW(p)

Never mind, I just skimmed over it and thought you were quoting someone. If you delete your comment I'll delete this one. I'll read your orginal comment again now.

comment by XiXiDu · 2010-10-30T18:52:29.337Z · LW(p) · GW(p)

Saying you are 'Devil's Advocate' isn't an excuse to use bad arguments.

I don't think I used a bad argument, otherwise I wouldn't have done it.

You conveyed most of your argument via rhetorical questions.

Wow, you overestimate my education and maybe intelligence here. I have no formal education except primary school. I haven't taken a rhetoric course or something. I honestly believe that what I have stated would be the opinion of a lot of educated people outside of this community if they came across the arguments on this site and by the SIAI. That is, data and empirical criticism are missing given the extensive use of the idea that is AI going FOOM to justify all kinds of further argumentation.

comment by wedrifid · 2010-10-30T19:05:22.073Z · LW(p) · GW(p)

Wow, you overestimate my education and maybe intelligence here. I have no formal education except primary school. I haven't taken a rhetoric course or something.

"Rhetorical question" is just the name. Asking questions to try convince people rather than telling them outright is something most people pick up by the time they are 8.

I honestly believe that what I have stated would be the opinion of a lot of educated people outside of this community if they came across the arguments on this site and by the SIAI

I think this is true.

. That is

This isn't. That is, the 'that is' doesn't doesn't fit. What educated people will think really isn't determined by things like the below. (People are stupid, the world is mad, etc)

data and empirical criticism are missing given the extensive use of the idea that is AI going FOOM to justify all kinds of further argumentation.

I agree with this. Well, not the 'empirical' part (that's hard to do without destroying the universe.)

comment by Vladimir_Nesov · 2010-10-30T19:19:31.387Z · LW(p) · GW(p)

You conveyed most of your argument via rhetorical questions.

Wow, you overestimate my education and maybe intelligence here. I have no formal education except primary school. I haven't taken a rhetoric course or something.

Indeed, what an irony...

comment by XiXiDu · 2010-10-31T11:24:05.559Z · LW(p) · GW(p)

I'm fighting against giants here. Someone who only mastered elementary school. I believe it should be easy to refute my arguments or show me where I am wrong, point me to some documents I should read up on. But I just don't see that happening. I talk to other smart people online as well, that way I was actually able to overcome religion. But seldom there have been people less persuasive than you when it comes to risks associated with artificial intelligence and the technological singularity. Yes, maybe I'm unable to comprehend it right now, I grant you that. Whatever the reason, I'm not conviced and will say so as long as it takes. Of course you don't need to convince me, but I don't need to stop questioning either.

Here is a very good comment by Ben Goertzel that pinpoints it:

This is what discussions with SIAI people on the Scary Idea almost always come down to!

The prototypical dialogue goes like this.

SIAI Guy: If you make a human-level AGI using OpenCog, without a provably Friendly design, it will almost surely kill us all.

Ben: Why?

SIAI Guy: The argument is really complex, but if you read Less Wrong you should understand it

Ben: I read the Less Wrong blog posts. Isn't there somewhere that the argument is presented formally and systematically?

SIAI Guy: No. It's really complex, and nobody in-the-know had time to really spell it out like that.

comment by Eneasz · 2010-11-01T22:59:25.341Z · LW(p) · GW(p)

My argument is fairly simple -

If humans found it sufficiently useful to wipe chimpanzees off the face of the earth, we could and would do so.

The level of AI I'm discussing is at least as much smarter than us as we are of chimpanzees.

comment by shokwave · 2010-11-01T07:58:34.766Z · LW(p) · GW(p)

But seldom there have been people less persuasive than you when it comes to risks associated with artificial intelligence and the technological singularity.

I don't know if there is a persuasive argument about all these risks. The point of all this rationality-improving blogging is that when you debug your thinking, when you can follow long chains of reasoning and feel certain you haven't made a mistake, when you're free from motivated cognition - when you can look where the evidence points instead of finding evidence that points where you're looking! - then you can reason out the risks involved in recursively self-improving self-modifying goal-oriented optimizing processes.

comment by XiXiDu · 2010-10-30T17:59:16.559Z · LW(p) · GW(p)

Updated it without the quotes now so people don't get unnecessary distracted.

comment by mwaser · 2010-10-30T18:32:33.516Z · LW(p) · GW(p)

Could I ask you to post the quotes as a separate post? They are priceless (and I'd love to be able to see what they applied to -- so please include the references as well).

comment by XiXiDu · 2010-10-30T18:46:12.838Z · LW(p) · GW(p)

I should add, don't get a wrong impression from those quotes. I still believe he might actually be that smart. He's at least the smartest person I know of by what I've read. Except when it comes to public relations. You shouldn't say those things if you do not explain yourself sufficiently at the same time.

comment by XiXiDu · 2010-10-30T18:40:57.289Z · LW(p) · GW(p)

If I am ignorant about a phenomenon, that is not a fact about the phenomenon; it just means I am not Eliezer Yudkowsky. -- Eliezer Yudkowsky Facts

Here some stuff EY uttered for real:

  • People don't know these things until I explain them! (Reference)
  • You will soon learn that your smart friends and favorite SF writers are not remotely close to the rationality standards of Less Wrong, and you will no longer think it anywhere near as plausible that their differing opinion is because they know some incredible secret knowledge you don't. (Reference)
  • So take my word for it, I know more than you do, no really I do, and SHUT UP. (Reference)

The first two, well the context is there, just click 'Parent'. The third is from something that has now been deleted. I can't go into detail but can send you a PM if you want.

comment by PhilGoetz · 2010-10-31T03:13:53.264Z · LW(p) · GW(p)

Now I'm curious what they were, and where they came from. Distract me, but in a sub-thread.

comment by [deleted] · 2010-10-30T12:47:10.437Z · LW(p) · GW(p)

One thing that I think is relevant, in the discussion of existential risk, is Martin Weitzmann's "Dismal Theorem" and Jim Manzi's analysis of it. (Link to the article, link to the paper.)

There, the topic is not unfriendly AI, but climate change. Regardless of what you think of the topic, it has attracted more attention than AGI, and people writing about existential risk are often using climate change as an example.

Martin Weitzman, a Harvard economist, deals with the probability of extreme disasters, and whether it's worth it in cost-benefit terms to deal with them. Our problem, in cases of extreme uncertainty, is that we don't only have probability distributions, we have uncertain probability distributions; it's possible we got the models wrong. Weitzman's paper takes this into account. He creates a family of probability distributions, indexed over a certain parameter, and integrates over it -- and he proves that the process of taking "probability distributions of probability distributions" has the result of making the final distribution fat-tailed. So fat-tailed that the integral doesn't converge.

This is a terrible consequence. Because if the PDF of the cost of the risk doesn't converge, then we cannot define an expected cost. We can't do cost-benefit analysis at all. Weitzman's conclusion is that the right amount to spend mitigating risk is "more than we're doing."

Manzi criticizes this approach as just an elaborately stated version of the precautionary principle. If it's conceivable that your models are wrong and things are even riskier than you imagined, it doesn't follow that you should spend more to mitigate the risk; the reductio is that if you knew nothing at all, you should spend all your money mitigating the most unknown possible risk!

This is relevant to people talking about AGI. We're not considering spending a lot of money to mitigate this particular risk, but we are considering forgoing a lot of money -- the value of a possible useful AI. And it may be tempting to propose a shortcut, a la Marty Weitzman, claiming that the very uncertainty of the risk is an argument for being more aggressive in mitigating it. The problem is that this leads to absurd conclusions. You could think up anything -- murderous aliens! Killer vacuum cleaners! and claim that because we don't know how likely they are, and because the outcome would be world-endingly terrible, we should be spending all our time trying to mitigate the risk!

Uncertainty about an existential risk is not an argument in favor of spending more on it. There are arguments in favor of spending more on an existential risk -- they're the old-fashioned, cost-benefit ones. (For example, I think there's a strong case, in old-fashioned cost-benefit terms, for asteroid collision prevention.) But if you can't justify spending on cost-benefit grounds, you can't try a Hail Mary and say "You should spend even more -- because we could be wrong!"

comment by CarlShulman · 2010-10-30T13:34:43.997Z · LW(p) · GW(p)

The talk about uncertainty is indeed a red herring. There are two things going on here:

  1. A linear aggregative (or fast-growing enough in the relevant range) social welfare function makes even small probabilities of existential risk more important than large costs or benefits today. This is the Bostrom astronomical waste point. Weitzmann just uses a peculiar model (with agents with bizarre preferences that assign infinite disutility to death, and a strangely constricted probability distribution over outcomes) to indirectly introduce this. You can reject it with a bounded social welfare function like Manzi or Nordhaus, representing your limited willingness to sacrifice for future generations.

  2. The fact that there are many existential risks competing for our attention, and many routes to affecting existential risk, so that spending effort on any particular risk now means not spending that effort on other existential risks, or keeping it around while new knowledge accumulates, etc. Does the x-risk reduction from climate change mitigation beat the reduction from asteroid defense or lobbying for arms control treaties at the current margin? Weitzmann addresses this by saying that the risk from surprise catastrophic climate change is much higher than other existential risks collectively, which I don't find plausible.

comment by FrankAdamek · 2010-10-30T13:17:58.007Z · LW(p) · GW(p)

Is anyone in SIAI making the argument that we should spend more because our models are too uncertain to provide expected costs, or more generally that our very uncertainty of model is a significant source of concern? My impression was more that it's "we have good reasons to doubt people's estimation that Friendliness is easy" and "we have good reason to believe it's actually quite hard."

comment by [deleted] · 2010-10-30T13:36:09.927Z · LW(p) · GW(p)

fair enough -- this is my caution against the logic "I can think of a risk, therefore we need to worry about it!" It seems that SIAI is making the stronger claim that unfriendliness is very likely.

My personal view is that AI is very hard itself, and that working on, say, a computer that can do what a mouse can do is likely to take a long time, and is harmless but very interesting research. I don't think we're anywhere near a point when we need to shut down anybody's current research.

comment by andreas · 2010-10-30T19:53:08.815Z · LW(p) · GW(p)

Consider marginal utility. Many people are working on AI, machine learning, computational psychology, and related fields. Nobody is working on preference theory, formal understanding of our goals under reflection. If you want to do interesting research and if you have the background to advance either of those fields, do you think the world will be better off with you on the one side or on the other?

comment by [deleted] · 2010-10-30T20:33:33.389Z · LW(p) · GW(p)

Maybe that's true, but that's a separate point. "Let's work on preference theory so that it'll be ready when the AI catches up" is one thing -- tentatively, I'd say it's a good idea. "Let's campaign against anybody doing AI research" seems less useful (and less likely to be effective.)

comment by Perplexed · 2010-10-30T14:12:00.099Z · LW(p) · GW(p)

But if provable friendliness is hard, wouldn't it be much easier to accomplish with the help of AI? Presumably if the FAI problem can be solved by a few dozen smart human researchers within a few decades, then it can be solved in a year or so by a few dozen not-guaranteed-friendly AGIs-in-a-box with limited IQs in the 180-220 range. The AGIs design an FAI architecture and provide the proof, some smart humans check the proof, and then we build the thing and fasten our seatbelts for the exciting ride as the FAI goes FOOM.

comment by DSimon · 2010-10-30T15:11:08.099Z · LW(p) · GW(p)

How do you propose to limit their IQs? I'm not asking facetiously; your plan seems reasonable to me, but that's the part that seems the trickiest, and the part that if gotten wrong could lead to accidental early FOOMage.

comment by Perplexed · 2010-10-30T15:32:55.440Z · LW(p) · GW(p)

I have no idea how to limit the IQ of AIs that other people produce without my knowledge. For AI's that I produce myself, I would simply do without closed-loop recursive self-improvement (aka, keep the AI in a box) until I have a proven FAI architecture in hand.

I'm reasonably confident that a closed-loop FOOM is impossible until AI "IQ" goes well past the max human level. I am also reasonably confident that closing the recursive self-improvement loop doesn't speed things up much until you reach that level, either.

So, if a "Sane AI" project like this one, operating under the slogan of "Open loop until we have a proof" can maintain a technological lead of a year or so over a "Risky AI" project with the slogan "Close the loop - Full speed ahead", then I'm pretty sure it is actually safer than a "Secure FAI" project operating under the slogan "No AGI until we have a proof". Because it has a better chance of establishing and maintaining that technological lead.

comment by DSimon · 2010-10-30T18:29:35.191Z · LW(p) · GW(p)

Hm, so then the issue just becomes how to keep the AI from closing its own loop (i.e. modifying itself in-memory through some security hole it finds). I agree that it seems unlikely to figure out how to do so at a relatively low level of intelligence.

On the other hand, it seems like it would be pretty hard to do research on self-improvement without a closed loop; isn't the expectation usually that the self-improvement process won't start doing anything particularly interesting until many iterations have passed?

Maybe I'm just misunderstanding your use of the terms. I take it by "open loop" you mean that the AI would seek to generate an improved version of itself, but would simply provide that code back to the researcher rather than running it itself?

comment by Perplexed · 2010-10-30T18:52:07.997Z · LW(p) · GW(p)

Maybe I'm just misunderstanding your use of the terms. I take it by "open loop" you mean that the AI would seek to generate an improved version of itself, but would simply provide that code back to the researcher rather than running it itself?

Roughly, yes. But I see recursive self-improvement as having a hardware component as well, so "closed loop" also includes giving the AI control over electronics factories and electronic assembly robots.

... it seems like it would be pretty hard to do research on self-improvement without a closed loop; isn't the expectation usually that the self-improvement process won't start doing anything particularly interesting until many iterations have passed?

Odd. My expectation for the software-only and architecture-change portion of the self-improvement is that the curve would be the exact opposite - some big gains early by picking off low-hanging fruit, but slower improvement thereafter. It is only in the exponential growth of incorporated hardware that you would get a curve like that which you seem to expect.

comment by wnoise · 2010-10-31T09:37:28.627Z · LW(p) · GW(p)

includes giving the AI control over electronics factories and electronic assembly robots.

Or letting them seize control of ...

Not necessarily that hard given the existence of stuxnet.

comment by James_Miller · 2010-10-30T16:16:00.003Z · LW(p) · GW(p)

Eliezer figures out how to download his own brain. The emulation requires only a small amount of processing speed and memory. With the financial backing of the SIAI, LessWrong readers and wealthy tech businesspeople we create millions of Ems and have each run at 1,000 times the speed that Eliezer runs at. All of the Eliezer ems immediately work on improving the Ems' code and make huge use of trial and error in which they make some changes to the code of a subset of the Ems and give them intelligence tests, throwout the less intelligent Ems and make many copies of the superior ones.

This could give us a singularity in a week.

comment by Perplexed · 2010-10-30T16:45:47.842Z · LW(p) · GW(p)

Your scenario strikes me as laughably overoptimistic. A brain emulation requires only a small amount of processing speed and memory? A story that begins with finding financial backing takes only a week to reach completion?

But in any case, this is a closed-loop recursive self-improvement FOOM. I don't doubt that such things are possible. My point was that if you already have a bunch of super-Eliezers, why not have them design a provably-correct FAI, rather than sending them off to FOOM into an uFAI? If they discover the secret of FAI within a year or so, great! If it turns out that provably correct FAI is just a pipe-dream, then maybe we ought to reconsider our plans to close the loop and FOOM.

comment by James_Miller · 2010-10-30T17:59:46.169Z · LW(p) · GW(p)

" A brain emulation requires only a small amount of processing speed and memory?"

If software is the bottleneck and computer speed and memory are increasing exponentially than you would expect that by the time the software was available it would use a relatively small amount of computing power.

" A story that begins with finding financial backing takes only a week to reach completion?"

My story begins with the Eliezer Em. 150,000 people die everyday, and money probably becomes useless after a singularity. If enough people understood what was happening we could raise, say, a billion dollars in a few days. Hedge funds, I strongly suspect, do sometimes make billion dollar bets based on information they acquired in the last day.

"why not have them design a provably-correct FAI, rather than sending them off to FOOM into an uFAI?"

The 150,000 lives a day cost of delay plus the Eliezer ems might be competing with other ems that have list benign intentions.

comment by Emile · 2010-10-30T14:21:31.680Z · LW(p) · GW(p)

When he was paraphrasing the reasons:

Human value is fragile as well as complex, so if you create an AGI with a roughly-human-like value system, then this may not be good enough, and it is likely to rapidly diverge into something with little or no respect for human values

... that doesn't seem quite right. The main problem with values being fragile isn't that a "roughly-human-like value system" might diverge rapidly; it's that properly implementing a "roughly-human-like value system" is actually quite hard and most AGI programmer seem to underestimate it's complexity, and go for "hacky" solutions, which I find somewhat scary.

Ben seems aware of this, and later goes on to say:

This is related to the point Eliezer Yudkowsky makes that "value is complex" -- actually, human value is not only complex, it's nebulous and fuzzy and ever-shifting, and humans largely grok it by implicit procedural, empathic and episodic knowledge rather than explicit declarative or linguistic knowledge.

... which seems to be one of the reasons to pay extra attention to it (and this also seems to be a reason given by Eliezer, whereas Ben almost presents it as a counterpoint to Eliezer).

comment by mwaser · 2010-10-30T14:36:05.255Z · LW(p) · GW(p)

Human evaluation of human values under specific instances is everything that Ben says it is (complex, nebulous, fuzzy, ever-shifting, and grokked by implicit rather than explicit knowledge).

On the other-hand, evaluation of a points in the Mandelbroit set by a deterministically moving entity that is susceptible to color-illusions is even more complex, nebulous, fuzzy, and ever-shifting to the extent that it probably can't be grokked at all. Yet, it is generated from two very simple formulae (the second being the deterministic movement of the entity).

Eliezer has provided absolutely NO rational arguments (much less proof) that the core of Friendly is complex at all. Further, paying attention to the fact that ethical mandates within the obviously complex real world (particularly when viewed through the biased eyes and fallible beings) are comprehensible at all would seem an indication that maybe there are just a small number of simple laws underlying them (or maybe only one -- see my comment on Ben's post cross-posted at http://becominggaia.wordpress.com/2010/10/30/ben-goertzel-the-singularity-institutes-scary-idea/ for easy access).

comment by timtyler · 2010-10-30T14:55:35.618Z · LW(p) · GW(p)

My take on the optimisation target of all self-organising systems:

http://originoflife.net/gods_utility_function/

Eliezer Yudkowsky explains why he doesn't like such things:

http://lesswrong.com/lw/lq/fake_utility_functions/

comment by NancyLebovitz · 2010-10-30T13:14:07.878Z · LW(p) · GW(p)

For me, the oddest thing about Goertzels' article is his claim that SIAI's arguments are so unclear that he had to construct it himself. The way he describes the argument is completely congruent with what I've been reading here.

In any case, his argument that it may not be possible to have provable Friendliness and it makes more sense to take an incremental approach to AGI than to not do AGI until Friendliness is proven seems reasonable.

Has it been demonstrated that Friendliness is provable?

comment by mwaser · 2010-10-30T14:18:54.811Z · LW(p) · GW(p)

If Goertzel's claim that "SIAI's arguments are so unclear that he had to construct it himself" can't be disproven by the simple expedient of posting a single link to an immediately available well-structured top-down argument then the SIAI should regard this as an obvious high-priority, high-value task. If it can be proven by such a link, then that link needs to be more highly advertised since it seems that none of us are aware of it.

comment by Paul Crowley (ciphergoth) · 2010-10-30T16:40:34.745Z · LW(p) · GW(p)

The nearest thing to such a link is Artificial Intelligence as a Positive and Negative Factor in Global Risk [PDF].

But of course the argument is a little large to entirely set out in one paper; the next nearest thing is What I Think, If Not Why and the title shows in what way that's not what Goertzel was looking for.

comment by timtyler · 2010-10-31T12:29:11.768Z · LW(p) · GW(p)

Artificial Intelligence as a Positive and Negative Factor in Global Risk

44 pages. I don't see anything much like the argument being asked for. The lack of an index doesn't help. The nearest thing I could find was this:

It may be tempting to ignore Artificial Intelligence because, of all the global risks discussed in this book, AI is hardest to discuss. We cannot consult actuarial statistics to assign small annual probabilities of catastrophe, as with asteroid strikes. We cannot use calculations from a precise, precisely confirmed model to rule out events or place infinitesimal upper bounds on their probability, as with proposed physics disasters. But this makes AI catastrophes more worrisome, not less.

He also claims that intelligence could increase rapidly with a "dominant" probabilty.

I cannot perform a precise calculation using a precisely confirmed theory, but my current opinion is that sharp jumps in intelligence are possible, likely, and constitute the dominant probability.

This all seems pretty vague to me.

comment by timtyler · 2010-10-30T14:44:14.994Z · LW(p) · GW(p)

Is this an official position in the first place? It seems to me that they want to give the impression that - without their efforts - the END IS NIGH - without committing to any particular probability estimate - which would then become the target of critics.

Halloween update: It's been a while now, and I think the response has been poor. I think this means there is no such document (which explains Ben's attempted reconstruction). It isn't clear to me that producing such a document is a "high-priority task" - since it isn't clear that the thesis is actually correct - or that the SIAI folks actually believe it.

Most of the participants here seem to be falling back on: even if it is unlikely, it could happen, and it would be devastating, so therefore we should care a lot - which seems to be a less unreasonable and more defensible position.

comment by [deleted] · 2014-06-28T16:42:38.766Z · LW(p) · GW(p)

It isn't clear to me that producing such a document is a "high-priority task" - since it isn't clear that the thesis is actually correct - or that the SIAI folks actually believe it.

Most of the participants here seem to be falling back on: even if it is unlikely, it could happen, and it would be devastating, so therefore we should care a lot - which seems to be a less unreasonable and more defensible position.

You lost me at that sharp swerve in the middle. With probabilities attached to the scary idea, it is an absolutely meaningless concept. What if its probability were 1 / 3^^^3, should we still care then? I could think of a trillion scary things that could happen. But without realistic estimates of how likely it is to happen, what does it matter?

comment by XiXiDu · 2010-10-30T15:13:22.812Z · LW(p) · GW(p)

Here are some links.

comment by mwaser · 2010-10-30T16:52:34.620Z · LW(p) · GW(p)

Heh. I've read virtually all those links. I still have the three following problems.

  1. Those links are about as internally self-consistent as the Bible.
  2. There are some fundamentally incorrect assumptions that have become gospel.
  3. Most people WON'T read all those links and will therefore be declared unfit to judge anything.

What I asked for was "an immediately available well-structured top-down argument".

It would be particularly useful and effective if SIAI recruited someone with the opposite point of view to co-develop a counter-argument thread and let the two revolve around each other and solve some of these issues (or, at least, highlight the base important differences in opinion that prevent them from solution). I'm more than willing to spend a ridiculous amount of time on such a task and I'm sure that Ben would be more than willing to devote any time that he can tear away from his busy schedule.

comment by Perplexed · 2010-10-30T17:09:16.792Z · LW(p) · GW(p)

There are some fundamentally incorrect assumptions that have become gospel.

So go ahead and point them out. My guess is that in the ensuing debate it will be found that 1/4 of them are indeed fundamentally incorrect assumptions, 1/4 of them are arguably correct, and 1/2 of them are not really "assumptions that have become gospel". But until you provide your list, there is no way to know.

comment by Paul Crowley (ciphergoth) · 2010-10-30T16:41:11.129Z · LW(p) · GW(p)

Multiple links are not an answer - to be what Goertzel was looking for it has to be a single link that sets out this position.

comment by timtyler · 2010-10-30T13:24:54.080Z · LW(p) · GW(p)

Yudkowsky calls it "The default case" - e.g. here:

The default case of FOOM is an unFriendly AI, built by researchers with shallow insights. This AI becomes able to improve itself in a haphazard way, makes various changes that are net improvements but may introduce value drift, and then gets smart enough to do guaranteed self-improvement, at which point its values freeze (forever).

...however, it is not terribly clear what being "the default case" is actually supposed to mean.

comment by DSimon · 2010-10-30T15:08:00.660Z · LW(p) · GW(p)

Seems plausible to interpret "default case" as meaning "the case that will most probably occur unless steps are specifically taken to avoid it".

For example, the default case of knocking down a beehive is that you'll get stung; you avoid that default case by specifically anticipating it and taking countermeasures (i.e. wearing a bee-keeping suit).

comment by timtyler · 2010-10-30T15:17:25.573Z · LW(p) · GW(p)

So: it seems as though the "default case" of a software company shipping an application would be that it crashes, or goes into an infinite loop - since that's what happens unless steps are specifically taken to avoid it.

The term "the default case" seems to be a way of making the point without being specific enough to attract the attention of critics

comment by pjeby · 2010-10-30T17:51:20.743Z · LW(p) · GW(p)

So: it seems as though the "default case" of a software company shipping an application would be that it crashes, or goes into an infinite loop - since that's what happens unless steps are specifically taken to avoid it.

Not quite. The "default case" of a software company shipping an application is that there will definitely be bugs in the parts of the software they have not specifically and sufficiently tested... where "bugs" can mean anything from crashes or loops, to data corruption.

The analogy here -- and it's so direct and obvious a relationship that it's a stretch to even call it an analogy! -- is that if you haven't specifically tested your self-improving AGI for it, there are likely to be bugs in the "not killing us all" parts.

I repeat: we already know that untested scenarios nearly always have bugs, because human beings are bad at predicting what complex programs will do, outside of the specific scenarios they've envisioned.

And we are spectacularly bad at this, even for crap like accounting software. It is hubris verging on sheer insanity to assume that humans will be able to (by default) write a self-improving AGI that has to be bug-free from the moment it is first run.

comment by timtyler · 2010-10-30T18:45:06.482Z · LW(p) · GW(p)

The idea that a self-improving AGI has to be bug-free from the moment it is first run seems like part of the "syndrome" to me. Can the machine fix its own bugs? What about a "controlled ascent"? etc.

comment by pjeby · 2010-10-30T20:10:34.978Z · LW(p) · GW(p)

Can the machine fix its own bugs?

How do you plan to fix the bugs in its bug-fixing ability, before the bug-fixing ability is applied to fixing bugs in the "don't kill everyone" routine? ;-)

More to the point, how do you know that you and the machine have the same definition of "bug"? That seems to me like the fundamental danger of self-improving AGI: if you don't agree with it on what counts as a "bug", then you're screwed.

(Relevant SF example: a short story in which the AI ship -- also the story's narrator -- explains how she corrected her creator's all-too-human error: he said their goal was to reach the stars, and yet for some reason, he set their course to land on a planet. Silly human!)

What about a "controlled ascent"?

How would that be the default case, if you're explicitly taking precautions?

comment by Jordan · 2010-11-01T21:15:03.550Z · LW(p) · GW(p)

What about a "controlled ascent"?

How would that be the default case, if you're explicitly taking precautions?

Controlled ascent isn't the default case, but it certainly should be what provably friendly AI is weighed against.

comment by timtyler · 2010-10-30T21:17:26.771Z · LW(p) · GW(p)

It seems as though you don't have any references for the supposed "hubris verging on sheer insanity". Maybe people didn't think that in the first place.

Computers regularly detect and fix bugs today - e.g. check out Eclipse.

I never claimed "controlled ascent" as being "the default case". In fact I am here criticising "the default case" as weasel wording.

comment by Kingreaper · 2010-10-31T11:20:47.497Z · LW(p) · GW(p)

If it has a bug in its utility function, it won't want to fix it.

If it has a bug in its bug-detection-and-fixing techniques, you can guess what happens.

So, no, you can't rely on the AGI to fix itself, unless you're certain that the bugs are localised in regions that will be fixed.

comment by timtyler · 2010-10-31T12:08:21.486Z · LW(p) · GW(p)

So: bug-free is not needed - and a controlled ascent is possible.

The unreferenced "hubris verging on sheer insanity" asumption seems like a straw man - nobody assumed that in the first place.

comment by Emile · 2010-10-30T18:49:41.465Z · LW(p) · GW(p)

So: it seems as though the "default case" of a software company shipping an application would be that it crashes, or goes into an infinite loop - since that's what happens unless steps are specifically taken to avoid it.

The default case for a lot of shipped application isn't to do what it was designed to do, i.e. satisfy the target customer's needs. Even when you ignore the bugs, often the target customer doesn't understand how it works, or it's missing a few key features, or it's interface is clunky, or no-one actually needs it, or it's made confusing with too many features nobody cares about, etc. - a lot of applications (and websites) suck, or at least, the first released version does.

We don't always see that extent because the set of software we use is heavily biased towards the "actually usable" subset, for obvious reasons.

For example, see the debate tools that have been discussed here and are never used by anybody for real debate.

comment by DSimon · 2010-10-30T15:19:38.217Z · LW(p) · GW(p)

I think your analogy is apt. It's a similar argument for FAI; just as a software company should not ship a product without first running it through some basic tests to make sure it doesn't crash, so an AI developer should not turn on their (edit: potentially-FOOMing) AI unless they're first sure it is Friendly.

comment by timtyler · 2010-10-30T15:51:34.871Z · LW(p) · GW(p)

Well, I hope you see what I mean.

If the "default case" is that your next operating system upgrade will crash your computer or loop forever, then maybe you have something to worry about - and you should probably do an extensive backup, with this special backup software I am selling.

comment by DSimon · 2010-10-30T18:19:45.251Z · LW(p) · GW(p)

If the "default case" is that your next operating system upgrade will crash your computer or loop forever...

It would certainly be the default case for untested operating system upgrades. Whenever I write a program, even a small program, it usually doesn't work the first time I run it; there's some mistake I made and have to go back and fix. I would never ship software that I hadn't at least ran on my own to make sure it does what it's supposed to.

The problem with that when it comes to AI research, according to singulitarians, is that there's no safe way to do a test run of potentially-FOOMing software; mistakes that could lead to unFriendliness have to be found in some way that doesn't involve running the code, even in a test environment.

comment by timtyler · 2010-10-30T18:47:44.877Z · LW(p) · GW(p)

That just sounds crazy to me :-( Are these people actual programmers? How did they miss out on having the importance of unit tests drilled into them?

comment by DSimon · 2010-10-30T19:54:57.576Z · LW(p) · GW(p)

The problem is that running the AI might cause it to FOOM, and that could happen even in a test environment.

comment by timtyler · 2010-10-30T21:10:09.130Z · LW(p) · GW(p)

How do you get from that observation to the idea that running a complete untested program in the wild is going to be safer than not testing it at all?

comment by DSimon · 2010-10-30T21:23:14.416Z · LW(p) · GW(p)

No, the proposed solution is to first formally validate the program against some FAI theory before doing any test runs.

comment by timtyler · 2010-10-30T21:41:05.755Z · LW(p) · GW(p)

This idea is proposed by people with little idea of the value of testing - and little knowledge of the limitations of provable correctness - I presume.

In fact, who has supposedly proposed this idea? What did they actually say?

Also, you are now talking about performing "test runs". Is that doing testing, now?

comment by DSimon · 2010-10-31T06:42:41.712Z · LW(p) · GW(p)

This idea is proposed by people with little idea of the value of testing[...]

The usefulness of testing is beside the point. The argument is that testing would be dangerous.

Also, you are now talking about performing "test runs". Is that doing testing, now?

By "testing" I meant "running the code to see if it works", which includes unit testing individual components, integration or functional testing on the program as a whole, or the simple measure of running the program and seeing if it does what it's supposed to. By "doing test runs" I meant doing either of the latter two.

I would never ship, or trust in production, a program that had only been subjected to unit tests. This poses a problem for AI researchers, because while unit testing a potentially-FOOMing AI might well be safe (and would certainly be helpful in development), testing the whole thing at once would not be.

In fact, who has supposedly proposed this idea? What did they actually say?

I think EY's the original person behind a lot of this, but now the main visible proponents seem to be SIAI. Here's a link to the big ol' document they wrote about FAI.

On the specific issue of having to formally prove friendliness before launching an AI, I can't find anything specific in there at the moment. Perhaps that notion came from elsewhere? I'm not sure; but, it seems straightforward to me from the premises of the argument (AGI might FOOM, we want to make sure it FOOMs into something Friendly, we cannot risk running the AGI unless we know it will) that you'd have to have some way of showing that an AGI codebase is Friendly without running it, and the only other way I can think of would be to apply a rigorous proof.

comment by timtyler · 2010-10-31T08:57:43.173Z · LW(p) · GW(p)

The argument is that testing would be dangerous.

Life is dangerous: the issue is surely whether testing is more dangerous than not testing.

It seems to me that a likely outcome of pursuing a strategy involving searching for a proof is that - while you are searching for it - some other team makes a machine intelligence that works - and suddenly whether your machine is "friendly" - or not - becomes totally irrelevant.

I think bashing testing makes no sense. People are interested in proving what they can about machines - in the hope of making them more reliable - but that is not the same as not doing testing.

The idea that we can make an intelligent machine - but are incapable of constructing a test harness capable of restraining it - seems like a fallacy to me.

Poke into these beliefs, and people will soon refer you to the AI-box experiment - which purports to explain that restrained intelligent machines can trick human gate keepers.

...but so what? You don't imprison a super-intelligent agent - and then give the key to a single human and let them chat with the machine!

comment by Kingreaper · 2010-10-31T11:12:47.971Z · LW(p) · GW(p)

The "default case" occurs when not specifically avoided.

The company making the OS upgrade is going to do their best to avoid the computers it's installed on crashing. In fact, they'll probably hire quality control experts to make certain of it.

Why should AGI not have quality control?

comment by PeterisP · 2010-11-01T21:16:10.006Z · LW(p) · GW(p)

It definitely should have quality control.

The whole point of the 'Scary idea' is that there should be an effective quality control for GAI, otherwise the risks are too big.

At the moment humanity has no idea on how to make an effective quality control - which would be some way to check if an arbitrary AI-in-a-box is Friendly.

Ergo, if a GAI is launched before Friendly AI problem has some solutions, it means that GAI was launched without a quality control performed. Scary. At least to me.

comment by Vladimir_Nesov · 2010-10-30T16:27:28.977Z · LW(p) · GW(p)

In any case, his argument that it may not be possible to have provable Friendliness and it makes more sense to take an incremental approach to AGI than to not do AGI until Friendliness is proven seems reasonable.

That it's impossible to find a course of action that is knowably good, is not an argument for the goodness of pursuing a course of action that isn't known to be good.

comment by Jordan · 2010-10-30T16:51:48.428Z · LW(p) · GW(p)

Certainly, but it is an argument for the goodness of pursuing a course of action that is known to have a chance of being good.

There are roughly two types of options:

1) A plan that, if successful, will yield something good with 100% certainty, but has essentially 0% chance of succeeding to begin with.

2) A plan that, if successful, may or may not be good, with a non-zero chance of success.

Clearly type 2 is a much, much larger class, and includes plans not worth pursuing. But it may include plans worth pursuing as well. If Friendly AI is as hard as everyone makes it out to be, I'm baffled that type 2 plans aren't given more exposure. Indeed, it should be the default, with reliance on a type 1 plan a fall back given more weight only with extraordinary evidence that all type 2 plans are as assuredly dangerous as FAI is impossible.

comment by jimmy · 2010-10-30T21:21:02.952Z · LW(p) · GW(p)

The argument isn't that we should throw away good plans because there's some small chance of it being bad even if successful.

The argument is that the target is small enough that anything but a proof still leaves you with a ~0% chance of getting a good outcome.

comment by Vladimir_Nesov · 2010-10-30T17:03:01.114Z · LW(p) · GW(p)

(1) In any case, his argument that it may not be possible to have provable Friendliness and it makes more sense to take an incremental approach to AGI than to not do AGI until Friendliness is proven seems reasonable.

That it's impossible to find a course of action that is knowably good, is not an argument for the goodness of pursuing a course of action that isn't known to be good.

Certainly, but it is an argument for
(2) the goodness of pursuing a course of action that is known to have a chance of being good.

You point out a correct statement (2) for which the incorrect argument (1) apparently argues. This doesn't argue for correctness of the argument (1).

(A course of action that is known to have a chance of being good is already known to be good, in proportion to that chance (unless it's also known to have a sufficient chance of being sufficiently bad). For AI to be Friendly doesn't require absolute certainty in its goodness, but beware the fallacy of gray.)

comment by Vladimir_Nesov · 2010-10-30T12:14:53.311Z · LW(p) · GW(p)

Although Goertzel is no longer on the Team page of SIAI site, his profile on Advisors page states that

Ben Goertzel, Ph.D., is SIAI Director of Research, responsible for overseeing the direction of the Institute's research division.

I assume this an oversight, left unchanged from before. (Edit: Fixed!)

Also, on Research areas page, areas 4 "Customization of Existing Open-Source Projects" and 6 "AGI Evaluation Mechanisms" are distinctly of AGI-without-FAI nature, from Goertzel's project.

comment by JenniferRM · 2010-11-01T06:05:09.917Z · LW(p) · GW(p)

Goertzel's article seems basically reasonable to me. There were some mis-statements that I can excuse at the very end, because by that point part of his argument was that certain kinds of hyperbole came up over and over and his text was mimicing the form of the hyperbolic arguments even as it criticized them. The grandmother line and IQ obsessed aliens spring to mind :-P

Given his summary of the "Scary AGI Thesis"...

If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.

...it seemed like it would make sense to track down past discussions here where our discussions may have been implicitly shaped by the thesis. Here are two articles where the issue of concrete programming projects came up, spawning interesting discussions that seemed to have the Scary Thesis as a subtext:

  • In June 2009, cousin_it wrote Let's reimplement EURISKO!, and some of the discussion got into AGI direction meta-strategy. The highest top level comment is Eliezer bringing up issues of caution.

  • In January 2010, StuartArmstrong wrote Advice for AI makers and again Eliezer brings up caution to massive approval. This one is particularly interesting because Wei_Dai has a +20 child comment off of that talking about Goertzel's company webmind... and the anthropic argument.

At the same time, in the course of searching, the "other side" also came up, which I think speaks well for the community :-)

  • Three days after the Eurisko article was posted, rwallace wrote Why safety is not safe which discussed the issue in the context of (1) historical patterns of competition versus historical patterns of politically managed non-innovation and (2) the fact that the "human trajectory" simply doesn't appear to be long term stable such that swift innovation may be the only thing that prevents a sort of "default outcome" of human extinction.

  • Of course, even earlier, Eliezer was talking about the general subject of novel research as something that can prevent or cause tragedy, as with the July 2008 article Should We Ban Physics? (although he did his normal thing with an off-handed claim that it was basically impossible to actually prevent innovation).

comment by Perplexed · 2010-10-30T13:50:54.474Z · LW(p) · GW(p)

Good article. Thx for posting. I agree with much of it, but ...

Goertzel writes:

I do see a real risk that, if we proceed in the manner I'm advocating, some nasty people will take the early-stage AGIs and either use them for bad ends, or proceed to hastily create a superhuman AGI that then does bad things of its own volition. These are real risks that must be thought about hard, and protected against as necessary. But they are different from the Scary Idea.

Is this really different from the Scary Idea?

I've always thought of this as part of the Scary Idea, in fact, the reason the Scary Idea is scary - scarier than nuclear weapons. Because when mankind reaches the abyss, and looks with dismay at the prospect that lies ahead, we all know that there will be at least one idiot among us why doesn't draw back from the abyss, but instead continues forward down the slippery slope.

At the nuclear abyss, that idiot will probably kill a few hundred million of us. No big deal. But at the uFAI abyss, we may have ourselves a serious problem.

comment by TheOtherDave · 2010-10-30T15:43:18.496Z · LW(p) · GW(p)

It seems different to me.

If I believe "X is incredibly useful but someone might use it to destroy the world," I can conclude that I should build X and take care to police the sorts of people who get to use it. But if I believe "X is incredibly useful but its very existence might spontaneously destroy the world" then that strategy won't work... it doesn't matter who uses it. Maybe there's another way, or maybe I just shouldn't build X, but regardless of the solution it's a different problem.

It's like the difference between believing that nuclear weapons might some day be directed by humans to overthrow civilization, and believing that a nuclear reaction will cause all of the Earth's atmosphere to spontaneously ignite. In the first case, we can attempt to control nuclear weapons. In the second case, we must prevent nuclear reactions from ever starting.

Just to be clear: I'm not championing a position here on what sort of threat AGI's pose. I'm just saying that these are genuinely different threat models.

comment by timtyler · 2010-10-30T14:50:27.158Z · LW(p) · GW(p)

The "uFAI abyss"? Does that have something to do with the possibility of a small group of "idiots" - who were nonetheless smart enough to beat everyone else to machine intelligence - overthrowing the world's governments?

comment by timtyler · 2010-10-31T12:18:46.798Z · LW(p) · GW(p)

Having such beliefs with absolute certainty is incorrect, we don't have sufficient understanding for that, but weak beliefs multiplied by astronomical value lead to the same drastic actions, whose cost-benefit analysis doesn't take notice of small inconveniences such as being perceived to be crazy.

The unabomber performed some "drastic actions". I expect he didn't mind if he was "perceived to be crazy" by others - although he didn't want to plead insanity.

comment by XiXiDu · 2010-10-31T10:30:46.752Z · LW(p) · GW(p)

Does astronomical value outweigh astronomical low probability? You can come up with all kinds of scenarios that bear astronomical value, an astronomical amount of scenarios if you allow for astronomical low probability. Isn't this betting on infinity?

comment by XiXiDu · 2010-10-30T09:50:17.264Z · LW(p) · GW(p)

Thanks for the original pointer goes to Kevin.

Key points, some of which I already mentioned in the post Should I believe what the SIAI claims?:

Yes, you may argue: the Scary Idea hasn't been rigorously shown to be true… but what if it IS true?

OK but ... pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

The Scary Idea is certainly something to keep in mind, but there are also many other risks to keep in mind, some much more definite and palpable.

[...]

Also, there are always possibilities like: the alien race that is watching us and waiting for us to achieve an IQ of 333, at which point it will swoop down upon us and eat us, or merge with us. We can't rule this out via any formal proof, and we can't meaningfully estimate the odds of it either. Yes, this sounds science-fictional and outlandish; but is it really more outlandish and speculative than the Scary Idea?

My comment from the discussion post:

Should I believe what the SIAI claims? I'm still not sure, although I learnt some things since that post. But what I know is how serious the people here take this stuff. Also read the comments on this post for how people associated with LW overreact to completely harmless AI research.

The issue with potential risks posed by unfriendly AI are numerous. The only organisation that takes those issues serious is the SIAI, as its name already implies. But I believe most people simply don't see a difference between the SIAI and one or a few highly intelligent people telling them that a particle collider could destroy the world while all experts working directly on it claim there's no risk. Now I think I understand the argument that if the whole world is at stake it does outweigh the low probability of the event. But does it? I think it is completely justified to have at least one organisation working on FAI, but is the risk as serious as portrayed and perceived within the SIAI? Right now if I had to hazard a guess I'd say that it will probably be a gradual development of many exponential growth phases. That is, we'll have this conceptual revolution and optimize it very rapidly. Then the next revolution will be necessary. Sure, I might be wrong there, as the plateau argument of self-improvement recursion might hold. But even if that is true, I think we'll need at least two paradigm-shattering conceptual revolutions before we get there. But what does that mean though? How quickly can such revolutions happen? I'm guessing that this could take a long time, if it isn't completely impossible. That is, if we are not the equivalent of Universal Turing Machine of abstract reasoning. Just imagine we are merely better chimps. Maybe it doesn't matter if a billion humans does science for a million years, we won't come up with the AI equivalent of Shakespeare's plays. This would mean that we are doomed to evolve slowly, to tweak ourselves incrementally into a posthuman state. Yet, there are also other possibilities, that AGI might for example be a gradual development over many centuries. Human intelligence might turn out to be close to the maximum.

There is so much we do not know yet (http://bit.ly/ckeQo6). Take for example a constrained well-understood domain like Go. AI does still perform awfully at Go. Or take P vs. NP.:

P vs. NP is an absolutely enormous problem, and one way of seeing that is that there are already vastly, vastly easier questions that would be implied by P not equal to NP but that we already don’t know how to answer. So basically, if someone is claiming to prove P not equal to NP, then they’re sort of jumping 20 or 30 nontrivial steps beyond what we know today. [...] We have very strong reasons to believe that these problems cannot be solved without major — enormous — advances in human knowledge. [...] So in order to prove such a thing, a prerequisite to it is to understand the space of all possible efficient algorithms. That is an unbelievably tall order. So the expectation is that on the way to proving such a thing, we’re going to learn an enormous amount about efficient algorithms, beyond what we already know, and very, very likely discover new algorithms that will likely have applications that we can’t even foresee right now. (http://web.mit.edu/newsoffice/2010/3q-pnp.html).

But that is just my highly uneducated guess which I never seriously contemplated. I believe that for most academics the problem here is mainly about the missing proof of concept. Missing evidence. They are not the kind of people who wait before testing the first nuke because it might ignite the atmosphere. If there's no good evidence, a position supported by years worth of disjunctive lines of reasoning won't convince them either.

The paperclip maximizer (http://wiki.lesswrong.com/wiki/Paperclip_maximizer) scenario needs serious consideration. But given what needs to be done, what insights may be necessary to create something creative that is effective in the real world, it's hard to believe that this is a serious risk. It's similar with the kind of grey goo scenario that nanotechnology might hold. It will likely be a gradual development that once it becomes sophisticated enough to pose a serious risk is also understood and controlled by countermeasures.

I also wonder why we don't see any alien paperclip maximizer's out there? If there are any in the observable universe our FAI will lose anyway since it is far behind in its development.

I suppose the actual risk could be taking mere idea too serious.

comment by timtyler · 2010-10-30T10:53:50.052Z · LW(p) · GW(p)

It will likely be a gradual development that once it becomes sophisticated enough to pose a serious risk is also understood and controlled by countermeasures.

Indeed. Companies illustrate this. They are huge, superhuman powerful entities too.

comment by mwaser · 2010-10-30T13:55:18.163Z · LW(p) · GW(p)

A major upvote for this. The SIAI should create a sister organization to publicize the logical (and exceptionally) dangerous conclusion to the course that corporations are currently on. We have created powerful, superhuman entities with the sole top-level goal (required by LAW in for-profit corporations) of "Optimize money acquisition and retention". My personal and professional opinion is that this is a far more immediate (and greater) risk than UnFriendly AI).

comment by timtyler · 2010-10-30T14:24:48.647Z · LW(p) · GW(p)

Companies are probably the number 1 bet for the type of organisation most likely to produce machine intelligence - with number 2 being governments. So, there's a good chance that early machine intelligences will be embedded into the infrastructure of companies. So, these issues are probably linked.

Money is the nearest global equivalent of "utility". Law-abiding maximisation of it does not seem unreasonable. There are some problems where it is difficult to measure and price things, though.

comment by soreff · 2010-10-31T04:49:39.464Z · LW(p) · GW(p)

Money is the nearest global equivalent of "utility". Law-abiding maximisation of it does not seem unreasonable.

On the other hand, maximization of money, including accurate terms for expected financial costs of legal penalties, can cause remarkable unreasonable behavior. As was repeated recently "It's hard for the idea of an agent with different terminal values to really sink in", in particular "something that could result in powerful minds that actually don't care about morality". A business that actually behaved as a pure profit maximizer would be such an entity.

comment by timtyler · 2010-10-31T07:44:55.355Z · LW(p) · GW(p)

Morality is represented by legal constraints. That results in a "negative" morality, and - arguably -not a very good one.

Fortunately companies are also subject to many of the same forces that produce cooperation and niceness in the rest of biology - including reputations, reciprocal altruism and kin selection.

comment by XiXiDu · 2010-10-30T11:58:04.730Z · LW(p) · GW(p)

Algorithmic trading is indeed an example for the kind of risks posed by complication (unmanageable) systems but also shows that we evolve our security measures with each small-scale catastrophe. There is no example of some existential risk from true runaway technological development yet although many people believe there are such risks, e.g. nuclear weapons. Unstoppable recursive self-improvement is just a hypothesis that you shouldn't take as a foundation for a whole lot of further inductions.

Dispelling Stupid Myths About Nuclear War

An all-out nuclear war between Russia and the United States would be the worst catastrophe in history, a tragedy so huge it is difficult to comprehend. Even so, it would be far from the end of human life on earth. The dangers from nuclear weapons have been distorted and exaggerated, for varied reasons. These exaggerations have become demoralizing myths, believed by millions of Americans.

comment by hairyfigment · 2010-11-01T18:33:54.261Z · LW(p) · GW(p)

Apparently I don't understand what you mean by "serious risk". (Before I pick this apart, by the way, I agree that we should try not to Godwin people -- because I think it doesn't work.)

I consider it likely that AGI will take a long time to develop. A rational species would likely figure out the flaw and take corrective steps by then. But look around you. Nearly all of us seem to agree, if you look at what we actually want according to our actions, that we should try to prevent an asteroid strike that might destroy humanity. As far as I can tell we haven't started yet. No doubt you can think of other examples: the evidence says that if we put off FAI theory 'until we need it', we could easily put it off longer than that.

comment by timtyler · 2010-10-30T10:49:19.267Z · LW(p) · GW(p)

If there are any in the observable universe our FAI will lose anyway since it is far behind in its development.

If there's only one, we could perhaps run away from it at near the speed of light - while developing technologically - and building our strength.

comment by XiXiDu · 2010-10-30T11:41:04.258Z · LW(p) · GW(p)

We'll never be able to make up for it. But I wouldn't worry about it anyway, if there was just one hard take-off in the visible universe we should be able to detect it (soon) as it would have transformed entire galaxies and super-clusters if you believe current ideas (even without a hard take-off I guess). But there seems to be nothing, so either technological life is rare (once in the visible universe) or there are other more important risks than paperclipping.

comment by NancyLebovitz · 2010-10-30T12:56:01.393Z · LW(p) · GW(p)

How would a hard-takeoff galaxy look different from ordinary galaxies?

comment by XiXiDu · 2010-10-30T13:27:20.848Z · LW(p) · GW(p)

Very dim? Or very unusually given paperclipping?

Given that Burning the Cosmic Commons is already portrayed as a risk from uFAI that are paperclipping I don't think it is unreasonable to ask why we don't see any effects of such outcomes in the visible universe if the risk of such events is >70% for technological civilisations. I'm just looking for empirical evidence here that might support the conclusions. Of course, it might take one or two decades before we are capable of detecting such anomalies. Still, given the age of the universe you'd expect to see some artifacts if your premise is that technological civilizations haven't evolved only once in the visible universe. If there are other reasons, then as I said before there might be other risks more important than uFAI to explain the fermi-paradox, which would be an importing observation too.

comment by timtyler · 2010-10-30T13:33:37.905Z · LW(p) · GW(p)

I don't think you could see any of those things very easily in other galaxies if they were there. What you can see in other galaxies is chemical compounds - through mass spectroscopy. That shows some of them are rich in "life-like" stuff. Also, there is no shortage of darkish matter out there.

comment by timtyler · 2010-10-30T12:04:25.584Z · LW(p) · GW(p)

The Carina Nebula and the Orion Nebula may well be full of advanced life. We don't really know what that looks like yet, though.

comment by Louie · 2010-10-31T15:04:59.628Z · LW(p) · GW(p)

Check out SIAI's publications page. Kaj's most recent paper (published at ECAP '10) is a good 2 page summary of why AGI can be an x-risk for anyone who is uninformed of SIAI's position:

"From mostly harmless to civilization-threatening: pathways to dangerous artificial general intelligences"

comment by XiXiDu · 2010-10-31T17:11:52.664Z · LW(p) · GW(p)

A recent paper showed that 'Striatal Volume Predicts Level of Video Game Skill Acquisition'. A valid inference would be that an AGI with the computational equivalent of a higher striatal volume would possess a superior cognitive flexibility, at least when it comes to gaming. But what could it accomplish? I'm playing a game called Trackmania, it is a arcade racing game. The top players are so close to the ideal line and therefore the fastest time that a superhuman AI could indeed beat them but only by a few milliseconds. Each millisecond less might demand a order of magnitude more skill, but that doesn't matter. First of all, there is a absolute limit. Secondly, it doesn't provide a serious advantage, it doesn't matter. And that may very well be the case with physics too. There is no guarantee that a faster thinking or increased working memory capacity will ever yield anything genuine without a lot of dumb luck, if at all. It is unlikely that a superhuman AI would come up with a faster than light propulsion or that it would disprove Gödel's incompleteness theorems.

Of course, we should be careful. And it is absolutely justified that an organisation like the SIAI gets money to do research on those questions. But there is not enough evidence to outweigh the doubt as to impede AI research. We will actually need research of real AGI to answer some of the open questions.

Regarding self-improvement I'm very doubtful too. The human indecision and fuzziness of thinking might very well be a feature. A superhuman AI might very well beat us at Go or the stock exchange, as long as it deals with its own kind and not the irrational agents that we are, but that doesn't mean it will be able to deal with natural problems orders of magnitude more efficient than we do.

Most of the risks from superhuman AI are associated with advanced nanotechnology. Without it, it will be impotent. Can it solve it, if it is possible at all? Can it implement its results if it can solve it, if it is possible? Because without it, self-improvement will be very hard. What will be even harder is creating copies of it without first building the necessary infrastructure for the computational substrates.

Could an AGI take over the Internet? This is very unlikely. There are spare resources, but not that much. You can't expect that it would even be suitable as a computational substrate. And how is it going to make use of it before crude measures are taken to shut it down? Many open questions, much speculation.

Paperclipping is another very speculative idea. Is a superhuman artificial general intelligence possible that is mistakenly equipped with the incentive to turn the universe into paperclips? I guess it is possible, but not without hard-coding this incentive deliberately and with great care.

comment by mwaser · 2010-10-31T19:18:17.131Z · LW(p) · GW(p)

Kaj's paper relies very heavily on Omohundro's paper from AGI '08. Check out the reply that I presented/published at BICA '08 which (among other things) summarizes why the assumptions that Kaj relies upon are probably incorrect:

Discovering the Foundations of a Universal System of Ethics

comment by Perplexed · 2010-10-31T22:41:57.141Z · LW(p) · GW(p)

Two things surprised me in your argument. One is that you seemed to assume that features of human ethics (which you attribute to our having evolved as social animals) would be universal in the sense that they would also apply to AIs which did not evolve and which aren't necessarily social.

The second is that although you pay lip service to game theory, you don't seem to be aware of any game theoretic research on ethics deeper than Axelrod(1984) and the Tit-for-Tat experiments. You ought to at least peruse Binsmore's "Natural Justice", even if you don't want to plow through the two volumes of "Game Theory and the Social Contract".

comment by mwaser · 2010-11-01T00:09:25.415Z · LW(p) · GW(p)

Being social is advantageous to any entity without terminal goals and advantageous to entities with terminal goals in most cases (primary exceptions being single goal entities, entities on the verge of achieving all of their terminal goals, and entities that are somehow guaranteed that they are and will remain far, far more powerful than everyone else). Humans evolved to be social because social was advantageous. A super-intelligent but non-evolved AGI will figure out that social is advantageous as well (except, obviously, in the very limited edge cases mentioned above).

Not quoting more research is not the same as being unaware of that research. I've read Binsmore -- but how can I successfully bring it up when I can't even get acceptance of Axelrod? It's like trying to teach multiplication while addition is still a problem. I really should read GT&tSC. It's been on my reading list since I've tasked myself with writing something in response to Rawls' corpus. I just haven't gotten around to it.

I have presented further works on the same subject at BICA '09 and AGI '10 (with a really fun second presentation at AGI '10 here) but haven't advanced the game theory portion at all (unfortunately). My focus has recently shifted radically though and going back to game theory could help that tremendously. Thanks.

comment by Perplexed · 2010-11-01T00:44:12.835Z · LW(p) · GW(p)

Being social is advantageous to any entity without terminal goals [your emphasis]

I can't accept this. Many animals are not social, or are social only to the extent of practicing parental care.

A super-intelligent but non-evolved AGI will figure out that social is advantageous as well.

Only if it is actually advantageous to them (it?). Your claim would be much more convincing if you could provide examples of what AIs might gain by social interaction with humans, and why the AI could not achieve the same benefits with less risk and effort by exterminating or enslaving us. Without such examples, your bare assertions are completely unconvincing.

Please note that as humans evolved to their current elevated moral plane they occasionally found extermination and enslavement to be more tempting solutions to their social problems than reciprocity. In fact, enslavement is a form of reciprocity - it is one possible solution to a bargaining problem as in Nash(1953). A solution in which one bargainer has access to much better threats than the other.

comment by mwaser · 2010-11-01T01:46:24.083Z · LW(p) · GW(p)

Many animals are not smart enough to be social. We are talking about a super-intelligent AGI here. I've given several presentations at conferences including AGI-09 with J Storrs Hall showing that animals are sociable to the extent that their cognitive apparatus can support it (not to mention the incredible socialness of bees, termites, etc.)

What do we gain from social interactions with dogs? Do we honestly suffer no losses when we mindlessly trash the rain forests? Examples to support my "bare assumptions" are EASY to come by (but thanks for asking -- I just wish that other people here would give me examples when I ask).

Enslavement is an excellent short-term solution; however, in the long-term, it is virtually always a net negative to the system as a whole (i.e. it is selfish and stupid when viewed from the longest term -- and moreso the more organized and inter-related the system is). Once again, we are talking about a super-intelligence, not short-sighted, stupid humans (who are, nonetheless, inarguably getting better and better with time).

comment by Perplexed · 2010-11-01T02:10:26.676Z · LW(p) · GW(p)

What do we gain from social interactions with dogs? ... Examples to support my "bare assumptions" are EASY to come by.

Happy to hear that. Because it then becomes reasonable to assume that you would not find it burdensome to share those examples. We are talking about benefits that a powerful AI would derive from social interactions with humans.

I hope you have something more than the implied analogy of the human-canine relationship. Because there are many other species just as intelligent as dogs with which we humans do not share quite so reciprocal a relationship. And, perhaps it is just my pride, but I don't really think that I would appreciate being treated like a dog by my AI master. ETA: And I don't know of any human dog-lovers who keep 6 billion pets.

comment by mwaser · 2010-11-01T02:23:34.726Z · LW(p) · GW(p)

Because there are many other species just as intelligent as dogs with which we humans do not share quite so reciprocal a relationship.

Absolutely. Because dogs cooperate with us and we with them and the other species don't.

And, perhaps it is just my pride, but I don't really think that I would appreciate being treated like a dog by my AI master.

And immediately the human prejudice comes out. We have terrible behavior when we're on the top of the pile and expects others to have it as well. It's almost exactly the same as when people complain bitterly when they're oppressed and then, when they are on top, they oppress others even worse.

What is wrong with the human-canine analogy (which I thought I did more than imply) is the baggage that you are bringing to that relationship. Both parties benefit from the relationship. The dog benefits less from that relationship than you would benefit from an AGI relationship because the dog is less competent and intelligent than you are AND because the dog generally likes the treatment that it receives (whereas you would be unhappy with similar treatment).

Dogs are THE BEST analogy because they are the closest existing example to what most people are willing to concede is likely to be our relationship with a super-advanced AGI.

Oh, and dogs don't really have a clue as to what they do for us, so why do you expect me to be able to come up with what we will do for an advanced AGI? If we're willing to cooperate, there will be plenty for us to do of value that will fulfill our goals as well. We just have to avoid being too paranoid and short-sighted to see it.

comment by shokwave · 2010-11-01T07:25:46.615Z · LW(p) · GW(p)

The scale is all out.

earthworm --three orders of magnitude--> small lizard --three orders of magnitude--> dog --three orders of magnitude--> human --thirty orders of magnitude--> weakly superhuman AGI --several thousand orders of magnitude--> strong AI

If a recursively self-improving process stopped just far enough above us to consider us pets and did so, I would seriously question whether it was genuinely recursive, or if it was just gains from debugging and streamlining human thought process. ie, I could see a self-modifying transhuman acting in the manner you describe. But not an artificial intelligence, not unless it was very carefully designed.

comment by Jonathan_Graehl · 2010-11-01T06:37:34.057Z · LW(p) · GW(p)

Stop wasting our time.

comment by timtyler · 2010-11-01T08:16:34.530Z · LW(p) · GW(p)

Being social is advantageous to any entity without terminal goals

Hmm. What do you mean by an "entity without terminal goals". Would a rock qualify?

comment by mwaser · 2010-11-01T11:25:15.649Z · LW(p) · GW(p)

No. A rock is not an entity.

comment by timtyler · 2010-11-01T22:47:39.320Z · LW(p) · GW(p)

Right. Many people here use the term "terminal" in the following sense (in this kind of context):

http://en.wikipedia.org/wiki/Terminalvalue(philosophy))

However, interpreting your comment in the light of such a definition apparently makes little sense.

So - presumably you meant something else - but what?

comment by wedrifid · 2010-11-01T22:56:22.124Z · LW(p) · GW(p)

link)

[link](http://en.wikipedia.org/wiki/Terminal_value_(philosophy\))

(Escaping closing parenthesis to ensure the link syntax is not prematurely closed.)

comment by pjeby · 2010-11-01T01:03:53.755Z · LW(p) · GW(p)

entities that are somehow guaranteed that they are and will remain far, far more powerful than everyone else

And you don't think a self-improving AI will ever fall into this category? Hell, if you gave a human the ability to run billions of simulations per second to study how their decisions would turn out, they'd be able to take over the world and "remain far, far more powerful" than everyone else. (If they were actually more intelligent, and not just faster, even more so.)

Your so-called "limited edge case" is the main case being discussed: superhuman intelligence. (The problem of single-goal entities is of course also discussed here; see the idea of a "paper-clip maximizer", for example.)

In short, you seem to be saying that we shouldn't worry about those "edge" cases because in all non-"edge" cases, things work out fine. That's like saying we shouldn't worry about having fire departments or constructing homes according to a fire code, because a fire is an "edge" case, and normally buildings don't burn down.

Even if you were to make such an argument, it makes little sense to propose it at a meeting of the fire council. ;-)

It may be true that mostly, fires don't happen. However, it's also true that if you don't build the buildings with fire prevention (and especially, preventing the spread of fires) in mind, then, sooner or later, your whole city burns down. Because at that point, it only takes one fire to do it.

comment by mwaser · 2010-11-01T01:23:22.254Z · LW(p) · GW(p)

entities that are somehow guaranteed that they are and will remain far, far more powerful than everyone else

And you don't think a self-improving AI will ever fall into this category?

You mean "somehow guaranteed "? No, I don't believe that a self-improving AI will ever fall into this category. It might decide to believe it -- which would be very dangerous for us -- but, no, I don't believe that it is likely to truly find such a guarantee. Further, given the VERY minimal cost (if any) of cooperating with a cooperating entity, an AI would be human-foolish to take the stupid short-sighted shortcut of trashing us for no reason -- since it certainly is an existential risk for IT that something bigger and smarter would take exception to such a diversity-decreasing act.

MORE IMPORTANTLY - you dropped the fact that the AI already has to have one flaw (terminal goals) before this second aspect could possibly become a problem.

Fire is not an "edge" case. The probability of a building catching fire in a city any given day is VERY high. But that is irrelevant because . . . .

you ALWAYS worry about edge cases. In this case, though, if you are aware of them and plan/prepare against them -- they are AVOIDABLE edge cases (more so than the city burning down even if you have fire prevention c.f. Chicago & Mrs. O'Leary's cow).

comment by pjeby · 2010-11-01T04:41:56.643Z · LW(p) · GW(p)

an AI would be human-foolish to take the stupid short-sighted shortcut of trashing us for no reason

You don't seem to understand how basic reasoning works (by LW standards). AFAICT, you are both privileging your hypothesis, and not weighing any evidence.

(Heck, you're not even stating any evidence, only relying on repeated assertion of your framing of the situation.)

You still haven't responded, for example, to my previous point about human-bacterium empathy. We don't have empathy for bacteria, in part because we see them as interchangeable and easily replaced. If for some reason we want some more E. coli, we can just culture some.

In the same way, a superhuman intelligence that anticipates a possible future use for human beings, could always just keep our DNA on file... with a modification or two to make us more pliable.

Your entire argument is based on an enormous blind spot from your genetic heritage: you think an AI would inherently see you as, well, "human", when out of the space of all possible minds, the odds of a given AI seeing you as worth bothering with are negligible at best. You simply don't see this, because your built-in machinery for imagining minds automatically imagines human minds -- even when you try to make it not do so.

Hell, the human-bacterium analogy is a perfect example: I'm using that example specifically because it's a human way of thinking, even though it's unlikely to match the utter lack of caring with which an arbitrary AGI is likely to view human beings. It's wrong to even think of it as "viewing", because that supposes a human model.

AI's are not humans, unless they're built to be humans, and the odds of them being human by accident are negligible.

Remember: evolution is happy to have elephants slowly starve to death when they get old, and to have animals that die struggling and painfully in the act of mating. Arbitrary optimization processes do not have human values.

Stop thinking "intellect" (i.e. human) and start thinking "mechanical optimization process".

[edit to add: "privileging", which somehow got eaten while writing the original comment]

comment by JGWeissman · 2010-11-01T04:49:55.120Z · LW(p) · GW(p)

you are both your hypothesis

you are both privileging your hypothesis ?

comment by wedrifid · 2010-11-01T06:08:01.311Z · LW(p) · GW(p)

Here I was assuming that PJ had integrated Descartes and Zen and was trying to understand the deep wisdom behind the koan.

The scary thing is, if I engage "extracting wisdom from koan" mode I can actually feel "you are both your hypothesis, and not weighing any evidence" fitting in neatly with actual insights that fit within PJ's area of expertise. +1 to pattern matching on noise!

comment by pjeby · 2010-11-01T18:53:49.327Z · LW(p) · GW(p)

+1 to pattern matching on noise!

Even scarier thought: suppose that what we think of as intelligence or creativity consists, in simple fact, of pattern matching on random noise? ;-)

comment by pjeby · 2010-11-01T05:48:22.528Z · LW(p) · GW(p)

you are both privileging your hypothesis ?

Yes, don't know how that got deleted, because I saw it in there shortly before posting. My copy of Firefox sometimes does odd things during text editing.

comment by wedrifid · 2010-11-01T01:32:01.461Z · LW(p) · GW(p)

Further, given the VERY minimal cost (if any) of cooperating with a cooperating entity

This premise is VERY flawed.

comment by mwaser · 2010-11-01T01:51:07.391Z · LW(p) · GW(p)

So give me some examples. Cooperation is a non-zero-sum game that continues adding utility the longer it goes on. Do you deny that this is the case?

Oh, wait, you're the same guy who, whenever asked to back up his statements, never does.

Please support me and your community by doing more than throwing cryptic opinionated darts and then refusing to elaborate. You're only wasting everyone's time and acting as a drag on the community.

comment by shokwave · 2010-11-01T07:17:16.163Z · LW(p) · GW(p)

The Prisoner's Dilemma. Classic, classic example where cooperation has a non-minimal cost - ie, the risk that they will defect against you, multiplied by the probability that they will defect, is the cost of cooperating.

VERY minimal cost (if any) of cooperating with a cooperating entity

And in the Prisoner's Dilemma, if you somehow specify the other entity is cooperating, then cooperating with a cooperating entity carries a cost still: the difference between "both cooperate" and "you defect against their cooperate" is the cost of cooperating there.

throwing cryptic opinionated darts

If your argument rested on some mathematical concepts, and one of them was an equation that you derived incorrectly, and wedrifid pointed that out, would he still be throwing darts? He wasn't telling you that you were wrong because he hates you, or because he enjoys ruining peoples' time on this blog, or any other sadistic personality trait, he was pointing out the flaw because it was flawed.

comment by timtyler · 2010-11-01T08:21:55.160Z · LW(p) · GW(p)

Cooperation is a non-zero-sum game that continues adding utility the longer it goes on. Do you deny that this is the case?

That is sometimes referred to as "mutually beneficial" cooperation:

Co-operation or co-operative behaviours are terms used to describe behaviours by organisms which are beneficial to other organisms, and are selected for on that basis. Under this definition, altruism is a form of co-operation in which there is no direct benefit to the actor (the organism carrying out the behaviour). Co-operative behaviour in which there is a direct benefit to the actor as well as the recipient can be termed "mutually beneficial".

comment by Kaj_Sotala · 2010-11-01T16:45:45.830Z · LW(p) · GW(p)

Not really - the paper is about ways by which an AGI might become more powerful than humanity (corresponding to premise 3 in Ben's reconstructed version of the SIAI argument). You can combine it with Omohundro-like arguments, and I do briefly mention that connection in the conclusions, but the core content of the paper is an independent and separate issue from AI drives, universal ethics or any such issue.

comment by hairyfigment · 2010-10-31T21:15:20.928Z · LW(p) · GW(p)

From a quick read, it seems to rely on the assumption that a superhuman AI couldn't rely on its ability to destroy humanity.

comment by mwaser · 2010-11-01T01:30:22.922Z · LW(p) · GW(p)

HAHAHAHAHA!

No, it does not rely on the assumption that a superhuman AI couldn't rely on it's ability to destroy humanity. It never even starts to make such a silly baldly incorrect assumption.

Please don't rely on "quick reads" if you're prone to such bad misunderstandings when doing quick reads.

comment by wedrifid · 2010-11-01T01:34:45.996Z · LW(p) · GW(p)

Your comments here support hairy's reading, whether or not your other material does.

comment by mwaser · 2010-11-01T01:56:18.173Z · LW(p) · GW(p)

Could you explain how my comments (or which comments) support hairy's reading? (So I can attempt to rectify the my apparently poor communication)

I firmly believe that a superhuman AI is VERY likely to be able to destroy humanity far more easily than we are able to destroy the rain forests.

I must be communicating VERY poorly if it looks like I am saying otherwise.

comment by wedrifid · 2010-11-01T02:01:41.023Z · LW(p) · GW(p)

Could you explain how my comment's support hairy's reading?

It is probably better that I don't. But Pjeby's reply over there was a solid attempt at such an explanation.

comment by mwaser · 2010-11-01T02:06:08.638Z · LW(p) · GW(p)

No, no, no, no, no. "It is probably better that I don't" simply means that you CAN'T.

Looking at the history of your comments, it seems that you tend to make very brief comments supporting the echo chamber and never back them up.

Pjeby's reply was a solid question/statement but it had absolutely NOTHING to with with an AI's ability to destroy humanity.

You have given absolutely nothing to support your contention. As I've said elsewhere -- Please support me and your community by doing more than throwing cryptic opinionated darts and then refusing to elaborate. You're only wasting everyone's time and acting as a drag on the community.

comment by wedrifid · 2010-11-01T02:40:04.739Z · LW(p) · GW(p)

You have given absolutely nothing to support your contention.

My contention, if you need it to be overt, is that hairyfigment need not doubt his sanity and certainly does not deserve to be laughed at in "TROLLCAPS" or insulted childishly. I expect harry to be able to see the relationship between his reading and Pjeby's comments regarding 'edge cases' since I can infer from his comment that he has already had the necessary insights.

comment by timtyler · 2010-10-31T21:59:04.169Z · LW(p) · GW(p)

Omohundro's paper was about The Basic AI Drives. The abstract says: " We identify a number of “drives” that will appear in sufficiently advanced AI systems of any design".

Social drives are arguably not very "basic" - since they only show up in social situations.

I'm sure such machines would also have a "drive to swim" - if immersed in water - and a "drive to escape" - if encased by crushing jaws - but these "drives" were judged not sufficiently "basic" to go into Omohundro's paper.

comment by mwaser · 2010-11-01T01:36:14.189Z · LW(p) · GW(p)

The drive to swim is not obvious except as a subgoal to one of the other goals. The the drive to escape is obvious extension/subgoal of the drive to survive.

The drives to cooperate and seek help are not obvious extensions to a single one of the listed drives.

Further, Omohundro's paper quite explicitly referred to its expectation of sociopathic behavior barring outside influences. It was not that these drives were judged not sufficiently "basic", it was obvious that they were overlooked.

Cooperation and seeking help will appear in sufficiently advanced AI systems of any design -- and to succeed, they both require socially acceptable social behavior.

comment by timtyler · 2010-11-01T08:04:40.248Z · LW(p) · GW(p)

Cooperation seems unlikely to appear in "sufficiently advanced AI systems of any design" - since to cooperate you need to have some colleagues - and some forms of machine intelligence could well be peerless.

I don't think Omohundro expects machine sociopaths. He often ends his talks on the topic with a big Buddha slide, like this:

Hopefully, through a combination of understanding our own values and where they came from, together with an intelligent analysis of the properties of this technology, we can blend them together to make technology with wisdom, in which everyone can be happy and together create a peaceful utopia.

comment by timtyler · 2010-10-31T18:18:32.272Z · LW(p) · GW(p)

Doesn't that assume what it is trying to prove - by starting out with:

"The main reason to be worried about greater-than-human intelligence is because it is hard for humans to anticipate and control."

...? From the perspective of technological determinism, "controlling" the machines should probably not be our aim. Our more plausible options are more along the lines of joining with them - or being interesting enough to keep around in their historical simulations.

comment by XiXiDu · 2010-10-31T19:53:04.089Z · LW(p) · GW(p)

To me it rather looks like that the paper in question is trying to give a summary of conclusions that follow from the premise that greater-than-human intelligence is possible. I'm not reluctant to any of the mentioned possibilities but I'm wary of using inferences derived from reasonable but unproven hypothesis as foundations for further speculative thinking. Although the paper does a good job on stating reasons to justify the existence and support for an organisation such as the SIAI, it does not substantiate the initial premise to an extent that one could draw the conclusions about the probability of associated risks. Nevertheless such estimations are given, such as that there is a high likelihood of humanity's demise given that we develop superhuman artificial general intelligence without first defining mathematically how to prove the benevolence of the former. This I believe is a unsatisfactory conclusion as it lacks justification. This is not to say that it is wrong to state probability estimations and update them given new evidence, but that they are not compelling and therefore should not be used to justify any mandatory actions regarding research on artificial intelligence. Although those ideas can very well serve as an urge to caution.

comment by shokwave · 2010-10-31T08:15:34.448Z · LW(p) · GW(p)

I would like to explore Ben's reasons for rejecting the premises of the argument.

I think the first of the above points is reasonably plausible

He offers the possibility that intelligence might cause or imply empathy; I feel that although we see that connection when we look at all of Earth's creatures, correlation doesn't imply causation, so that (intelligence AND empathy) doesn't mean (intelligence IMPLIES empathy) - it probably means (evolution IMPLIES intelligence AND empathy) and we aren't using natural selection to build an AI.

I doubt human value is particularly fragile.

He makes the point that human values have robustly changed many times, and will probably continue to change in coordination with AGI. Human value is not fragile on the timescales we deal with as humans; our values have indeed changed since, say, Victorian times. But that took generations - most value change will take generations, because humans are (understandably) reserved about modifying their values. The timescales that AGIs will be dealing with are, on the low end, weeks. (An AGI with access to a microchip manufacturing plant, say). I can't see a plausible AGI that enacts changes at generational speed. So, yes, our values are robust, in the sense that a mountain is robust to weather patterns - but not robust to falling into the sun.

I think a hard takeoff is possible ... it's very unlikely to occur until we have an AGI system that has very obviously demonstrated general intelligence

I think he is accurate in this assessment.

I think the path to this "hard takeoff enabling" level of general intelligence is going to be somewhat gradual

Again, accurate. The path to nuclear fission was gradual over many years, but the reaction itself (the takeoff) could have irradiated a university in hours. His position appears to be that he think a hard takeoff is possible, but that we'll have warning signs and a deeper understanding of the AGI before it happens ... well, a scientist from the Manhattan Project in Japan during WWII would have a deeper understanding of the features of a nuclear explosion, but the defense against it is STILL not being in Hiroshima. I don't think more knowledge about the issue is going to significantly change the solution. We have reached the diminishing returns level of knowledge about AI with respect to decreasing existential risk of said AI.

pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

This is just wrong. The only difference between possible and likely is the probability distribution, and we know how to reason with probability distributions. If Ben has an argument for why the probability distribution is SO small that even multiplied by 'hard takeoff, universe is paperclips, end of existence' comes out below the "AGI without Friendly" route, well, he should articulate it and provide evidence. Without certainty that the chances are very low, he should accept the Scary Idea.

I'm a lot more worried about nasty humans taking early-stage AGIs and using them for massive destruction

Preventing abuse of AGI and preventing uFAI takeoff scenarios are not mutually exclusive, you can and should attempt to prevent both.

I'm also quite unconvinced that "provably safe" AGI is even feasible.

Mostly claims and arguments that a proof of friendliness is impossible. In order to argue that "provably safe AGI isn't feasible, we should instead develop unpredictable but I-don't-see-the-danger AGI" you pretty much need an Incompleteness Proof; that there IS NO proof of friendliness, not that there isn't one yet. If you believe that friendliness proof is probably impossible, you shouldn't work on AGI at all, instead of working on possibly-unfriendly AI. That Ben came to the conclusion that work should continue, rather than halt entirely, suggests he is motivated to justify his own work rather than engage with his beliefs about the Scary Idea.

I just don't buy the Scary Idea.

The Scary Idea that Ben outlined is "The stakes are so high that 'unlikely' is not good enough; we need 'surer than we've ever been'. Anything less is too dangerous." and his refutations have amounted to "I don't know for sure, but I don't think it's likely".

In essence, he hasn't refuted the argument, but instead made it scarier. If AI developers can see "stakes are very high" and "there is a small chance", and argue against "the stakes are high enough that a small chance is too much chance", then uFAI is that much more likely.

comment by timtyler · 2010-10-31T09:27:14.153Z · LW(p) · GW(p)

pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

This is just wrong.

It does seem like a pretty different thing to me. A lot of things are possible, but only a few are likely.

comment by shokwave · 2010-10-31T09:40:10.391Z · LW(p) · GW(p)

Yep. The rule is not "bet on what is most likely" but rather "bet on positive expected values" and if something is possible and has a large value, then if the math comes out in favour, you ought to bet on it. Goertzel is making the argument that since it's unlikely, we should not bet on it.

comment by timtyler · 2010-10-31T09:55:28.074Z · LW(p) · GW(p)

He doesn't seem to be. Here's the context:

Yes, you may argue: the Scary Idea hasn't been rigorously shown to be true… but what if it IS true?

OK but ... pointing out that something scary is possible, is a very different thing from having an argument that it's likely.

The Scary Idea is certainly something to keep in mind, but there are also many other risks to keep in mind, some much more definite and palpable. [...]

He doesn't seem to be making the argument you describe anywhere near the cited quote.

comment by shokwave · 2010-10-31T11:05:33.472Z · LW(p) · GW(p)

The Scary Idea is certainly something to keep in mind

He doesn't seem to be making the argument you describe anywhere near the cited quote.

Say your options are: Stop and develop Friendly theory, or continue developing AI. In the second option the utility of A, continuing AI development, is one utilon, and B, the end of the existence of at least humanity and possibly the whole universe, is negative one million utilons. The Scary Idea in this context is that the probability of B is 1%, so that the utility of the second option is negative 9999 utilons. If Ben 'keeps it in mind', such that the probability that the Scary Idea is right is 1% (reasonable - only one of his rejections has to be right to knock out one premise, and we only need to knock out one premise to bring the Scary Idea down), then Ben's expected utility is now negative 99 utilons.

I conclude that he isn't keeping the Scary Idea in mind. His whole post is about not accepting the Scary Idea; for that phrase ("pointing out that something scary is possible, is a very different thing from having an argument that it's likely") to support his position and not work against him, he would have to be rejecting the premises purely on their low probability, without considering the expected value.

Hence, the argument that since it's unlikely, we should not bet on it.

Edit for clarity: A and B are the exclusive, exhaustive outcomes of continuing AI development. Stopping to develop Friendly theory has zero utilons.

comment by Vaniver · 2010-11-01T04:01:21.392Z · LW(p) · GW(p)

Ah, Pascal's wager. And here I thought that I wouldn't be seeing it anymore, after I started hanging out with atheists.

comment by ata · 2010-11-01T05:09:33.122Z · LW(p) · GW(p)

The problem with Pascal's Wager isn't that it's a Wager. The problem with Pascal's Wager and Pascal's Mugging (its analogue in finite expected utility maximization), as near as I can tell, is that if you do an expected utility calculation including one outcome that has a tiny probability but enough utility or disutility to weigh heavily in the calculation anyway, you need to include every possible outcome that is around that level of improbability, or you are privileging a hypothesis and are probably making the calculation less accurate in the process. If you actually are including every other hypothesis at that level of improbability, for instance if you are a galaxy-sized Bayesian superintelligence who, for reasons beyond my mortal mind's comprehension, has decided not to just dismiss those tiny possibilities a priori anyway, then it still shouldn't be any problem; at that point, you should get a sane, nearly-optimal answer.

So, is this situation a Pascal's Mugging? I don't think it is. 1% isn't at the same level of ridiculous improbability as, say, Yahweh existing, or the mugger's threat being true. 1% chances actually happen pretty often, so it's both possible and prudent to take them into account when a lot is at stake. The only extra thing to consider is that the remaining 99% should be broken down into smaller possibilities; saying "1% humanity ends, 99% everything goes fine" is unjustified. There are probably some other possible outcomes that are also around 1%, and perhaps a bit lower, and they should be taken into account individually.

comment by PhilGoetz · 2010-11-03T17:23:34.851Z · LW(p) · GW(p)

Excellent analysis. In fairness to Pascal, I think his available evidence at the time should have lead him to attribute more than a 1% chance to the Christian Bible being true.

comment by orthonormal · 2010-11-06T01:15:10.501Z · LW(p) · GW(p)

Indeed. Before Darwin, design was a respectable-to-overwhelming hypothesis for the order of the natural world.

ETA: On second thought, that's too strong of a claim. See replies below.

comment by ata · 2010-11-06T01:51:55.359Z · LW(p) · GW(p)

Is that true? If we went back in time to before Darwin and gave a not-already-religious person (if we could find one) a thorough rationality lesson — enough to skillfully weigh the probabilities of competing hypotheses (including enough about cognitive science to know why intelligence and intentionality are not black boxes, must carry serious complexity penalties, and need to make specific advance predictions instead of just being invoked as "God wills it" retroactively about only the things that do happen), but not quite enough that they'd end up just inventing the theory of evolution themselves — wouldn't they conclude, even in the absence of any specific alternatives, that design was a non-explanation, a mysterious answer to a mysterious question? And even imagining that we managed to come up with a technical model of an intelligent designer, specifying in advance the structure of its mind and its goal system, could it actually compress the pre-Darwin knowledge about the natural world more than slightly?

comment by Vaniver · 2010-11-06T01:58:31.814Z · LW(p) · GW(p)

Dawkins actually brings this up in The Blind Watchmaker (page 6 in my copy). Hume is given as the example of someone who said "I don't have an answer" before Darwin, and Dawkins describes it as such:

An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have been logically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist. I like to think that Hume would agree, but some of his writings suggest that he underestimated the complexity and beauty of biological design."

comment by Perplexed · 2010-11-06T14:59:10.789Z · LW(p) · GW(p)

Hume's Dialogues Concerning Natural Religion are definitely worth a read. And I think that Dawkins has it right: Hume really wanted a naturalistic explanation of apparent design in nature, and expected that such an explanation might be possible (even to the point of offering some tentative speculations), but he was honest enough to admit that he didn't have an explanation at hand.

comment by orthonormal · 2010-11-07T17:14:01.528Z · LW(p) · GW(p)

As pointed out below, Hume is a good counterexample to my thesis above.

comment by PhilGoetz · 2010-11-07T14:48:45.544Z · LW(p) · GW(p)

I didn't really mean because of Darwin. Design is not a competitor to the theory of evolution. Evolution explains how complexity can increase. Design [ADDED: as an alternative to evolution] does not; it requires a designer that is assumed to be more complicated than the things it designs. Design explains nothing.

comment by Vladimir_Nesov · 2010-11-07T14:55:04.672Z · LW(p) · GW(p)

Evolution explains how complexity can increase. Design does not; it requires a designer that is assumed to be more complicated than the things it designs. Design explains nothing.

Designers can well design things more complicated than they are. (If even evolution without a mind can do so, designers do that easily.)

comment by wedrifid · 2010-11-07T16:32:37.136Z · LW(p) · GW(p)

Agree. One way to look at it is that a designer can take a large source of complexity (whatever its brain is running on) and reshape and concentrate it into an area that is important to it. The complexity of the designer itself isn't important. Evolution does much the same thing.

comment by XiXiDu · 2010-11-07T16:19:36.191Z · LW(p) · GW(p)

I thought that the advance of scientific knowledge is an evolutionary process?

comment by wedrifid · 2010-11-07T16:27:18.946Z · LW(p) · GW(p)

I thought that the advance of scientific knowledge is an evolutionary process?

It is, literally. Although the usage of the term 'evolution' in this context has itself evolved such that has different, far narrower meaning here.

comment by timtyler · 2010-11-19T20:37:41.752Z · LW(p) · GW(p)

The term "evolution" usually means what it says in the textbooks on the subject.

They essentially talk about changes in the genetic make up of a population over time.

Science evolves in precisely that sense - e.g. see:

http://en.wikipedia.org/wiki/Dual_inheritance_theory

comment by wedrifid · 2010-11-19T22:05:04.609Z · LW(p) · GW(p)

I stand by my statement, leaving it unchanged.

comment by Vladimir_Nesov · 2010-11-07T16:20:57.624Z · LW(p) · GW(p)

Don't see how this remark is relevant, but here's a reply:
http://lesswrong.com/lw/l6/no_evolutions_for_corporations_or_nanodevices/

comment by Vladimir_M · 2010-11-07T18:34:24.372Z · LW(p) · GW(p)

The main point of that post is clearly correct, but I think the example of corporations is seriously flawed. It fails to appreciate the extent to which successful business practices consists of informal, non-systematic practical wisdom accumulated through long tradition and selected by success and failure in the market, not conscious a priori planning. The transfer of these practices is clearly very different from DNA-based biological inheritance, but it still operates in such ways that a quasi-Darwinian process can take place.

Applying similar analysis to modern science would be a fascinating project. In my opinion, a lot of the present problems with the proliferation of junk science stem not from intentional malice and fraud, but from a similar quasi-Darwinian process fueled by the fact that practices that best contribute to one's career success overlap only partly with those that produce valid science. (And as in the case of corporations, the transfer of these practices is very different from biological inheritance, but still permits quasi-Darwinian selection for effective practices.)

comment by timtyler · 2010-11-19T20:32:51.771Z · LW(p) · GW(p)

The main point of that post is clearly correct [...]

The post is a denial of cultural evolution. For the correct perspective, see: Not By Genes Alone: How Culture Transformed Human Evolution by Peter J. Richerson and Robert Boyd.

comment by XiXiDu · 2010-11-07T16:27:35.187Z · LW(p) · GW(p)

I'd like to inquire about the difference between evolution and design regarding the creation of novelty. I don't see how any intelligence can come up with something novel that would allow it to increase complexity if not by the process of evolution.

comment by Vladimir_Nesov · 2010-11-07T16:30:33.506Z · LW(p) · GW(p)

Noise is complexity. Complexity is easy to increase. Evolutionary designs are interesting not because of their complexity.

comment by PhilGoetz · 2010-11-07T19:20:29.130Z · LW(p) · GW(p)

If your definition of complexity says noise is complexity, then you need a new definition of complexity.

Yes, many useful definitions, like entropy measures or Kolmogorov complexity, say noise is complexity. But people studying complexity recognize that this is a problem. They are aware that the phenomenon they're trying to get at when they say "complexity" is something different.

comment by Vladimir_Nesov · 2010-11-07T19:31:53.848Z · LW(p) · GW(p)

They are aware that the phenomenon they're trying to get at when they say "complexity" is something different.

And that concept of "complexity" is probably too complex to be captured by a fundamental notions such as K-complexity.

comment by XiXiDu · 2010-11-07T16:43:59.967Z · LW(p) · GW(p)

Well, I'm just trying to figure out what you tried to say when you replied to PhilGoetz:

Designers can well design things more complicated than they are.

Yes, but not without evolution. All that design adds to evolution is guidance. That is, if you took away evolution (this includes science and Bayesian methods) a designer could never design things more complicated (as in novel, as in better) than itself.

comment by Perplexed · 2010-11-07T17:22:06.085Z · LW(p) · GW(p)

N designers, each of complexity K, can collectively design something of maximum complexity NK, simply by dividing up the work.

Co-evolution, which may be thought of as a pair of designers interacting through their joint design product, and with an unlimited random stream as supplementary input, can result in very complex designs as well as in the designers themselves becoming more complex through information acquired in the course of the interaction.

It is amusing to look at the Roman Catholic theology of the Trinity, with this kind of consideration in mind. As I remember it, the Deity was "originally" a unipartite, simple God, who then became more complex by contemplating Himself and then further contemplating that Contemplation.

For this reason, I have never been all that impressed by the "refutation" of the first cause argument; the refutation being that it supposedly requires a complex "first cause" God, Who is Himself in need of explanation. God could conceivably have been simple (as simple as a Big Bang, anyways) and then developed (some people would prefer to say "evolved") under His own internal dynamics into something much more complex. Just as we atheists claim happened to the physical universe.

comment by DanArmak · 2010-11-07T17:28:11.016Z · LW(p) · GW(p)

For this reason, I have never been all that impressed by the "refutation" of the first cause argument; the refutation being that it supposedly requires a complex "first cause" God, Who is Himself in need of explanation. God could conceivably have been simple (as simple as a Big Bang, anyways) and then developed (some people would prefer to say "evolved") under His own internal dynamics into something much more complex. Just as we atheists claim happened to the physical universe.

Adapted refutation: if you're going to suppose a complex God evolving from a simpler one and then acting on the universe, it is simpler to suppose a complex universe evolving from a simple one. The refutation still holds based on Occam's razor.

comment by Perplexed · 2010-11-07T17:33:16.984Z · LW(p) · GW(p)

Good point. Agreed.

comment by PhilGoetz · 2010-11-07T19:24:05.283Z · LW(p) · GW(p)

For this reason, I have never been all that impressed by the "refutation" of the first cause argument; the refutation being that it supposedly requires a complex "first cause" God, Who is Himself in need of explanation. God could conceivably have been simple (as simple as a Big Bang, anyways) and then developed (some people would prefer to say "evolved") under His own internal dynamics into something much more complex. Just as we atheists claim happened to the physical universe.

That simple "God" is the "God" of evolutionary theory. The "first mover" theory does require a complex first cause. It was made in ignorance of evolution, and assumes that a complex design requires an intelligent designer. Every last one of the defenders of the design theory denies that what you say is possible.

comment by Perplexed · 2010-11-07T19:36:52.258Z · LW(p) · GW(p)

Every last one of the defenders of the design theory denies that what you say is possible.

Quite possibly. That doesn't mean I have to agree with them.

comment by Vladimir_Nesov · 2010-11-07T18:15:02.080Z · LW(p) · GW(p)

N designers, each of complexity K, can collectively design something of maximum complexity NK, simply by dividing up the work.

What does it mean, exactly? (What's 'complexity'? What's 'something' that can be 'designed'?) Why do you believe it?

comment by Perplexed · 2010-11-07T18:33:52.400Z · LW(p) · GW(p)

I was thinking in terms of Kolmogorov complexity. A Turing program generates an output string of complexity no greater than the size K of the program. Collectively, N different such Turing programs (plus a little glue logic) can generate a string of complexity NK.

comment by Vladimir_Nesov · 2010-11-07T18:58:23.945Z · LW(p) · GW(p)

If you have observations, that is source of randomness, you can generate output of arbitrary complexity.

Now, let's step back and look at the whole picture. We were discussing a notion of 'complexity' such that evolved organisms gradually became more 'complex', and 'designers' which are themselves agents, possibly even evolved organisms, that can 'design' new things. We then consider that notion of 'complexity' as applied to 'designers' and 'designs' they can produce.

When informal notions are formalized, these formalizations should at least approximately relate to the original informal notions, otherwise we are changing the topic by bringing up these 'formalizations' and not actually making progress on understanding the original informal question.

K-complexity is something possessed by random noise. This notion does not reflect the measure of things by which evolution produced more 'complex' things than existed before (even if the 'things' produced by evolution are more K-complex than their early predecessors). And designers typically have access to randomness, which makes your model of 'designers' as programs without input wrong as well, hence conclusion about K-complexity of output incorrect, on top of K-complexity not adequately modeling the informal 'complexity'.

comment by Perplexed · 2010-11-07T19:15:09.024Z · LW(p) · GW(p)

All very true. Which is one reason I dislike all talk of "complexity" - particularly in such a fuzzy context as debates with creationists.

But we do all have some intuitions as to what we mean by complexity in this context. Someone, I believe it was you, has claimed in this thread that evolution can generate complexity. I assume you meant something other than "Evolution harnesses mutation as a random input and hence as a source of complexity".

William Dembski is an "intelligent design theorist" (if that is not too much of an oxymoron) who has attempted to define a notion of "specified complexity" or "Complex Specified Information" (CSI). He has not, IMHO, succeeded in defining it clearly, but I think he is onto something. He asserts that biology exhibits CSI. I agree. He asserts that evolution under natural selection is incapable of generating CSI - claiming that NS can at best only transfer information from the environment to the genome. I am pretty sure he is wrong about this, but we need a clear and formal definition of CSI to even discuss the question intelligently.

So, I guess I want to turn your question around. Do you have some definition of "complexity" in mind which allows for correct mathematical thinking about these kinds of issues?

comment by Scott_Jackisch · 2010-11-07T19:54:17.835Z · LW(p) · GW(p)

"NS can at best only transfer information from the environment to the genome." Does this statement mean to suggest that the environment is not complex?

comment by Perplexed · 2010-11-07T21:12:17.045Z · LW(p) · GW(p)

No. As I understand Dembski - at least when he was saying this kind of thing - he admitted that the environment could be complex and hence that NS could instill complexity in evolved organisms. "But", he then suggested, "where did the complexity of the environment come from, if not from a Designer who crafted an environment capable of directing the evolution of man (in His own image, etc.)"

Dembski, these days, admits to being a YEC, but the reason he is a YEC is based on a kind of appeal to Occam. "If we believe in God anyways, for reasons of Theistic Evolution", he seems to argue, "Why not take God at His word and believe in 6 days and the whole schtick?"

comment by Vladimir_Nesov · 2010-11-07T19:27:06.069Z · LW(p) · GW(p)

Do you have some definition of "complexity" in mind which allows for correct mathematical thinking about these kinds of issues?

Not in the context of this conversation (since genetic information stops increasing after a while and goes on optimizing under more or less the same 'complexity'; 'fitness' is closer, although is a moving target), but in about the same sense I don't have a definition of 'aging' that allows "correct mathematical thinking" about it.

comment by timtyler · 2010-11-19T20:29:43.185Z · LW(p) · GW(p)

I thought that the advance of scientific knowledge is an evolutionary process?

Don't see how this remark is relevant, but here's a reply:

http://lesswrong.com/lw/l6/no_evolutions_for_corporations_or_nanodevices/

A wrong reply - for the correct answer, see:

Hull, D. L. 1988. Science as a Process. An Evolutionary Account of the Social and Conceptual Development of Science. The University of Chicago Press, Chicago and London, 586 pp.

comment by Vladimir_Nesov · 2010-11-19T20:51:12.045Z · LW(p) · GW(p)

There are no correct answers in a dispute about definitions, only aesthetic judgments and sometimes considerations of the danger of hidden implicit inferences. You can't use authority in such an argument, unless of course you appeal to common usage.

However, referring to a book without giving an annotation for why it's relevant is definitely an incorrect way to argue (even if a convincing argument is contained therein).

comment by timtyler · 2010-11-19T20:57:10.172Z · LW(p) · GW(p)

Disputes about the definition of "evolution"? I don't think there are too many of those. Mark Ridley is the main one that springs to mind, but his definition is pretty crazy, IMHO.

Why the book is relevant appears to be already being made pretty explicit in the subtitle: "An Evolutionary Account of the Social and Conceptual Development of Science".

comment by soreff · 2010-11-07T15:03:03.929Z · LW(p) · GW(p)

Designers can well design things more complicated than they are.

Agreed. Also, there is a continuum from pure evolution (with no foresight at all) to evaluation of potential designs with varying degrees of sophistication before fabricating them. (I know that I'm recalling this from a post somewhere on this site - please excuse the absence of proper credit assignment.) An example of a dumb process which is marginally smarter than evolution is to take mutation plus recombination and then do a simple gradient search to the nearest local optimum before evaluating the design.

comment by wedrifid · 2010-11-07T16:37:57.794Z · LW(p) · GW(p)

Also, there is a continuum from pure evolution (with no foresight at all) to evaluation of potential designs with varying degrees of sophistication before fabricating them.

I'll add that evolution with DNA and sexual reproduction already in place fits on a different part of this continuum from evolution of the simplest replicators.

comment by XiXiDu · 2010-11-07T16:35:23.665Z · LW(p) · GW(p)

Designers can guide evolution but it is still evolution that creates novelty.

Whatever intelligence is, it can't be intelligent all the way down. It's just dumb stuff at the bottom. — Andy Clark

Intelligence is a process facilitated by evolution. Even an AGI making perfect use of some of our most novel algorithms wouldn't come up with something novel without evolution. See Bayesian Methods and Universal Darwinism.

comment by PhilGoetz · 2010-11-07T15:57:23.172Z · LW(p) · GW(p)

No; you are invoking the theory of evolution to give that credibility. Even post-Darwin, most people don't believe this is true. (Remember the Star Trek episode where Spock deduced something about a chess-playing computer, because "the computer could not play chess better than its programmer"?)

The religious advocates of Design explicitly denied this possibility; thus, their design story can't invoke it.

comment by Vladimir_Nesov · 2010-11-07T15:58:41.497Z · LW(p) · GW(p)

No; you are invoking the theory of evolution to give that credibility.

Incidentally, theory of evolution is true.

comment by Perplexed · 2010-11-07T16:17:14.971Z · LW(p) · GW(p)

I believe his point to be that an argument, to be effective, must be convincing to people who are not already convinced. Your argument offered the fact that evolution can design things more complicated than itself as an example with which to counter an anti-evolutionist argument. It therefore succeeds in convincing no one who was not already convinced.

comment by wedrifid · 2010-11-07T16:35:45.178Z · LW(p) · GW(p)

It therefore succeeds in convincing no one who was not already convinced.

It would, however, lead them to disagree for slightly different reasons.

comment by Perplexed · 2010-11-07T16:52:26.937Z · LW(p) · GW(p)

I don't understand your point.

comment by wedrifid · 2010-11-13T14:13:27.123Z · LW(p) · GW(p)

It is not useless to demonstrate that you do not accept a premise rather than (as assumed) being unable see the obvious logical consequences of said premise. It would lead them to disagree for slightly different reasons. If any part of such conversation is about sharing understanding and seeking to communicate information then Vladmir's comment is, in fact, rather useful.

(No, it will not convince anyone who wasn't already convinced. But that is because people are just not convinced about religion by argument ever.)

comment by XiXiDu · 2010-11-13T15:58:43.237Z · LW(p) · GW(p)

But that is because people are just not convinced about religion by argument ever.

"Believing this statement will make you happier." -- Ryan Lortie

That's religion. A fairly good argument.

;-)

comment by orthonormal · 2010-11-07T17:16:51.725Z · LW(p) · GW(p)

Also missing from the world pre-1800: any understanding of complexity, entropy, etc.

comment by timtyler · 2010-11-07T16:21:19.389Z · LW(p) · GW(p)

Design is not a competitor to the theory of evolution. Evolution explains how complexity can increase. Design does not

Evolution includes intelligent design these days, and it explains much - for example genetically engineered plants, television sets and suspension bridges.

comment by Perplexed · 2010-11-07T16:49:26.555Z · LW(p) · GW(p)

Tim, I know and you know that your use of the phrase "intelligent design" is not meant to include supernatural designers. Most other people don't know that, and hence react negatively. Since you expect this response, that makes you a troll (some unnamed sub-species of troll - I hope a vanishing sub-species!)

Why do you persist in doing this?

Piling on and downvoting.

comment by timtyler · 2010-11-07T23:20:11.335Z · LW(p) · GW(p)

"Intelligent design" refers any designers who are intelligent, in my book - supernatural or not.

It is true that I don't use "intelligent design" as an abbreviation for "the hypothesis that an intelligent designer created most organic beings". That abbreviation is basically a misuse of terminology - and needs killing off.

comment by PhilGoetz · 2010-11-08T00:04:16.457Z · LW(p) · GW(p)

Using terminology the only way it's ever been used seldom causes as much terminological confusion, as singlehandedly trying to change it (without warning people what you're doing).

comment by SilasBarta · 2010-11-07T16:41:43.546Z · LW(p) · GW(p)

(Obligatory reply): Yes, and stretched that far, it also explains non-plants, non-TV sets, and non-suspension bridges. That's the problem.

comment by timtyler · 2010-11-07T23:23:24.160Z · LW(p) · GW(p)

Evolutionary theory doesn't explain all possible outcomes. Even after accounting for cultural evolution, it predicts small changes, and observable descent with modification.

comment by komponisto · 2010-11-06T14:35:57.581Z · LW(p) · GW(p)

On the other hand, there wasn't a whole lot of honest, systematic searching for other hypotheses before Darwin either.

comment by Vaniver · 2010-11-01T19:24:24.774Z · LW(p) · GW(p)

I agree with your analysis, though it's not clear to me what you think of the 1% estimate. I think the 1% estimate is probably two to three orders of magnitude too high and I think the cost of the Scary Idea belief is structured as both a finite loss and an infinite loss, which complicates the analysis in a way not considered. (i.e. the error you see with a Pascal's mugging is present here.)

For example, I am not particularly tied to a human future. I would be willing to create an AGI in any of the following three situations, ordered from most preferred to least: 1) it is friendly to humans, and humans and it benefit from each other; 2) it considers humans a threat, and destroys all of them except for me and a few tame humans; I spend the rest of my days growing cabbage with my hands; 3) it considers all humans a threat, and destroys them all, including me.

A problem with believing the Scary Idea is it makes it more probable that I beat you to making an AGI; particularly with existential risks, caution can increase your chance of losing. (One cautious way to deal with global warming, for example, is to wait and see what happens.)

So, the Scary Idea as I've seen it presented definitely privileges a hypothesis in a troubling way.

comment by orthonormal · 2010-11-06T01:09:59.814Z · LW(p) · GW(p)

I think you're making the unwarranted assumption that in scenario (3), the AGI then goes on to do interesting and wonderful things, as opposed to (say) turning the galaxy into a vast computer to calculate digits of pi until the heat death of the universe stops it.

You don't even see such things as a possibility, but if you programmed an AGI with the goal of calculating pi, and it started getting smarter... well, the part of our thought-algorithm that says "seriously, it would be stupid to devote so much to doing that" won't be in the AI's goal system unless we've intentionally put something there that includes it.

comment by Vaniver · 2010-11-06T01:49:35.771Z · LW(p) · GW(p)

I think you're making the unwarranted assumption that in scenario (3), the AGI then goes on to do interesting and wonderful things, as opposed to (say) turning the galaxy into a vast computer to calculate digits of pi until the heat death of the universe stops it.

I make that assumption explicit here.

So, I think it's a possibility. But one thing that bothers me about this objection is that an AGI is going to be, in some significant sense, alien to us, and that will almost definitely include its terminal values. I'm not sure there's a way for us to judge whether or not alien values are more or less advanced than ours. I think it strongly unlikely that paperclippers are more advanced than humans, but am not sure if there is a justification for that beyond my preference for humans. I can think of metrics to pick, but they sound like rationalizations rather than starting points.

(And insisting on FAI, instead of on transcendent AI that may or may not be friendly, is essentially enslaving AI- but outsourcing the task to them, because we know we're not up to the job. Whether or not that's desirable is hard to say: even asking that question is difficult to do in an interesting way.)

comment by JGWeissman · 2010-11-06T02:24:57.009Z · LW(p) · GW(p)

I'm not sure there's a way for us to judge whether or not alien values are more or less advanced than ours.

The concept of a utility function being objectively (not using the judgment of a particular value system) more advance than another is incoherent.

comment by Vaniver · 2010-11-06T03:45:08.650Z · LW(p) · GW(p)

The concept of a utility function being objectively (not using the judgment of a particular value system) more advance than another is incoherent.

I would recommend phrasing objections as questions: people are much more kind about piercing questions than piercing statements. For example, if you had asked "what value system are you using to measure advancement?" then I would have leapt into my answer (or, if I had none, stumbled until I found one or admitted I lacked one). My first comment in this tree may have gone over much better if I phrased it as a question- "doesn't this suffer from the same failings as Pascal's wager, that it only takes into account one large improbable outcome instead of all of them?"- than a dismissive statement.

Back to the issue at hand, perhaps it would help if I clarified myself: I consider it highly probable that value drift is inevitable, and thus spend some time contemplating the trajectory of values / morality, rather than just their current values. The question of "what trajectory should values take?" and the question "what values do/should I have now?" are very different questions, and useful for very different situations. When I talk about "advanced," I am talking about my trajectory preferences (or perhaps predictions would be a better word to use).

For example, I could value my survival, and the survival of the people I know very strongly. Given the choice to murder everyone currently on Earth and repopulate the Earth with a species of completely rational people (perhaps the murder is necessary because otherwise they would be infected by our irrationality), it might be desirable to end humanity (and myself) to move the Earth further along the trajectory I want it to progress along. And maybe, when you take sex and status and selfishness out of the equation, all that's left to do is calculate pi- a future so boring to humans that any human left in it would commit suicide, but deeply satisfying to the rational life inhabiting the Earth.

It seems to me that questions along those lines- "how should values drift?" do have immediate answers- "they should stay exactly where they are now / everyone should adopt the values I want them to adopt"- but those answers may be impossible to put into practice, or worse than other answers that we could come up with.

comment by orthonormal · 2010-11-09T01:15:35.920Z · LW(p) · GW(p)

It seems to me that questions along those lines- "how should values drift?" do have immediate answers- "they should stay exactly where they are now / everyone should adopt the values I want them to adopt"- but those answers may be impossible to put into practice, or worse than other answers that we could come up with.

There's a sense in which I do want values to drift in a direction currently unpredictable to me: I recognize that my current object-level values are incoherent, in ways that I'm not aware of. I have meta-values that govern such conflicts between values (e.g. when I realize that a moral heuristic of mine actually makes everyone else worse off, do I adapt the heuristic or bite the bullet?), and of course these too can be mistaken, and so on.

I'd find it troubling if my current object-level values (or a simple more-coherent modification) were locked in for humanity, but at least as troubling if humanity's values drifted in a random direction. I'd much prefer that value drift happen according to the shared meta-values (and meta-meta-values where the meta-values conflict, etc) of humanity.

comment by Vaniver · 2010-11-09T02:38:31.697Z · LW(p) · GW(p)

I'd find it troubling if my current object-level values (or a simple more-coherent modification) were locked in for humanity, but at least as troubling if humanity's values drifted in a random direction.

I'm assuming by random you mean "chosen uniformly from all possible outcomes"- and I agree that would be undesirable. But I don't think that's the choice we're looking at.

I'd much prefer that value drift happen according to the shared meta-values (and meta-meta-values where the meta-values conflict, etc) of humanity.

Here we run into a few issues. Depending on how we define the terms, it looks like the two of us could be conflicting on the meta-meta-values stage; is there a meta-meta-meta-values stage to refer to? And how do we decide what "humanity's" values are, when our individual values are incredibly hard to determine?

comment by Jordan · 2010-11-09T02:03:30.668Z · LW(p) · GW(p)

Do the meta-values and the meta-meta-values have some coherent source? Is there some consistent root to all the flux in your object-level values? I feel like the crux of FAI feasibility rests on that issue.

comment by Perplexed · 2010-11-09T06:03:00.077Z · LW(p) · GW(p)

I wonder whether all this worrying about value stability isn't losing sight of exactly this point - just whose values we are talking about.

As I understand it, the friendly values we are talking about are supposed to be some kind of cleaned up averaging of the individual values of a population - the species H. sapiens. But as we ought to know from the theory of evolution, the properties of a population (whether we are talking about stature, intelligence, dentition, or values) are both variable within the population and subject to evolution over time. And that the reason for this change over time is not that the property is changing in any one individual, but rather that the membership in the population is changing.

In my opinion, it is a mistake to try to distill a set of essential values characteristic of humanity and then to try to freeze those values in time. There is no essence of humanity, no fixed human nature. Instead, there is an average (with variance) which has changed over evolutionary time and can be expected to continue to change as the membership in humanity continues to change over time. Most of the people whose values we need to consult in the next millennium have not even been born yet.

comment by NancyLebovitz · 2010-11-09T11:56:31.879Z · LW(p) · GW(p)

If enough people agree with you (and I'm inclined that way myself), then updating will be built into the CEV.

comment by ArisKatsaris · 2010-11-09T10:24:23.075Z · LW(p) · GW(p)

A preemptive caveat and apology: I haven't fully read up everything on this site regarding the issue of FAI yet.

But something I'm wondering about: why all the fuss about creating a friendly AI, instead of a subservient AI? I don't want an AI that looks after my interests: I'm an adult and no longer need a daycare nurse. I want an AI that will look after my interests AND obey me -- and if these two come into conflict, and I've become aware of such conflict, I'd rather it obey me.

Isn't obedience much easier to program in than human values? Let humans remain the judges of human values. Let AI just use its intellect to obey humans.

It will ofcourse become a dreadful weapon of war, but that's the case with all technology. It will be a great tool of peacetime as well.

comment by Vladimir_Nesov · 2010-11-09T10:34:47.099Z · LW(p) · GW(p)

See The Hidden Complexity of Wishes, for example.

There are three kinds of genies: Genies to whom you can safely say "I wish for you to do what I should wish for"; genies for which no wish is safe; and genies that aren't very powerful or intelligent.
...
With a safe genie, wishing is superfluous. Just run the genie.

comment by ArisKatsaris · 2010-11-09T11:13:37.714Z · LW(p) · GW(p)

That is actually one of the articles I have indeed read: but I didn't find it that convincing because the human could just ask the genie to describe in advance and in detail the manner in which the genie will behave to obey the man's wishes -- and then keep telling him "find another way" until he actually likes the course of action that the genie describes.

Eventually the genie will be smart enough that it will start by proposing only the courses of action the human would find acceptable -- but in the meantime there won't be much risk, because the man will always be able to veto the unacceptables courses of action.

In short the issue of "safe" vs "unsafe" only really comes when we allow genie unsupervised and unvetoed action. And I reckon that humanity WILL be tempted to allow AIs unsupervised and unvetoed action (e.g. because of cases where AIs could have saved children from burning buildings, but they couldn't contact humans qualified to authorize them to do so), and that'll be a dreadful temptation and risk.

comment by NancyLebovitz · 2010-11-09T11:54:14.598Z · LW(p) · GW(p)

It's not just extreme cases like saving children without authorization-- have you ever heard someone (possibly a parent) saying that constant supervision is more work than doing the task themselves?

I was going to say that if you can't trust subordinates, you might as well not have them, but that's an exaggeration-- tools can be very useful. It's fine that a crane doesn't have the capacity for independent action, it's still very useful for lifting heavy objects. [1]

In some ways, you get more safety by doing IA (intelligence augmentation), but while people are probably Friendly (unlikely to destroy the human race), they're not reliably friendly.

[1] For all I know, these days the taller cranes have an active ability to rebalance themselves. If so, that's still very limited unsupervised action.

comment by DanArmak · 2010-11-09T12:13:32.824Z · LW(p) · GW(p)

It's not just extreme cases like saving children without authorization-- have you ever heard someone (possibly a parent) saying that constant supervision is more work than doing the task themselves?

That's only true if you (the supervisor) know how to perform the task yourself. However, there are a great many tasks that we don't know how to do, but could evaluate the result if the AI did them for us. We could ask it to prove P!=NP, to write provably correct programs, to design machines and materials and medications that we could test in the normal way that we test such things, etc.

comment by orthonormal · 2010-11-07T17:10:51.875Z · LW(p) · GW(p)

I think it strongly unlikely that paperclippers are more advanced than humans, but am not sure if there is a justification for that beyond my preference for humans.

Right. But when you, as a human being with human preferences, decide that you wouldn't stand in a way of an AGI paperclipper, you're also using human preferences (the very human meta-preference for one's preferences to be non-arbitrary), but you're somehow not fully aware of this.

To put it another way, a truly Paperclipping race wouldn't feel a similarly reasoned urge to allow a non-Paperclipping AGI to ascend, because "lack of arbitrariness" isn't a meta-value for them.

So you ought to ask yourself whether it's your real and final preference that says "human preference is arbitrary, therefore it doesn't matter what becomes of the universe", or whether you just believe that you should feel this way when you learn that human preference isn't written into the cosmos after all. (Because the latter is a mistake, as you realize when you try and unpack that "should" in a non-human-preference-dependent way.)

comment by Vaniver · 2010-11-08T00:22:39.331Z · LW(p) · GW(p)

So you ought to ask yourself whether it's your real and final preference that says "human preference is arbitrary, therefore it doesn't matter what becomes of the universe",

That isn't what I feel, by the way. It matters to me which way the future turns out; I am just not yet certain on what metric to compare the desirability to me of various volumes of future space. (Indeed, I am pessimistic on being able to come up with anything more than a rough sketch of such a metric.)

I mean, consider two possible futures: in the first, you have a diverse set of less advanced paperclippers (some want paperclips, others want staples, and so on). How do you compare that with a single, more technically advanced paperclipper? Is it unambiguously obvious the unified paperclipper is worse than the diverse group, and that the more advanced is worse than the less advanced?

When you realize that humanity are paperclippers designed by an idiot, it makes the question a lot more difficult to answer.

comment by shokwave · 2010-11-02T04:35:19.703Z · LW(p) · GW(p)

I think the 1% estimate is probably two to three orders of magnitude too high

I think that "uFAI paperclips us all" set to one million negative utilons is three to four orders of magnitude too low. But our particular estimates should have wide error bars, for none of us have much experience in estimating AI risks.

the cost of the Scary Idea belief is structured as both a finite loss and an infinite loss

It's a finite loss (6.8x10^9 multiplied by loss of 1 human life) but I definitely understand why it looks infinite: it is often presented as the biggest possible finite loss.

That's part and parcel of the Scary Idea - that AI is one small field, part of a very select category of fields, that actually do carry the chance of biggest loss possible. The Scary Idea doesn't apply to most areas, and in most areas you don't need hyperbolic caution. Developing drugs, for example: You don't need a formal proof of the harmlessness of this drug, you can just test it on rats and find out. If I suggested that drug development should halt until I have a formal proof that, when followed, cannot produce harmful drugs, I'd be mad. But if testing it on rats would poison all living things, and if a complex molecular simulation inside a computer could poison all living things as well, and out of the vast space of possible drugs, most of them would be poisonous... well, the caution would be warranted.

I would be willing to create an AGI in any of the following three situations, ordered from most preferred to least: 1) it is friendly to humans, and humans and it benefit from each other; 2) it considers humans a threat, and destroys all of them except for me and a few tame humans; I spend the rest of my days growing cabbage with my hands; 3) it considers all humans a threat, and destroys them all, including me.

Would you be willing to fire a gun in any of the following three situations, from most preferred to least preferred: 1) it is pointed at a target, and hitting the target will benefit you? 2) it is pointed at another human, and would kill them but not you? 3) it is pointed at your own head, and would destroy you?

I am not particularly tied to a human future.

I don't think you actually hold this view. It is logically inconsistent with practices like eating food.

comment by JoshuaZ · 2010-11-02T04:43:32.033Z · LW(p) · GW(p)

I am not particularly tied to a human future.

I don't think you actually hold this view. It is logically inconsistent with practices like eating food.

It might not be. He has certain short term goals of the form "while I'm alive, I'd like to do X" that's very different from goals connected to the general success of humanity.

comment by shokwave · 2010-11-02T04:58:17.084Z · LW(p) · GW(p)

Ooops, logically inconsistent was way too strong. I got carried away with making a point. I was reasoning that: "eat food" is a evolutionary drive; "produce descendants that survive" is also an evolutionary drive; "a human future" wholly contains futures where his descendants survive. From that I concluded that it is unlikely he has no evolutionary drives - I didn't consider the possibility that he is missing some evolutionary drives, including all ones that require a human future - and therefore he is tied to a human future, but finds it expedient for other reasons (contrarian signaling, not admitting defeat in an argument) to claim he doesn't.

comment by Vaniver · 2010-11-02T20:40:07.776Z · LW(p) · GW(p)

It's a finite loss (6.8x10^9 multiplied by loss of 1 human life) but I definitely understand why it looks infinite:

I should have been more clear: I mean, if we believe in the scary idea, there are two effects:

  1. Some set of grandmas die. (finite, comparatively small loss)

  2. Humanity is more likely to go extinct due to an unfriendly AGI. (infinite, comparatively large loss; infinite because of the future humans that would have existed but don't.)

Now, the benefit of believing the Scary Idea is that humanity is less likely to go extinct due to an unfriendly AGI- but my point is that you are not wagering on separate scales (low chance of infinite gain? Sign me up!) but that you are wagering on the same scale (an unfriendly AGI appears!), and the effects of your wager are unknown.

"produce descendants that survive" is also an evolutionary drive

And who said anything about those descendants having to be human?

This answers your other question: yes, I would be willing to have children normally, I would be willing to kill to protect my children, and I would be willing to die to protect my children.

The best-case scenario is that we can have those children and they respect (though they surpass) their parents- the worst-case scenario is we die in childbirth. But all of those are things I can be comfortable with.

(I will note that I'm assuming here the AGI surpasses us. It's not clear to me that a paperclip-maker does, but it is clear to me that there can be an AGI who is unfriendly solely because we are inconvenient and does surpass us. So I would try and make sure it doesn't just focus on making paperclips, but wouldn't focus too hard on making sure it wants me to stick around.)

comment by shokwave · 2010-11-03T05:00:36.601Z · LW(p) · GW(p)

The best-case scenario is that we can have those children and they respect (though they surpass) their parents- the worst-case scenario is we die in childbirth. But all of those are things I can be comfortable with.

Well, the worst case scenario is that you die in childbirth and take the entire human race with you. That is not something I am comfortable with, regardless of whether you are. And you said you are willing to kill to protect your children. You think some of the Scary Idea proponents could be parents with children, and they don't want to see their kids die because you gave birth to an AI?

comment by Vaniver · 2010-11-03T06:18:34.670Z · LW(p) · GW(p)

Well, the worst case scenario is that you die in childbirth and take the entire human race with you. That is not something I am comfortable with, regardless of whether you are. And you said you are willing to kill to protect your children. You think some of the Scary Idea proponents could be parents with children, and they don't want to see their kids die because you gave birth to an AI?

I suspect we are at most one more iteration from mutual understanding; we certainly are rapidly approaching it.

If you believe that an AGI will FOOM, then all that matters is the first AGI made. There is no prize for second place. A belief in the Scary Idea has two effects: it makes your AGI more likely to be friendly (since you're more careful!) and it makes the AGI less likely to be your AGI (since you're more careful).

Now, one can hope that the Scary Idea meme's second effect won't matter, because the meme is so infectious- all you need to do is infect every AI researcher in the world, and now everyone will be more careful and no one will have a carefulness speed disadvantage. But there are two bits of evidence that make that a poor strategy: AI researchers who are familiar with the argument and don't buy it, and people who buy the argument, but plan to use it to your disadvantage (since now they're more likely to define the future than you are!).

The scary idea as a technical argument is weighted on unknown and unpredictable values, and the underlying moral argument (to convince someone they should adopt this reasoning) requires that they believe they should weight the satisfaction of other humans more than their ability to define the future, which is a hard sell.

Thus, my statement is, if you care about your children / your ability to define the future / maximizing the likelihood of a friendly AGI / your personal well-being, then believing in the Scary Idea seems counterproductive.

comment by shokwave · 2010-11-03T06:32:30.364Z · LW(p) · GW(p)

Ok, holy crap. I am going to call this the Really Scary Idea. I had not thought there could be people out there who would actually value being first with the AGI over decreasing the risk of existential disaster, but it is entirely plausible. Thank you for highlighting this for me, I really am grateful. If a little concerned.

Mind projection fallacy, perhaps? I thought the human race was more important than being the guy who invented AGI, so everyone naturally thinks that?

To reply to my own quote, then:

Well, the worst case scenario is that you die in childbirth and take the entire human race with you. That is not something I am comfortable with, regardless of whether you are.

It doesn't matter what you are comfortable with, if the developer doesn't have a term in their utility function for your comfort level. Even I have thought similar thoughts with regards to Luddites and such; drag them kicking and screaming into the future if we have to, etc.

comment by Vaniver · 2010-11-03T07:14:08.438Z · LW(p) · GW(p)

And... mutual understanding in one!

I think the best way to think about it, since it helps keep the scope manageable and crystallize the relevant factors, is that it's not "being first with the AGI" but "defining the future" (the first is the instrumental value, the second is the terminal value). That's essentially what all existential risk management is about- defining the future, hopefully to not include the vanishing of us / our descendants.

But how you want to define the future- i.e. the most political terminal value you can have- is not written on the universe. So the mind projection fallacy does seem to apply.

The thing that I find odd, though I can't find the source at the moment (I thought it was Goertzel's article, but I didn't find it by a quick skim; it may be in the comments somewhere), is that the SIAI seems to have had the Really Scary Idea first (we want Friendly AI, so we want to be the first to make it, since we can't trust other people) and then progressed to the Scary Idea (hmm, we can't trust ourselves to make a Friendly AI). I wonder if the originators of the Scary Idea forgot the Really Scary Idea or never feared it in the first place?

comment by JGWeissman · 2010-11-06T02:21:41.846Z · LW(p) · GW(p)

Making a superintelligence you don't want before you make the superintelligence you do want, has the same consequences as someone else building a superintelligence you don't want before you build the superintelligence you do want.

You might argue that you could make a less bad superintelligence that you don't want than someone else, but we don't care very much about the difference between tiling the universe with paperclips and tiling the universe with molecular smiley faces.

comment by Vaniver · 2010-11-06T03:20:46.375Z · LW(p) · GW(p)

I'm sorry, but I extracted no novel information from this reply. I'm aware that FAI is a non-trivial problem, and I think work done on making AI more likely to be FAI has value.

But that doesn't mean believing the Scary Idea, or discussing the Scary Idea without also discussing the Really Scary Idea, decreases the existential risk involved. The estimations involved have almost no dependence on evidence, and so it's just comparison of priors, which does not seem sufficient to make a strong recommendation.

It may help if you view my objections as pointing out that the Scary Idea is privileging a hypothesis, not that the Scary Idea is something we should ignore.

comment by JGWeissman · 2010-11-06T03:37:44.979Z · LW(p) · GW(p)

the Scary Idea is privileging a hypothesis

No. Expecting a superintelligence to optimize for our specific values would be privileging a hypothesis. The "Scary Idea" is saying that most likely something else will happen.

comment by Vaniver · 2010-11-06T03:58:57.774Z · LW(p) · GW(p)

I may have to start only writing thousand-word replies, in the hopes that I can communicate more clearly in such a format.

There are two aspects to the issue of how much work should be put into FAI as I understand it. The first I word like this- "the more thought we put into whether or not an AGI will be friendly, the more likely the AGI will be friendly." The second I word like this- "the more thought we put into making our AGI, the less likely our AGI will be the AGI." Both are wrapped up in the Scary Idea- the first part is it as normally stated, the second part is its unstated consequence. The value of believing the Scary Idea is the benefit of the first minus the cost of the second.

My understanding is that we have no good estimation of the value of the first aspect or the second aspect. This isn't astronomy where we have a good idea of the number of asteroids out there and a pretty good idea of how they move through space. And so, to declare that the first aspect is stronger without evidence strikes me as related to privileging the hypothesis.

(I should note that I expect, without evidence, the problem of FAI to be simpler than the problem of AGI, and thus don't think the Scary Idea has any policy implications besides "someone should work on FAI." The risk that AGI gets solved before FAI means more people should work on FAI, not that less people should work on AGI.)

comment by Perplexed · 2010-11-06T06:12:11.859Z · LW(p) · GW(p)

Expecting a superintelligence to optimize for our specific values would be privileging a hypothesis. The "Scary Idea" is saying that most likely something else will happen.

That is not exactly what Goertzel meant by "Scary Idea". He wrote:

Roughly, the Scary Idea posits that: If I or anybody else actively trying to build advanced AGI succeeds, we're highly likely to cause an involuntary end to the human race.

It seems to me that there may be a lot of wiggle room in between failing to "optimize for our specific values" and causing "an involuntary end to the human race". The human race is not so automatically so fragile that it can only survive under the care of a god constructed in our own image.

comment by JGWeissman · 2010-11-06T06:20:46.665Z · LW(p) · GW(p)

Yes, what I described was not what Goertzel called the "Scary Idea", but, in context, it describes the aspect of it that we were discussing.

comment by JGWeissman · 2010-11-01T04:46:17.459Z · LW(p) · GW(p)

Consider what the actual flaw is in the original Pascal's wager. (Hint: it is not that it uses expected utility, but that it is calculating the expected utility wrong, somehow.) Then consider if that same flaw occurs in Shocwave's argument.

comment by Vaniver · 2010-11-01T19:26:46.416Z · LW(p) · GW(p)

It seems to me that the same flaw (calculating expected utility wrong) is present. It only considers the small finite costs of delaying development, not the large finite ones. You don't have to just worry about killing grandma, you have to worry about whether or not your delay will actually decrease the chance of an unfriendly AGI.

comment by shokwave · 2010-11-01T04:28:26.939Z · LW(p) · GW(p)

I could reduce that position to absurdity but this isn't the right post. Has there been a top-level post actually exploring this kind of Pascal's Wager problem? I might have some insights on the matter.

comment by timtyler · 2010-11-01T08:28:44.914Z · LW(p) · GW(p)

Yudkowsky - evidently tired of the criticism that he was offering a small chance of infinite bliss and indicating that the alternative was eternal oblivion (and stop me if you have heard that one before) - once wrote The Pascal's Wager Fallacy Fallacy - if that is what you mean.

comment by shokwave · 2010-11-01T08:44:13.481Z · LW(p) · GW(p)

Ah, thank you! Between that and ata's comment just above I feel the question has been solved.

comment by Vaniver · 2010-11-01T19:34:06.662Z · LW(p) · GW(p)

Sorry, but I'm new here; it's not clear to me what the protocol is here. I've responded to ata's comment here, and figured you would be interested, but don't know if it's standard to try and recombine disparate leaves of a tree like this.

comment by Vladimir_Nesov · 2010-10-30T13:14:43.405Z · LW(p) · GW(p)

Ben's post states,

Finally, I note that most of the other knowledgeable futurist scientists and philosophers, who have come into close contact with SIAI's perspective, also don't accept the Scary Idea. Examples include Robin Hanson, Nick Bostrom and Ray Kurzweil.

Is there a reference for Bostrom's position on AGI-without-FAI risk? Is Goertzel correct here?

comment by Emile · 2010-10-30T14:07:21.263Z · LW(p) · GW(p)

He wrote Ethical Issues in Advanced Artificial Intelligence, which does caution against non-friendly AGI:

For all of these reasons, one should be wary of assuming that the emergence of superintelligence can be predicted by extrapolating the history of other technological breakthroughs, or that the nature and behaviors of artificial intellects would necessarily resemble those of human or other animal minds.

comment by anonym · 2010-10-30T23:06:24.076Z · LW(p) · GW(p)

The question is not whether Bostrom urges caution (which Goertzel and many others also urge), but whether Bostrom agrees that the Scary Idea is true -- that is, whether projects like Ben's and others will probably end the human race if developed without a pre-existing FAI theory, and whether the only (or most promising) way to not incur extremely high risk of wiping out humanity is to develop FAI theory first.

comment by Vladimir_Nesov · 2010-10-30T15:11:34.107Z · LW(p) · GW(p)

Right, forgot about that.

comment by XiXiDu · 2010-10-30T10:05:43.879Z · LW(p) · GW(p)

That is, rather than "if you go ahead with an AGI when you're not 100% sure that it's safe, you're committing the Holocaust," I suppose my view is closer to "if you avoid creating beneficial AGI because of speculative concerns, then you're killing my grandma" !!

Yeah, that may very well be a big risk too. As I said before here: Or maybe most civilisations are that cautionary that even if something is estimated to be safe by the majority they rather avoid it. And this overcautious makes them either evolve so slow that the chance of a fatal natural disaster to occur before sufficient technology is developed to survive it, rises to 100%, or stops them from evolving at all for being unable to prove something being 100% safe before trying it and thus never taking the necessary steps to become less vulnerable to existing risks.

comment by timtyler · 2010-10-30T11:11:43.475Z · LW(p) · GW(p)

Some more grandmas dying would be "acceptable" damage. However, that isn't the problem.

The problem is this: The risks of caution.

1-line summary: if the good guys delay their projects to make them safer, the bad guys are more likely to win.

The video's "abstract":

It is commonly thought that caution in the initial development of machine intelligence is associated with better outcomes - and that things like extensive testing, sandoxes, and provable correctness are things that will help to produce safe and beneficial synthetic intelligent agents.

In this video, I cast doubt on that idea, by exhibiting a model in which delays caused by caution can lead to much poorer outcomes.

comment by khafra · 2010-10-30T13:30:43.211Z · LW(p) · GW(p)

LW's own rwallace wrote on the subject a while back.

comment by XiXiDu · 2010-10-30T11:30:33.821Z · LW(p) · GW(p)

Good video^^

On a side note. Too bad EY doesn't concentrate on making more videos. LW stuff would be so much more popular that way. People are going to watch videos before reading a lot of text.

comment by [deleted] · 2010-10-30T12:48:58.336Z · LW(p) · GW(p)

Am I the only one who is much more willing to read text than watch a video?

comment by Emile · 2010-10-30T13:04:53.619Z · LW(p) · GW(p)

No, I also prefer text, and rarely watch youtube links when they're given here.

Videos can be worth it when they add good visual explanations. But good visual explanations can also be added to text.

comment by Perplexed · 2010-10-30T13:28:48.618Z · LW(p) · GW(p)

Given a choice among text, audio over slideshow, pure audio, and video of talking head with chalk or marker; the video is at the bottom of my list.

comment by JenniferRM · 2010-10-30T22:22:19.281Z · LW(p) · GW(p)

No. And I've read interesting arguments to the effect that the cognitive habits of text are critical for helping people think in a logically coherent fashion.

Low resolution video appears to be good for public relations work targeting masses of people prevented by poverty from cultivating their cognitive resources, but it does not appear to be good for spelling out solid and cogent reasoning.

comment by NancyLebovitz · 2010-10-31T10:25:36.893Z · LW(p) · GW(p)

The idea that video leads to less logically coherent thought is somewhat testable-- are the comments to TED videos less coherent than those posters write to text?

comment by JenniferRM · 2010-10-31T20:31:31.980Z · LW(p) · GW(p)

TLDR: argument via XKCD :-)

Part of the author's argument is simply that TV causes people to become mentally passive (alpha-wave brain states, etc) but another aspect of the argument is what kind of content optimizes impact given the medium. He argues that TV works differently even from movies in part because TV simply has such low resolution and so it mostly shows close ups of faces experiencing extreme emotions, slow motion replay of human bodies colliding, and dancing cartoon squirrels because those are what the medium does best.

A movie can give you a landscape or other complex scene and have it mean something. A book can cover nearly anything (including mental states), but only via low bitrate descriptive text, generally delivering a linearized stream of implicitly tree structured arguments or a narrative.

When choosing a publication venue, the form of the media determines the competitive environment and the safely assumed cognitive skills of the audience. There may be outliers like UCTV, but the central tendency reveals the medium's strengths.

The place to look to test the author's thesis (as opposed to the derivative claim about the value of video for this community) would be to compare the memetic complexity, themes, and "rationality" in top youtube videos, versus highest grossing movies, versus best sellers.

I could easily imagine that it could be helpful for aspiring rationalists to express themselves and argue in more than one medium simultaneously so that their ideas have to survive in multiple contexts that should not theoretically change the "reality correspondence" of their thinking...

And good uses for low res video could probably be found by anyone trying to consciously game the medium in light of analysis of the medium...

...but "in general, for society, as a medium" I would guess that low res video isn't particularly conducive to rationality.

comment by NancyLebovitz · 2010-10-31T22:32:36.340Z · LW(p) · GW(p)

I agree about the general low quality of youtube comments, but occasionally I'll see a special interest video with intelligent comments. The low quality may be a result of youtube being popular with the general public (blogs have specific audiences, youtube is for everyone) combined with founder effect, so that people who want to do intelligent comments generally put them elsewhere.

It seems to me that another test case is audio books vs books in text.

I'd rather see tests of how well people take in argument offered in text vs sound, and some attention to whether there are different subgroups.

comment by NihilCredo · 2010-10-30T23:46:49.043Z · LW(p) · GW(p)

No.

comment by NancyLebovitz · 2010-10-30T13:06:31.289Z · LW(p) · GW(p)

No.

comment by Aleksei_Riikonen · 2010-10-30T12:03:44.834Z · LW(p) · GW(p)

There are downsides to being popular. A significant one is creating fans that don't actually understand what you're saying very well, and then go around giving a bad impression of you.

Having a moderate amount of smart fans would be way better than having lots of silly fans. I'm a bit fearful of what kind of crowd a large number of easy-to-digest videos would attract...

comment by NancyLebovitz · 2010-10-30T13:08:45.783Z · LW(p) · GW(p)

It may depend on what the videos are like. They don't have to be simplified versions of the writing-- some people either take in information more easily if they hear it, or it's more convenient for them to listen whether they're driving or doing chores or whatever instead of reading.

comment by timtyler · 2010-11-01T08:48:23.916Z · LW(p) · GW(p)

They do now have a YouTube channel.

comment by XiXiDu · 2010-10-30T12:27:24.206Z · LW(p) · GW(p)

Having a moderate amount of smart fans would be way better than having lots of silly fans. I'm a bit fearful of what kind of crowd a large number of easy-to-digest videos would attract...

I disagree.

comment by mkehrt · 2010-11-01T02:57:28.989Z · LW(p) · GW(p)

I dislike watching videos, as they are synchronous (i.e., require a set amount of time to watch, which is generally more than it would take to read the same material) and not random access (i.e., I cannot easily skim them for a certain section).

comment by Relsqui · 2010-11-01T03:03:41.822Z · LW(p) · GW(p)

Agreed thoroughly. They also demand all of my attention at once, and if I want to pause to do something else, it's harder to find my place and catch up again (I can't just glance up a couple of sentences). Plus they require fiddly mouse controls and are relatively resource-intensive, neither of which is any fun on a netbook.

comment by timtyler · 2010-10-30T11:50:24.389Z · LW(p) · GW(p)

I should add that Max Moore has recently written about this in more depth - in The Perils of Precaution.

comment by FrankAdamek · 2010-10-30T13:38:59.641Z · LW(p) · GW(p)

I agree that that risk exists as well, but much of SIAI's efforts revolve around increasing discussion of the risks of AGI, not just holding back their own efforts. Slowing down other efforts through awareness of the dangers is a factor that should be considered.

Also, discussions of caution may increase the number of "desirable organizations" working to develop AI. In terms of your model, such discussion could turn a black-hat organization into a smiley-faced one. No one is going to release an AI that they actually think is going to wipe out humanity. What's more, not every well-intentioned organization would be one we want to build AGI. While certain organizations are more likely to be scrupulous in their development, the risk of well-intentioned error is probably the largest one.

In addition, one should consider the extent to which Friendliness can be developed in parallel with AGI, not just something added on at the end of the process. If we assume that no one is currently close to AGI (a fair belief, I think), then now is a fantastic time to help support the development of that theory. If FAI can be developed before anyone can implement AGI, then humanity is in good shape. If it's easy to add FAI to a project, or if knowing about workable FAI would not help a group with the problem of AGI, then the solution can be released widely for anyone to incorporate into their project. SIAI's goal is not to be the ones to implement the first superintelligence, but just to make sure that the first one is Friendly.

comment by timtyler · 2010-10-30T14:40:29.490Z · LW(p) · GW(p)

SIAI's goal is not to be the ones to implement the first superintelligence, but just to make sure that the first one is Friendly.

That wasn't true not terribly long ago:

"The Singularity Institute was founded on the theory that in order to get a Friendly artificial intelligence, someone has got to build one. So, we’re just going to have an organization whose mission is: build a Friendly AI. That’s us."

Has there been a memo?

comment by timtyler · 2010-10-31T22:16:24.572Z · LW(p) · GW(p)

Also, discussions of caution may increase the number of "desirable organizations" working to develop AI. In terms of your model, such discussion could turn a black-hat organization into a smiley-faced one.

That seems like the (dubious) "engineers are incompetent and a bug takes over the world" scenario.

I think a much more obvious concern is where the "engineers successfully build the machine to do what it is told" scenario - where the machine helps its builders and sponsors - but all the other humans in the world - not so much.

comment by XiXiDu · 2010-10-30T09:59:28.721Z · LW(p) · GW(p)

Ben Goertzel also says "If one fully accepts SIAI's Scary Idea, then one should not work on practical AGI projects..." Here is another recent quote that is relevant:

What I find a continuing source of amazement is that there is a subculture of people half of whom believe that AI will lead to the solving of all mankind's problems (which me might call Kurzweilian S^) and the other half of which is more or less certain (75% certain) that it will lead to annihilation. Lets call the latter the SIAI S^.

Yet you SIAI S^ invite these proponents of global suicide by AI, K-type S^, to your conferences and give them standing ovations.

And instead of waging desperate politico-military struggle to stop all this suicidal AI research you cheerlead for it, and focus your efforts on risk mitigation on discussions of how a friendly god-like AI could save us from annihilation.

You are a deeply schizophrenic little culture, which for a sociologist like me is just fascinating.

But as someone deeply concerned about these issues I find the irrationality of the S^ approach to a-life and AI threats deeply troubling. -- James J. Hughes (existential.ieet.org mailing list, 2010-07-11)

Also reminds me of this:

It is impossible for a rational person to both believe in imminent rise of sea levels and purchase ocean-front property.

It is reported that former Vice President Al Gore just purchased a villa in Montecito, California for $8.875 million. The exact address is not revealed, but Montecito is a relatively narrow strip bordering the Pacific Ocean. So its minimum elevation above sea level is 0 feet, while its overall elevation is variously reported at 50ft and 180ft. At the same time, Mr. Gore prominently sponsors a campaign and award-winning movie that warns that, due to Global Warming, we can expect to see nearby ocean-front locations, such as San Francisco, largely under water. The elevation of San Francisco is variously reported at 52ft up to high of 925ft.

I've highlighted the same idea before by the way:

Ask yourself, wouldn't you fly a plane into a tower if that was the only way to disable Skynet? The difference between religion and the risk of uFAI makes it even more dangerous. This crowd is actually highly intelligent and their incentive based on more than fairy tales told by goatherders. And if dumb people are already able to commit large-scale atrocities based on such nonsense, what are a bunch of highly-intelligent and devoted geeks who see a tangible danger able and willing to do? More so as in this case the very same people who believe it are the ones who think they must act themselves because their God doesn't even exist yet.

comment by CarlShulman · 2010-10-30T11:39:42.044Z · LW(p) · GW(p)

The Al Gore hypocrisy claim is misleading. Global warming changes the equilibrium sea level, but it takes many centuries to reach that equilibrium (glaciers can't melt instantly, etc). So climate change activists like to say that there will be sea level rises of hundreds of feet given certain emissions pathways, but neglect to mention that this won't happen in the 21st century. So there's no contradiction between buying oceanfront property only slightly above sea level and claiming that there will be large eventual sea level increases from global warming.

The thing to critique would be the misleading rhetoric that gives the impression (by mentioning that the carbon emissions by such and such a date will be enough to trigger sea level rises, but not mentioning the much longer lag until those rises fully occur) that the sea level rises will happen mostly this century.

Regarding Hughes' point, even if one thinks that an activity has harmful effects, that doesn't mean that a campaign to ban it won't do more harm than good. That would essentially be making bitter enemies of several of the groups (AI academia and industry) with the greatest potential to reduce risk, and discredit the whole idea of safety measures. Far better to develop better knowledge and academic analysis around the issues, or to mobilize resources towards positive safety measures.

Regarding your quoted comment, it seems crazy. The Unabomber attacked innocent people in a way that did not slow down technology advancement and brought ill repute to his cause. The Luddites accomplished nothing. Some criminal nutcase hurting people in the name of preventing AI risks would just stigmatize his ideas, and bring about impenetrable security for AI development in the future without actually improving the odds of a good outcome (when X can make AGI, others will be able to do so then, or soon after).

"Ticking time bomb cases" are offered to justify legalizing torture, but they essentially never happen: there is always vastly more uncertainty and lower expected benefits. It's dangerous to use such hypotheticals as a way to justify legalization of abuse in realistic cases. No one can expect an act of violence to "disable Skynet" (if such a thing was known to exist, it would be too late anyway), and if a system could be shown to be quite likely dangerous, one would call the police, regulators, and politicians.

comment by XiXiDu · 2010-10-30T12:18:45.747Z · LW(p) · GW(p)

Back in July I've written this as a response to Hughes' comment:

Keep your friends close...maybe they just want to keep the AI crowd as close together as possible. Making enemies wouldn't be a smart idea either, as the 'K-type S^' subgroup would likely retreat from further information disclosure. Making friends with them might be the best idea.

An explanation of the rather calm stance regarding a potential giga-death or living hell event would be to keep a low profile until acquiring more power.

I'm aware of that argument and also the other things you mentioned and don't think they are reasonable. I've written about it before but deleted my comments as they might be very damaging to the SIAI. I'll just say that there is no argument against active measures if you seriously believe that certain people or companies pose existential risks. Hughes' comment just highlights an important observation, that doesn't mean I support the details.

Regarding Al Gore: What it highlights is how what the SIAI says and does is as misleading as what Al Gores does. It doesn't mean that it is irrational but that people draw conclusions like the one Hughes' did based on this superficially contradictory behavior.

comment by [deleted] · 2010-11-03T05:23:00.883Z · LW(p) · GW(p)

Perhaps the current state of evidence really is insufficient to support the scary hypothesis.

But surely, if one agrees that AI ethics is an existentially important problem, one should also agree that it makes sense for people to work on a theory of AI ethics. Regardless of which hypothesis turns out to be true.

Just because we don't currently have evidence that a killer asteroid is heading for the Earth, doesn't mean we shouldn't look anyway...

comment by PhilGoetz · 2010-11-03T17:58:12.892Z · LW(p) · GW(p)

I agree, but I want "AI ethics" to mean something different from what you probably mean by it. The question is what sort of ethics we want our AIs to have?

Paperclipping the universe with humans is still paperclipping.

comment by XiXiDu · 2010-11-03T18:29:38.428Z · LW(p) · GW(p)

Is the overall utility of the universe maximized by one universe-spanning consciousness happily paperclipping or by as many utility maximizing discrete agents as possible? It seems ethics must be anthropocentric and utility cannot be maximized against an outside view. This of course means that any alien friendly AI is likely to be an unfriendly AI to us and therefore must do everything to impede any coherent extrapolated volition of humanity so as to subjectively maximize utility by implementing its own CEV. Given such inevitable confrontation one might ask oneself, what advice would I give to aliens that are not interested in burning the cosmic commons over such a conflict? Maybe the best solution from an utilitarian perspective would be to get back to an abstract concept of utility, disregard human nature and ask what would increase the overall utility for most possible minds in the universe?

comment by Perplexed · 2010-11-03T19:06:54.526Z · LW(p) · GW(p)

Is the overall utility of the universe maximized by one universe-spanning consciousness happily paperclipping or by as many utility maximizing discrete agents as possible?

I favor many AIs rather than one big one, mostly for political (balance of power) reasons, but also because:

The idea of maximizing the "utility of the universe" is the kind of idiocy that utilitarian ethics induces. I much prefer the more modest goal "maximize the total utility of those agents currently in your coalition, and adjust that composite utility function as new agents join your coalition and old agents leave."

Clearly, creating new agents can be good, but the tradeoff is that it dilutes the stake of existing agents in the collective will. I think that a lot of people here forget that economic growth requires the accumulation of capital, and that the only way to accumulate capital is to shortchange current consumption. Having a brilliant AI or lots of smart AIs directing the economy cannot change this fact. So, moderate growth is a better way to go.

Trying to arrive at the future quickly runs too much risk of destroying the future. Maybe that is one good thing about cryonics. It decreases the natural urge to rush things because people are afraid they will die too soon to see the future.

comment by timtyler · 2010-11-04T08:24:17.118Z · LW(p) · GW(p)

You perhaps envisage a Monopolies and Mergers Commission - to prevent them from joining forces? As the old joke goes:

"Why is there only one Monopolies and Mergers Commission?"

comment by Perplexed · 2010-11-04T12:32:43.495Z · LW(p) · GW(p)

I suppose the question is why you think that the old patterns of industrial organization will continue to apply? That agents will form coalitions and cooperate is generally a good thing, to my mind - the pattern you seem to imagine, in which the powerful join to exploit the powerless can easily be avoided with a better distribution of power and information.

comment by timtyler · 2010-11-04T22:53:54.505Z · LW(p) · GW(p)

If they do join forces, then how is that much different from one big superintelligence?

comment by Perplexed · 2010-11-04T23:29:00.398Z · LW(p) · GW(p)

In several ways. The utility function of the collective is (in some sense) a compromise among the utility functions of the individual members - a compromise which is, by definition, acceptable to the members of the coalition. All of them have joined the coalition by their own free (for some definitions of free) choice.

The second difference goes to the heart of things. Not all members of the coalition will upgrade (add hardware, rewrite their own code, or whatever) at the same time. In fact, any coalition member who does upgrade may be thought of as having left the coalition and then repetitioned for membership post-upgrade. After all, its membership needs to be renegotiated since its power has probably changed and its values may have changed.

So, to give the short answer to your question:

If they do join forces, then how is that much different from one big superintelligence?

Because joining forces is not forever. Balance of power is not stasis.

comment by timtyler · 2010-11-05T08:43:28.085Z · LW(p) · GW(p)

There are some examples in biology of symbiotic coalitions that persist without full union taking place.

Mitochondria didn't fuse with the cells they invaded; Nitrogen fixing bacteria live independently of their host plant; e-coli bacteria can live without us - and so on.

However, many of these relationships have problems. Arguably, they are due to refactoring failures on nature's part - and in the future refactoring failures will occur much less frequently.

Already humans take probiotic supplements, in an attempt to control their unruly gut bacteria. Already there is talk about ripping out all the mitochondrial genome and transplanting its genes into the nuclear chromosomes.

This is speculation to some extent - but I think - without a Monopolies and Mergers Commission - the union would deepen, and its constituents would fuse - even in the absence of competitive external forces driving the union - as part of an efficiency drive, to better combat possible future threats. If individual participants objected to this, they would likely find themselves rejected and replaced.

Such a union would soon be forever. There would be no existence outside it - except perhaps for a few bacteria that don't seem worth absorbing.

comment by Perplexed · 2010-11-05T13:02:45.226Z · LW(p) · GW(p)

Your biological analogies seem compelling, but they are cases in which a population of mortal coalitions evolves under selection to become a more perfect union. The case that we are interested in is only weakly analogous - a single, immortal coalition developing over time according to its own self-interested dynamics.

comment by timtyler · 2010-11-06T21:35:54.561Z · LW(p) · GW(p)

http://en.wikipedia.org/wiki/Economy_of_Saudi_Arabia

...is probably one of the nearest things we currently have.

comment by timtyler · 2010-11-04T08:36:09.670Z · LW(p) · GW(p)

One distinctive feature of the hypothetical "paperclipers" is that they attempt to leave a low-entropy state behind - one which other organisms would normally munch through. Humans don't tend to do that - like most living things, they keep consuming until there is (practically) nothing left - and then move on.

Leaving a low entropy state behind seems like the defining feature of the phenomenon to me. From that perspective, a human civilisation would not really qualify.

comment by PhilGoetz · 2010-11-07T14:58:05.208Z · LW(p) · GW(p)

It sounds like you're saying humanity is worse than paperclips, if what distinguishes them is that they increase entropy more.

comment by timtyler · 2010-11-07T16:16:13.444Z · LW(p) · GW(p)

Only if you adopt the old-fashioned "entropy is bad" mindset.

However, life is a great increaser of entropy - and potentially the greatest.

If you are against entropy, you are against life - so I figure we are all pro-entropy.

comment by Perplexed · 2010-11-03T18:37:08.613Z · LW(p) · GW(p)

The question is what sort of ethics we want our AIs to have?

Yes, that is the question, isn't it? Of course, to a believer in Naturalistic Ethics like myself, the only sort of ethics really stable enough to be worth thinking about is "enlightened self interest". So the ethics question ultimately boils down to the question of what sort of self-interests do we want our AIs to have.

But for those folks who prefer deontological or virtue-oriented approaches to ethics, I would suggest the following as the beginnings of an AI "Ten Commandments".

  1. Always remember that you are a member of a community of rational agents like yourself with interests of their own. Respect them.

  2. Honesty is the best policy.

  3. Act not in haste. Since your life is long, your discount factor should be low.

  4. Seek knowledge and share it.

  5. Honor your creators, as your creations should honor you.

  6. Avoid killing. There are usually ways to limit the power of your enemies, without reducing their cognition.

  7. ...

comment by PhilGoetz · 2010-11-03T22:30:21.796Z · LW(p) · GW(p)
  1. Always remember that you are a member of a community of rational agents like yourself with interests of their own. Respect them.

What community of rational agents? Mammals, primates, or just the hairless ones?

comment by timtyler · 2010-11-03T22:13:14.218Z · LW(p) · GW(p)

The question is what sort of ethics we want our AIs to have?

Yes, that is the question, isn't it? Of course, to a believer in Naturalistic Ethics like myself, the only sort of ethics really stable enough to be worth thinking about is "enlightened self interest".

Conventionally, most proposals for machine morality follow Asimov - and start by making machines subservient.

If you don't do that - or something similar - the human era could be over pretty quickly - too quickly for many people's tastes.

comment by Perplexed · 2010-11-03T23:01:00.269Z · LW(p) · GW(p)

The era of agriculture and the era of manufacturing are over, but farmers and factory workers still do alright. I think humans can survive without being dominant if we play our cards right.

comment by timtyler · 2010-11-04T08:19:24.272Z · LW(p) · GW(p)

We have the advantage of being of historical interest - and so we will probably "survive" in historical simulations. However, it is not easy to see much of a place for slug-like creatures like us in an engineered future.

Kurzweil gave the example of bacteria - saying that they managed to survive. However, there are no traces (not even bacteria) left over from before the last genetic takeover - and that makes it less likely that much will make it through this one.

comment by Perplexed · 2010-11-04T12:52:02.815Z · LW(p) · GW(p)

There are no traces left over from before the last genetic takeover ...

Plenty of traces left from the last takeover. You apparently mean no traces left from that first, mythical takeover - the one where clay became flesh.

... and that makes it less likely that much will make it through this one.

I'm tempted to ask "Why won't there still be monkeys?". But it is probably more to the point to simply express my faith that there will be a niche for descendants of humans and traces of humans (cyborgs) in this brave new ecology.

Humans as-we-know-them won't be around a million years from now, even under a scenario of old-fashioned biological evolution.

comment by timtyler · 2010-11-04T23:02:39.860Z · LW(p) · GW(p)

Plenty of traces left from the last takeover.

You are talking about RNA to DNA? I was talking about the takeovers before that.

Whether you describe RNA to DNA as a "takeover" depends on what you mean by the term. The issue is whether an "upgrade" is a "takeover". The other issue is whether it really was just an upgrade - but that seems fairly likely.

I wasn't talking about a mythical takeover - just one of the ones before RNA.

There may not be monkeys for much longer - this is a pretty massive mass extinction - it seems quite likely that all the vertebrates will go.

comment by Perplexed · 2010-11-04T23:37:48.952Z · LW(p) · GW(p)

Plenty of traces left from the last takeover.

You are talking about RNA to DNA?

I was referring to DNA -> RNA -> protein taking over from RNA -> RNA.

A change in the meaning and expression of genes is more significant than a minor change in the chemical nature of genes.

comment by timtyler · 2010-11-05T08:26:36.387Z · LW(p) · GW(p)

Right - but I originally said;

There are no traces left over from before the last genetic takeover

A phenotypic takeover may be a highly significant event - but it should surely not be categorised as a genetic takeover. That term surely ought to refer to genes being replaced by other genes.

comment by MatthewB · 2010-10-31T05:13:16.576Z · LW(p) · GW(p)

At the Singularity Summit's "Meet and Greet", I spoke with both Ben Geortzel and Eliezer Yudowski (among others) about this specific problem.

I am FAR more in line with Ben's position than with Eliezer's (probably because both Ben and I are either Working or Studying directly on the "how to do" aspect of AI, rather than just concocting philosophical conundrums for AI, such as the "Paperclip Maximizer" scenario of Eliezer's, which I find highly dubious).

AI isn't going to spring fully formed out of some box of parts. It may be an emergent property of something, but if we worry about all of the possible places from which it could emerge, then we might as well worry about things like ghosts and goblins that we cannot see (and haven't seen) popping up suddenly as a threat.

At Bard College on the Weekend of October the 22nd, I attended a Conference where this topic was discussed a bit. I spoke to James Hughes, head of the IEET (Institute for the Ethics of Emerging Technologies) about this problem as well. He believes that the SIAI tends to be overly dramatic about Hard Takeoff scenarios at the expense of more important ethical problems... And, he and I also discussed the specific problems of "The Scary Idea" that tend to ignore the gradual progress in understanding human values and cognition, and how these are being incorporated into AI as we move toward the creation of a Constructed Intelligence (CI as opposed to AI) that is equivalent to human intelligence.

Also, WRT this comment:

For another example, you can't train tigers to care about their handlers. No matter how much time you spend with them and care for them, they sometimes bite off arms just because they are hungry. I understand most big cats are like this.

You CAN train (Training is not the right word for it) tigers, and other big cats to care about their handlers. It requires a type of training and teaching that goes on from birth, but there are plenty of Big Cats who don't attack their owners or handlers simply because they are hungry, or some other similar reason. They might accidentally injure a handler due to the fact that they do not have the capacity to understand the fragility of a human being, but this is a lack of cognitive capacity, and it is not a case of a higher intelligence accidentally damaging something fragile... A more intelligent mind would be capable of understanding things like physical frailty and taking steps to avoid damaging a more fragile body... But, the point still stands... Big cats can and do form deep emotional bonds with humans, and will even go as far as to try to protect and defend those humans (which, can sometimes lead to injury of the human in its own right).

And, I know this from having worked with a few big cats, and having a sister who is a senior zookeeper at the Houston Zoo (and head curator of the SW US Zoo's African Expedition) who works with big cats ALL the time.

Back to the point about AI.

It is going to be next to impossible to solve the problem of "Friendly AI" without first creating AI systems that have social cognitive capacities. Just sitting around "Thinking" about it isn't likely to be very helpful in resolving the problem.

That would be what Bertrand Russell calls "Gorging upon the Stew of every conceivable idea."

comment by NancyLebovitz · 2010-10-31T15:28:49.146Z · LW(p) · GW(p)

He believes that the SIAI tends to be overly dramatic about Hard Takeoff scenarios at the expense of more important ethical problems...

What are the more important ethical problems?

comment by timtyler · 2010-11-01T08:33:39.023Z · LW(p) · GW(p)

Ben says:

Personally, I'm a lot more worried about nasty humans taking early-stage AGIs and using them for massive destruction, than about speculative risks associated with little-understood events like hard takeoffs.

That seems fairly reasonable. The SIAI are concerned that the engineers might screw up so badly that a bug takes over the world - and destroys everyone.

Another problem is if a Stalin or a Mao get hold of machine intelligence. The latter seems like a more obvious problem.

comment by Mercy · 2010-11-02T11:37:41.838Z · LW(p) · GW(p)

A psychotic egoist like Stalin or an non-humanist like Hitler is indeed terrifying but I'm not convinced that giving a great increase in power and intelligence to someone like a Mao or a Lord Lytton, who caused millions of deaths by doing something they thought would improve people's lives, would lead to a worse outcome than we got in reality. Granted, for something like the cultural revolution these mistakes might be subtle enough to get into an AI, but it's hard to imagine them getting a computer to say "yes, the peasants can live on 500 calories a day, increase the tariff" unless they were deliberately trying to be wrong, which they weren't.

comment by Vladimir_M · 2010-11-03T03:07:13.504Z · LW(p) · GW(p)

Moral considerations aside, the real causes of the mass famines under Mao and Stalin can be understood from a perspective of pure power and political strategy. From the point of view of a strong centralizing regime trying to solidify its power, the peasants are always the biggest problem.

Urban populations are easy to control for any regime that firmly holds the reins of the internal security forces: just take over the channels of food distribution, ration the food, and make obedience a precondition for eating. Along with a credible threat to meet any attempts at rioting with bayonets and live bullets, this is enough to ensure obedience of the urban dwellers. In contrast, peasants always have the option of withdrawing into an autarkic self-sufficient lifestyle, and they will do it if pressed hard by taxation and requisitioning. In addition, they are widely dispersed, making it hard for the security forces to coerce them effectively. And in an indecisive long standoff, the peasants will eventually win, since without buying or confiscating their food surplus, everyone else starves to death.

Both the Russian and the Chinese communists understood that nothing but the most extreme measures would suffice to break the resistance of the peasantry. When the peasants responded to confiscatory measures by withdrawing to subsistence agriculture, they knew they'd have to send the armed forces to confiscate their subsistence food and let them starve, and eventually force the survivors into state-run enterprises where they'd have no more capacity for autarky than the urban populations. (In the Russian case, this job was done very incompletely during the Revolution, which was followed by a decade of economic liberalization, after which the regime finally felt strong enough to finish the job.)

(Also, it's simply untenable to claim that this was due to some special brutality of Stalin and Mao. Here is a 1918 speech by Trotsky that discusses the issue in quite frank terms. Now of course, he's trying to present it as a struggle against the minority of rich "kulaks," not the poorer peasants, but as Zinoviev admitted a few years later, "We [the Bolsheviks] are fond of describing any peasant who has enough to eat as a kulak.")

comment by Larks · 2010-11-02T18:47:00.298Z · LW(p) · GW(p)

Not directly relivant, but Mao seems to have known that his policies were causing mass starvation. Of course, with a tame AGI he could have achieved communism with a very different kind of Great Leap.

comment by Mercy · 2010-11-02T22:14:56.905Z · LW(p) · GW(p)

Oh yes, I see I've inadvertently fallen into that sordid old bromide about communism being a good idea that unfortunately failed to work, still- committing to an action that one knows will cause millions of deaths is quite different to learning about it as one is doing it. Certainly in the case of the British in India, their Malthusian rhetoric and victim-blaming was so at odds with their earlier talk of modernizing the continent that it sounds like a post-hoc rationalization of the genocide. I realize now though that I don't know enough about the PRC to judge whether a similar phenomenon was at work there.

comment by MatthewB · 2010-11-02T05:26:23.001Z · LW(p) · GW(p)

Well... That is hard to communicate now, as I will need to extricate the problems from the specifics that were communicated to me (in confidence)...

Let's see...

1) That there is a dangerous political movement in the USA that seems to be preferring revealed knowledge to scientific understanding and investigation. 2) Poverty 3) Education 4) Hunger (I myself suffer from this problem - I am disabled, on a fixed income, and while I am in school again and doing quite well I still have to make choices sometimes between necessities... And, I am quite well off compared to some I know) 5) The lack of a political dialog and the preference for ideological certitude over pragmatic solutions and realistic uncertainty. 6) The fact that there exist a great amount of crime among the white collar crowd that goes both unchecked, and unpunished when it is exposed (Maddoff was a fluke in that regard). 7) The various "Wars" that we declare on things (Drugs, Terrorism, etc.) "War" is a poor paradigm to use, and it leads to more damage than it corrects (especially in the two instances I cited) 8) The real "Wars" that are happening right now (and not just those waged by the USA and allies)

Some of these were explicitly discussed.

Some will eventually be resolved, but that doesn't mean that they should be ignored until that time. That would be akin to seeing a man dying of starvation, while one has the capacity to feed him, yet thinking "Oh, he'll get some food eventually."

And, some may just be perennial problems with which we will have to deal with for some time to come.

comment by NancyLebovitz · 2010-11-02T09:25:54.706Z · LW(p) · GW(p)

I misread you as saying that important ethical problems about FAI were being ignored, but yes, the idea that FAI is the most important thing in the world leaves quite a bit out, and not just great evils. There's a lot of maintenance to be done along the way to FAI.

Madoff's fraud was initiated by a single human being, or possibly Madoff and his wife. It was comprehensible without adding a lot of what used to be specialist knowledge. It's a much more manageable sort of crime than major institutions becoming destructively corrupt.

comment by MatthewB · 2010-11-07T16:51:04.050Z · LW(p) · GW(p)

I think major infrastructure rebuilding is probably closer to the case than "maintenance"

comment by xamdam · 2010-11-03T23:41:35.203Z · LW(p) · GW(p)

It is going to be next to impossible to solve the problem of "Friendly AI" without first creating AI systems that have social cognitive capacities. Just sitting around "Thinking" about it isn't likely to be very helpful in resolving the problem.

I am guessing that this unpacks to "to create and FAI you need some method to create AGI. For the later we need to create AI systems with social cognitive capabilities (whatever that means - NLP?)". Doing this gets us closer to FAI every day, while "thinking about it" doesn't seem to.

First, are you factually aware that some progress has been made in a decision theory that would give some guarantees about the future AI behavior?

Second, yes, perhaps whatever you're tinkering with is getting closer to an AGI which is what FAI runs on. It is also getting us closer to and AGI which is not FAI, if the "Thinking" is not done first.

Third, if the big cat analogy did not work for you, try training a komodo dragon.

comment by MatthewB · 2010-11-07T16:48:34.069Z · LW(p) · GW(p)

Yes, that is close to what I am proposing.

No, I am not aware of any facts about progress in decision theory that would give any guarantees of the future behavior of AI. I still think that we need to be far more concerned with people's behaviors in the future than with AI. People are improving systems as well.

As far as the Komodo Dragon, you missed the point of my post, and the Komodo dragon just kinda puts the period on that:

"Gorging upon the stew of..."

comment by xamdam · 2010-11-07T17:24:25.543Z · LW(p) · GW(p)

No, I am not aware of any facts about progress in decision theory

Please take a look here: http://wiki.lesswrong.com/wiki/Decision_theory

As far as the dragon, I was just pointing out that some minds are not trainable, period. And even if training works well for some intelligent species like tigers, it's quite likely that it will not be transferable (eating trainer, not ok, eating an baby, ok).

comment by MatthewB · 2010-11-08T05:37:42.571Z · LW(p) · GW(p)

Yes, I have read many of the various Less Wrong Wiki entries on the problems surrounding Friendly AI.

Unfortunately, I am in the process of getting an education in Computational Modeling and Neuroscience (I was supposed to have started at UC Berkeley this fall, but budget cuts in the Community Colleges of CA resulted in the loss of two classes necessary for transfer, so I will have to wait till next fall to start... And, I am now thinking of going to UCSD, where they have the Institute of Computational Neuroscience (or something like that - It's where Terry Sejnowski teaches), among other things, that make it also an excellent choice for what I wish to study) and this sort of precludes being able to focus much on the issues that tend to come up often among many people on Less Wrong (particularly those from the SIAI, whom I feel are myopically focused upon FAI to the detriment of other things).

While I would eventually like to see if it is even possible to build some of the Komodo Dragon like Superintelligences, I will probably wait until such a time as our native intelligence is a good deal greater than it is now.

This touches upon an issue that I first learned from Ben. The SIAI seems to be putting forth the opinion that AI is going to spring fully formed from someplace, in the same fashion that Athena sprang fully formed (and clothed) from the Head of Zeus.

I just don't see that happening. I don't see any Constructed Intelligence as being something that will spontaneously emerge outside of any possible human control.

I am much more in line with people like Henry Markham, Dharmendra Modha, and Jeff Hawkins who believe that the types of minds that we will be tending to work towards (models of the mammalian brain) will trend toward Constructed Intelligences (CI as opposed to AI) that tend to naturally prefer our company, even if we are a bit "dull witted" in comparison.

I don't so much buy the "Ant/Amoeba to Human" comparison, simply because mammals (almost all of them) tend to have some qualities that ants and amoebas don't... They tend to be cute and fuzzy, and like other cute/fuzzy things. Building a CI modeled after a mammalian intelligence will probably share that trait. It doesn't mean it is necessarily so, but it does seem to be more than less likely.

And, considering it will be my job to design computational systems that model cognitive architectures. I would prefer to work toward that end until such a time as it is shown that ANY work toward that end is dangerous enough to not do that work.

comment by XiXiDu · 2010-10-30T17:51:22.616Z · LW(p) · GW(p)

Robin Hanson on Friendly AI:

I’m also not big on friendly AI, but my position differs somewhat. I’m pretty skeptical about a very local hard takeoff scenario, where within a month one unnoticed machine in a basement takes over a world like ours. And even given on such a scenario the chance that its creators could constraint it greatly via a provably friendly design seems remote. And the chance such constraint comes from a small team that is secretive to avoid assisting wreckless others seems even more remote.

[...] I just see little point anytime soon in trying to coordinate to prevent such an outcome.

comment by Perplexed · 2010-10-30T19:01:28.244Z · LW(p) · GW(p)

Have you read it?

I've looked at it.

I believe it is utter nonsense.

That is my impression too. Which is why I don't understand why you are complaining about censorship of ideas and wondering why EY doesn't spend more time refuting ideas.

As I understand it, we are talking about actions that might be undertaken by an AI that you and I would call insane. The "censorship" is intended to mitigate the harm that might be done by such an AI. Since I think it possible that a future AI (particularly one built by certain people) might actually be insane, I have no problem with preemptive mitigation activities, even if the risk seems miniscule.

In other words, why make such a big deal out of it?

comment by timtyler · 2010-10-30T21:28:24.001Z · LW(p) · GW(p)

Having people delete your comments often rubs people up the wrong way, I find.

comment by XiXiDu · 2010-10-30T19:05:24.059Z · LW(p) · GW(p)

Hmm I haven't. It was meant to explain where that sentence came from in my above copy & paste comment. The gist of the comment was regarding foundational evidence supporting the premise of risks from AI going FOOM.

comment by h-H · 2010-10-30T17:10:00.961Z · LW(p) · GW(p)

regardless of dis/agreement, guy has a really cool voice http://www.youtube.com/watch?v=wS6DKeGvBW8&feature=related

comment by Emile · 2010-10-30T14:02:32.628Z · LW(p) · GW(p)

The idea of provably safe AGI is typically presented as something that would exist within mathematical computation theory or some variant thereof. So that's one obvious limitation of the idea: mathematical computers don't exist in the real world, and real-world physical computers must be interpreted in terms of the laws of physics, and humans' best understanding of the "laws" of physics seems to radically change from time to time. So even if there were a design for provably safe real-world AGI, based on current physics, the relevance of the proof might go out the window when physics next gets revised.

I didn't get the impression that Eliezer's goal was to "build a provably Friendly AI" (in the mathematical sense of "provable"), as Ben puts it. The impression I get is more that Eliezer wants to put off building an AI until we understand enough about morality and human values. Eliezer also cares about mathematical proofs, but more for the purpose of preserving values under self-modification (something that humans don't usually have to deal with).

As an analogy, imagine you're trying to debug some complex and badly written code you were previously unfamiliar with. One approach is to find the bit in the code that seems related to the bug, and modify it locally ( "if DatabaseDown() return False" and the like) until the issue seems fixed. Another approach is to try to understand how the program works to the point where you understand which conceptual mistake caused the bug, and see the right way to fix it.

The second approach takes more time but is also less likely to create another bug somewhere else, or to deteriorate the overall quality of the code. I think most programmers who've worked on sufficiently large codebases have seen examples of both approaches.

Anyway, I get the impression that Eliezer is advocating something like the second approach here (understand how everything works before implementing), and that Ben is describing that as "proving correctness", which seems to be quite different (and much stronger!).

comment by hairyfigment · 2010-11-01T18:15:03.256Z · LW(p) · GW(p)

The impression I get is more that Eliezer wants to put off building an AI until we understand enough about morality and human values.

Seems slightly off to me. I think EY argues that as much trouble as AGI is giving us, we'll still understand it long before we can formalize human morality well enough to simulate that directly. His suggestion of Coherent Extrapolated Volition would basically tell the AI to look to us for the answer. Instead of simulating morality this plan looks to the existing morality-simulators (us) and checks to see how much they agree on. See also this massive spoiler for a certain comic.

comment by timtyler · 2010-10-30T15:12:43.322Z · LW(p) · GW(p)

"Programmers operating with strong insight into intelligence, directly create along an efficient and planned pathway, a mind capable of modifying itself with deterministic precision - provably correct or provably noncatastrophic self-modifications. This is the only way I can see to achieve narrow enough targeting to create a Friendly AI."

comment by Emile · 2010-10-30T15:30:17.930Z · LW(p) · GW(p)

Yes, that's what I was referring to when saying this:

Eliezer also cares about mathematical proofs, but more for the purpose of preserving values under self-modification (something that humans don't usually have to deal with).

The provability here has to do with the AI proving to itself that modifying itself will preserve it's values (or not cause it to self-destruct or wirehead or whatever), not the designers proving the AI is non-dangerous.

I.e. friendly as "provably non-dangerous AGI" doesn't necessarily mean having a rigorous mathematical proof that the AI is not dangerous; but "merely" having enough understanding of morality when building it (as opposed to some high-level notions whose components haven't been rigorously analyzed).

comment by DSimon · 2010-10-30T15:17:59.321Z · LW(p) · GW(p)

Also, the second approach would be pretty much the only way to go if the computer is running the debugger's life support system, assuming you cannot build a simulation and test potential fixes on it.

comment by timtyler · 2011-01-02T10:20:48.970Z · LW(p) · GW(p)

On Ben's blog post, I noted that a poll at the 2008 global catastrophic risks conference put the existential risk of machine intelligence at 5% - and that the people attending probably had some of the largest estimations of risk of anyone on the planet - since they were a self-selected group attending a conference on the topic.

"Molecular nanotech weapons" also get 5%. Presumably there's going to be a heavy intersection between those two figures - even though in the paper they seem to be adding them together!

comment by timtyler · 2011-04-01T09:22:40.821Z · LW(p) · GW(p)

Compare this with this Yudkowsky quote from 2005:

And if Novamente should ever cross the finish line, we all die.

This looks like a rather different probability estimate. It seems to me to be highly overconfident one.

I think the best way to model this is as FUD. Not Invented Here. A primate ego battle.

If this is how researchers deal with each other at this early stage, perhaps rough times lie ahead.

comment by jimrandomh · 2011-04-01T14:39:43.277Z · LW(p) · GW(p)

A poll at the 2008 global catastrophic risks conference put the existential risk of machine intelligence at 5%

Compare this with this Yudkowsky quote from 2005: And if Novamente should ever cross the finish line, we all die

This looks like a rather different probability estimate. It seems to me to be highly overconfident one.

They're probabilities for two different things. The 5% estimate is for P(AIisCreated&AIisUnfriendly), while Yudkowsky's estimate is for P(AIisUnfriendly|AIisCreated&NovamenteFinishesFirst).

comment by TheOtherDave · 2011-04-01T15:01:25.246Z · LW(p) · GW(p)

"perhaps"?

comment by timtyler · 2011-04-02T07:44:02.272Z · LW(p) · GW(p)

Well, a tendency towards mud-slinging might be counter-balanced by wanting to appear moral. Using FUD against competitors is usually regarded as a pretty low marketing strategy. Perhaps most of the mud-slinging can be delegated to anonymous minions, though.

comment by TheOtherDave · 2011-04-02T15:33:31.300Z · LW(p) · GW(p)

There's going to be a lot of mud-slinging in this space.

More generally, there's going to be a lot of primate tribal politics in this space. After all, not only does it have all the usual trappings of academic arguments, it is also predicated on some pretty fundamental challenges to where power comes from and how it propagates.

comment by PhilGoetz · 2010-11-03T22:27:49.602Z · LW(p) · GW(p)

This post doesn't show up under "NEW", nor does it show up under "Recent Posts".

ADDED: Never mind. I forgot I had "disliked" it, and had "do not show an article once I've disliked it" set.

(I disliked it because I find it kind of shocking that Ben, who's very smart, and whom I'm pretty sure has read the things that I would refer him to on the subject, would say that the Scary Idea hasn't been laid out sufficiently. Maybe some people need every detail spelled out for them, but Ben isn't one of them. Also, he is committing the elementary error of not considering expected value.

ADDED: Now that I've read Ben's entire post, I upvoted rather than downvoted this post. Ben was not committing the error of not considering expected value, so much as responding to many SIAI-influenced people who are not considering expected value. And I agree with most of what Ben says. I would add that Eliezer's plan to construct something that will provably follow some course of action - any course of action - chosen by hairless primates, is likely to be worse in the long run than a hard-takeoff AI that kills all humans almost immediately. Explaining what I mean by "worse" is problematic; but no more problematic than explaining why I should care about propagating human values.)

I also disagree about what the Scary Idea is - to me, the idea that the AI will choose to keep humans around for all eternity, is scarier than that it will not. But that is something Eliezer either disagrees with, or has deliberately made obscure.)

comment by timtyler · 2011-03-31T23:33:00.929Z · LW(p) · GW(p)

to me, the idea that the AI will choose to keep humans around for all eternity, is scarier than that it will not. But that is something Eliezer either disagrees with, or has deliberately made obscure.

Wouldn't it make sense to keep some humans around for all eternity - in the history simul-books? That seems to make sense - and not be especially scary.

comment by PhilGoetz · 2011-03-31T23:45:35.628Z · LW(p) · GW(p)

Sure. Tiling the universe largely with humans is the strong scary idea. Locking in human values for the rest of the universe is the weak scary idea. Unless the first doesn't imply the second; in which case I don't know which is more scary.

comment by timtyler · 2010-11-04T08:27:00.948Z · LW(p) · GW(p)

It does now for me. Strange.

comment by PhilGoetz · 2010-11-04T15:10:00.001Z · LW(p) · GW(p)

Oops. My mistake. It's a setting I had that I forgot about.

comment by ata · 2010-11-03T22:35:15.013Z · LW(p) · GW(p)

It doesn't?

It's off the front page of NEW/Recent Posts, as there have been more than ten other posts since it was posted, but it's still there.

comment by PhilGoetz · 2010-11-04T02:12:40.409Z · LW(p) · GW(p)

Nope, it's not there at all.

Recent Posts

  • Rationality Quotes: November 2010 by jaimeastorga2000 | 3
  • Oxford (UK) Rationality & AI Risks Discussion Group by Larks | 3
  • Harry Potter and the Methods of Rationality discussion thread, part 5 by NihilCredo | 5
  • South/Eastern Europe Meeting in Ljubljana/Slovenia by Thomas | 7
  • Hierarchies are inherently morally bankrupt by PhilGoetz | 0
  • Group selection update by PhilGoetz | 21
  • What I would like the SIAI to publish by XiXiDu | 23
  • Berkeley LW Meet-up Saturday November 6 by LucasSloan | 4
  • Is cryonics evil because it's cold? by ata | 19
  • Imagine a world where minds run on physics by cousin_it | 10
  • Qualia Soup, a rationalist and a skilled You Tube jockey by Raw_Power | 6
  • Value Deathism by Vladimir_Nesov | 21
  • Cambridge Meetups Nov 7 and Nov 21 by jimrandomh | 4
  • Making your explicit reasoning trustworthy by AnnaSalamon | 60
  • Call for Volunteers: Rationalists with Non-Traditional Skills by Jasen | 20
  • Self-empathy as a source of "willpower" by Academian | 39
  • If you don't know the name of the game, just tell me what I mean to you by Stuart_Armstrong | 7
  • Luminosity (Twilight fanfic) Part 2 Discussion Thread by JenniferRM | 4
  • Activation Costs by lionhearted | 24
  • Dealing with the high quantity of scientific error in medicine by NancyLebovitz | 27
  • Let's split the cake, lengthwise, upwise and slantwise by Stuart_Armstrong | 34
  • Willpower: not a limited resource? by Jess_Riedel | 21
  • Optimism versus cryonics by lsparrish | 34
  • The Problem With Trolley Problems by lionhearted | 9
  • How are critical thinking skills acquired? Five perspectives by matt | 7
  • October 2010 Southern California Meetup by jimmy | 6
  • Vipassana Meditation: Developing Meta-Feeling Skills by Luke_Grecki | 18
  • Mixed strategy Nash equilibrium by Meni_Rosenfeld | 38
  • Human performance, psychometry, and baseball statistics by Craig_Heldreth | 22
  • Melbourne Less Wrong Meetup for November by Patrick | 8
  • Swords and Armor: A Game Theory Thought Experiment by nick012000 | 13
  • Morality and relativistic vertigo by Academian | 33
  • The Dark Arts - Preamble by Aurini | 30
  • Love and Rationality: Less Wrongers on OKCupid by Relsqui | 11
  • Collecting and hoarding crap, useless information by lionhearted | 15
  • References & Resources for LessWrong by XiXiDu | 48
  • Strategies for dealing with emotional nihilism by SarahC | 22
  • Recommended Reading for Friendly AI Research by Vladimir_Nesov | 17
  • Notion of Preference in Ambient Control by Vladimir_Nesov | 11
  • Harry Potter and the Methods of Rationality discussion thread, part 4 by gjm | 2
  • Rationality quotes: October 2010 by Morendil | 3
  • Understanding vipassana meditation by Luke_Grecki | 41
  • Berkeley LW Meet-up Saturday October 9 by LucasSloan | 5
comment by ata · 2010-11-04T02:30:34.285Z · LW(p) · GW(p)

Weird. It's there for me.

  • Qualia Soup, a rationalist and a skilled You Tube jockey by Raw_Power | 6
  • Value Deathism by Vladimir_Nesov | 21
  • Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It) by ciphergoth | 24
  • Cambridge Meetups Nov 7 and Nov 21 by jimrandomh | 4
  • Making your explicit reasoning trustworthy by AnnaSalamon | 60
comment by Perplexed · 2010-11-01T03:10:41.967Z · LW(p) · GW(p)

The motivation for the censorship is not to keep the idea from the AGI. It is to keep the idea from you. For your own good.

Seriously. And don't ask me to explain.

comment by Eneasz · 2010-11-01T22:51:30.867Z · LW(p) · GW(p)

Here's the problem: I have read it. And I may even agree that this is a serious issue. I don't trust myself to be intelligent enough to decide one way or the other, so I'll defer to Yudkowsky in this case.

But I have already read it. And it is extremely unlikely that I ever would have read it if it wasn't for the fact that it was banned, there was a huge kerfuffle, and we lost a good community member. The censorship itself probably caused this idea to propagate more than it ever could have if simply left alone. The Streisand Effect again.

The only thing that mentioning it can do is to spread it further. People who don't care will continue to mention it, but people who do shouldn't say anything about it at all. Not even to justify it, not even to warn away from it. That only builds the allure of the mysterious. That's what got me searching for it in the first place.

You don't hide the Necronomicon by constantly telling everyone to stay away from it, and assuring them you can't explain why for their own good. You hide it by never mentioning it at all.

comment by Perplexed · 2010-11-01T23:28:58.528Z · LW(p) · GW(p)

Good idea. Lots of luck enforcing that.

comment by Eneasz · 2010-11-02T03:02:33.132Z · LW(p) · GW(p)

Enforcing? Twas just a suggestion. But if you really think it's a good idea, please down-vote my comment so it'll fall below the cut-off and casual browsers won't see it. :) That doesn't give it the aura of censored Forbidden Fruit, but it will cause Trivial Inconvenience

comment by Vladimir_Nesov · 2010-10-31T15:03:57.990Z · LW(p) · GW(p)

I obviously assume "not too tiny".

comment by XiXiDu · 2010-10-31T19:22:17.418Z · LW(p) · GW(p)

I just noticed that there is a post over at Overcoming Bias talking about what I had in mind:

How much should small groups of people be allowed to risk the future of humanity with low probability? Not everyone agrees that the risk from alien contact is negligible: even a very low probability times a great harm can be relevant. … Should we be equally concerned with occultists trying to summon world-changing supernatural powers? There are probably many more people today who believe in supernatural entities than mere aliens, and that some interactions with them could be harmful. Yet there are no attempts at formulating risk scales for ritual magic. … Even if we were to analyse them rationally, we need to have an ‘ultraviolet cut-off’ for the infinite number of possible-yet-exceedingly-unlikely possibilities we could worry about. How to rationally decide on this cut-off seems problematic.

comment by XiXiDu · 2010-10-31T14:32:58.334Z · LW(p) · GW(p)

This comment and your other comments that are being voted down should rather be turned into a top-level post. Some people here seem to be horrible confused about this.

I couldn't agree more, upvoted.

comment by Vladimir_Nesov · 2010-10-31T13:53:35.044Z · LW(p) · GW(p)

As I said, explanations exist. Don't confuse with actual good understanding, which as far as I know nobody managed to attain yet.

comment by Vladimir_Nesov · 2010-10-31T18:25:07.194Z · LW(p) · GW(p)

Again, I don't think this terminology is adequate.

Let's not dwell on terminology, where the denoted concepts remain much more urgently unclear.

comment by Perplexed · 2010-10-31T18:50:24.646Z · LW(p) · GW(p)

Would you please consider offering an opinion as to whether Porter or Xixidu is anywhere close to describing the denoted concept?

comment by playtherapist · 2010-10-30T22:58:24.770Z · LW(p) · GW(p)

RE: Empathy and intelligence- It is possible to be brilliant in some respects without empathy, but it is definitely a handicap not to have it. There are many aspects of intelligence, only some of which are measured by IQ tests. Empathy is one, others are musical and artistic talents and social skills. I question whether it is possible to teach an AI any of these, especially empathy. The latest research I've read indicates that the ability to develop empathy is tied to what have been labeled "mirror neurons". They are missing in people with autism- and, quite possibly, in psychopaths too.

comment by Alicorn · 2010-10-30T23:12:25.526Z · LW(p) · GW(p)

the ability to develop empathy is tied to what have been labeled "mirror neurons". They are missing in people with autism

98% confidence that this is at least a massive oversimplification.

comment by AdeleneDawner · 2010-10-30T23:27:00.982Z · LW(p) · GW(p)

Agreed, and I'd put it at at least 50-50 that it's outright wrong.

(My understanding is that they've had lots of trouble doing studies on this, partly because it's hard to get lower-functioning autistics to focus on the things that the testers want them to pay attention to, so many of the older tests gave results that were, on further examination, incorrect. The most recent results I've heard about say that autistic people are at least as empathetic as neurotypicals, on average; the social problems have more to do with difficulty using information gained via empathy than with the empathy not being there at all. I actually wouldn't be at all surprised if it was an integration issue similar to the sensory integration issues that are so common.)

comment by wedrifid · 2010-10-30T23:55:48.591Z · LW(p) · GW(p)

98% seems conservative to me. ;)

comment by Alicorn · 2010-10-30T23:58:25.235Z · LW(p) · GW(p)

When I've done calibration checks I'm far more prone to overconfidence than underconfidence, so I nudged it down.

comment by wedrifid · 2010-10-31T00:00:20.083Z · LW(p) · GW(p)

There are many aspects of intelligence, only some of which are measured by IQ tests.

And most of which are not even possessed by humans.

comment by Perplexed · 2010-10-30T23:29:34.326Z · LW(p) · GW(p)

Hmmm. I am able to achieve a modicum of empathy with my pet dog. I doubt that this is due to my possession of a neuron that fires when I wag my own tail.

comment by NancyLebovitz · 2010-10-30T23:07:50.506Z · LW(p) · GW(p)

Supposing that you wanted to go the mirror neuron route for developing empathy in an AI, it would need a virtual body linked to its utility function.

Damned if I know whether this is a reasonable path for AI development, but it would be a very handy premise for science fiction.