My Elevator Pitch for FAI

post by magfrump · 2012-02-23T22:41:40.801Z · LW · GW · Legacy · 19 comments

 

This is a short introduction to the idea of FAI and existential risk from technology that I've used with decent success among my social circles, which consist mostly of mathematicians or at least people who have taken an introductory CS class.

I'll do my best to dissect what I think is effective about it, mostly as an exercise for myself.  I encourage people to adopt this to their own purposes.

So, technology is getting more powerful over time, right?  That is, as time goes on, it gets easier and easier to do more and more.  If we extrapolate that to its logical extreme, and obviously there are some issues there but let's just pretend, eventually we should be able to press a button and recreate the entire world however we want.

But computers don't do exactly what we want, they do exactly what we say.  So if we ever get to the point of having that button, it's very important that we know exactly what we want.  And not at the level of "I want a sandwich right now," at the level of actually programming it into a computer.

 

(This is 90% of the inferential gap; also usually the above fits into a literal elevator ride.)

 

Again, we probably won't have a literal button that remakes the entire universe.  But we will probably have smarter-than-human AI at some point.

Imagine putting humans into a world with only chimpanzees.  You don't have to imagine that hard; humans evolved into a world with only chimpanzees.  And now humans are everywhere, there are tons of us, and if we all decided that chimpanzees should die then all chimpanzees would die.  They wouldn't even know what was going on.  A few humans died at first, and we still die for dumb reasons, but humans overall have a lot of power over everything else.

Now imagine putting AI into a world of humans.  And if you don't want the world to be a Luddite dictatorship, you have to imagine that people will keep creating AIs, even if the first few don't take off.  The same thing is likely to happen.  AIs will take over, and we'll live or die at their whim.

Fortunately for chimps, humans feel pretty friendly toward chimps most of the time.  So we really want AIs to be friendly toward us.  Which means we need to figure out what it actually means to be friendly, at a level that we can program into a computer.

Smarter than human AI is probably a fair distance away.  But if you look at how fast AI research progresses and compare it to how fast philosophy research progresses, I don't think AI is further away than philosophers actually agreeing on what people want out of life.  They can't even agree on whether God exists.

 

Guesses as to why this is effective (i.e. applications of Dark Arts 101):

Open with a rhetorical question that your audience likely agrees to.  If necessary, talk about some examples like computers.  Reduce it to a nice soundbite Also: stay really informal.

Ask them to extrapolate, and guide them to something to extrapolate.  Oversimplify a lot so that you can talk about something simple, but acknowledge that you're oversimplifying. Especially among mathematicians, the audience should fill in some gaps on their own; it makes them feel a bit more ownership over the idea, and allows them to start agreeing with you before things get too intense.

Recall a fact that they agree with and can sympathize with: computers doing what you say not what you want.  If they don't have this background it will be much harder to bridge the gap.

 

The next step is now just putting two and two together; hopefully they're doing this in your head and can almost complete the sentence:

we need to know what we want if we get that magic button.  The magic button is a good thing, too, so it's not scary to agree!

 

And codify it into something more precise: programming the answer to a philosophical question (vague, difficult to answer) into a computer (extremely precise and picky).  They should be able to register this as something very difficult, but possibly solvable.

After this, intelligences differences between chimps and humans and projecting to humans vs. AI usually works ok, but you could switch to talking about nanotech or whatnot easily.

For nanotech or biotech I recommend the line: "Any improvement is a change, but most changes just give you cancer and kill you."  This goes over well with biochemists.

 

19 comments

Comments sorted by top scores.

comment by Vladimir_Nesov · 2012-02-24T00:40:35.604Z · LW(p) · GW(p)

eventually we should be able to press a button and recreate the entire world however we want.

Confusing, unclear what that means. First idea is that you're talking about virtual world, but that doesn't fit. When applied to the physical world, it doesn't seem to follow from your presentation.

So if we ever get to the point of having that button, it's very important that we know exactly what we want. And not at the level of "I want a sandwich right now," at the level of actually programming it into a computer.

So we have to understand the sandwich wish at the level of actually programming it into computer, right? (Misleading.)

humans evolved into a world with only chimpanzees.

There were Neanderthals etc., but probably neither is particularly relevant for the "evolved" part, which itself doesn't seem to relate to the point.

And if you don't want the world to be a Luddite dictatorship, you have to imagine that people will keep creating AIs

But I do want the world to be a Luddite dictatorship. Where do I sign? Do I now have to imagine that people won't keep creating AIs, to complete the ritual? (Not wanting this is irrelevant, it just won't happen.)

(Stopped reading at this point.)

Replies from: Rain
comment by Rain · 2012-02-27T01:33:25.910Z · LW(p) · GW(p)

It's almost as if you aren't the target audience. Or that shifting a back-and-forth, feel-it-out, one-on-one conversation to broad, vague text is a difficult translation procedure.

comment by JoshuaFox · 2012-02-26T15:51:28.504Z · LW(p) · GW(p)

Here's my super-short pitch. I think it could be delivered between the first and fourth floors on the elevator :-)

"Within a few decades, engineers will probably create an Artificial Intelligence at a roughly human level of ability. When that happens, it will want to improve itself as much as it can, since that will help it achieve its goals. It will self-improve to far-above-human levels. When it is that smart, it will almost certainly achieve its goals, and so we had better make sure, before we build it, that it has goals that are good for humans."

Of course, there are some big inferential gaps, but usually you don't have much time for your initial pitch. I think that that really does summarize the point, and a few people have "got it" after hearing a short pitch, at least to the point where details can be further explained.

Replies from: magfrump, TheOtherDave
comment by magfrump · 2012-02-26T21:28:13.652Z · LW(p) · GW(p)

I think this is way too technical-argument sounding I think; I expect people would either challenge you if they feel like they know enough or tune you out if they feel they don't.

My initial impression is that starting with something a little less formal-sounding would be better, but actually I'd love to see a half dozen pitches and have people collect at least anecdotal evidence about which are effective.

Replies from: JoshuaFox
comment by JoshuaFox · 2012-02-28T09:00:36.250Z · LW(p) · GW(p)

Yes, it is too technical-sounding for most people.

But it is intended for very smart and technical/scientific people. That's the only audience that has a chance of getting it, anyway--unless rationality training does the trick, that is :-)

It is meant as an intro, perhaps after the person has heard a bit about the topic, but you want to give them a clear summary to grasp the concept. Of course, it is not enough and is usually followed by more detail.

comment by TheOtherDave · 2012-02-26T20:04:59.791Z · LW(p) · GW(p)

If I'm going to reduce that far, I'd probably go one level further and drop the reference to human/superhuman level AI altogether... for example: "We're building systems today that automatically implement their own goals. Often they are so complex, or operate so quickly, that no human can monitor them effectively. Over time those systems will get more complex and faster and even harder for humans to monitor. Therefore, if we want to ensure that their output is good for us, we need to ensure that their goals are good for us once implemented."

Of course, this completely loses the upside half of SI's argument, where superhuman FAIs create a utopian post-scarcity death-free ultra-awesome environment. This might be an advantage for an elevator pitch.

comment by thomblake · 2012-02-24T18:35:44.143Z · LW(p) · GW(p)

I feel like the parenthetical note regarding "catchy advice" was supposed to be a hyperlink in the final draft.

Replies from: magfrump
comment by magfrump · 2012-02-24T22:26:52.096Z · LW(p) · GW(p)

Yes quite! Thanks for reminding me.

comment by shminux · 2012-02-24T03:14:36.046Z · LW(p) · GW(p)

I suppose there is a couple of main points to get across. One (assuming you subscribe to the whole FOOM thing) is that once an AGI is nearly as smart as a human, it will take no time flat for it to become infinitely smarter than humans. The other is that this infinitely smart AGI is likely to treat humanity as animals in a zoo at best, and as a nuisance like a mosquito to get rid of at worst. I am not sure what metaphor or comparison would would well for the FOOM idea to bridge the inferential distance, but one is sorely needed for any sort of elevator pitch.

comment by thomblake · 2012-02-24T18:38:08.442Z · LW(p) · GW(p)

Very nice. I think the opening bit actually would make a pretty good 'elevator pitch' by itself (aside from the missing bill of sale). If it's actually working for you to close inferential distances, I'll have to try it out.

comment by praxis · 2012-02-24T05:59:36.440Z · LW(p) · GW(p)

So, technology is getting more powerful over time, right? That is, as time goes on, it gets easier and easier to do more and more. If we extrapolate that to its logical extreme, and obviously there are some issues there but let's just pretend, eventually we should be able to press a button and recreate the entire world however we want.

This is a little too utopian-sounding, and would probably provoke automatic reactions along the lines of Malthusianism and environmentalism and such. Perhaps if it's made a little more vague, it could get past any filters along the lines of "Uncontrolled progress will cause a disaster!" that your audience might have.

comment by timtyler · 2012-02-23T23:08:43.412Z · LW(p) · GW(p)

But computers don't do exactly what we want, they do exactly what we say. So if we ever get to the point of having that button, it's very important that we know exactly what we want. And not at the level of "I want a sandwich right now," at the level of actually programming it into a computer.

Users probably won't ever face that problem. They'll just be able to ask for a sandwich - much as they do today. English will be one of the recognised programming languages.

Replies from: magfrump
comment by magfrump · 2012-02-24T00:03:43.926Z · LW(p) · GW(p)

This is meant more as a "why FAI is a serious problem" introduction for academics than an introduction to futurism; I agree with you but I'm not sure how it's relevant.

Replies from: timtyler
comment by timtyler · 2012-02-24T12:21:35.645Z · LW(p) · GW(p)

The whole idea that computers "do exactly what we say." seems highly dubious. Look at the iPad and Kindle, for example. You have to be careful if building on top of such a premise - since intelligent machines will probably not be so literal-minded.

Replies from: Will_Newsome, DSimon
comment by Will_Newsome · 2012-02-25T03:17:30.181Z · LW(p) · GW(p)

I'm really annoyed that the above comment is heavily downvoted; yes, it goes against local folk beliefs, but it's not straightforwardly wrong and I can see many arguments that suggest it's the sensible default belief.

Replies from: timtyler
comment by timtyler · 2012-02-26T22:55:23.949Z · LW(p) · GW(p)

I don't really see why it would go against "local folk beliefs".

It's surely widely recognised that many computers don't literally do what their users tell them to - but instead obey a bunch of layers of system software - which may or may not have the user's interests in mind.

As for super-intelligent machines being "literal-minded", that would go against a long trend towards the use of higher level languages, and computers adapting to humans (rather than the other way around). Nobody is going to be aiming at a superintelligence which is autistic in this department.

comment by DSimon · 2012-02-26T19:26:20.888Z · LW(p) · GW(p)

[I]ntelligent machines will probably not be so literal-minded.

This is a variation of the "Superintelligent AI will do what you mean, not what you literally say; it would have to be pretty non-superintelligent to screw that up."

The counter-argument is: The person making the request may not understand the full implications of "what they really mean". The AI needs to be able to protect against bad unintended outcomes even of correctly interpreted requests. Because a superintelligent AI is very powerful the bad outcomes could be very bad indeed. To deal with this, the AI has to understand "what we really want", which is tricky since most of the time we don't even know what that is in any great detail.

Replies from: timtyler
comment by timtyler · 2012-02-26T22:42:30.073Z · LW(p) · GW(p)

This is a variation of the "Superintelligent AI will do what you mean, not what you literally say; it would have to be pretty non-superintelligent to screw that up."

...except that my comments were fine, while the position that you are likening them to is completely daft. That doesn't seem to be entirely fair. Maybe you thought I was making that daft argument - in which case, perhaps revisit the situation now that you have heard me state that I wasn't.

Replies from: DSimon
comment by DSimon · 2012-02-27T01:47:57.133Z · LW(p) · GW(p)

I re-read your comment, but I'm still not sure what you're driving at. Can you elaborate a little further?