Sarah Connor and Existential Risk

wiresnips

Sarah Connor and Existential Risk

post by wiresnips · 2011-05-01T18:28:55.986Z · LW · GW · Legacy · 78 comments

78 comments

It's probably easier to build an uncaring AI than a friendly one. So, if we assume that someone, somewhere is trying to build an AI without solving friendliness, that person will probably finish before someone who's trying to build a friendly AI.

[redacted]

further edit:

Wow, this is getting a rather stronger reaction than I'd anticipated. Clarification: I'm not suggesting practical measures that should be implemented. Jeez. I'm deep in an armchair, thinking about a problem that (for the moment) looks very hypothetical.

For future reference, how should I have gone about asking this question without seeming like I want to mobilize the Turing Police?

78 comments

Comments sorted by top scores.

comment by CarlShulman · 2011-05-01T22:41:38.533Z · LW(p) · GW(p)

For discussion of the general response to hypothetical ticking time-bomb cases in which one knows with unrealistic certainty than a violation of an ethical injunction will pay off, when in reality such an apparent assessment is more likely to be a result of bias and a shortsighted incomplete picture of the situation (e.g. the impact of being the kind of person who would do such a thing), see the linked post.

With respect to the idea of neo-Luddite wrongdoing, I'll quote a previous comment:

The Unabomber attacked innocent people in a way that did not slow down technology advancement and brought ill repute to his cause. The Luddites accomplished nothing. Some criminal nutcase hurting people in the name of preventing AI risks would just stigmatize his ideas, and bring about impenetrable security for AI development in the future without actually improving the odds of a good outcome (when X can make AGI, others will be able to do so then, or soon after).

"Ticking time bomb cases" are offered to justify legalizing torture, but they essentially never happen: there is always vastly more uncertainty and lower expected benefits. It's dangerous to use such hypotheticals as a way to justify legalization of abuse in realistic cases. No one is going to wind up in a state of justified confidence that wrongdoing to "disable Skynet" is an available option (if such a thing was known to exist, it would be too late anyway, so the idea could only apply in much more uncertain conditions), and if a system could be shown to be quite likely dangerous, one would call the police, regulators, and politicians.

In any plausible epistemic situations, the criminal in question would be undertaking actions with an almost certain effect of worsening the prospects for humanity, in the name of an unlikely and limited gain. I.e., the act would have terrible expected consequences. The danger is not that rational consequentialists are going to go around bringing about terrible consequences (in between stealing kidneys from out-of-town patients, torturing accused criminals, and other misleading hypotheticals in which we are asked to consider an act with bad consequences under the implausible supposition that it has good consequences), it's providing encouragement and direction to mentally unstable people who don't think things through.

Replies from: None

↑ comment by [deleted] · 2011-05-01T22:47:11.070Z · LW(p) · GW(p)

Absolutely. This is by far the most actually rational comment in this whole benighted thread (including mine), and I regret that I can only upvote it once.

comment by Dorikka · 2011-05-01T20:10:20.640Z · LW(p) · GW(p)

This may be relevant here.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2011-05-01T23:02:44.960Z · LW(p) · GW(p)

As direct moderator censorship seems to provoke a lot of bad feeling, I would encourage everyone to downvote this to oblivion, or for the original poster to voluntarily delete it, for reasons given in highly upvoted comments below. Or search on "UTTERLY FUCKING STUPID", without quotes.

Replies from: None, wedrifid

↑ comment by [deleted] · 2011-05-01T23:13:40.851Z · LW(p) · GW(p)

As someone who was not at all happy about the one previous act of censorship I know about, I would consider this to be a very different case, in that I can imagine legal liability for the comments here.

↑ comment by wedrifid · 2011-05-01T23:17:12.976Z · LW(p) · GW(p)

As moderator-directed censorship seems to provoke a lot of bad feeling, I would encourage everyone to downvote this to oblivion, or for the original poster to voluntarily delete it, for reasons given in highly upvoted comments below. Or search on "UTTERLY FUCKING STUPID", without quotes.

Wise move. At least with respect to restraint with clear assertion. The reference to uncalled for profanity was a little silly - a simple link or verbal reference to CarlShulman's rather brilliant explanation of how to actually think about these issues responsibly would sent a far better message.

comment by [deleted] · 2011-05-01T19:31:47.478Z · LW(p) · GW(p)

Given that (redacted) It is a very, very, VERY bad idea to start talking about (redacted), and I would suggest you should probably delete this post to avoid encouraging such behaviour.

EDIT: Original post has now been edited, and so I've done likewise here. I ask anyone coming along now to accept that neither the original post nor the original version of this comment contained anything helpful to anyone, and that I was not suggesting censorship of ideas, but caution about talking about hypotheticals that others might not see as such.

Replies from: wiresnips

↑ comment by wiresnips · 2011-05-01T20:04:37.579Z · LW(p) · GW(p)

Edited, in the interest of caution.

However, this is exactly the issue I'm trying to discuss. It looks as though, if we take the threat of uncaring AI seriously, this is a real problem and it demands a real solution. The only solution that I can see is morally abhorrent, and I'm trying to open a discussion looking for a better one. Any suggestions on how to do this would be appreciated.

Replies from: None, Nick_Tarleton

↑ comment by [deleted] · 2011-05-01T20:13:41.239Z · LW(p) · GW(p)

As I understand it from reading the sequences, Eliezer's position roughly boils down to "most AI researchers are dilettantes and no danger to anyone at the moment. Anyone capable of solving the problems in AI at the moment will have to be bright enough, and gain enough insights from their work, that they'll probably have to solve Friendliness as part of it - or at least be competent enough that if SIAI shout loud enough about Friendliness they'll listen. The problem comes if Friendliness isn't solved before the point where it becomes possible to build an AI without any special insight, just by throwing computing power at it along with a load of out-of-the-box software and getting 'lucky'."

In other words, if you're convinced by the argument that Friendly AI is the most important problem facing us, the thing to do is work on Friendly AI rather than prevent other people working on unFriendly AI. Find an area of the problem no-one else is working on, and do that. That might sound hard, but it's infinitely more productive than finding the baddies and shooting at them.

Replies from: wiresnips

↑ comment by wiresnips · 2011-05-01T20:25:46.197Z · LW(p) · GW(p)

Anyone smart enough to be dangerous is smart enough to be safe? I'm skeptical- folksy wisdom tells me that being smart doesn't protect you from being stupid.

But in general, yes- the threat becomes more and more tangible as the barrier to AI gets lower and the number of players increases. At the moment, it seems pretty intangible, but I haven't actually gone out and counted dangerously smart AI researchers- I might be surprised by how many there are.

To be clear, I was NOT trying to imply that we should actually right now form the Turing Police.

Replies from: None

↑ comment by [deleted] · 2011-05-01T20:37:51.980Z · LW(p) · GW(p)

As I understand it, the argument (roughly) is that if you build an AI from scratch, using just tools available now, you will have to specify its utility function, in a way that the program can understand, as part of that process. Anyone actually trying to work out a utility function that can be programmed would have to have a fairly deep understanding - you can't just type "make nice things happen and no bad things", but have to think in terms that can be converted into C or Perl or whatever. In doing so, you would have to have some kind of understanding in your own head of what you're telling the computer to do, and would be likely to avoid at least the most obvious failure modes.

However, in (say) twenty years that might not be the case - it might be (as an example) that we have natural language processing programs that can take a sentence like 'make people happy' and have some form of 'understanding' of it, while still not being Turing-test-passing, self-modification-capable fully general AIs. It could then get to the stage that some half-clever person could think "Hmm... If I put this and this and this together, I'll have a self-modifying AI. And then I'll just tell it to make everyone smile. What could go wrong?"

↑ comment by Nick_Tarleton · 2011-05-02T03:00:35.113Z · LW(p) · GW(p)

The only solution that I can see is morally abhorrent, and I'm trying to open a discussion looking for a better one.

It's already been linked to a couple times under this post, but: have you read http://lesswrong.com/lw/v1/ethical_injunctions/ and the posts it links to?

In any case, non-abhorrent solutions include "work on FAI" and "talk to AGI researchers, some of whom will listen (especially if you don't start off with how we're all going to die unless they repent, even though that's the natural first thought)".

comment by fubarobfusco · 2011-05-01T20:22:55.412Z · LW(p) · GW(p)

"Bad argument gets counterargument. Does not get bullet. Never. Never ever never for ever."

But I'll propose a possibly even more scarily cultish idea:

Why attempt to perfect human rationality? Because someone's going to invent uploading sometime. And if the first uploaded person is not sufficiently rational, they will rapidly become Unfriendly AI; but if they are sufficiently rational, then there's a chance they will become Friendly AI.

(The same argument can be used for increasing human compassion, of course. Sufficiently advanced compassion requires rationality, though.)

Replies from: Nick_Tarleton, wedrifid, Vladimir_Nesov

↑ comment by Nick_Tarleton · 2011-05-02T02:44:32.964Z · LW(p) · GW(p)

(Tangentially:)

And if the first uploaded person is not sufficiently rational, they will rapidly become Unfriendly AI

"Will" is far too strong. Becoming UFAI at least requires that an upload be given sufficient ability to self-modify (or sufficiently modified from outside), and that IA up to superintelligence on uploads be not only tractable (likely but not guaranteed) but, if it's going to be the first upload, easy enough that lots more uploads don't get made first. Digital intelligences are not intrinsically, automatically hard takeoff risks, which it sounds like you're modeling them as. (Not to mention, up to a point insufficient rationality would make an upload less likely to ever successfully increase its intelligence.)

(That said, there are lots of risks and horrible scenarios involving uploads that don't require strong superintelligence, just subjective speedup or copiability.)

↑ comment by wedrifid · 2011-05-01T21:22:28.472Z · LW(p) · GW(p)

"Bad argument gets counterargument. Does not get bullet. Never. Never ever never for ever."

I approve of that sentiment so long as people don't actually take it literally when the world is at stake. Because that could get everybody killed.

Replies from: Nick_Tarleton, None

↑ comment by Nick_Tarleton · 2011-05-02T02:36:03.538Z · LW(p) · GW(p)

If you predictably have no ethics when the world is at stake, people (including your allies!) who know this won't trust you when you think the world is at stake. That could also get everybody killed.

(Yes, this isn't going to make the comfortably ethical option always correct, but it's a really important consideration.)

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-02T03:29:55.266Z · LW(p) · GW(p)

Note to any readers: This subthread is discussing the general and unambiguously universal claim conveyed by a particular Eliezer quote. There are no connotations for the AGI prevention fiasco beyond the rejection of that particular soldier as it is used here or anywhere else.

If you predictably have no ethics when the world is at stake, people who know this won't trust you when you think the world is at stake. That could also get everybody killed.

I appreciate ethics. I've made multiple references to the 'ethical injunctions' post in this thread and tend to do so often elsewhere - I rate it as the second most valuable post on the site, after 'subjectively objective'.

Where people often seem to get confused is in conflating 'having ethics' with being nice. There are situations where not shooting at people is an ethical violation. (Think neglecting duties when there is risk involved.) Pacifism is not intrinsically ethically privileged.

The problem with the rule:

"Bad argument gets counterargument. Does not get bullet. Never. Never ever never for ever."

... is not that it is advocating doing the Right Thing even in extreme scenarios. The problem is that it is advocating doing the Wrong Thing. It is unethical and people knowing that you will follow this particular rule is dangerous and generally undesirable.

Bullets are an appropriate response in all sorts of situations where power is involved. And arguments are power. They don't say "the pen is mightier than the sword" for nothing.

Let's see... five seconds thought... consider a country in which one ethnicity has enslaved another. Among the dominant race there is a conservative public figure who is a powerful orator with a despicable agenda. Say... he advocates the killing of slaves who are unable to work, the castration of all the males and the use of the females as sex slaves. Not entirely implausible as far as atrocities go. The arguments he uses are either bad or Bad yet he is rapidly gaining support.

What is the Right Thing To Do? It certainly isn't arguing with him - that'll just end with you being 'made an example'. The bad arguments are an application of power and must be treated as such. The ethical action to take is to assassinate him if at all possible.

"Never. Never ever never for ever." is just blatantly and obviously wrong. There is no excuse for Eliezer to make that kind of irresponsible claim - he knows people are going to get confused by it and quote it to proliferate the error.

Replies from: Nick_Tarleton

↑ comment by Nick_Tarleton · 2011-05-02T03:41:54.186Z · LW(p) · GW(p)

I agree with everything in this comment (subject to the disclaimer in the first paragraph, and possibly excepting the strength of the claim in the very last sentence), and appreciate the clarification.

(I suspect we still disagree about how to apply ethics to AI risks, but I don't feel like having that argument right now.)

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-02T03:47:38.506Z · LW(p) · GW(p)

I agree with everything in this comment (subject to the disclaimer in the first paragraph, and possibly excepting the strength of the claim in the very last sentence), and appreciate the clarification.

I'm not entirely sure I agree with the strength of the claim in my last sentence either. It does seem rather exaggerated. :)

↑ comment by [deleted] · 2011-05-01T21:26:46.780Z · LW(p) · GW(p)

It says "bad *argument" not "Bad person shooting at you". Self-defence (or defence of one's family, country, world, whatever) is perfectly acceptable - initiation of violence never is. It's never right to throw the first punch, but can be right to throw the last.

Replies from: wedrifid, nhamann, Barry_Cotter

↑ comment by wedrifid · 2011-05-01T21:28:51.550Z · LW(p) · GW(p)

It says "bad *argument" not "Bad person shooting at you". Self-defence (or defence of one's family, country, world, whatever) is perfectly acceptable - initiation of violence never is. It's never right to throw the first punch, but can be right to throw the last.

I approve of that sentiment so long as people don't actually take it literally when the world is at stake. Because that could get everybody killed.

Mind you in this case there are even more exceptions. Initiation of violence, throwing the first punch, is appropriate in all sorts of situations. In fact in the majority of cases where it is appropriate to throw the second punch, throwing the first punch is better. Because the first punch could kill or injure you. The only reason not to preempt the punch (given that you will need to respond with a punch anyway) is for the purpose of signalling to people like yourself.

In these kind of cases it can be wise to pay lip service to a 'never throw the first punch' moral but actually follow a rational approach when a near mode situation arises.

Let me remind you: The world is at stake. You, everybody you care about and your entire species will die and the future light cone left baron or tiled with dystopic junk. That is not a time to be worrying about upholding your culture's moral ideals. Save the @#%! world!

Replies from: None

↑ comment by [deleted] · 2011-05-01T21:41:48.313Z · LW(p) · GW(p)

No, that's not the only reason. Generally speaking, one either has no warning that violence is coming (in which case one can't throw the first punch) or one does have warning (in which case it's possible to, e.g., walk away, negotiate, duck). On the other hand, none of us are perfect predictors of the future. There will be times when we believe the first punch is about to be thrown when it isn't. If we avoid aggression until attacked, it may be that nobody gets punched (or shot) at all. There's a reason that tit-for-tat is such a successful strategy in an iterated Prisoner's Dilemma - and that the only more successful strategies have been ones that punished defection less than that - and it's nothing to do with signalling.

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-01T21:51:54.827Z · LW(p) · GW(p)

I rejected a fully general moral prescription, not advice for what is often optimal decision making strategy:

Self-defence (or defence of one's family, country, world, whatever) is perfectly acceptable - initiation of violence never is. It's never right to throw the first punch, but can be right to throw the last.

↑ comment by nhamann · 2011-05-01T21:46:20.557Z · LW(p) · GW(p)

What about in the case where the first punch constitutes total devastation, and there is no last punch? I.e. the creation of unfriendly AI. It would seem preferable to initiate aggression instead of adhering to "you should never throw the first punch" and subsequently dying/losing the future.

Edit: In concert with this comment here, I should make it clear that this comment is purely concerned with a hypothetical situation, and that I definitely do not advocate killing any AGI researchers.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2011-05-01T22:27:18.050Z · LW(p) · GW(p)

Sure, but under what conditions can a human being reliably know that? You're running on corrupted hardware, just as I am.

Into the lives of countless humans before you has come the thought, "I must kill this nonviolent person in order to save the world." We have no evidence that those thoughts have ever been correct; and plenty of evidence that they have been incorrect.

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-01T22:32:15.044Z · LW(p) · GW(p)

I must kill this nonviolent person in order to save the world.

You may wish to strengthen that claim somewhat. I doubt the CIA would classify 'about to press the on switch of an unfriendly AGI' as 'nonviolent'.

You do make a good point about (actually rational constructions of) ethics.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2011-05-01T22:50:56.589Z · LW(p) · GW(p)

Sure; but the CIA also classifies "leading a peaceful, democratic political uprising" as worthy of violence; so they're not a very good guide.

More seriously: Today there are probably dozens or hundreds of processes going on that, if left unchecked, could lead to the destruction of the world and all that you and I value. Some of these are entirely mindless. I'm rather confident that somewhere in the solar system is an orbiting asteroid that will, if not deflected, eventually crash into the Earth and destroy all life as we know it. Everyone who is proceeding with their lives in ignorance of that fact is thereby participating in a process which, if unchecked, leads to the destruction of the world and all that is good. I hope that we agree that this belief does not justify killing people who oppose the funding of anti-asteroid defense.

But if you are seriously ready to kill someone who has her finger poised above the "on" switch of an unfriendly AGI (which is to say, an AGI that you believe is not sufficiently proven to be Friendly), then you are very likely susceptible to a rather trivial dead man's switch. The uFAI creator merely needs to be sufficiently confident in their AI's positive utility that they are willing to set it up to activate if they (the creator) are killed. Then, your readiness to kill is subverted. And ultimately, a person who is clever enough to create uFAI is clever enough to rig any number of nth-order dead man's switches if they really think they are justified in doing so.

Which means, in the limit case, that you're reduced to either (1) going on a massacre of everyone involved in AI, machine learning, or related fields; or (2) resorting to convincing people of your views and concerns rather than threatening them.

Replies from: Vladimir_Nesov, wedrifid

↑ comment by Vladimir_Nesov · 2011-05-01T22:55:49.253Z · LW(p) · GW(p)

I'm rather confident that somewhere in the solar system is an orbiting asteroid that will, if not deflected, eventually crash into the Earth and destroy all life as we know it.

Huh? Downvoted for sloppy reasoning. This most likely won't happen on the timescale where "life as we know it" continues to exist.

Replies from: JoshuaZ, fubarobfusco

↑ comment by JoshuaZ · 2011-05-01T23:55:52.826Z · LW(p) · GW(p)

This most likely won't happen on the timescale where "life as we know it" continues to exist.

The Chicxulub asteroid impact did wipe out almost all non-ocean life. That asteroid was 8-12 km. It is estimated that an impact of that size happens every few hundred million years. So this claim seems inaccurate. On the other hand, the WISE survey results strongly suggests that no severe asteroid impacts are likely in the next few hundred years.

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-02T00:18:18.226Z · LW(p) · GW(p)

It is estimated that an impact of that size happens every few hundred million years. So this claim seems inaccurate.

Only if you expect life as we know it to last in the order of a few hundred million years. That probability of that happening is too low for me to even put a number to it.

↑ comment by fubarobfusco · 2011-05-01T23:11:15.650Z · LW(p) · GW(p)

Would you mind posting your reasoning, instead of just posting your conclusions and an insult?

I should clarify that I was intending to set some sort of boundary condition on the possible futures of life on earth, rather than predicting a specific end to it: If life comes to no other end, at the very least, eventually we'll get asteroided if we stay here. This by itself does not justify killing people in a fight for asteroid-prevention; so what would justify killing people?

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-02T00:24:52.944Z · LW(p) · GW(p)

Would you mind posting your reasoning

Timescale of life as we know it continuing to exist: Short
Timescale of killer asteroids hitting earth: Long

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2011-05-02T01:00:23.711Z · LW(p) · GW(p)

Are we running into definitional issues of what we mean by "life as we know it?" That term has some degree of ambiguity that may be creating the problem.

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-02T02:39:49.606Z · LW(p) · GW(p)

Are we running into definitional issues of what we mean by "life as we know it?" That term has some degree of ambiguity that may be creating the problem.

Quite possibly. Although one of the features of 'life as we know it' that will not survive for hundreds of millions of years is living exclusively on earth. So the disagreement would remain independently of definition.

↑ comment by wedrifid · 2011-05-01T23:00:28.166Z · LW(p) · GW(p)

Sure; but the CIA also classifies "leading a peaceful, democratic political uprising" as worthy of violence; so they're not a very good guide.

They are not a guide so much as the very organisation for whom this sort of consideration is most relevant. They (or another organisation like them) are the groups most likely to carry out preventative measures. It is more or less part of their job description. (And puts a whole new twist on 'counter intelligence'!)

Which means, in the limit case, that you're reduced to either (1) going on a massacre of everyone involved in AI, machine learning, or related fields; or (2) resorting to convincing people of your views and concerns rather than threatening them.

Those extremes do not strike me as a particularly natural place to set up a dichotomy. In the space between them are all sorts of proactive options.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2011-05-01T23:09:15.428Z · LW(p) · GW(p)

I'd be more interested in a response to the substance of my comment: If you think that a person is about to turn on a (to your way of thinking) insufficiently Friendly AI, such that killing them might stop the inevitable paperclipping of all you hold dear, how do you take into account the fact that they might have outwitted you by setting up a dead man's switch?

In other words, how do you take into account the fact that killing them might bring about exactly the fate that you intend to prevent; whereas one more exchange of rational argument might convince them not to do it?

Replies from: XiXiDu, JoshuaZ, wedrifid

↑ comment by XiXiDu · 2011-05-02T09:20:16.295Z · LW(p) · GW(p)

If you think that a person is about to turn on a (to your way of thinking) insufficiently Friendly AI, such that killing them might stop the inevitable paperclipping of all you hold dear, how do you take into account the fact that they might have outwitted you by setting up a dead man's switch?

If someone with a facemask is pointing a gun at you he might just want to present it and ask you if you want to buy it, the facemask being the newest fashion hit that you are simply unaware of.

↑ comment by JoshuaZ · 2011-05-01T23:47:33.391Z · LW(p) · GW(p)

Edit: Disregard what I've wrote below. It isn't relevant since it assumes that the individual hasn't tried to make a Friendly AI which seems to be against the assumption in the hypothetical.

I'd be more interested in a response to the substance of my comment: If you think that a person is about to turn on a (to your way of thinking) insufficiently Friendly AI, such that killing them might stop the inevitable paperclipping of all you hold dear, how do you take into account the fact that they might have outwitted you by setting up a dead man's switch?

There seems to be a heavy overlap between people who think AGI will foom and people who are concerned about Friendliness (for somewhat obvious reasons. Friendliness matters a lot more if fooming is plausible). It seems extremely unlikely that someone would set up a dead man's switch unless they thought that a lot would actually get accomplished by the AI, i.e. that it would likely foom in a Friendly fashion. The actual chance that any such switches have been put into place seems low.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2011-05-01T23:59:04.689Z · LW(p) · GW(p)

Oh, sure, I agree.

But what if Eliezer thinks he's got an FAI he can turn on, and Joe isn't convinced that it's actually as Friendly as Eliezer thinks it is? I'd rather Joe argue with Eliezer than shoot him.

↑ comment by wedrifid · 2011-05-02T00:06:05.402Z · LW(p) · GW(p)

I am somewhat reluctant to engage deeply on the specific counterfactual here. Disagreeing with some of the more absurd statements by AndrewHickey has already placed me in the position of delivering enemy soldiers. That is an undesirable position to be in when the subject is one that encourages people to turn off their brains and start thinking with their emotional reflexes. Disagreeing with terrible arguments is not the same as supporting the opposition - but you an still expect the same treatment!

I would have to engage rather a lot of creative thinking to construct a scenario where I would personally take any drastic measures. Apart from the ethical injunctions I've previously mentioned I don't consider myself qualified to make the decision. The most I would do is make sure the situation has been brought to the attention of the relevant spooks and make sure competent AI researchers are informed so that they can give any necessary advice to the spook-analysts. Even then the spook agency would probably not need to resort to violence. If they do, in fact, have to resort to violence because the AGI creators force the issue then the creators in question definitely cannot be trusted!

If you think that a person is about to turn on a (to your way of thinking) insufficiently Friendly AI, such that killing them might stop the inevitable paperclipping of all you hold dear, how do you take into account the fact that they might have outwitted you by setting up a dead man's switch?

Now, with the aforementioned caveats, let us begin. I shall first note then assume away all the options that are available for circumventing dead man's switches. I refer here to resources the CIA could get their hands on. That means bunker buster bombs and teams of top of the line hackers to track down online instances. But those measures are not completely reliable so I'll take it for granted that the DMS works.

We now have a situation where terrorists are holding the world hostage. Ineffectively. Either they'll destroy the world or, if you kill them, they'll destroy the world. So it doesn't matter too much what you - you're dead either way. It seems the appropriate response is to blow the terrorists up. I'm not sure if I always advocate "don't negotiate with terrorists" but I definitely advocate "don't negotiate with terrorists when they are going to do the worst case thing anyway"!

But that is still too easy. Let's go to the next case. We'll say that the current design has a 99.9% chance of producing an uFAI. But if we give the AI creators another month to finish their work their creation has a 1% chance of creating an FAI[1]. Now the DMS threat actually matters. There is something to lose. The question becomes how do you deal with terrorists in a once-off, all-in situation. What do you do when (a small percentage but all that is available of) everything is at stake and someone can present a credible threat?

I actually don't know the answer. I am not sure there is a well established. Being the kind of group that doesn't take the terrorists out with a missile barrage has all sorts of problems. But being the person who does blow them away has a rather obvious problem too. I recall Vladimir making a interesting post regarding blackmail and terrorism however I don't think it gave us a how to guide kind of resolution.

[1] Also assume that you expect another source to create an FAI with 50% chance a few years later if the current creators are stopped.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2011-05-02T02:02:24.860Z · LW(p) · GW(p)

Yep. Now keep in mind that the CIA, or whatever other agency you care to bring to bear, is staffed with humans — fallible humans, the same sorts of agents who can be brought in remarkable numbers to defend a religion. The same sorts of agents who have at least once¹, and possibly twice², come within a single human decision of destroying the world for reasons that were later better classified as mistakes, or narrowly-averted disasters.

Given the fact that an agency full of humans is convinced that a given bunch of AGI-tators are within epsilon of dooming the world, what is the chance that they are right? And what is the chance that they have misconceived the situation such that by pulling the trigger, they will create an even worse situation?

My point isn't some sort of hippy-dippy pacifism. My point is: Humans — all of us; you, me, the CIA — are running on corrupted hardware. At some point when we make a severe decision, one that goes against some well-learned rules such as not-killing, we have to take into account that almost everyone who's ever been in that situation has been making a bad decision.

¹ Stanislav Petrov; 26 September 1983
² Jack Kennedy; Cuban Missile Crisis, October 1962

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-02T03:38:47.563Z · LW(p) · GW(p)

Given the fact that an agency full of humans is convinced that a given bunch of AGI-tators are within epsilon of dooming the world, what is the chance that they are right?

Fairly high. This is a far simpler situation than dealing with foreign powers. Raiding the research centre to investigate is a straightforward task. While they are in no place to evaluate friendliness themselves they are certainly capable of working out whether there is AI code that is about to be run - either by looking around or interrogating. Bear in mind that if it comes down to "do we need to shoot them?" the researchers must be resisting them and trying to run the doomsday code despite the intervention. That is a big deal.

And what is the chance that they have misconceived the situation such that by pulling the trigger, they will create an even worse situation?

Negligible.

The problem here is if other researchers or well meaning nutcases take it upon themselves to do some casual killing. An intelligence agency looking after the national interests - the same way it always does - is not a problem.

This is not some magical special case where there is some deep ethical reason that threat cannot be assessed. It is just another day at the office for the spooks and there is less cause for bias than usual - all the foreign politics gets out of the way.

↑ comment by Barry_Cotter · 2011-05-01T21:43:52.365Z · LW(p) · GW(p)

Violence is the last resort of the incompetent. The competent resort to violence as soon as it beats the alternatives. In situations where violence is appropriate this is almost always before their opponent strikes.

↑ comment by Vladimir_Nesov · 2011-05-01T21:20:40.416Z · LW(p) · GW(p)

"Bad argument gets counterargument. Does not get bullet. Never. Never ever never for ever."

AGI is not an argument.

Replies from: None

↑ comment by [deleted] · 2011-05-01T21:33:49.903Z · LW(p) · GW(p)

This is a site devoted to rationality, supposedly. How rational is it to make public statements that can be interpreted as saying people one disagrees with deserve to be shot? It's hyperbole, and, worse, hyperbole that might be both incitement to violence and possibly self-incriminating if one of those people do get shot. If the world where $randomAIresearcher, who wasn't anywhere near achieving hir goal anyway, gets shot, the SIAI is shut down as a terrorist organisation, and you get arrested for incitement to violence, seems optimal to you, then by all means keep making statements like the one above...

Replies from: wedrifid, Vladimir_Nesov

↑ comment by wedrifid · 2011-05-01T21:47:27.632Z · LW(p) · GW(p)

This is a site devoted to rationality, supposedly. How rational is it to

Comments of this form are almost always objectionable.

It's hyperbole, and, worse, hyperbole that might be both incitement to violence and possibly self-incriminating if one of those people do get shot. If the world where $randomAIresearcher, who wasn't anywhere near achieving hir goal anyway, gets shot, the SIAI is shut down as a terrorist organisation, and you get arrested for incitement to violence, seems optimal to you, then by all means keep making statements like the one above...

Are you trying to be ironic here? You criticize hyperbole while writing that?

Replies from: None, None

↑ comment by [deleted] · 2011-05-01T21:55:38.765Z · LW(p) · GW(p)

No, I am being perfectly serious. There are several people in this thread, yourself included, who are coming very close to advocating - or have already advocated - the murder of scientific researchers. Should any of them get murdered (and as I pointed out in my original comment, which I later redacted in the hope that as the OP had redacted his post this would all blow over, Ben Goertzel has reported getting at least two separate death threats from people who have read the SIAI's arguments, so this is not as low a probability as we might hope) then the finger will point rather heavily at the people in this thread. Murdering people is wrong, but advocating murder on the public internet is not just wrong but UTTERLY FUCKING STUPID.

Replies from: Vladimir_Nesov, XiXiDu, wedrifid

↑ comment by Vladimir_Nesov · 2011-05-01T22:09:47.377Z · LW(p) · GW(p)

advocating murder on the public internet is not just wrong but UTTERLY FUCKING STUPID.

I of course agree with this, but this consideration is unrelated to the question of what constitutes correct reasoning. For example, it shouldn't move you to actually take an opposite side in the argument and actively advocate it, and creating an appearance of that doesn't seem to promise comparable impact.

Replies from: None

↑ comment by [deleted] · 2011-05-01T22:16:40.846Z · LW(p) · GW(p)

That is not my only motive. My main motive is that I happen to think that the course of action being advocated would be extremely unwise and not lead to anything like the desired results (and would lead to the undesirable result of more dead people). My secondary motive was, originally, to try to persuade the OP that bringing the subject up at all was an incredibly bad idea, given that people have already been influenced by discussions of this subject to make death threats against an actual person. Trying to stop people making incredibly stupid statements which would incriminate them in the (hopefully) unlikely event of someone actually attempting to kill AI researchers was quite far down the list of reasons.

↑ comment by XiXiDu · 2011-05-02T09:32:22.276Z · LW(p) · GW(p)

No, I am being perfectly serious. There are several people in this thread, yourself included, who are coming very close to advocating - or have already advocated - the murder of scientific researchers.

Huh? People here often advocate to kill a completely innocent fat guy to save a few more people. People even advocate to torture someone for 50 years so others don't get dust specks into their eyes...

Replies from: None

↑ comment by [deleted] · 2011-05-02T11:13:00.433Z · LW(p) · GW(p)

The difference is there are no hypothetical fat men who are near train lines. There are, however, really-existing AI researchers who have received death threats as a result of this kind of thinking.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-05-02T11:23:28.608Z · LW(p) · GW(p)

The difference is there are no hypothetical fat men who are near train lines.

What are those thought experiments good for if there are no real-world approximations where they might be useful? What do you expect, absolute certainty? Sometimes consequentialist actions have to be made under uncertainty if the scope of the negative utility involved does outweigh it easily...do you disagree with this?

Replies from: None

↑ comment by [deleted] · 2011-05-02T11:38:06.541Z · LW(p) · GW(p)

The problem is, as has been pointed out many times in this thread already, threefold. Firstly, we do not have perfect information, and nor do our brains operate perfectly - the chances of us knowing for sure that there is no way to stop unfriendly AI other than killing someone are so small they can be discounted. The chances of someone believing that to be the case while it's not true are significantly higher.

Secondly, even if it's just being treated as a (thoroughly unpleasant) thought experiment here, there are people who have received death threats as a result of unstable people reading about uFAI. Were any more death threats to be made as a result of unstable people reading this thread, that would be a very bad thing indeed. Were anyone to actually get killed as a result of unstable people reading this thread, that would not only be a bad thing in itself, but it would likely have very bad consequences for the posters in this thread, for the SIAI, for the FHI and so on. This is my own primary reason for arguing so vehemently here - I do not want to see anyone get killed because I didn't bother to argue against it.

And thirdly, this is meant to be a site about becoming more rational. Whether or not it was ever the rational thing to do (and I cannot conceive of a real-world situation where it would be), it is never a rational thing to talk about killing members of a named, small group on the public internet because if/when anything bad happens to them, the finger will point at those doing the talking. In pointing this out I am trying to help people act more rationally.

Replies from: XiXiDu

↑ comment by XiXiDu · 2011-05-02T13:47:43.320Z · LW(p) · GW(p)

I strongly agree that trying to stop uFAI by killing people is a really bad idea. The problem is that this is not the first time the idea is resurfacing and won't be the last time. All the rational arguments against it are now buried in a downvoted and deleted thread and under some amount of hypocritical outrage.

...it is never a rational thing to talk about killing members of a named, small group on the public internet because if/when anything bad happens to them, the finger will point at those doing the talking.

The finger might also point at those who scared people about the dangers of AGI research but never took the effort to publicly distance themselves from extreme measures.

Were anyone to actually get killed as a result of unstable people reading this thread...

What if anyone gets killed as a result of not reading this thread because he was never exposed to the arguments of why it would be a really bad idea to violently oppose AGI research?

I trust you'll do the right thing. I just wanted to point that out.

Replies from: wedrifid, None, None

↑ comment by wedrifid · 2011-05-08T23:49:36.740Z · LW(p) · GW(p)

All the rational arguments against it are now buried in a downvoted and deleted thread

Exactly right. The comment by CarlShuman is valuable. To the extent that it warrants a thread.

What if anyone gets killed as a result of not reading this thread because he was never exposed to the arguments of why it would be a really bad idea to violently oppose AGI research?

Passionately suppressing the conversation could also convey a message of "Shush. Don't tell anyone." as well as showing you take the idea seriously. This is in stark contrast to signalling that you think the whole idea is just silly, because reasoning like Carl's is so damn obvious.

↑ comment by [deleted] · 2011-05-02T14:00:06.440Z · LW(p) · GW(p)

I also don't believe any of the 'outrage' in this thread has been 'hypocritical' - any more than I believe that those advocating murder have been. Certainly in my own case I have argued against killing anyone, and I have done so consistently - I don't believe I've said anything at all hypocritical here.

↑ comment by [deleted] · 2011-05-02T13:57:04.378Z · LW(p) · GW(p)

"The finger might also point at those who scared people about the dangers of AGI research but never took the effort to publicly distance themselves from extreme measures."

I absolutely agree. Personally I don't go around scaring people about AGI research because I don't find it scary. I also think Eliezer, at least, has done a reasonable job of distancing himself from 'extreme measures'.

"What if anyone gets killed as a result of not reading this thread because he was never exposed to the arguments of why it would be a really bad idea to violently oppose AGI research?"

Unfortunately, there are very few people in this thread making those arguments, and a large number making (in my view extremely bad) arguments for the other side...

↑ comment by wedrifid · 2011-05-01T21:59:55.675Z · LW(p) · GW(p)

advocating murder on the public internet is not just wrong but UTTERLY FUCKING STUPID.

This is not a sane representation of what has been said on this thread. I also note that taking an extreme position against preemptive strikes of any kind you are pitting yourself against the political strategy of most nations on earth and definitely the nation from which most posters originate.

For that matter I also expect state sanctioned military or paramilitary organisations to be the groups likely to carry out any necessary violence for the prevention of AGI apocalypse.

Replies from: None

↑ comment by [deleted] · 2011-05-01T22:05:47.657Z · LW(p) · GW(p)

This thread started with a post talking about how we should 'neutralize' people who may, possibly, develop AI at some point in the future. You, specifically, replied to "Bad argument gets counterargument. Does not get bullet. Never. Never ever never for ever." with "I approve of that sentiment so long as people don't actually take it literally when the world is at stake." Others have been saying "The competent resort to violence as soon as it beats the alternatives." What, exactly, would you call that if not advocating murder?

Replies from: wedrifid, wedrifid

↑ comment by wedrifid · 2011-05-01T22:14:33.658Z · LW(p) · GW(p)

Does not get bullet. Never. Never ever never for ever.

Does it get systematic downvoting of 200 of my historic comments? Evidently - whether done by yourself or another. I'm glad I have enough karma to shrug it off but I do hope they stop soon. I have made a lot of comments over the last few years.

Edit: As a suggestion it may be better to scroll back half a dozen pages on the user page before starting a downvote protocol. I was just reading another recent thread I was active in (the social one) and some of the -1s were jarringly out of place. The kind that are never naturally downvoted.

↑ comment by wedrifid · 2011-05-01T22:11:56.423Z · LW(p) · GW(p)

You, specifically, replied to "Bad argument gets counterargument. Does not get bullet. Never. Never ever never for ever." with "I approve of that sentiment so long as people don't actually take it literally when the world is at stake." Others have been saying "The competent resort to violence as soon as it beats the alternatives." What, exactly, would you call that if not advocating murder?

Rejecting what is clearly an irrational quote from Eliezer independently of the local context. I believe I have rejected it previously and likely will again whenever anyone choses to quote it. Eliezer should know better than to make general statements that quite clearly do not hold.

Replies from: Vladimir_Nesov

↑ comment by Vladimir_Nesov · 2011-05-01T22:19:44.957Z · LW(p) · GW(p)

Most statements don't hold in some contexts. Particularly, if you're advocating an implausible or subtly incorrect claim, it's easy to find a statement that holds most of the time but not for the claim in question, thus lending it connotational support of the reference class where the statement holds.

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-01T22:22:20.570Z · LW(p) · GW(p)

Most statements don't hold in some contexts. Particularly, if you're advocating an implausible or subtly incorrect claim, it's easy to find a statement that holds most of the time but not for the claim in question, thus lending it connotational support of the reference class where the statement holds.

I think I agree with what you are saying. As a side note statements that include "Never. Never ever never for ever" need to do better than to 'hold in some contexts'. Because that is a lot of 'never'.

↑ comment by [deleted] · 2011-05-01T22:12:26.860Z · LW(p) · GW(p)

Also, I refuse to reply any more to any of your comments, because at least twice that I have noticed you have edited your comment after the reply has been posted, without posting any acknowledgement of same.

Replies from: Vladimir_Nesov, wedrifid

↑ comment by Vladimir_Nesov · 2011-05-01T22:31:50.198Z · LW(p) · GW(p)

at least twice that I have noticed you have edited your comment after the reply has been posted, without posting any acknowledgement of same.

I do this all the time. There is always room for improvement, and notes about edits are ugly. I only leave them on comments that were later discovered to contain errors that matter for the discussion, and in that case I leave the errors in place, only pointing out their presence.

Act on caring about implementation of version history for the comments if you want a better alternative.

Replies from: None

↑ comment by [deleted] · 2011-05-01T22:42:15.363Z · LW(p) · GW(p)

That's reasonable. But I personally consider it to be arguing in bad faith if someone makes a comment, I reply to it, then I go back later and see that it's been edited to look like I'm replying to something substantially different. Minor edits for spelling or punctuation are reasonable, but introducing entirely new strands of argument, or deleting arguments that were there originally, gives an incorrect impression of what's actually been said. I'm not going to keep going back and checking every five minutes that the context of my comments hasn't been utterly changed, so I'm only going to reply in more-or-less stable contexts.

Replies from: wedrifid

↑ comment by wedrifid · 2011-05-01T22:43:54.894Z · LW(p) · GW(p)

or deleting arguments that were there originally

As I previously mentioned, I have not deleted anything from comments I have written in this thread.

↑ comment by wedrifid · 2011-05-01T22:19:04.963Z · LW(p) · GW(p)

Also, I refuse to reply any more to any of your comments

Thankyou.

because at least twice that I have noticed you have edited your comment after the reply has been posted,

About 1/3 comments that I make I think of additional things to say as soon as I press enter. When I start editing within 5 seconds of clicking 'comment' I do not consider it necessary to write edit. Given the frequency that would be outright spammy.

without posting any acknowledgement of same.

I have added sentences to several comments here. Nothing has been removed. A few extra words have been included where they were missing, making a sentence outright ungrammatical. This is an acknowledgement and not an apology of any kind.

↑ comment by Vladimir_Nesov · 2011-05-01T21:43:03.697Z · LW(p) · GW(p)

It's not true that AGI is an argument. Instead, it is a device. That is simple truth.

comment by wedrifid · 2011-05-01T21:26:58.748Z · LW(p) · GW(p)

It's probably easier to build an uncaring AI than a friendly one. So, if we assume that someone, somewhere is trying to build an AI without solving friendliness, that person will probably finish before someone who's trying to build a friendly AI.

I can only infer what you were saying here but it seems likely that I roughly speaking approve of what you are saying. It is the sort of thing that people don't consider rationally, instead going off the default reaction that fits a broad class of related ideas.

comment by Armok_GoB · 2011-05-01T18:57:13.710Z · LW(p) · GW(p)

That sounds like it'd be a rather small conspiracy, rather little assimilation, and rather much hunting.

comment by Thomas · 2011-05-01T19:14:38.729Z · LW(p) · GW(p)

So you say, Horizon Wars should be started? Preemptive strikes against any not FAI programmer or organization out there?

Sweet!

Replies from: None

↑ comment by [deleted] · 2011-05-01T19:34:06.408Z · LW(p) · GW(p)

Yeah, sweet! Because nothing says "we deserve to make decisions about the future of humanity, and possibly of life in the universe" like murdering one's ideological enemies, does it? That is not the kind of thing one should even joke about.

Replies from: Thomas

↑ comment by Thomas · 2011-05-01T19:55:27.985Z · LW(p) · GW(p)

It is not a joke, it is an observation of the possible OP intent.

I agree, he should delete the post.

Replies from: None

↑ comment by [deleted] · 2011-05-01T19:56:52.787Z · LW(p) · GW(p)

The "Sweet!" made it appear as if you either endorsed that view or found it humorous rather than disturbing.

Replies from: Thomas

↑ comment by Thomas · 2011-05-01T20:18:51.846Z · LW(p) · GW(p)

The last thing I want is a sectarian war between various research groups and organizations.

Religious wars of the past and present could be dwarfed by that. And even if not, bad enough.

comment by Ivanbaj · 2011-05-01T20:11:44.119Z · LW(p) · GW(p)

If you mean a self-aware AI, then I doubt that the creator will have much to say if the artificial person will be good or bad. How much blame do you put on the parents of a killer?

The actions of a self-aware AI should be referred to Metaphysics and Axiology.

If you mean an automata. then we have the laws already. Why would anyone would want to create a machine to break them? The creator will definitely be responsible. :(

A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Replies from: None

↑ comment by [deleted] · 2011-05-01T20:20:51.646Z · LW(p) · GW(p)

You appear to be new here, so to explain why someone has downvoted you - this is a frequently discussed topic here. There is a generally accepted view here (which I do not necessarily share, but which is the view of the community as a whole) that an Artificial Intelligence is likely to be created soon (as in not tomorrow, but probably within the next century or two), and that if it is created without proper care such an AI might well destroy humanity/the Earth/the local universe. It is generally considered in the community that the Asimov Laws wouldn't prevent such an event.

I suggest you first read http://wiki.lesswrong.com/wiki/Friendly_artificial_intelligence and the links on that page, then read or at least skim 'the Sequences' (a core set of posts that most people here have read, which you can find here - http://wiki.lesswrong.com/wiki/Sequences ) before making comments about this subject, as it is one that has been the subject of much discussion on this site, and people will consider repetition of ideas that have already been dealt with as being noise, rather than signal.

Sarah Connor and Existential Risk

Contents

78 comments