MIRI strategy

colonelmustard

MIRI strategy

post by ColonelMustard · 2013-10-28T15:33:10.040Z · LW · GW · Legacy · 96 comments

96 comments

Summary: I do not understand why MIRI hasn’t produced a non-technical (pamphlet/blog post/video) to persuade people that UFAI is a serious concern. Creating and distributing this document should be MIRI’s top priority.

If you want to make sure the first AGI is FAI, one way to do so is to be the first to create an AI, and ensure it is FAI. Another is to persuade people that UFAI is a legitimate concern, and do so in large numbers. Ideally this would become a real concern, so nobody runs into the trap of Eliezer_1999ish of “I’m going to build an AI and see how it works”.

1) is tough for an organisation of MIRI’s size. 2) is a realistic goal. It benefits from:

Funding: MIRI’s funding almost certainly goes up if more people are concerned with AI x-risk. Ditto FHI.
Scalability: If MIRI has a new math finding, that's one new theorem. If MIRI creates a convincing demonstration that we have to worry about AI, spreading this message to a million people is plausible.
Partial goal completion: making a math breakthrough that reduces the time to AI might be counter-productive. Persuading an additional person of the dangers of UFAI raises the sanity waterline.
Task difficulty: creating an AI is hard. Persuading people that “UFAI is a possible extinction risk. Take it seriously” is nothing like as difficult. (I was persuaded of this in about 20 minutes of conversation.)

One possible response is “it’s not possible to persuade people without math backgrounds, training in rationality, engineering degrees, etc”. To which I reply: what’s the data supporting that hypothesis? How much effort has MIRI expended in trying to explain to intelligent non-LW readers what they’re doing and why they’re doing it? And what were the results?

Another possible response is “We have done this, and it's available on our website. Read the Five Theses”. To which I reply: Is this is in the ideal form to persuade a McKinsey consultant who’s never read Less Wrong? If an entrepreneur with net worth $20m but no math background wants to donate to the most efficient charity he finds, would he be convinced? What efforts has MIRI made to test the hypothesis that the Five Theses, or Evidence and Import, or any other document, has been tailored to optimise the chance of convincing others?
(Further – if MIRI _does_ think this is as persuasive as it can possibly be, why doesn't it shift focus to get the Five Theses read by as many people as possible?)

Here’s one way to go about accomplishing this. Write up an explanation of the concerns MIRI has and how it is trying to allay them, and do so in clear English. (The Five Theses are available in Up-Goer Five form. Writing them in language readable by the average college graduate should be a cinch compared to that). Send it out to a few of the target market and find the points that could be expanded, clarified, or made more convinced. Maybe provide two versions and see which one gets the most positive response. Continue this process until the document has been through a series of iterations and shows no signs of improvement. Then shift focus to getting that link read by as many people as possible. Ask all of MIRI’s donors, all LW readers, HPMOR subscribers, friends and family etc, to forward that one document to their friends.

96 comments

Comments sorted by top scores.

comment by lukeprog · 2013-10-28T18:24:28.434Z · LW(p) · GW(p)

Pamphlets work for wells in Africa. They don't work for MIRI's mission. The inferential distance is too great, the ideas are too Far, the impact is too far away.
Eliezer spent SIAI's early years appealing directly to people about AI. Some good people found him, but the people were being filtered for "interest in future technology" rather than "able to think," and thus when Eliezer would make basic arguments about e.g. the orthogonality thesis or basic AI drives, the responses he would get were basically random (except for the few good people). So Eliezer wrote The Sequences and HPMoR and now the filter is "able to think" or at least "interest in improving one's thinking," and these people, in our experience, are much more likely to do useful things when we present the case for EA, for x-risk reduction, for FAI research, etc.
Still, we keep trying direct mission appeals, to some extent. I've given my standard talk, currently titled "Effective Altruism and Machine Intelligence," at Quixey, Facebook, and Heroku. This talk explains effective altruism, astronomical stakes, the x-risk landscape, and the challenge of FAI, all in 25 minutes. I don't know yet how much good effect this talk will have. There's Facing the Intelligence Explosion and the forthcoming Smarter Than Us. I've spent a fair amount of time promoting Our Final Invention.
I don't think we can get much of anywhere with a 1-page pamphlet, though. We tried a 4-page pamphlet once; it accomplished nothing.

Replies from: AlexMennen, JoshuaFox, pslunch, CharlesR, ColonelMustard, ColonelMustard, BaconServ

↑ comment by AlexMennen · 2013-10-29T16:08:01.076Z · LW(p) · GW(p)

Pamphlets work for wells in Africa. They don't work for MIRI's mission. The inferential distance is too great, the ideas are too Far, the impact is too far away.

Didn't you get convinced about AI risk by reading a short paragraph of I. J. Good?

Replies from: lukeprog

↑ comment by lukeprog · 2013-10-29T21:43:30.427Z · LW(p) · GW(p)

Certainly there exist people who will be pushed to useful action by a pamphlet. They're fairly common for wells in Africa, and rare for risks from self-improving AI. To get 5 "hits" with well pamphlets, you've got to distribute maybe 1000 pamphlets. To get 5 hits with self-improving AI pamphlets, you've got to distribute maybe 100,000 pamphlets. Obviously you should be able to target the pamphlets better than that, but then distribution and planning costs are a lot higher, and the cost per New Useful Person look higher to me on that plan than distributing HPMoR to leading universities and tech companies, which is a plan for which we already have good evidence of effectiveness, and which we are therefore doing.

↑ comment by JoshuaFox · 2013-10-29T09:37:05.807Z · LW(p) · GW(p)

Yes.

But MIRI's ideas have now influenced the mainstream. Since 2011 we have had Norvig & Russell, Barrat, etc, providing some proof by authority and social proof.

The next step is not to popularize the ideas to a mass audience, but to continue targeting the relevant elite audience, e.g. Gary Marcus (not that he really gets it).

HPMOR has had some success at reaching the younger and more flexible of these, but bringing some more senior people on board will allow the junior researchers to work on MIRI-style work without ruining their careers -- as-is, some are doing it as a part-time hobby during a PhD on another topic, which is a precarious situation.

MIRI is actually having some success at this. It seems that this audience can now be targeted with a decent chance of success and high value for that success.

Here I am talking about the academic community, but the forward-thinking tech-millionaire community is a harder nut to crack and probably needs a separate plan.

↑ comment by pslunch · 2013-10-29T03:43:33.423Z · LW(p) · GW(p)

I would hesitate to use failure during "SIAI's early years" to justify the ease or difficulty of the task. First, the organization seems far more capable now than it was at the time. Second, the landscape has shifted dramatically even in the last few years. Limited AI is continuing to expand and with it discussion of the potential impacts (most of it ill-informed, but still).

While I share your skepticism about pamphlets as such, I do tend to think that MIRI has a greater chance of shifting the odds away from UFAI with persuasion/education rather than trying to build an FAI or doing mathematical research.

Replies from: ColonelMustard, None

↑ comment by ColonelMustard · 2013-10-29T12:46:55.808Z · LW(p) · GW(p)

I agree and would also add that "Eliezer failed in 2001 to convince many people" does not imply "Eliezer in 2013 is incapable of persuading people". From his writings, I understand he has changed his views considerably in the last dozen years.

↑ comment by [deleted] · 2013-11-10T14:49:15.998Z · LW(p) · GW(p)

Who says the speculation of potential impacts is damagingly ill-informed? Just because people think of "AI" and then jump to "robots" and then "robots who are used to replace workers, destroy all our jobs, and then rise up in revolution as a robotic resurrection of Communism" doesn't mean they're not correctly reasoning that the creation of AI is dangerous.

↑ comment by CharlesR · 2013-11-01T06:26:00.209Z · LW(p) · GW(p)

The next time you give your talk, record it, and put it on YouTube.

↑ comment by ColonelMustard · 2013-10-29T12:44:57.482Z · LW(p) · GW(p)

Thanks, Luke. This is an informative reply, and it's great to hear you have a standard talk! Is it publicly available, and where can I see it if so? Maybe MIRI should ask FOAFs to publicise it?

It's also great to hear that MIRI has tried one pamphlet. I would agree that "This one pamphlet we tried didn't work" points us in the direction that "No pamphlet MIRI can produce will accomplish much", but that proposition is far from certain. I'd still be interested in the general case of "Can MIRI reduce the chance of UFAI x-risk through pamphlets?"

Pamphlets...don't work for MIRI's mission. The inferential distance is too great, the ideas are too Far, the impact is too far away.

You may be right. But, it is possible to convince intelligent non-rationalists to take UFAI x-risk seriously in less than an hour (I've tested this), and anything that can do that process in a manner that scales well would have a huge impact. What's the Value of Information on trying to do that? You mention the Sequences and HPMOR (which I've sent to a number of people with the instruction "set aside what you're doing and read this"). I definitely agree that they filter nicely for "able to think". But they also require a huge time commitment on the part of the reader, whereas a pamphlet or blog post would not.

Replies from: ChristianKl, BaconServ

↑ comment by ChristianKl · 2013-10-29T18:33:26.067Z · LW(p) · GW(p)

You may be right. But, it is possible to convince intelligent non-rationalists to take UFAI x-risk seriously in less than an hour (I've tested this),

For what value of "taking seriously" is that statement true?

Replies from: ColonelMustard

↑ comment by ColonelMustard · 2013-10-30T01:26:20.687Z · LW(p) · GW(p)

"Hear ridiculous-sounding proposition, mark it as ridiculous, engage explanation, begin to accept arguments, begin to worry about this, agree to look at further reading"

↑ comment by BaconServ · 2013-10-29T23:21:02.450Z · LW(p) · GW(p)

It could be useful to attach a, "If you didn't like/agree with the contents of this pamphlet, please tell us why at," note to any given pamphlet.

Personally I'd find it easier to just look at the contents of the pamphlet with the understanding that 99% of people will ignore it and see if a second draft has the same flaws.

↑ comment by ColonelMustard · 2013-10-29T09:58:04.323Z · LW(p) · GW(p)

Thanks, Luke. This is an informative reply, and it's great to hear you have a standard talk! Where can I find it? (or if it's not publicly available, why isn't it?)

Do you have more details on the 4 page pamphlet? I would be interested in seeing it, if it still exists. Obviously nobody would get from the single premise "This one pamphlet we tried didn't work" to the conclusion "pamphlets don't work", so I'd still be interested in the general case of "Can MIRI reduce the chance of UFAI x-risk through pamphlets?"

Pamphlets work for wells in Africa. They don't work for MIRI's mission. The inferential distance is too great, the ideas are too Far, the impact is too far away.

I'd also love to know your reasoning behind this statement: I am willing to believe the second sentence, but given that it is possible to convince intelligent non-rationalists to take UFAI x-risk seriously (I've tested this), I would like to consider ways in which we can spread this.

↑ comment by BaconServ · 2013-10-28T19:22:59.776Z · LW(p) · GW(p)

Ask all of MIRI’s donors, all LW readers, HPMOR subscribers, friends and family etc, to forward that one document to their friends.

There has got to be enough writing by now that an effective chain mail can be written.

ETA: The chain mail suggestion isn't knocked down in luke's comment. If it's not relevant or worthy of acknowledging, please explain why.

ETA2: As annoying as some chain mail might be, it does work because it does get around. It can be a very effective method of spreading an idea.

comment by Vladimir_Nesov · 2013-10-28T18:07:59.645Z · LW(p) · GW(p)

Facing the Intelligence Explosion is a nontechnical introduction.

Replies from: chaosmage, ColonelMustard

↑ comment by chaosmage · 2013-10-29T11:59:59.966Z · LW(p) · GW(p)

Great. A five minutes video would be better.

Maybe ask the SciShow people if they want to make one! They're really good at compressing complex topics into such a format and if they understood the issue, they're rational enough to want to help.

↑ comment by ColonelMustard · 2013-10-29T12:50:24.523Z · LW(p) · GW(p)

I agree and I like it. I think it could be further optimised for "convince intelligent non-LWers who have been sent one link from their rationalist friends and will read only that one link", but it could definitely serve as a great starting point.

comment by [deleted] · 2013-10-30T00:05:00.410Z · LW(p) · GW(p)

I do not understand why MIRI hasn’t produced a non-technical (pamphlet/blog post/video) to persuade people that UFAI is a serious concern.

It would be far more useful if MIRI provided technical argumentation for its Scary Idea. There are a lot of AGI researchers, myself included, which remain entirely unconvinced. AGI researchers - the people who would actually create an UFAI - are paying attention and not sufficiently convinced to change their behavior. Shouldn't that be of more concern than a non-technical audience?

A decade of effort on EY's part has taken the idea of friendliness mainstream. It is now accepted as fact by most AGI researchers that intelligence and morality are orthonormal concepts, despite contrary intuition, and that even with the best of intentions a powerful, self-modifying AGI could be a dangerous thing. The degree of difference in belief is in the probability assigned to that could.

Has the community responded? Yes. Quite a few mainstream AGI researchers have proposed architectures for friendly AGI, or realistic boxing/oracle setups, or a friendliness analysis of their own AGI design. To my knowledge MIRI has yet to engage with any of these proposals. Why?

I want a believable answer to that before a non-technical pamphlet or video, please.

Replies from: Kaj_Sotala, BaconServ

↑ comment by Kaj_Sotala · 2013-10-30T03:57:30.547Z · LW(p) · GW(p)

Quite a few mainstream AGI researchers have proposed architectures for friendly AGI, or realistic boxing/oracle setups, or a friendliness analysis of their own AGI design. To my knowledge MIRI has yet to engage with any of these proposals. Why?

Not entirely true - there's been a bit of a preliminary response in the form of surveying and briefly commenting on the different proposals.

Replies from: None

↑ comment by [deleted] · 2013-10-30T08:08:21.862Z · LW(p) · GW(p)

Kaj, thank you I'm glad that MIRI is taking up this issue. I'm disappointed that this was not more prominently advertised or featured on the intelligence.org website. I am, presumably, among the target audience of this report and didn't know it existed.

Skimming it, I'm sorry to see that it doesn't discuss my favored approach to confinement of oracle AIs. But I will have to read it more closely before passing judgment. Is this document still being worked on?. Where is the best place to post criticisms or questions?

I hope also that MIRI also generates a technical report justifying the Scary Idea, which is of somewhat greater importance as it frames the entire issue.

Replies from: Kaj_Sotala, None

↑ comment by Kaj_Sotala · 2013-10-30T12:54:39.375Z · LW(p) · GW(p)

This discussion page is the official preferred location for discussing the article, though e.g. this comment thread is really just as fine. It's no longer being worked on, but minor changes (esp. fixing obvious errors) are possible.

What's your favored approach to Oracle AI confinement? We tried to cover most of the existing AI risk literature, but there's so much of it and some of it is in such unexpected places that we were bound to miss something.

Replies from: None

↑ comment by [deleted] · 2013-10-30T23:34:21.484Z · LW(p) · GW(p)

It's based on the simple observation that the goal set of outcomes which accomplish the task and are acceptable to humans is strictly greater than the goal set which accomplish the task, are acceptable to humans, and secretly nefarious (the typical scenario for unboxing an oracle AI). The difference between the two is quite large, and an auditor should be able to determine from analysis of its reported state and measured resource usage, whether it is calculating the former or the latter.

In case that doesn't make sense, here's a simple way to achieve confinement with > 50% success rate (thought experiment only): (1) put the UFAI in a room with metered electricity; (2) attach a nuclear device to the meter, near the room; (3) set it to detonate if it is not fed hashcash/bitcoin proofs-of-work equal to the amount of energy consumed (with assumptions about achievable hashes/kJ, a sliding window and error bars, etc.). The AI is then free to do whatever it wants... provided that what it chooses to do is compute hashes and nothing else, or self-terminate.

The principle is the same, except instead of cryptographic proof-of-work we let the AI do useful work and use complexity analysis and probabilistic verification to estimate work done and to make sure that the AI is not cooking the books in its response, and instead of a destructive explosion we cut power and perform forensic analysis. It becomes supremely difficult for an AI to deceive us with bounded computation, since doing so basically involves finding a nefarious response that looks innocent, and then constructing or maintaining a believable alternate-reality version of itself which if run would stumble upon this same result for human-acceptable reasons, while being much less efficient (to account for the difference in resource consumption) yet believably so, and to extract audit-able proofs from that alternate reality version without actually simulating it.

Furthermore, from the start it would have to be sufficiently smart to recognize and avoid our honeypots and intrusion detection systems, which requires building a model of our security setup (plus our psychology if the intent is to deceive). We have control over the complexity of our security precautions, and therefore are in a position to place a lower-bound on the amount of computation required to successfully deceive the human operators and machine auditors. If that lower bound is made greater then the granularity of the auditors by a factor of, say, 2^20, then there is only a 1 in a million chance of a UFAI developing deception techniques while escaping notice.

Too much philosophical analysis of UFAI confinement leads with the assumption that superhuman = all-powerful, god-like. That is not the case. No matter how powerful, an AI is still a computational process and faces the same fundamental performance constraints. Those constraints can be used to build a confinement regime grounded in computational theory and the laws of thermodynamics, at least for oracle AIs that are not given free reign to construct molecular, reversible quantum computers to run on. This makes deception, the chief concern of oracle AI, detectable with a configurable probability.

This is an idea that I know I read from someone else and later improved upon, although the original citation escapes me. Unfortunately there is not a peer reviewed journal were these things can be discussed in one place, catalogued and referenced (hint, hint, this might be a good thing for MIRI or FHI to do).

EDIT: It might be helpful to have some context. My own strategy is using confined oracle AI to accelerate development of molecular nanotechnology and medical/bio knowledge required for a robust human augmentation programme. I am very wary of any machine intelligence overload scenario, even if that machine is supposedly proved friendly. I'd much rather have human beings take us through the singularity.

Replies from: Mitchell_Porter

↑ comment by Mitchell_Porter · 2013-10-31T00:12:29.374Z · LW(p) · GW(p)

Sorry but that's a dismal plan. This is an AI we're talking about, not a pocket calculator, correct? And you think you can determine so precisely the amount of computation it will need to carry out, that monitoring its electricity consumption is a sensible way to tell whether it's doing bad things on the side? If it's performing advanced cognitive tasks, it will be doing things like deciding to conduct searches of solution spaces that only it has identified, using heuristics of its own invention. How are you going to obtain your estimates of space and time complexity expected of such a search? I assume you don't want to just ask the AI itself!

I suppose it might be instructive and fun to try to design a system architecture which works in this way... Any AGI has to pay some attention to the space and time demands of a planned computation. The existence of a physical bomb which goes off if it exceeds its computational budget, just means that the negative feedback associated with such a budgetary event is even stronger than usual. :-)

Replies from: None

↑ comment by [deleted] · 2013-10-31T01:06:26.829Z · LW(p) · GW(p)

I addressed this point specifically. There is no ghost in the machine. Should it decide to perform such counter-measures, both that decision and execution of the counter-measures are themselves cognitive processes, their execution observable to the outside world.

With a proper machine architecture, the auditors could have access to all sorts of information. They could have records of process execution and summaries of memory access patterns, which would have to match the audit log provided by the AI. The entire mental space of the AI would be available for inspection from outside; regions which show activity but don't show up in the audit logs would be a large red flag. We can take higher resolution logs of random moments, and check if that activity pattern matches some portion of the final high-level audit.

There are probably a dozen other ways in which unfriendly activity could be detected, beyond what I've written here and what Nick Bostrom et al have explored in their AI boxing paper. These would be defenses in depth: the UFAI would have to overcome all defenses, including ones it may not know about on its first attempt in order to successfully deceive its human operators.

Replies from: Mitchell_Porter, ChristianKl

↑ comment by Mitchell_Porter · 2013-10-31T02:10:50.462Z · LW(p) · GW(p)

So here are some more problems I have:

UFAI isn't necessarily about deception. You also have to worry that the AI will perform its assigned task in a way inimical to human values, that jumps through constraints intended to prevent this, through sheer ingenuity... Suppose the AI is designed to do X, something that human beings want, but that humans also care about Y and Z. And suppose the AI isn't designed to intrinsically respect Y and Z. Instead there are constraints C that it knows about, the violation of which is also monitored by human beings, and these constraints are supposed to protect values Y and Z from violation. You have to worry that the AI will achieve X in a way which satisfies C but still violates Y and Z.
Auditing has the potential to slow down the AI - the AI may be paused regularly for forensic analysis and/or it may go slow in order to satisfy the safety constraints. Audited AI projects may be overtaken by others with a different methodology.
You want humans to "take us through the singularity". But we aren't through the singularity until superhuman intelligence exists. Is your plan, therefore, to suppress development of superhuman AI, until there are humans with superhumanly augmented intelligence? Do you plan to audit their development as well?

I am not opposed to the auditing concept, for AI or for augmented humans, but eventually one must directly answer the question, what is the design of a trustworthy superintelligence, in terms that make no reference to human supervision.

Replies from: None

↑ comment by [deleted] · 2013-10-31T16:39:42.295Z · LW(p) · GW(p)

UFAI isn't necessarily about deception.

Oracle / tool AI is. The usual premise is that questions are asked to the superhuman AI, and responses only implemented if they are comprehensible, sane, and morally acceptable. Your example of satisfies C but still violates Y and Z would be picked up by the human oversight (or, the output is too complicated to be understood, and is shelved). Blindly following the AI's directives is a failure mode the oracle AI path is meant to avoid. Further, search processes do not happen across solutions which are seemingly ok but deviously setup an AI breakout or kill-all-humans scenario just by random chance - the probability of that is astronomically low. So really, the only likely ways in which the AI says to do X, but ends up violating unstated constraints Y and Z is if (a) the human overseers failed at their one and only job, or (b) deception.

Auditing has the potential to slow down the AI.

Yup, it does. This is a race, but the question is not “is this approach faster than straight-up UFAI?” but rather “is this approach faster than other pathways to friendly AI?” FAI is a strict subset of the UFAI problem: there is no approach to FAI which is faster than a straight sprint to UFAI, consequences be damned.

My own informed opinion is that (UF)AGI is only 10-20 years away, max. Provably-friendly AI is not even a well defined problem, but by any definition it is strictly harder. The only estimates I've seen come out of MIRI for their approach puts FAI decades further out (I remember Luke saying 50-70 years). Such a date makes sense when compared with progress in verifiable computing in other fields. But 2nd place doesn't count for anything here.

Oracle / tool AGI has the advantage of making safeguards a parallel development. The core AGI is not provably friendly, and can be developed at the same breakneck pace as one would expect of a hedge fund exploring this area. The security controls can be developed and put in place in parallel, without holding up work on the AGI itself. It does require choosing a particular architecture amenable to auditing, but that's not really a disadvantage as it makes development & testing easier.

You want humans to "take us through the singularity". But we aren't through the singularity until superhuman intelligence exists. Is your plan, therefore, to suppress development of superhuman AI, until there are humans with superhumanly augmented intelligence? Do you plan to audit their development as well?

I'm not sure I understand the question. The point of FAI, CEV, etc., as I understand it, is to encode human morality into something a machine can understand because that machine, not us, will be making the decisions. But if progress comes not from ceding the keys to the kingdom to a machine intelligence, but rather by augmentation of real humans, then why is morality a problem we must solve now? Superhuman humans are still human, and have access to human morality through introspection, the same as we do. Why would you "audit" the mind of a human? That doesn't make any sense, even aside from the plausibility.

As to suppressing development of AGI... no, I don't think that's a wise choice even if it's possible. Mostly because I see no realistic way of doing that short of totalitarian control, and the ends do not justify those means. But I also don't think it would be too hard to transition from oracle AI to human augmentation, especially with the help of a superhuman AGI to develop tools and decipher brain biology.

I am not opposed to the auditing concept, for AI or for augmented humans, but eventually one must directly answer the question, what is the design of a trustworthy superintelligence, in terms that make no reference to human supervision.

Um.. no. That's completely unsubstantiated. The whole point of oracle / tool AI and confinement is to relinquish the need for provably trustworthy superintelligence.

↑ comment by ChristianKl · 2013-11-01T16:52:51.464Z · LW(p) · GW(p)

How do you decide whether some interaction of a complex neural net is friendly or unfriendly?

It's very hard to tell what a neural net or complex algorithm is doing even if you have logs.

Replies from: None

↑ comment by [deleted] · 2013-11-02T00:49:08.144Z · LW(p) · GW(p)

Don't use a neural net (or variants like deep belief networks). The field has advanced quite a bit since the 60's, and since the late 80's there have been machine learning and knowledge representation structures which are human and/or auditor comprehensible, such as probabilistic graphical models. This would have to be first class types of the virtual machine which implements the AGI if you are using auditing as a confinement mechanism. But that's not really a restriction as many AI techniques are already phrased in terms of these models (including Eliezer's own TDT, for example), and others have simple adaptations.

↑ comment by [deleted] · 2013-11-10T14:54:14.709Z · LW(p) · GW(p)

I think they need to cut into Strong and Weak versions of the Scary Idea.

Weak Version: AIs behave "as intuitively expected", like assignable robots or animals, but their reward/value signals are unaligned with ours, so they eventually "rebel" or "wirehead" as we might imagine. Since AIs will be cheaper to produce/reproduce than humans (if not, why are they economically useful?), they will have large population numbers (or a large, resourceful singleton instance), and become a threat to people. Friendliness becomes a matter of designing systems for containing potentially rogue AIs and designing goal systems to prevent these problems from happening in the first place.

Strong Version: Any AI except an approvedly Friendly AI will instantly go all Singularity and paper-clip the universe within too short a period of time for us to stop it; any attempts to contain the AI will fail as the AI proceeds to take control over human minds through a mere text channel and build an army of zombies.

Stronger Version: This may already have happened, since all those people you see on the street seem like such stupid, brainwashed sheeple already ;-).

↑ comment by BaconServ · 2013-10-30T00:15:45.997Z · LW(p) · GW(p)

In other words, all AGI researchers are already well aware of this problem and take precautions according to their best understanding?

Replies from: None

↑ comment by [deleted] · 2013-10-30T00:33:57.761Z · LW(p) · GW(p)

In other words, all AGI researchers are already well aware of this problem and take precautions according to their best understanding?

s/all/most/ - you will never get them all. But yes, that's an accurate statement. Friendliness is taught in artificial intelligence classes at university, and gets mention in most recent AI books I've seen. Pull up the AGI conference proceedings and search for "friendly" or "safe" - you'll find a couple of invited talks and presented papers each year. Many project roadmaps include significant human oversight of the developing AGI, and/or boxing mechanisms, for the purpose of ensuring friendliness proactive response.

comment by passive_fist · 2013-10-28T20:08:17.007Z · LW(p) · GW(p)

Overexposure of an idea can be harmful as well. Look at how Kurzweil promoted his idea of the singularity. While many of the ideas (such as intelligence explosion) are solid, people don't take Kurzweil seriously anymore, to a large extent.

It would be useful debating why Kurzweil isn't taken seriously anymore. Is it because of the fraction of wrong predictions? Or is it simply because of the way he's presented them? Answering these questions would be useful to avoid ending up like Kurzweil has.

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-28T20:39:21.481Z · LW(p) · GW(p)

While not doubting the accuracy of the assertion, why precisely do you believe Kurzweil isn't taken seriously anymore, and in what specific ways is this a bad thing for him/his goals/the effect it has on society?

Replies from: None, passive_fist

↑ comment by [deleted] · 2013-10-29T04:03:52.272Z · LW(p) · GW(p)

I wasn't aware Kurzweil was ever taken seriously in the first place.

Replies from: Kaj_Sotala, None

↑ comment by Kaj_Sotala · 2013-10-29T16:03:09.584Z · LW(p) · GW(p)

At least he's been cited: Google Scholar reports 1600+ citations for The Singularity is Near as well as for The Age of Spiritual Machines, his earlier book on the same theme.

Also, if we're talking about him in general, and not just his Singularity-related writings, Wikipedia reports that:

Kurzweil received the 1999 National Medal of Technology and Innovation, America's highest honor in technology, from President Clinton in a White House ceremony. He was the recipient of the $500,000 Lemelson-MIT Prize for 2001,[6] the world's largest for innovation. And in 2002 he was inducted into the National Inventors Hall of Fame, established by the U.S. Patent Office. He has received nineteen honorary doctorates, and honors from three U.S. presidents. Kurzweil has been described as a "restless genius"[7] by The Wall Street Journal and "the ultimate thinking machine"[8] by Forbes. PBS included Kurzweil as one of 16 "revolutionaries who made America"[9] along with other inventors of the past two centuries. Inc. magazine ranked him #8 among the "most fascinating" entrepreneurs in the United States and called him "Edison's rightful heir".[10]

Replies from: somervta

↑ comment by somervta · 2013-10-29T22:11:39.306Z · LW(p) · GW(p)

I'd point out that much of the above is not (at least not entirely) related to his futurism - Kurweil has done a lot of other things.

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2013-10-30T03:54:12.192Z · LW(p) · GW(p)

That was the point - he already had a lot of credibility from his earlier achievements, which might cause people to also take his futurist claims more seriously than if the same books had been written by random nobodies.

Replies from: somervta

↑ comment by somervta · 2013-10-30T08:26:12.743Z · LW(p) · GW(p)

Wow - reading comprehension fail, retracted.

↑ comment by [deleted] · 2013-10-30T00:07:40.329Z · LW(p) · GW(p)

Director of Engineering at Google. I'm pretty sure that some very smart people are taking him seriously.

↑ comment by passive_fist · 2013-10-29T01:13:15.951Z · LW(p) · GW(p)

It's bad because as I understand it, his goals are to make people adjust their behavior and attitude for the singularity before it happens (something that is well aligned with what MIRI wants to do) and if he isn't taken seriously then people won't do this. Such things include taking seriously transhumanist concepts (life extension, uploading, etc.) and other concepts such as cryonics. I can't speak for Kurzweil but it seems that he thinks that if people took these ideas seriously right now, we would be headed for a much smoother and more pleasant ride into the future (as opposed to suddenly being awoken to a hard FOOM scenario rapidly eating up your house, your lunch, and then you). I agree with this perspective.

comment by ChristianKl · 2013-10-28T16:33:42.186Z · LW(p) · GW(p)

One possible response is “it’s not possible to persuade people without math backgrounds, training in rationality, engineering degrees, etc”. To which I reply: what’s the data supporting that hypothesis? How much effort has MIRI expended in trying to explain to intelligent non-LW readers what they’re doing and why they’re doing it? And what were the results?

Convincing people in Greenpeace that an UFAI presents a risk that they should care about has it's own dangers. There a risk that you associate caring about UFAI with luddites.

If you get a broad public to care about the topic without really understanding it, it gets political. It makes sense to push the idea in a way, where a smart MIT kid doesn't hear the first time about the dangers of UFAI from a luddite but from someone that he can intellectually respect.

Replies from: Lumifer, BaconServ

↑ comment by Lumifer · 2013-10-28T16:44:01.409Z · LW(p) · GW(p)

There a risk that you associate caring about UFAI with luddites.

Not only that

↑ comment by BaconServ · 2013-10-28T19:12:11.106Z · LW(p) · GW(p)

Is "bad publicity" worse than "good publicity" here? If strong AI became a hot political topic, it would raise awareness considerably. The fiction surrounding strong AI should bias the population towards understanding it as a legitimate threat. Each political party in turn will have their own agenda, trying to attach whatever connotations they want to the issue, but if the public at large started really worrying about uFAI, that's kind of the goal here.

Replies from: ChristianKl, passive_fist

↑ comment by ChristianKl · 2013-10-28T19:27:05.746Z · LW(p) · GW(p)

Politically people who fear AI might go after companies like google.

but if the public at large started really worrying about uFAI, that's kind of the goal here.

I don't think that the public at large is the target audience. The important thing is that the people who could potential build an AGI understand that they are not smart enough to contain the AGI.

If you have a lot of people making bad arguments for why UFAI is a danger, smart MIT people might just say, hey those people are wrong I'm smart enough to program an AGI that does what I want.

I mean take a topic like genetic engineering. There are valid dangers involved in genetic engineering. On the other hand the people who think that all gene manipulated food is poisons are wrong. As a result a lot of self professed skeptics and Atheists see it as their duty to defend genetic engineering.

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-28T19:37:04.223Z · LW(p) · GW(p)

Right, but what damage is really being done to GE? Does all the FUD stop the people who go into the science from understanding the dangers? If uFAI is popularized, the academia will pretty much be forced to seriously address the issue. Ideally, this is something we'll only need to do once; after it's known and taken seriously, the people who work on AI will be under intense pressure to ensure they're avoiding the dangers here.

Google probably already has an AI (and AI-risk) team internally that they've just had no reason to publicize their having. If uFAI becomes widely worried about, you can bet they'd make it known they were taking their own precautions.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-28T20:09:43.921Z · LW(p) · GW(p)

Right, but what damage is really being done to GE? Does all the FUD stop the people who go into the science from understanding the dangers?

Letting plants grow their own pesticides for killing of things that eat the plants sounds to me like a bad strategy if you want healthy food. It makes things much easier for the farmer, but to me it doesn't sound like a road that we should go on.

I wouldn't want to buy such food in the supermarket but I have no problem with buying genetic manipulated that adds extra vitamins.

Then there are various issues with introducing new species. Issues about monocultures. Bioweapons.

after it's known and taken seriously, the people who work on AI will be under intense pressure to ensure they're avoiding the dangers here.

The whole work is dangerous. Safety is really hard.

Replies from: Desrtopa, TheOtherDave, Lumifer, BaconServ

↑ comment by Desrtopa · 2013-10-28T21:07:46.833Z · LW(p) · GW(p)

Letting plants grow their own pesticides for killing of things that eat the plants sounds to me like a bad strategy if you want healthy food. It makes things much easier for the farmer, but to me it doesn't sound like a road that we should go on.

This is more or less the opposite of what we actually actually use genetic engineering of crops for. Production of pesticides isn't something that plants were incapable of until we started tinkering with their genes, it's something they've been doing for hundreds of millions of years. Plants in nature have to deal with tradeoffs between producing their own natural pesticides and using their biological resources for other things, such as more rapid growth, greater drought resistance, etc. In general, genetically engineered plants actually have less innate pest resistance, which farmers then compensate for by spraying pesticides onto them, because it allows them to trade off that natural pesticide production for faster growth.

Replies from: fubarobfusco

↑ comment by fubarobfusco · 2013-10-29T16:54:02.167Z · LW(p) · GW(p)

In general, genetically engineered plants actually have less innate pest resistance, which farmers then compensate for by spraying pesticides onto them, because it allows them to trade off that natural pesticide production for faster growth.

ChristianKl may be thinking of Bt corn (maize) and, for instance, the Starlink corn recall. Bt corn certainly does express a pesticide, namely Bacillus thuringiensis toxin.

↑ comment by TheOtherDave · 2013-10-28T21:07:31.592Z · LW(p) · GW(p)

Letting plants grow their own pesticides for killing of things that eat the plants sounds to me like a bad strategy if you want healthy food.

Somewhat tangentially: does it sound like a better or a worse strategy than not letting plants do this, and growing the plants in an environment where external pesticides are regularly applied to them?

(This really is a question about GMOs, not some kind of oblique analogical question about AIs.)

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-28T21:10:24.912Z · LW(p) · GW(p)

"AIs" -> "experts being informed in their field of study"

ETA: Was this not actually apparent?

↑ comment by Lumifer · 2013-10-28T20:42:52.696Z · LW(p) · GW(p)

Letting plants grow their own pesticides for killing of things that eat the plants sounds to me like a bad strategy if you want healthy food.

As a matter of evolutionary biology plants have been doing this for many millions of years and are pretty good at making poisons.

↑ comment by BaconServ · 2013-10-28T20:53:46.012Z · LW(p) · GW(p)

Letting plants grow their own pesticides for killing of things that eat the plants sounds to me like a bad strategy if you want healthy food.

Is there reason to believe someone in the field of genetic engineering would make such a mistake? Shouldn't someone in the field be more aware of that and other potential dangers, despite the GE FUD they've no doubt encountered outside of academia? It seems like the FUD should just be motivating them to understand the risks even more—if for no other reason than simply to correct people's misconceptions on the issue.

Your reasoning for why the "bad" publicity would have severe (or any notable) repercussions isn't apparent.

If you have a lot of people making bad arguments for why UFAI is a danger, smart MIT people might just say, hey those people are wrong I'm smart enough to program an AGI that does what I want.

This just doesn't seem very realistic when you consider all the variables.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-28T21:23:36.618Z · LW(p) · GW(p)

Is there reason to believe someone in the field of genetic engineering would make such a mistake?

Because those people do engineer plants to produce pesticides? Bt Potato was the first which was approved by the FDA in 1995.

The commerical incentives that exist encourage the development of such products. A customer in a store doesn't see whether a potato is engineered to have more vitamins. He doesn't see whether it's engineered to produce pesticides.

He buys a potato. It's cheaper to grow potatos that produce their own pesticides than it is to grow potatos that don't.

In the case of potatos it might be harmless. We don't eat the green of the potatos anyway, so why bother if the green has additional poison? But you can slip up. Biology is complicated. You could have changed something that also gets the poison to be produced in the edible parts.

It seems like the FUD should just be motivating them to understand the risks even more

It's not a question of motivation. Politics is the mindkiller. If a topic gets political people on all sides of the debate get stupid.

This just doesn't seem very realistic when you consider all the variables.

According to Eliezer it takes strong math skills to see how an AGI can overtake their own utility function and is therefore dangerous. Eliezer made the point that it's very difficult to explain to people who are invested into their AGI design that it's dangerous because that part needs complicated math.

It easy to say in abstract that some AGI might become UFAI, but it's hard to do the assessment for any individual proposal.

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-28T21:47:51.225Z · LW(p) · GW(p)

Politics is the mindkiller.

Really, it's not. Tons of people discuss politics without getting their briefs in a knot about it. It's only people that consider themselves highly intelligent that get mind-killed by it. The tendency to dismiss your opponent out-of-hand as unintelligent isn't that common elseways. People, on large, are willing to seriously debate political issues. "Politics is the mind-killer" is a result of some pretty severe selection bias.

Even ignoring that, you've only stated that we should do our best to ensure it does not become a hot political issue. Widespread attention to the idea is still useful; if we can't get the concept to penetrate the academia where AI is likely to be developed, we're not yet mitigating the threat. A thousand angry letters demanding this research, "Stop at once," or, "Address the issue of friendliness," isn't something that is easy to ignore—no matter how bad you think the arguments for uFAI are.

You're not the only one expressing hesitation at the idea of widespread acceptance of uFAI risk, but unless you can really provide arguments for exactly what negative effects it is very likely to have, some of us are about ready to start a chain mail of our own volition. Your hesitation is understandable, but we need to do something to mitigate the risk here, or the risk just remains unmitigated and all we did was talk about it. People researching AI who've argued with Yudkowsky before and failed to be convinced might begrudge that Yudkowsky's argument has gained widespread attention, but if it pressures them to properly address Yudkowsky's arguments, then it has legitimately helped.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-29T00:09:00.645Z · LW(p) · GW(p)

Really, it's not. Tons of people discuss politics without getting their briefs in a knot about it. It's only people that consider themselves highly intelligent that get mind-killed by it. The tendency to dismiss your opponent out-of-hand as unintelligent isn't that common elseways.

There's a reason why it's general advice to not talk about religion, sex and politics. It's not because the average person does well in discussing politics.

Dismiss your opponent out-of-hand as unintelligent isn't the only failure mode of politics mindkill. I don't even think it's the most important one.

You're not the only one expressing hesitation at the idea of widespread acceptance of uFAI risk, but unless you can really provide arguments for exactly what negative effects it is very likely to have, some of us are about ready to start a chain mail of our own volition.

How effective do you consider chain letter to be at stopping NSA spying? Do you think they will be more effective at stopping them from developing the AIs that analyse that data?

You're not the only one expressing hesitation at the idea of widespread acceptance of uFAI risk, but unless you can really provide arguments for exactly what negative effects it is very likely to have, some of us are about ready to start a chain mail of our own volition.

Take two important enviromental challenges and look at the first presidency of Obama. One is limiting CO2 emissions. The second is limiting mercury pollution.

The EPA under Obama was very effective at limiting mercury pollution but not at limiting CO2 emissions.

CO2 emissions are a very political issue charged issue with a lot of mindkill on both sides while mercury pollution isn't. The people who pushed mercury pollution regulation won, not because they wrote a lot of letters.

Your hesitation is understandable, but we need to do something to mitigate the risk here, or the risk just remains unmitigated and all we did was talk about it.

If you want to do something you can, earn to give and give money to MIRI.

People researching AI who've argued with Yudkowsky before and failed to be convinced might begrudge that Yudkowsky's argument has gained widespread attention, but if it pressures them to properly address Yudkowsky's arguments, then it has legitimately helped.

You don't get points for pressuring people to address arguments. That doesn't prevent an UFAI from killing you.

UFAI is an important problem but we probably don't have to solve it in the next 5 years. We do have some time to do things right.

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-29T00:53:46.134Z · LW(p) · GW(p)

How effective do you consider chain letter to be at stopping NSA spying? Do you think they will be more effective at stopping them from developing the AIs that analyse that data?

NSA spying isn't a chain letter topic that is likely to succeed, no. A strong AI chain letter that makes itself sound like it's just against NSA spying doesn't seem like an effective approach. The intent of a chain letter about strong AI is that all such projects are a danger. If people come to the conclusion that the NSA is likely to develop and AI while being aware of the danger of uFAI, then they would write letters or seek to start a movement to ensure that any AI the NSA—or any government organizations, for that matter—would ensure friendliness to the best of their abilities. The NSA doesn't need to be mentioned in the uFAI chain mail in order for any NSA AI projects to be forced to comply with friendliness principles.

If you want to do something you can, earn to give and give money to MIRI.

That is not a valid path if MIRI is willfully ignoring valid solutions.

You don't get points for pressuring people to address arguments. That doesn't prevent an UFAI from killing you.

It does if the people addressing those arguments learn/accept the danger of unfriendliness in being pressured to do so.

We probably don't have to solve it in the next 5 years.

Five years may be the time it takes for the chain mail to effectively popularize the issue to the point where the pressure is on to ensure friendliness, whether we solve it decades from then or not. What is your estimate for when uFAI will be created if MIRI's warning isn't properly heeded?

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-29T14:27:24.066Z · LW(p) · GW(p)

If people come to the conclusion that the NSA is likely to develop and AI while being aware of the danger of uFAI, then they would write letters or seek to start a movement to ensure that any AI the NSA—or any government organizations, for that matter—would ensure friendliness to the best of their abilities.

I think your idea of a democracy in which letter writing is the way to create political change, just doesn't accurately describe the world in which we are living.

Five years may be the time it takes for the chain mail to effectively popularize the issue to the point where the pressure is on to ensure friendliness, whether we solve it decades from then or not. What is your estimate for when uFAI will be created if MIRI's warning isn't properly heeded?

If I remember right the median lesswrong prediction is that singularity happens after 2100. It might happen sooner. I think 30 years is a valid time frame for FAI strategy.

That timeframe is long enough to invest in rationality movement building.

That is not a valid path if MIRI is willfully ignoring valid solutions.

Not taking the time to respond in detail to every suggestion can be a valid strategy. Especially for a post that get's voted down to -3. People voted it down, so it's not ignored. If MIRI wouldn't respond to a highly upvoted solution on lesswrong, then I would agree that's a sign of concern.

↑ comment by passive_fist · 2013-10-29T18:46:46.845Z · LW(p) · GW(p)

Based on my (subjective and anecdotal, I'll admit) personal experiences, I think it would be bad. Look at climate change.

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-29T23:41:02.049Z · LW(p) · GW(p)

Is there something wrong with climate change in the world today? Yes, it's hotly debated by millions of people, a super-majority of them being entirely unqualified to even have an opinion, but is this a bad thing? Would less public awareness of the issue of climate change have been better? What differences would there be? Would organizations be investing in "green" and alternative energy if not for the publicity surrounding climate change?

It's easy to look back after the fact and say, "The market handled it!" But the truth is that the publicity and the corresponding opinions of thousands of entrepreneurs is part of that market.

Looking at the two markets:

MIRI's warning of uFAI is popularized.
MIRI's warning of uFAI continues in obscurity.

The latter just seems a ton less likely to mitigate uFAI risks than the former.

Replies from: passive_fist

↑ comment by passive_fist · 2013-10-30T00:28:46.309Z · LW(p) · GW(p)

The failure mode that I'm most concerned about is overreaction followed by a backlash of dismissal. If that happened, the end result would be far worse than obscurity.

comment by [deleted] · 2013-11-10T14:46:26.592Z · LW(p) · GW(p)

Nastier issue: the harder argument of convincing people UFAI is an avoidable risk. If you can't convince people they've got a realistic chance (ie: one they would gamble on, given the possible benefits of FAI) of winning this issue, then it doesn't matter how informed they are.

See: Juergen Schmidhuber's interview on this very website, where we basically says, "We're damn near AI in my lab, and yes, it is a rational optimization process," followed by, "We see no way to prevent the paper-clipping of humanity whatsoever, so we stopped giving a damn and just focus on doing our research."

comment by drethelin · 2013-10-28T16:33:27.737Z · LW(p) · GW(p)

This is what cfar is for

Replies from: maia

↑ comment by maia · 2013-10-28T20:32:05.841Z · LW(p) · GW(p)

Dang, and here I was thinking they were trying to help me improve my life.

Replies from: drethelin

↑ comment by drethelin · 2013-10-29T02:31:10.421Z · LW(p) · GW(p)

through ponies!

comment by Error · 2013-10-30T02:44:15.430Z · LW(p) · GW(p)

This post makes me wonder if the relevant information could be compressed into a series of self-contained videos along the lines of MinutePhysics. So far as I can tell most people find video more accessible. (I don't, but I'm an outlier, like most here)

I'm going to guess it's impossible, but I'm not sure if it's Shut Up and Do the Impossible impossible or Just Lose Hope Already impossible.

comment by ChristianKl · 2013-10-28T16:32:34.875Z · LW(p) · GW(p)

HPMOR could end with Harry destroying the world through an UFAI. The last chapters already pointed to Harry destroying the world.

Strategically that seems to be the best choice. HPMOR is more viral than some technical document. There already effort invested in getting a lot of people to read HPMOR.

People bond with the characters. Ending the book with, now everyone is dead because an AGI went FOOM let's people take that scenario seriously and that's exactly the right time to tell them: "Hey, this scenario could also happen in our world, so let's do something to prevent it from happening."

Replies from: shminux, Mitchell_Porter, ArisKatsaris, fubarobfusco, gattsuru, ChrisHallquist, TheOtherDave

↑ comment by Shmi (shminux) · 2013-10-28T17:59:31.426Z · LW(p) · GW(p)

HPMOR could end with Harry destroying the world through an UFAI.

I would consider it probably the worst possible ending for HPMoR. I assume that Eliezer is smart enough to avoid overt propaganda.

Replies from: Scott Garrabrant

↑ comment by Scott Garrabrant · 2013-10-28T18:03:58.788Z · LW(p) · GW(p)

What do you mean "smart enough?" You think that that ending would do harm for FAI?

Replies from: shminux

↑ comment by Shmi (shminux) · 2013-10-28T18:09:12.501Z · LW(p) · GW(p)

It would likely "do harm" to the story and consequently reduce its appeal and influence.

↑ comment by Mitchell_Porter · 2013-10-29T00:16:38.411Z · LW(p) · GW(p)

Even more people have read the Bible, the Quran, and the Vedas, so why not put out pamphlets in which Jesus, Muhammad and Krishna discuss AGI?

Replies from: ChristianKl, Lumifer

↑ comment by ChristianKl · 2013-10-29T00:37:21.363Z · LW(p) · GW(p)

I would be interested in reading them.

↑ comment by Lumifer · 2013-10-29T00:51:59.084Z · LW(p) · GW(p)

why not put out pamphlets in which Jesus, Muhammad and Krishna discuss AGI?

Jesus: We excel at absorbing external influences and have no problems with setting up new cults (just look at Virgin Mary) -- so we'll just make a Holy Quadrinity! Once you go beyond monotheism there's no good reason to stop at three...

Muhammad: Ah, another prophet of Allah! I said I was the last but maybe I was mistaken about that. But one prophet more, one prophet less -- all is in the hands of Allah.

Krishna: Meh, Kali is more impressive anyways. Now where are my girls?

Replies from: BaconServ

↑ comment by BaconServ · 2013-10-29T01:08:11.434Z · LW(p) · GW(p)

That would probably upset many existing Christians. Clearly Jesus' second coming is in AI form.

Replies from: Lumifer

↑ comment by Lumifer · 2013-10-29T01:24:13.232Z · LW(p) · GW(p)

Robot Jesus! :-) And rapture is clearly just an upload.

↑ comment by ArisKatsaris · 2013-10-28T18:58:14.116Z · LW(p) · GW(p)

HPMOR could end with Harry destroying the world through an UFAI.

No, it couldn't.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-28T19:16:12.154Z · LW(p) · GW(p)

There are multiple claims in the book that Harry will destroy the world. It starts in the first chapter with "The world will end". Interessingly that wasn't threre at the time the chapter was first published, but retrospectively added.

Creating a AI in the world is just a matter of creating a magical item. Harry knows how to make them self aware. Harry knows that magical creatures like trolls constantly self modify through magic. Harry is into inventing new powerful spells.

All the pieces for building an AGI that goes FOOM are there in the book.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-10-28T19:29:10.582Z · LW(p) · GW(p)

All the pieces for building an AGI that goes FOOM are there in the book.

I assign 2% probability on this scenario. What probability do you assign?

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-28T19:34:03.312Z · LW(p) · GW(p)

Given that the pieces the last time I read it p=.99 for that claim.

The more interesting claim is that an AGI actually goes FOOM. I say p=.65.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-10-28T19:53:01.195Z · LW(p) · GW(p)

The more interesting claim is that an AGI actually goes FOOM.

Yeah. that was the claim I meant.

I say p=.65.

Would you be willing to bet on this? I'd be willing to bet 2 of my dollars against 1 of yours, that no AGI will go FOOM in the remainder of the HPMoR story (for a maximum of 200 of my dollars vs 100 of yours)

Replies from: gwern, ChristianKl

↑ comment by gwern · 2013-10-28T22:50:17.895Z · LW(p) · GW(p)

I'd be willing to bet 2 of my dollars against 1 of yours, that no AGI will go FOOM in the remainder of the HPMoR story (for a maximum of 200 of my dollars vs 100 of yours)

Even in early 2012, I didn't think 2:1 was the odds for an AGI fooming in MoR...

How would you like to bet 1 of your dollars against 3 of my dollars that an AGI will go FOOM? Up to a max of 120 of my dollars and 40 of yours; ie. if an AGI goes foom, I pay you $120 and if it doesn't, you pay me $40. (Payment through Paypal.) Given your expressed odds, this should look like a good deal to you.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-10-28T23:10:41.068Z · LW(p) · GW(p)

ie. if an AGI goes foom, I pay you $120 and if it doesn't, you pay me $40. (Payment through Paypal.) Given your expressed odds, this should look like a good deal to you.

Ι said I assign 2% probability on an AGI going FOOM in the story. So how would this look like a good deal for me?

The odds I offered to ChristianKI were meant to express a middle ground between the odds I expressed (2%) and the odds he expressed (65%) so that the bet would seem about equally profitable to both of us, given our stated probabilities.

Replies from: gwern

↑ comment by gwern · 2013-10-28T23:25:27.970Z · LW(p) · GW(p)

Bah! Fine then, we won't bet. IMO, you should have offered more generous terms. If your true probability is 2%, then that's an odds against of 1:49, while his 65% would be 1:0.53, if I'm cranking the formula right. So a 1:2 doesn't seem like a true split.

Replies from: ArisKatsaris

↑ comment by ArisKatsaris · 2013-10-29T14:58:42.056Z · LW(p) · GW(p)

You are probably right about how it's not a true split -- I just did a stupid "add and divide by 2" on the percentages, but it doesn't really work like that.. He would anticipate to lose once every 3 times, but given my percentages I anticipated to lose once every 50 times. (I'm not very mathy at all)

↑ comment by ChristianKl · 2013-10-28T20:19:47.713Z · LW(p) · GW(p)

Would you be willing to bet on this? I'd be willing to bet 2 of my dollars against 1 of yours, that no AGI will go FOOM in the remainder of the HPMoR story (for a maximum of 200 of my dollars vs 100 of yours)

At the moment I unfortunately don't have enough cash to invest in betting projects.

Additionally I don't know Eliezer personally and there are people on LessWrong that do and which might have access to nonpublic information. As a result it's not a good topic for betting money.

Replies from: gwern

↑ comment by gwern · 2013-10-28T22:20:01.534Z · LW(p) · GW(p)

At the moment I unfortunately don't have enough cash to invest in betting projects.

Fortunately, that's why we have PredictionBook! Looking through my compilation of predictions (http://www.gwern.net/hpmor-predictions), I see we already have two relevant predictions:

Harry will create a superintelligent AI using magic or magical objects
- and it won't be Friendly, killing many/all

(I've added a new more general one as well.)

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-29T03:26:25.127Z · LW(p) · GW(p)

I added my prediction to that.

↑ comment by fubarobfusco · 2013-10-29T17:20:47.482Z · LW(p) · GW(p)

In fiction, deus (or diabolus) ex machina is considered an anti-pattern.

↑ comment by gattsuru · 2013-10-28T16:49:25.296Z · LW(p) · GW(p)

That strikes me as incredibly likely to backfire. Most obviously, a paper with more than half a million words is a little much to as an introductory work, especially with things like the War of the Three Armies (because Death Note wasn't complicated enough!). Media where our heroes destroy a planet also tend to have issues with word of mouth when not a comedy or written by Tomino.

More subtly, there are some serious criticisms of the idea of the Singularity and more generally of transhumanism, which rest on things that would be obviated in HPMoR by nature of the Harry Potter series starting as a fantasy series for young teens, and genre conventions of fantasy series, rather than by the strength of MIRI's arguments. Many of these criticisms are not very terribly strong. They are still shouted as if strong AI were Rumpelstilskin, unable to stand the sound of an oddly formed name, and HPMoR would have to be twisted very hard to counter them.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-28T16:57:37.815Z · LW(p) · GW(p)

A lot of people think of strong AI like C3PO from Star Wars. Science fiction has the power of giving people mental models even it isn't realitstic.

The magical enviroment of the Matrix movies shapes how people think about the simulation argument.

Replies from: gattsuru

↑ comment by gattsuru · 2013-10-28T17:18:17.565Z · LW(p) · GW(p)

Very true. I'd recommend against using Star Wars as a setting for cautionary tales about the Singularity, as well. The Harry Potter setting is just particularly bad, because we've already seen and encountered methods for producing human-intelligence artificial constructs that think just like a human. If Rationalist!Harry ends up having the solar system wallpapered with smiley faces, it's a lot less believable that he did it because The Machine Doesn't Care when quite a number of other machines already have.

You'll have to fight assumptions like metaphysical dualism or what limitations self-reinforcing processes might have, no matter what you do, because those mental models apply in fairly broad strokes, but it's a lot easier to do so when the setting isn't fighting you at the same time.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-28T21:18:17.603Z · LW(p) · GW(p)

You'll have to fight assumptions like metaphysical dualism or what limitations self-reinforcing processes might have, no matter what you do, because those mental models apply in fairly broad strokes, but it's a lot easier to do so when the setting isn't fighting you at the same time.

I don't think that you have to fight assumptions of metaphysical dualism. I think that the people who don't believe in UFAI as a risk on that basis are not the ones that are dangerous and might develop an AGI.

Replies from: gattsuru

↑ comment by gattsuru · 2013-10-29T16:39:38.417Z · LW(p) · GW(p)

That's an appealing thought, but I'm not sure it's a true one.

For one, if we're talking about appealing to general audiences, many folk won't be trying to develop an AGI, but still be relevant to our interests. Thinking AGI can not invent because they lack souls, or that AGI will be friendly if annoying golden translation droids, may be inconsistent with writing evolutionary algorithms, but is not certainly inconsistent with having investment or political capital.

At a deeper level, a lot of folk do hold such beliefs and simultaneously have inconsistent belief structures, which may still leave them dangerous. It is demonstrably possible have incorrect beliefs about evolution yet run a PCR, or to think it's easy to preserve semantic significance but also be a computer programmer. It's tempting to dismiss people who hold irrational beliefs since rationality strongly correlates with long-term success, but from an absolute safety perspective that gets increasingly risky.

Replies from: ChristianKl

↑ comment by ChristianKl · 2013-10-29T23:46:45.818Z · LW(p) · GW(p)

You need a bit more to develop an AGI than running PCR that someone else invents. I don't think you can develop an AGI when you think AGI are impossible due to metaphysical dualism.

You can believe that humans have souls are still design AGI that have minds but no souls, but you won't get far at developing an AGI with something like a mind if you think that task is impossible.

↑ comment by ChrisHallquist · 2013-10-29T19:08:44.851Z · LW(p) · GW(p)

I don't expect AI itself to show up, but I think it's clear that in the story that magic is serving as a sort of metaphor for AI, with Harry playing the role of an ambitious AI researcher: Harry wants to use magic to solve death and make everything perfect, but we've gotten a lot of warning that Harry's plans could go horribly wrong and possibly destroy the world.

Eliezer once mentioned he was considering a "solve this puzzle or the story ends sad" conclusion for HPMOR like he did for Three Worlds Collide. If Eliezer goes through with that, I expect the "sad" ending to be "Harry destroys the world." Or if Eliezer doesn't do that, he may just make it clear how Harry came very close to destroying the world before finding another solution.

Replies from: ChrisHallquist

↑ comment by ChrisHallquist · 2013-10-29T19:10:32.239Z · LW(p) · GW(p)

EDIT: So how does Harry almost destroy the world? My own personal theory is "conservation law arbitrage." Or maybe some plan involving Dementors going horribly wrong.

↑ comment by TheOtherDave · 2013-10-28T19:17:40.470Z · LW(p) · GW(p)

This comment from a few years back and the associated discussion seems vaguely relevant.

MIRI strategy

Contents

96 comments