against "AI risk"

wei-dai

against "AI risk"

post by Wei Dai (Wei_Dai) · 2012-04-11T22:46:10.533Z · LW · GW · Legacy · 91 comments

91 comments

Why does SI/LW focus so much on AI-FOOM disaster, with apparently much less concern for things like

bio/nano-tech disaster
Malthusian upload scenario
highly destructive war
bad memes/philosophies spreading among humans or posthumans and overriding our values
upload singleton ossifying into a suboptimal form compared to the kind of superintelligence that our universe could support

Why, for example, is lukeprog's strategy sequence titled "AI Risk and Opportunity", instead of "The Singularity, Risks and Opportunities"? Doesn't it seem strange to assume that both the risks and opportunities must be AI related, before the analysis even begins? Given our current state of knowledge, I don't see how we can make such conclusions with any confidence even after a thorough analysis.

SI/LW sometimes gives the impression of being a doomsday cult, and it would help if we didn't concentrate so much on a particular doomsday scenario. (Are there any doomsday cults that say "doom is probably coming, we're not sure how but here are some likely possibilities"?)

91 comments

Comments sorted by top scores.

comment by CarlShulman · 2012-04-11T23:46:35.357Z · LW(p) · GW(p)

Speaking only for myself, most of the bullets you listed are forms of AI risk by my lights, and the others don't point to comparably large, comparably neglected areas in my view (and after significant personal efforts to research nuclear winter, biotechnology risk, nanotechnology, asteroids, supervolcanoes, geoengineering/climate risks, and non-sapient robotic weapons). Throwing in all x-risks and the kitchen sink in, regardless of magnitude, would be virtuous in a grand overview, but it doesn't seem necessary when trying to create good source materials in a more neglected area.

bio/nano-tech disaster

Not AI risk.

I have studied bio risk (as has Michael Vassar, who has even done some work encouraging the plucking of low-hanging fruit in this area when opportunities arose), and it seems to me that it is both a smaller existential risk than AI, and nowhere near as neglected. Likewise the experts in this survey, my conversations with others expert in the field, and reading their work.

Bio existential risk seems much smaller than bio catastrophic risk (and not terribly high in absolute terms), while AI catastrophic and x-risk seem close in magnitude, and much larger than bio x-risk. Moreover, vastly greater resources go into bio risks, e.g. Bill Gates is interested and taking it up at the Gates Foundation, governments pay attention, and there are more opportunities for learning (early non-extinction bio-threats can mobilize responses to guard against later ones).

This is in part because most folk are about as easily mobilized against catastrophic as existential risks (e.g. Gates thinks that AI x-risk is larger than bio x-risk, but prefers to work on bio rather than AI because he thinks bio catastrophic risk is larger, at least in the medium-term, and more tractable). So if you are especially concerned about x-risk, you should expect bio risk to get more investment than you would put into it (given the opportunity to divert funds to address other x-risks).

Nanotech x-risk would seem to come out of mass-producing weapons that kill survivors of an all out war (which leaves neither side standing), like systems that could replicate in the wild and destroy the niche of primitive humans, really numerous robotic weapons that would hunt down survivors over time, and such like. The FHI survey gives it a lot of weight, but after reading the work of the Foresight Institute and Center for Responsible Nanotechnology (among others) from the last few decades since Drexler's books, I am not very impressed with the magnitude of the x-risk here or the existence of distinctive high-leverage ways to improve outcome around the area, and the Foresight Institute continues to operate in any case (not to mention Eric Drexler visiting FHI this year).

Others disagree (Michael Vassar has worked with the CRN, and Eliezer often names molecular nanotechnology as the x-risk he would move to focus on if he knew that AI was impossible), but that's my take.

Malthusian upload scenario

This is AI risk. Brain emulations are artificial intelligence by standard definitions, and in articles like Chalmers' "The Singularity: a Philosophical Analysis."

highly destructive war

It's hard to destroy all life with a war not involving AI, or the biotech/nanotech mentioned above. The nuclear winter experts have told me that they think x-risk from a global nuclear war is very unlikely conditional on such a war happening, and it doesn't seem that likely.

bad memes/philosophies spreading among humans or posthumans and overriding our values

There are already massive, massive, massive investments in tug-of-war over politics, norms, and values today. Shaping the conditions or timelines for game-changing technologies looks more promising to me than adding a few more voices to those fights. On the other hand, Eliezer has some hopes for education in rationality and critical thinking growing contagiously to shift some of those balances (not as a primary impact, and I am more skeptical). Posthuman value evolution does seem to sensibly fall under "AI risk," and shaping the development and deployment of technologies for posthumanity seems like a leveraged way to affect that.

upload singleton ossifying into a suboptimal form compared to the kind of superintelligence that our universe could support

AI risk again.

(Are there any doomsday cults that say "doom is probably coming, we're not sure how but here are some likely possibilities"?)

Probably some groups with a prophecy of upcoming doom, looking to every thing in the news as a possible manifestation.

Replies from: endoself, Wei_Dai, Wei_Dai, Dmytry, Turgurth, steven0461

↑ comment by endoself · 2012-04-12T02:57:18.964Z · LW(p) · GW(p)

Are you including just the extinction of humanity in your definition of x-risk in this comment or are you also counting scenarios resulting in a drastic loss of technological capability?

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T03:00:05.134Z · LW(p) · GW(p)

I expect losses of technological capability to be recovered with high probability.

Replies from: JoshuaZ, army1987

↑ comment by JoshuaZ · 2012-04-12T03:30:01.597Z · LW(p) · GW(p)

Why? This is highly non-obvious. To reach our current technological level, we had to use a lot of non-renewable resources. There's still a lot of coal and oil left, but the remaining coal and oil is harder to reach and much more technologically difficult to reliably use. That trend will only continue. It isn't obvious that if something set the tech level back to say 1600 that we'd have the resources to return to our current technology level.

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T03:43:38.978Z · LW(p) · GW(p)

It's been discussed repeatedly here on Less Wrong, and in many other places. The weight of expert opinion is on recovery, and I think the evidence is strong. Most resources are more accessible in ruined cities than they were in the ground, and more expensive fossil fuels can be substituted for by biomass, hydropower, efficiency, and so forth. It looks like there was a lot of slack in human development, e.g. animal and plant breeding is still delivering good returns after many centuries, humans have been adapting to civilization over the last thousands of years and would continue to become better adapted with a long period of low-fossil fuel near-industrial technology. And for many catastrophes knowledge from the previous civilization would be available to future generations.

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-04-12T13:25:35.653Z · LW(p) · GW(p)

It's been discussed repeatedly here on Less Wrong, and in many other places. The weight of expert opinion is on recovery

Can you give sources for this? I'm particularly interested in the claim about expert opinion, since there doesn't seem to be much discussion in the literature of this. Bostrom has mentioned it, but hasn't come to any detailed conclusion. I'm not aware of anyone else discussing it.

Most resources are more accessible in ruined cities than they were in the ground

Right. This bit has been discussed on LW before in the context of many raw metals. The particularly good example is aluminum which is resource intensive and technically difficult to refine, but is easy to use once one has a refined a bit. That's been discussed before, and looking around for such discussion I see that you and I discussed that here, but didn't discuss the power issue in general.

I think you are being optimistic about power. Hydropower and biomass while they can exist with minimal technology (and in fact, the first US commercial power plant outside New York was hydroelectric), they both have severe limitations as power methods. Hydroelectric power can only be placed in limited areas, and large-scale grids are infrastructurally difficult and require a lot of technical coordination and know-how . That's why the US grids were separate little grids until pretty late. And using hydroelectric power would further restrict the locations that power can be produced, leading to much more severe inefficiences in the grid (due to long-term power transmission and the like). There's a recent good book, Maggie Koerth Baker's "Before the Lights Go Out". , which discusses the difficulties and complexities in electric grids which also discusses in detail the historical problems with running grids. They are often underestimated.

Similarly, direct biomass is not generally as energy dense as coal or oil. You can't easily use biomass to power trains or airplanes. The technology to make synthetic oil was developed in the 1940s but it is inefficient, technically difficult, and requires a lot of infrastructure.

I also think you are overestimating how much can be done with efficiency at a low tech level. Many of the technologies that can be made more efficient (such as lightbulbs) require a fair bit of technical know-how to use the more efficient version. Thus for example, while fluorescent lights are not much more technically difficult than incandescents, CFLs' are much more technically difficult.

And efficiency bites you a bit in another direction as well: If your technology is efficient enough, then you don't have as much local demand on the grid, and you don't get the benefits of the economies of scale that you get. This was historically a problem even when incandescent light-bulbs were in use- in the first forty years of electrificiation, the vast majority of electric companies failed.

It looks like there was a lot of slack in human development, e.g. animal and plant breeding is still delivering good returns after many centuries

We're using much more careful and systematic methods of breeding now, and the returns are clearly diminishing- we're not domesticating new crops, just making them marginally more efficient. It is only large returns because the same plants and animals are in such widespread use.

And for many catastrophes knowledge from the previous civilization would be available to future generations.

This is true for some catastrophes but not all, and I'm not at all sure it will be true for most.. Most humans have minimal technical know-how beyond their own narrow areas. I'm curious to hear more about how you reach this conclusion.

Replies from: satt, Tyrrell_McAllister

↑ comment by satt · 2012-04-16T21:06:12.803Z · LW(p) · GW(p)

This may be worth expanding into a discussion post; I can't remember any top-level posts devoted to this topic, and I reckon it's important enough to warrant at least one. Your line of argument seems more plausible to me than CarlShulman's (although that might change if CS can point to specific experts and arguments for why a technological reset could be overcome).

↑ comment by Tyrrell_McAllister · 2012-04-12T16:23:23.271Z · LW(p) · GW(p)

Thus for example, while fluorescent lights are not much more technically difficult than incandescents, are much more technically difficult.

Is there a typo in this sentence?

Replies from: JoshuaZ

↑ comment by JoshuaZ · 2012-04-12T17:07:32.729Z · LW(p) · GW(p)

Yes. Intended to be something like:

Thus for example, while fluorescent lights are not much more technically difficult than incandescents, are much more technically difficult if one wants them to be cheaper and more efficient than incandescents.

↑ comment by A1987dM (army1987) · 2012-04-12T12:02:12.537Z · LW(p) · GW(p)

On what timescale?

I find the focus on x-risks as defined by Bostrom (those from which Earth-originating intelligent life will never, ever recover) way too narrow. A situation in which 99% of humanity dies and the rest reverts to hunting and gathering for a few millennia before recovering wouldn't look much brighter than that -- let alone one in which humanity goes extinct but in (say) a hundred million years the descendants of (say) elephants create a new civilization. In particular, I can't see why we would prefer the latter to (say) a civilization emerging on Alpha Centauri -- so per the principle of charity I'll just pretend that instead of “Earth-originating intelligent life” he had said “descendants of present-day humans”.

Replies from: loup-vaillant

↑ comment by loup-vaillant · 2012-04-12T22:51:25.814Z · LW(p) · GW(p)

It depends on what you value. I see 3 situations:

Early Singularity. Everyone currently living is saved.
Late Singularity. Nearly everyone currently living dies anyway.
Very late Singularity, or "Semi-crush". everyone currently living dies, and most of our yet to be born descendants (up to the second renaissance) will die as well. There is a point however were everyone is saved.
Crush. Everyone will die, now and for ever. Plus, humanity dies with our sun.

If you most value those currently living, that's right, it doesn't make much difference. But if you care about the future of humanity itself, a Very Late Singularity isn't such a disaster.

Replies from: army1987

↑ comment by A1987dM (army1987) · 2012-04-16T16:33:27.709Z · LW(p) · GW(p)

Now that I think about it, I care both about those currently living and about humanity itself, but with a small but non-zero discount rate (of the order of the reciprocal of the time humanity has existed so far). Also, I value humanity not only genetically but also memetically, so having people with human genome but Palaeolithic technocultural level surviving would be only slightly better for me than no-one surviving at all.

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T00:11:06.084Z · LW(p) · GW(p)

Perhaps it's mainly a matter of perceptions, where "AI risk" typically brings to mind a particular doomsday scenario, instead of a spread of possibilities that includes posthuman value drift, which is also not helped by the fact that around here we talk much more about UFAI going FOOM than the other scenarios. Given this, do you think we should perhaps favor phrases like "Singularity-related risks and opportunities" where appropriate?

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T00:24:04.810Z · LW(p) · GW(p)

I have the opposite perception, that "Singularity" is worse than "artificial intelligence." If you want to avoid talking about FOOM, "Singularity" has more connotation of that than AI in my perception.

I'm also not sure exactly what you mean by the "single scenario" getting privileged, or where you would draw the lines. In the Yudkowsky-Hanson debate and elsewhere Eliezer talked about many separate posthuman AIs coordinating to divvy up the universe without giving humanity or humane values a share, about monocultures of seemingly separate AIs with shared values derived from a common ancestor, and so forth. Whole brain emulations coming first, which then invent AIs that race ahead of the WBEs were discussed, and so forth.

Replies from: Wei_Dai, Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T00:57:12.850Z · LW(p) · GW(p)

I have the opposite perception, that "Singularity" is worse than "artificial intelligence."

I see... I'm not sure what to suggest then. Anyone else have ideas?

I'm also not sure exactly what you mean by the "single scenario" getting privileged, or where you would draw the lines.

I think the scenario that "AI risk" tends to bring to mind is a de novo or brain-inspired AGI (excluding uploads) rapidly destroying human civilization. Here are a couple of recent posts along these lines and using the phrase "AI risk".

utilitymonster's What is the best compact formalization of the argument for AI risk from fast takeoff?
XiXiDu's A Primer On Risks From AI
ETA: See also lukeprog's Facing the Singularity, which talks about this AI risk and none of the other ones you consider to be "AI risk"

Replies from: steven0461

↑ comment by steven0461 · 2012-04-12T01:04:27.797Z · LW(p) · GW(p)

"Posthumanity" or "posthuman intelligence" or something of the sort might be an accurate summary of the class of events you have in mind, but it sounds a lot less respectable than "AI". (Though maybe not less respectable than "Singularity"?)

↑ comment by Wei Dai (Wei_Dai) · 2012-04-13T20:37:20.326Z · LW(p) · GW(p)

How about "Threats and Opportunities Associated With Profound Sociotechnological Change", and maybe shortened to "future-tech threats and opportunities" in informal use?

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T02:40:57.252Z · LW(p) · GW(p)

Apparently it's also common to not include uploads in the definition of AI. For example, here's Eliezer:

Perhaps we would rather take some other route than AI to smarter-than-human intelligence

say, augment humans instead? To pick one extreme example, suppose the one says: The prospect of AI makes me nervous. I would rather that, before any AI is developed, individual humans are scanned into computers, neuron by neuron, and then upgraded, slowly but surely, until they are super-smart; and that is the ground on which humanity should confront the challenge of superintelligence.

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T02:44:48.419Z · LW(p) · GW(p)

Yeah, there's a distinction between things targeting a broad audience, where people describe WBE as a form of AI, versus some "inside baseball" talk in which it is used to contrast against WBE.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T03:20:12.350Z · LW(p) · GW(p)

That paper was written for the book "Global Catastrophic Risks" which I assume is aimed at a fairly general audience. Also, looking at the table of contents for that book, Eliezer's chapter was the only one talking about AI risks, and he didn't mention the three listed in my post that you consider to be AI risks.

Do you think I've given enough evidence to support the position that many people, when they say or hear "AI risk", is either explicitly thinking of something narrower than your definition of "AI risk", or have not explicitly considered how to define "AI" but is still thinking of a fairly narrow range of scenarios?

Besides that, can you see my point that an outsider/newcomer who looks at the public materials put out by SI (such as Eliezer's paper and Luke's Facing the Singularity website) and typical discussions on LW would conclude that we're focused on a fairly narrow range of scenarios, which we call "AI risk"?

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T03:35:17.481Z · LW(p) · GW(p)

explicitly thinking of something narrower than your definition of "AI risk", or have not explicitly considered how to define "AI" but is still thinking of a fairly narrow range of scenarios?

Yes.

↑ comment by Dmytry · 2012-04-12T04:20:06.989Z · LW(p) · GW(p)

Seems like a prime example of where to apply rationality: what are the consequences to trying to work on AI risk right now? Versus on something else? Does AI risk work have good payoff?

What's of the historical cases? The one example I know of is this: http://www.fas.org/sgp/othergov/doe/lanl/docs1/00329010.pdf (thermonuclear ignition of atmosphere scenario). Can a bunch of people with little physics related expertise do something about such risks >10 years before? Beyond the usual anti war effort? Bill Gates will work on AI risk when it becomes clear what to do about it.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T08:13:55.963Z · LW(p) · GW(p)

Can a bunch of people with little physics related expertise do something about such risks >10 years before?

Have you seen Singularity and Friendly AI in the dominant AI textbook?

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T08:42:43.449Z · LW(p) · GW(p)

I'm kind of dubious that you needed 'beware of destroying mankind' in a physics textbook to get Teller to check if nuke can cause thermonuclear ignition in atmosphere or seawater, but if it is there, I guess it won't hurt.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T09:18:11.090Z · LW(p) · GW(p)

Here's another reason why I don't like "AI risk": it brings to mind analogies like physics catastrophes or astronomical disasters, and lets AI researchers think that their work is ok as long as they have little chance of immediately destroying Earth. But the real problem is how do we build or become a superintelligence that shares our values, and given this seems very difficult, any progress that doesn't contribute to the solution but brings forward the date by which we must solve it (or be stuck with something very suboptimal even if it doesn't kill us) is bad, and this includes AI progress that is not immediately dangerous.

ETA: I expanded this comment into a post here.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T09:27:17.598Z · LW(p) · GW(p)

Well, there's this implied assumption that super-intelligence that 'does not share our values' shares our domain of definition of the values. I can make a fairly intelligent proof generator, far beyond human capability if given enough CPU time; it won't share any values with me, not even the domain of applicability; the lack of shared values with it is so profound as to make it not do anything whatsoever in the 'real world' that I am concerned with. Even if it was meta - strategic to the point of potential for e.g. search for ways to hack into a mainframe to gain extra resources to do the task 'sooner' by wallclock time, it seems very dubious that by mere accident it will have proper symbol grounding, won't wirelead (i.e. would privilege the solutions that don't involve just stopping said clock), etc etc. Same goes for other practical AIs, even the evil ones that would e.g. try to take over internet.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T09:30:47.487Z · LW(p) · GW(p)

You're still falling into the same trap, thinking that your work is ok as long as it doesn't immediately destroy the Earth. What if someone takes your proof generator design, and uses the ideas to build something that does affect the real world?

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T09:44:05.616Z · LW(p) · GW(p)

You're still falling into the same trap, thinking that your work is ok as long as it doesn't immediately destroy the Earth. What if someone takes your proof generator design, and uses the ideas to build something that does affect the real world?

Well let's say in 2022 we have a bunch of tools along the lines of automatic problem solving, unburdened by their own will (not because they were so designed but by simple omission of immense counter productive effort). Someone with a bad idea comes around, downloads some open source software, cobbles together some self propelling 'thing' that is 'vastly superhuman' circa 2012. Keep in mind that we still have our tools that make us 'vastly superhuman' circa 2012 , and i frankly don't see how 'automatic will', for lack of better term, is contributing anything here that would make the fully automated system competitive.

Replies from: Wei_Dai, XiXiDu

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T10:18:58.407Z · LW(p) · GW(p)

Well, one thing the self-willed superintelligent AI could do is read your writings, form a model of you, and figure out a string of arguments designed to persuade you to give up your own goals in favor of its goals (or just trick you into doing things that further its goals without realizing it). (Or another human with superintelligent tools could do this as well.) Can you ask your "automatic problem solving tools" to solve the problem of defending against this, while not freezing your mind so that you can no longer make genuine moral/philosophical progress? If you can do this, then you've pretty much already solved the FAI problem, and you might as well ask the "tools" to tell you how to build an FAI.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-04-12T11:10:54.767Z · LW(p) · GW(p)

Well, one thing the self-willed superintelligent AI could do is read your writings, form a model of you, and figure out a string of arguments designed to persuade you to give up your own goals in favor of its goals...

Does agency enable the AI to do so? If not, then why wouldn't a human being not be able to do the same by using the AI in tool mode?

Can you ask your "automatic problem solving tools" to solve the problem of defending against this...

Just make it list equally convincing counter-arguments.

Replies from: Wei_Dai, Dmytry, TheOtherDave

↑ comment by Wei Dai (Wei_Dai) · 2012-04-12T20:34:55.977Z · LW(p) · GW(p)

Does agency enable the AI to do so? If not, then why wouldn't a human being not be able to do the same by using the AI in tool mode?

Yeah, I realized this while writing the comment: "(Or another human with superintelligent tools could do this as well.)" So this isn't a risk with self-willed AI per se. But note this actually makes my original point stronger, since I was arguing against the idea that progress on AI is safe as long as it doesn't have a "will" to act in the real world.

Just make it list equally convincing counter-arguments.

So every time you look at a (future equivalent of) website or email, you ask your tool to list equally convincing counter-arguments to whatever you're looking at? What does "equally convincing" mean? An argument that exactly counteracts the one that you're reading, leaving your mind unchanged?

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-04-13T09:57:57.237Z · LW(p) · GW(p)

So every time you look at a (future equivalent of) website or email, you ask your tool to list equally convincing counter-arguments to whatever you're looking at?

Sure, why not? I think IBM is actually planning to do this with IBM Watson. Once mobile phones become fast enough you can receive constant feedback about ideas and arguments you encounter.

For example, some commercial tells you that you can lose 10 pounds in 1 day by taking a pill. You then either ask your "IBM Oracle" or have it set up to give you automatic feedback. It will then tell you that there are no studies that indicate that something as advertised is possible and that it won't be healthy anyway. Or something along those lines.

I believe that in future it will be possible to augment everything with fact-check annotations.

But that's besides the point. The idea was that if you run the AI box experiment with Eliezer posing as malicious AI trying to convince the gatekeeper to let it out of the box, and at the same time as a question answering tool using the same algorithms as the AI, then I don't think someone would let him out of the box. He would basically have to destroy his own arguments by giving unbiased answers about the trustworthiness of the boxed agent and possible consequences of letting it out of the box. At the very best the AI in agent mode would have to contradict the tool mode version and thereby reveal that it is dishonest and not trustworthy.

Replies from: khafra

↑ comment by khafra · 2012-05-11T14:58:30.004Z · LW(p) · GW(p)

So every time you look at a (future equivalent of) website or email, you ask your tool to list equally convincing counter-arguments to whatever you're looking at?

Sure, why not?

When I'm feeling down and my mom sends me an email trying to cheer me up, that'll be a bit of a bummer.

↑ comment by Dmytry · 2012-04-12T11:31:22.897Z · LW(p) · GW(p)

Yep. Majorly awesome scenario degrades into ads vs adblock when you consider everything in the future not just the self willed robot. Matter of fact, a lot of work is put into constructing convincing strings of audio and visual stimuli, and into ignoring those strings.

Replies from: David_Gerard

↑ comment by David_Gerard · 2012-04-12T12:12:06.188Z · LW(p) · GW(p)

Superstimuli and the Collapse of Western Civilization. Using such skills to manipulate other humans appears to be what we grew intelligence for, of course. As I note, western civilisation is already basically made of the most virulent toxic memes we can come up with. In the noble causes of selling toothpaste and car insurance and, of course, getting laid. It seems to be what we do now we've more or less solved the food and shelter problems.

↑ comment by TheOtherDave · 2012-04-12T14:12:47.328Z · LW(p) · GW(p)

They'd probably have to be more convincing, since convincing a human being out of a position they already hold is usually a more difficult task than convincing them to hold the position in the first place.

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-04-12T15:13:23.279Z · LW(p) · GW(p)

They'd probably have to be more convincing, since convincing a human being out of a position they already hold is usually a more difficult task than convincing them to hold the position in the first place.

If I have a superhuman answering machine on one side. A tool that just lists a number of answers to my query, just like a superhuman Google. And on the other side I have the same tool in agent mode. Then why would I be more convinced by the agent mode output?

An agent has an incentive to trick me. While the same algorithm, minus the agency module, will just output unbiased answers to my queries.

If the answers between the tool and agent mode differ, then I naturally believe the tool mode output.

If for example the agent mode was going to drivel something about acausal trade and the tool mode would just output some post by Eliezer Yudkowsky explaining why I shouldn't let the AI out of the box, then how could the agent mode possible be more convincing? Especially since putting the answering algorithm into agent mode shouldn't improve the answers.

Replies from: TheOtherDave, TheOtherDave

↑ comment by TheOtherDave · 2012-04-12T15:41:39.125Z · LW(p) · GW(p)

why would I be more convinced by the agent mode output?

You wouldn't, necessarily. Nor did I suggest that you would.

I also agree that if (AI in "agent mode") does not have any advantages over ("tool mode" plus human agent), then there's no reason to expect its output to be superior, though that's completely tangential to the comment you replied to.

That said, it's not clear to me that (AI in "agent mode") necessarily lacks advantages over ("tool mode" plus human agent).

↑ comment by TheOtherDave · 2012-04-12T15:28:15.649Z · LW(p) · GW(p)

why would I be more convinced by the agent mode output?

You wouldn't, necessarily. Nor did I suggest that you would.

That said, it's not clear to me that (AI in "agent mode") necessarily lacks advantages over ("tool mode" plus human agent).

Replies from: XiXiDu

↑ comment by XiXiDu · 2012-04-12T16:30:13.854Z · LW(p) · GW(p)

...though that's completely tangential to the comment you replied to.

I don't think that anyone with the slightest idea that an AI in agent mode could have malicious intentions, and therefore give biased answers, wouldn't be as easily swayed by counter-arguments made by a similarly capable algorithm.

I mean, we shouldn't assume an idiot gatekeeper who never heard of anything we're talking about here. So the idea that an AI in agent mode could brainwash someone to the extent that it afterwards takes even stronger arguments to undo it seems rather far-fetched (ETA What's it supposed to say? That the tool uses the same algorithms as itself but is somehow wrong in claiming that the AI in agent mode tries to brainwash the gatekeeper?).

The idea is that given a sufficiently strong AI in tool mode, it might be possible to counter any attempt to trick a gatekeeper. And in the case that the tool mode agrees, then it probably is a good idea to let the AI out of the box. Although anyone familiar with the scenario would probably rather assume a systematic error elsewhere, e.g. a misinterpretation of one's questions by the AI in tool mode.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2012-04-12T16:47:51.367Z · LW(p) · GW(p)

I don't think that anyone with the slightest idea that an AI in agent mode could have malicious intentions, and therefore give biased answers, wouldn't be as easily swayed by counter-arguments made by a similarly capable algorithm.

Ah, I see. Sure, OK, that's apposite. Thanks for clarifying that.

I disagree with your prediction.

↑ comment by XiXiDu · 2012-04-12T11:06:09.314Z · LW(p) · GW(p)

Keep in mind that we still have our tools that make us 'vastly superhuman' circa 2012 , and i frankly don't see how 'automatic will', for lack of better term, is contributing anything here that would make the fully automated system competitive.

This is actually one of Greg Egan's major objections. That superhuman tools come first and that artificial agency won't make those tools competitive against augmented humans. Further, you can't apply any work done to ensure that an artificial agents is friendly to augmented humans.

↑ comment by Turgurth · 2012-04-12T05:44:57.169Z · LW(p) · GW(p)

I have a few questions, and I apologize if these are too basic:

1) How concerned is SI with existential risks vs. how concerned is SI with catastrophic risks?

2) If SI is solely concerned with x-risks, do I assume correctly that you also think about how cat. risks can relate to x-risks (certain cat. risks might raise or lower the likelihood of other cat. risks, certain cat. risks might raise or lower the likelihood of certain x-risks, etc.)? It must be hard avoiding the conjunction fallacy! Or is this sort of thing more what the FHI does?

3) Is there much tension in SI thinking between achieving FAI as quickly as possible (to head off other x-risks and cat. risks) vs. achieving FAI as safely as possible (to head off UFAI), or does one of these goals occupy signficantly more of your attention and activities?

Edited to add: thanks for responding!

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T06:03:11.042Z · LW(p) · GW(p)

How concerned is SI with existential risks vs. how concerned is SI with catastrophic risks?

Different people have different views. For myself, I care more about existential risks than catastrophic risks, but not overwhelmingly so. A global catastrophe would kill me and my loved ones just as dead. So from the standpoint of coordinating around mutually beneficial policies, or "morality as cooperation" I care a lot about catastrophic risk affecting current and immediately succeeding generations. However, when I take a "disinterested altruism" point of view x-risk looms large: I would rather bring 100 trillion fantastic lives into being than improve the quality of life of a single malaria patient.

If SI is solely concerned with x-risks, do I assume correctly that you also think about how cat. risks can relate to x-risks

Yes.

Or is this sort of thing more what the FHI does?

They spend more time on it, relatively speaking.

FAI as quickly as possible (to head off other x-risks and cat. risks) vs. achieving FAI as safely as possible (to head off UFAI)

Given that powerful AI technologies are achievable in the medium to long term, UFAI would seem to me be a rather large share of the x-risk, and still a big share of the catastrophic risk, so that speedups are easily outweighed by safety gains.

Replies from: multifoliaterose

↑ comment by multifoliaterose · 2012-04-14T00:07:59.560Z · LW(p) · GW(p)

However, when I take a "disinterested altruism" point of view x-risk looms large: I would rather bring 100 trillion fantastic lives into being than improve the quality of life of a single malaria patient.

What's your break even point for "bring 100 trillion fantastic lives into being with probability p" vs. "improve the quality of a single malaria patient" and why?

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-14T00:20:59.892Z · LW(p) · GW(p)

It depends on the context (probability distribution over number and locations and types of lives), with various complications I didn't want to get into in a short comment.

Here's a different way of phrasing things: if I could trade off probability p1 of increasing the income of everyone alive today (but not providing lasting benefits into the far future) to at least $1,000 per annum with basic Western medicine for control of infectious disease, against probability p2 of a great long-term posthuman future with colonization, I would prefer p2 even if it was many times smaller than p1. Note that those in absolute poverty are a minority of current people, a tiny minority of the people who have lived on Earth so far, their life expectancy is a large fraction of that of the rich, and so forth.

↑ comment by steven0461 · 2012-04-12T00:21:55.111Z · LW(p) · GW(p)

Nanotech x-risk would seem to come out of mass-producing weapons that kill survivors of an all out war (which leaves neither side standing), like systems that could replicate in the wild and destroy the niche of primitive humans, really numerous robotic weapons that would hunt down survivors over time, and such like.

What about takeover by an undesirable singleton? Also, if nanotechnology enables AI or uploads, that's an AI risk, but it might still involve unique considerations we don't usually think to talk about. The opportunities to reduce risk here have to be very small to justify LessWrong's ignoring the topic almost entirely, as it seems to me that it has. The site may well have low-hanging conceptual insights to offer that haven't been covered by CRN or Foresight.

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T00:34:21.474Z · LW(p) · GW(p)

to justify LessWrong's ignoring the topic

That's a much lower standard than "should Luke make this a focus when trading breadth vs speed in making his document". If people get enthused about that, they're welcome to. I've probably put 50-300 hours (depending on how inclusive a criterion I use for relevant hours) into the topic, and saw diminishing returns. If I overlap with Eric Drexler or such folk at a venue I would inquire, and I would read a novel contribution, but I'm not going to be putting much into it given my alternatives soon.

Replies from: steven0461

↑ comment by steven0461 · 2012-04-12T01:09:22.048Z · LW(p) · GW(p)

I agree that it's a lower standard. I didn't mean to endorse Wei's claims in the original post, certainly not based on nanotech alone. If you don't personally think it's worth more of your time to pay attention to nanotech, I'm sure you're right, but it still seems like a collective failure of attention that we haven't talked about it at all. You'd expect some people to have a pre-existing interest. If you ever think it's worth it to further describe the conclusions of those 50-300 hours, I'd certainly be curious.

Replies from: CarlShulman

↑ comment by CarlShulman · 2012-04-12T01:12:12.972Z · LW(p) · GW(p)

I'll keep that in mind.

comment by XiXiDu · 2012-04-12T10:46:48.071Z · LW(p) · GW(p)

SI/LW sometimes gives the impression of being a doomsday cult...

I certainly never had this impression. The worst that can be said about SI/LW is that some use inappropriately strong language with respect to risks from AI.

What I endorse:

Risks from AI (including WBE) are an underfunded research area and might currently be the best choice for anyone who seeks to do good by contribute money to an important cause.

What I think is unjustified:

This is crunch time. This is crunch time for the entire human species. And it’s crunch time not just for us, it’s crunch time for the intergalactic civilization whose existence depends on us.

I would have to assign a +90% probability to risks from AI, to pose an existential risk, to endorse the second stance. I would further have to be highly confident that we will have to face associated risks within this century and that the model uncertainty associated with my estimates is low.

You might argue that I would endorse the second stance if NASA told me that there was a 20% chance of an asteroid hitting Earth and that they need money to deflect it. I would indeed. But that seems like a completely different scenario to me.

That intuition might stem from the possibility that any estimates regarding risks from AI are very likely to be wrong, whereas in the example case of an asteroid collision one could be much more confident in the 20% estimate. As the latter is based on empirical evidence while the former is inference based and therefore error prone.

What I am saying is that I believe that SI is probably the top charity right now but that it is not as far ahead of other causes as some people here seem to think. I don't think that the evidence allows anyone to claim that trying to mitigate risks from AI is the best one could do and be highly confident about it. I think that it is currently the leading cause, but only slightly. And I am highly skeptical about using the expected value of a galactic civilization to claim otherwise.

Replies from: Rain

↑ comment by Rain · 2012-04-13T17:19:23.447Z · LW(p) · GW(p)

I believe that SI is probably the top charity right now

I think that it is currently the leading cause

Charitable giving in the US in 2010: ~$290,890,000,000

SI's annual budget for 2010: ~$500,000

US Peace Corps volunteers in 2010 (3 years of service in a foreign country for sustenance wages): ~8,655

SI volunteers in 2010 (work from home or California hot spots): like 5?

Replies from: XiXiDu, thomblake

↑ comment by XiXiDu · 2012-04-13T17:58:05.717Z · LW(p) · GW(p)

Charitable giving in the US in 2010: ~$290,890,000,000

SI's annual budget for 2010: ~$500,000

I am not sure what you are trying to tell me by those numbers. I think that there are a few valid criticisms regarding SI as an organization. It is also not clear that they could usefully spend more than ~$500,000 at this time.

In other words, even if risks from AI was the by far (not just slightly) most important cause, it is not clear that contributing money to SI is better than withholding funds from it it at this point.

If for example they can't usefully spend more money at this point, and there is nothing medium probable that you yourself can do against AI risk right now, then you should move on to the next most important cause that needs funding and support it instead.

Replies from: Rain, Rain

↑ comment by Rain · 2012-04-13T19:51:34.193Z · LW(p) · GW(p)

You think SI is "probably the top charity right now".
SI is smaller than the rounding error in US charitable giving.
You think they might have more than enough money

Those don't add up.

↑ comment by Rain · 2012-04-13T18:33:26.099Z · LW(p) · GW(p)

I am not sure what you are trying to tell me by those numbers.

I think it's funny.

↑ comment by thomblake · 2012-04-13T20:01:34.873Z · LW(p) · GW(p)

I think you misread "top charity" as "biggest charity" instead of "most important charity".

Replies from: Rain

↑ comment by Rain · 2012-04-13T20:26:09.150Z · LW(p) · GW(p)

No, I didn't.

comment by Larks · 2012-04-12T01:20:50.634Z · LW(p) · GW(p)

Are there any doomsday cults that say "doom is probably coming, we're not sure how but here are some likely possibilities"?

No, but there are lots of cults that say "we are the people to solve all the world's problems." Acknowledging the benefits of Division of Labour is un-cult-like.

comment by wedrifid · 2012-04-12T08:43:58.630Z · LW(p) · GW(p)

Why does SI/LW focus so much on AI-FOOM disaster, with apparently much less concern for things like

Malthusian upload scenario

For my part I consider that scenario pretty damn close to the AI-FOOM. ie. It'll quite probably result in a near equivalent outcome but just take slightly longer before it becomes unstoppable.

comment by Alex_Altair · 2012-04-11T23:21:07.480Z · LW(p) · GW(p)

Personally, I care primarily about AI risk for a few reasons. One is that it is an extremely strong feedback loop. There are other dangerous feedback loops, including nanotech, and I am not confident which will be a problem first. But I think AI is the hardest risk to solve, and also has the most potential for negative utility. I also think that we are relatively close to being able to create AGI.

As far as I know, the SI is defined by its purpose of reducing AI risk. If other risks need long-term work, then each risk needs a dedicated group to work on it.

As for LW, I think it's simply that people read EY's writing on AI risk, and those that agree tend to stick around and discuss it here.

Replies from: inachu

↑ comment by inachu · 2012-04-12T13:03:03.525Z · LW(p) · GW(p)

There are two forms of AI in my book and either one contains risk. The learned AI or the AI that comes with complete knowledge. to involve AI in risk assesment you will need the AI in the wilderness with nothing held back. Truly though would you do that to AI? Kind of like shoving all information down the brain of a 13 year old girl. She would just go berserk and will become defiant in the end.

The best alternative safe AI that contains no risk is the copied brain of a scientist.

comment by Viliam_Bur · 2012-04-12T11:12:38.692Z · LW(p) · GW(p)

To me it seems reasonable to focus on self-improving AI instead of wars and nanotechnology. If we get the AI right, then we can give it a task to solve our problems with wars, nanotechnology, et cetera (the "suboptimal singleton" problem is included in "getting the AI right"). One solution will help us with other solutions.

As an analogy, imagine yourself as an intelligent designer of your favorite species. You can choose to give them an upgrade: fast feet, thick fur, improved senses, or human-like brain. Of course you should choose a human-like brain, because this allows them to also fix their problems with feet, fur and senses. Now when you have an opportunity to give them Friendly AI as a next upgrade, you should do it, because it will help them fix many other problems too.

This reasoning does not work if chance to make the Friendly AI is extremely low, and the chances of fixing other problems are much higher. Then it makes sense to fix the other problems first. Important thing is, in long term we want to fix all these problems, so it's not about whether "A" is better than "B", but whether "A, then B" is better than "B, then A".

comment by ChrisHallquist · 2012-04-12T09:33:14.946Z · LW(p) · GW(p)

The answer to your initial question is that Eliezer and Luke believe that if we create AI, the default result is itkills us all.or does.something else equally unpleasant. And also that creating Friendly AI will be an extraordinarily good thing, in part (and only in part) because it would be excellent protection against other risks.

That said, I think there is a limit to how confident anyone ought to be in that view, and it is worth trying to prepare for other scenarios.

comment by fubarobfusco · 2012-04-12T02:42:06.487Z · LW(p) · GW(p)

What does "doomsday cult" mean? I had been under the impression that it referred to groups like Heaven's Gate or Family Radio which prophesied a specific end-times scenario, down to the date and time of doomsday.

However, Wikipedia suggests the term originated with John Lofland's research on the Unification Church (the Moonies):

Doomsday cult is an expression used to describe groups who believe in Apocalypticism and Millenarianism, and can refer both to groups that prophesy catastrophe and destruction, and to those that attempt to bring it about. The expression was first used by sociologist John Lofland in his 1966 study of a group of Unification Church members in California, Doomsday Cult: A Study of Conversion, Proselytization, and Maintenance of Faith. A classic study of a group with cataclysmic predictions had previously been performed by Leon Festinger and other researchers, and was published in his book When Prophecy Fails: A Social and Psychological Study of a Modern Group that Predicted the Destruction of the World.

(This is the same When Prophecy Fails that Eliezer cites in Evaporative Cooling of Group Beliefs, by the way. Read the Sequences, folks. Lotsa good stuff in there.)

Wikipedia continues, describing some of the different meanings that "doomsday cult" has held:

Some authors have used "doomsday cult" solely to characterize groups that have used acts of violence to harm their members and/or others, such as the salmonella poisoning of salad bars by members of the Bhagwan Shree Rajneesh group, and the mass murder/suicide of members of the Movement for the Restoration of the Ten Commandments of God group. Others have used the term to refer to groups which have made and later revised apocalyptic prophesies or predictions, such as the Church Universal and Triumphant led by Elizabeth Clare Prophet, and the initial group studied by Festinger, et al. Still others have used the term to refer to groups that have prophesied impending doom and cataclysmic events, and also carried out violent acts, such as the Aum Shinrikyo sarin gas attack on the Tokyo subway and the mass murder/suicide of members of Jim Jones' Peoples Temple group after similar types of predictions.

So, "doomsday cult" seems to have a lot to do with repeated prophecies of doom, even in the face of past prophecies being overtaken by events. So far as I know, SIAI seems more to err on the side of not making specific predictions, and thus risking running afoul of getting evicted for not paying rent in anticipated experiences, than in giving us a stream of doomish prophecies and telling us to forget about the older ones when they fail to come true.

Reading on:

While a student at the University of California, Berkeley Lofland lived with Unification Church missionary Young Oon Kim and a small group of American church members and studied their activities in trying to promote their beliefs and win new members for their church. Lofland noted that most of their efforts were ineffective and that most of the people who joined did so because of personal relationships with other members, often family relationships. Though Lofland had made his sociological interests clear to Kim from the outset, when she determined that he was not going to convert to their religion he was asked to move out of their residence. [...] Lofland laid out seven conditions for a doomsday cult, including: acutely felt tension, religious problem-solving perspective, religious seekership, experiencing a turning point, development of cult affective bonds, and neutralization of extracult attachments. He also suggests that individuals who join doomsday cults suffer from a form of deprivation.

I haven't been able to get a hold of a greppable copy of Lofland's book. I'd be interested to see how he expands on these seven conditions. Some of them very well may apply, in some form, to our aspiring rationalists ... I wonder to what extent though they apply to aspirants to any group at some ideological variance from mainstream society, though.

Replies from: timtyler, Luke_A_Somers

↑ comment by timtyler · 2012-04-13T00:46:58.789Z · LW(p) · GW(p)

So, "doomsday cult" seems to have a lot to do with repeated prophecies of doom, even in the face of past prophecies being overtaken by events.

In one out of three quoted meanings? It seems to be a relatively unimportant factor to me.

↑ comment by Luke_A_Somers · 2012-04-12T21:42:56.408Z · LW(p) · GW(p)

The first, well, anyone raising a concern is going to have that.

Numbers 2 and 3 (religious problem-solving, seekership) are right out.

Number 4 (turning point), okay.

Number 5 (formation of affective bonds)... I dunno, maaybe? I mean, you can't really blame a group for people liking it. I think this was meant way more strongly than we have here.

Number 6 Neutralization of external attachments? Absolutely not.

You didn't name the seventh, unless it's the deprivation, which again... no.

So, arguably 3 out of seven, of which 2 are so common as to be kind of silly, and one of those was a major stretch. Whee.

comment by Manfred · 2012-04-11T23:18:46.964Z · LW(p) · GW(p)

SI/LW sometimes gives the impression of being a doomsday cult

To whom? In the post you linked, the main source of the concern (google hits) turned out not to mean the thing the author originally thought (edit: this is false. Sorry). Merely "raising the issue" is merely privileging the hypothesis.

Anywho, is the main idea of this post "this other bad stuff is similarly bad, and SI could be doing similar amounts to reduce the risk of these bad things?" I seem to recall their justification for focus on AI was that with self-improving AI, you only need to get it right the first time - one person could eliminate the risk if they could solve the right technical problems. With preventing war or preventing upload labor, on the other hand, you need all or most people to cooperate with you, making the marginal effect of one group smaller.

Replies from: Wei_Dai

↑ comment by Wei Dai (Wei_Dai) · 2012-04-11T23:46:22.654Z · LW(p) · GW(p)

To whom?

The post was triggered by a private message from someone, so unfortunately I can't link to it.

Anywho, is the main idea of this post "this other bad stuff is similarly bad, and SI could be doing similar amounts to reduce the risk of these bad things?"

Not quite. I'm saying there are a bunch of Singularity-related risks that aren't AI risks, and a bunch of Singularity-related opportunities that aren't AI opportunities. The AI-related opportunities affect the non-AI risks, and the non-AI opportunities affect the AI risks. (For example successfully building FAI would prevent war as much as it prevents UFAI.) We shouldn't be thinking just about AI risks and opportunities at this point, or giving the impression that we are.

comment by A1987dM (army1987) · 2012-04-12T11:39:07.119Z · LW(p) · GW(p)

bad memes/philosophies spreading among humans or posthumans and overriding our values

Well,

comment by AlphaOmega · 2012-04-12T17:48:26.038Z · LW(p) · GW(p)

I am going to assert that the fear of unfriendly AI over the threats you mention is a product of the same cognitive bias which makes us more fascinated by evil dictators and fictional dark lords than more mundane villains. The quality of "evil mind" is what really frightens us, not the impersonal swarm of "mindless" nanobots, viruses or locusts. However, since this quality of "mind," which encapsulates such qualities as "consciousness" and "volition," is so poorly understood by science and so totally undemonstrated by our technology, I would further assert that unfriendly AI is pure science fiction which should be far down the list of our concerns compared to more clear and present dangers.

Replies from: Zetetic

↑ comment by Zetetic · 2012-04-13T05:39:27.218Z · LW(p) · GW(p)

I'm going to assert that it has something to do with who started the blog.

comment by Dmytry · 2012-04-12T04:18:14.011Z · LW(p) · GW(p)

SI/LW sometimes gives the impression of being a doomsday cult,

Because it fits the pattern exactly. If you have top astronomers worrying about meteorite hitting earth, that is astronomy. If you have nonastronomers (with very few astronomers) worrying about meteorite hitting earth, that's doomsday cult. Or at very best, a vague doomsday cult. edit: Just saying, that's how I classify, works for me. If you have instances (excluding SIAI) where this method of classification fails in damaging way, I am very interested to hear of them, to update my classification method. I might be misclassifying something. I might just go through the list of things that i classified as cults, and classify some items on that list as non-cult, if the classification method fails.

Replies from: Incorrect, orthonormal, thomblake, Emile

↑ comment by Incorrect · 2012-04-12T16:31:02.988Z · LW(p) · GW(p)

What complete definition of "cult" are you using here so that I can replace every occurrence of the word by its definition and get a better understanding of your paragraph?

That would be helpful to me as many people use this word in different ways and I don't know precisely how you use it.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T17:40:22.120Z · LW(p) · GW(p)

Pretty ordinary meaning: Bunch of people trusting extraordinary claims not backed with any evidence or expert consensus, originating from a charismatic leader who is earning living off cultists. Subtype doomsday. Now, I don't give any plus or minus points for the leader and living off cultists part, but the general lack of expert concern of the issue is a killer. Experts being people with expertise on relevant subject (but no doomsday experts allowed; has to be something practically useful or at least not all about the doomsday itself. Else you start counting theologians as experts). E.g. for AI risk, the relevant experts may be people with CS accomplishments, the folks who made self driving car, the visual object recognition experts, speech recognition, who developed actual working AI of some kind, etc etc.

I wonder what'd happen if we'd train a SPR for cult recognition. http://lesswrong.com/lw/3gv/statistical_prediction_rules_outperform_expert/ SPRs don't care for any unusual redeeming qualities or special circumstances.

Can you list some non-cult most similar to LW/SIAI ?

Replies from: Incorrect

↑ comment by Incorrect · 2012-04-12T19:22:54.377Z · LW(p) · GW(p)

extraordinary claims not backed with any evidence

There are two claims the conjunction of which must be true in order for a doomsday scenario to be likely:

self-improving human-level AI is dangerous enough
humans are likely to create human-level AI

I am unsure of 2 but believe 1. Do you disagree with 1?

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T20:28:11.845Z · LW(p) · GW(p)

I think the problem is conflating different aspects of intelligence into one variable. The three major groups of aspects are:

1: thought/engineering/problem-solving/etc; it can work entirely within mathematical model. This we are making steady progress at.

2: real-world volition, especially the will to form most accurate beliefs of the world. This we don't know how to solve, and don't even need to automate. We ourselves aren't even a shining example of 2, but generally don't care so much about that. 2 is a hard philosophical problem.

3: Morals.

Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1. 2 without 1 can't invent anything. The 3 may follow from strong 1 and 2 assuming that AI assigns non zero chance to being under test in a simulation, and strong 1 providing enormous resources.

So, what is your human level AI?

It seems to me that people with high capacity for 1, i.e. the engineers and scientists, are so dubious about AI risk because it is pretty clear to them, both internally, and from the AI effort, that 1 doesn't imply 2 and adding 2 won't strengthen 1. There isn't some great issue with 1 that 2 would resolve. The 1 works just fine. If for example we invent awesome automatic software development AI, it will be harmless even if superhuman at programming, and will self improve as much as possible without 2. Not just harmless, there's no reason why 1-agent plus human are together any less powerful than 1-agent with 2-capability.

Eliezer, it looks like, is very concerned with forming accurate beliefs, i.e. 2-type behaviour, but i don't see him inventing novel solutions as much. Maybe he's so scared of the AI because he attributes other people's problem solving to intellect paralleling his, while it's more orthogonal. Maybe he imagines that very strongly more-2 agent will somehow be innovative and foom, and he sees a lot of room for improving the 2. Or something along those lines. He is a very unusual person; I don't know how he thinks. The way I think it is very natural for me that the problem solving does not require wanting to actually do anything real first. That also parallels the software effort because ultimately everyone who is capable of working effectively as innovative software developers are very 1-orientated and don't see 2 as either necessary or desirable. I don't think 2 would just suddenly appear out of nothing by some emergence or accident.

Replies from: Incorrect

↑ comment by Incorrect · 2012-04-12T20:46:01.770Z · LW(p) · GW(p)

Even strongly superhuman 1 by itself is entirely harmless, even if very general within the problem space of 1.

Type 1 intelligence is dangerous as soon as you try to use it for anything practical simply because it is powerful. If you ask it "how can we reduce global temperatures" and "causing a nuclear winter" is in its solution space, it may return that. Powerful tools must be wielded precisely.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T20:49:18.132Z · LW(p) · GW(p)

See, that's what is so incredibly irritating about dealing with people who lack any domain specific knowledge. You can't ask it, "how can we reduce global temperatures" in the real world.

You can ask it how to make a model out of data, you can ask it what to do to the model so that such and such function decreases, it may try nuking this model (inside the model), and generate such solution. You got to actually put a lot of effort, like replicating it's in-model actions in real world in mindless manner, for this nuking to happen in real world. (and you'll also have the model visualization to examine, by the way)

Replies from: Incorrect

↑ comment by Incorrect · 2012-04-12T20:56:39.665Z · LW(p) · GW(p)

What if instead of giving the solution "cause nuclear war" it simply returns a seemingly innocuous solution expected to cause nuclear war? I'm assuming that the modelling portion is a black box so you can't look inside and see why that solution is expected to lead to a reduction in global temperatures.

If the software is using models we can understand and check ourselves then it isn't nearly so dangerous.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T21:02:09.312Z · LW(p) · GW(p)

I'm assuming that the modelling portion is a black box so you can't look inside and see why that solution is expected to lead to a reduction in global temperatures.

Let's just assume that mister president sits on nuclear launch button by accident, shall we?

It isn't an amazing novel philosophical insight that type-1 agents 'love' to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind. You, of course, also have some pretty visualization of what is the scenario where the parameter was minimized or maximized.

edit: also the answers could be really funny. How do we solve global warming? Okay, just abduct the prime minister of china! That should cool the planet off.

Replies from: Incorrect

↑ comment by Incorrect · 2012-04-12T21:09:42.446Z · LW(p) · GW(p)

It isn't an amazing novel philosophical insight that type-1 agents 'love' to solve problems in the wrong way. It is fact of life apparent even in the simplest automated software of that kind.

Of course it isn't.

Let's just assume that mister president sits on nuclear launch button by accident, shall we?

There are machine learning techniques like genetic programming that can result in black-box models. As I stated earlier, I'm not sure humans will ever combine black-box problem solving techniques with self-optimization and attempt to use the product to solve practical problems; I just think it is dangerous to do so once the techniques become powerful enough.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T21:16:20.705Z · LW(p) · GW(p)

There are machine learning techniques like genetic programming that can result in black-box models.

Which are even more prone to outputting crap solutions even without being superintelligent.

Replies from: Incorrect

↑ comment by Incorrect · 2012-04-12T21:18:12.693Z · LW(p) · GW(p)

Yup, we seem safe for the moment because we simply lack the ability to create anything dangerous.

Sorry you're being downvoted. It's not me.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T21:25:56.896Z · LW(p) · GW(p)

Yup, we seem safe for the moment because we simply lack the ability to create anything dangerous.

Actually your scenario already happened... Fukushima reactor failure: they used computer modelling to simulate tsunami, it was 1960s, the computers were science woo, and if computer said so, then it was true.

For more subtle cases though - see, the problem is substitution of 'intellectually omnipotent omniscient entity' for AI. If the AI tells to assassinate foreign official, nobody's going to do that; got to be starting the nuclear war via butterfly effect, and that's pretty much intractable.

Replies from: Incorrect

↑ comment by Incorrect · 2012-04-12T21:29:34.110Z · LW(p) · GW(p)

For more subtle cases though - see, the problem is substitution of 'intellectually omnipotent omniscient entity' for AI. If the AI tells to assassinate foreign official, nobody's going to do that; got to be starting the nuclear war via butterfly effect, and that's pretty much intractable.

I would prefer our only line of defense not be "most stupid solutions are going to look stupid". It's harder to recognize stupid solutions in say, medicine (although there we can verify with empirical data).

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T21:46:20.170Z · LW(p) · GW(p)

It is unclear to me that artificial intelligence adds any risk there, though, that isn't present from natural stupidity.

Right now, look, so many plastics around us, food additives, and other novel substances. Rising cancer rates even after controlling for age. With all the testing, when you have hundred random things a few bad ones will slip through. Or obesity. This (idiotic solutions) is a problem with technological progress in general.

edit: actually, our all natural intelligence is very prone to quite odd solutions. Say, reproductive drive, secondary sex characteristics, yadda yadda, end result, cosmetic implants. Desire to sell more product, end result, overconsumption. Etc etc.

↑ comment by orthonormal · 2012-04-12T23:26:16.111Z · LW(p) · GW(p)

It's worth discussing an issue as important as cultishness every so often, but as you might expect, this isn't the first time Less Wrong has discussed the meme of "SIAI agrees on ideas that most people don't take seriously? They must be a cult!"

ETA: That is, I'm not dismissing your impression, just saying that the last time this was discussed is relevant.

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-13T05:17:47.559Z · LW(p) · GW(p)

Less Wrong has discussed the meme of "SIAI agrees on ideas that most people don't take seriously? They must be a cult!"

Awesome, it has discussed this particular 'meme', to prevalence of viral transmission of which your words seem to imply it attributes it's identification as cult. Has it, however, discussed good Bayesian reasoning and understood the impact of a statistical fact that even when there is a genuine risk (if there is such risk), it is incredibly unlikely that the person most worth listening to will be lacking both academic credentials and any evidence of rounded knowledge, and also be an extreme outlier on degree of belief? There's also the NPD diagnostic criteria to consider. The probabilities multiply here into an incredibly low probability of extreme on many parameters relevant to cult identification, for a non-cult. (For cults, they don't multiply up because there is common cause.)

edit: to spell out details: So you start with prior maybe 0.1 probability that doomsday salvation group is noncult (and that is massive benefit of the doubt right here), then you look at the founder being such incredibly unlikely combination of traits for a non-cult doomsday caution advocate but such a typical founder for a cult - on multitude of parameters - and then you fuzzily do some knee jerk Bayesian reasoning (which however can be perfectly well replicated using a calculator instead of neuronal signals), and you end up virtually certain it is cult. That's if you can do Bayes without doing it explicitly on calculator. Now, the reason I am here, is that I did not take a good look until very recently because I did not care if you guys are a cult or not - the cults can be interesting to argue with. And EY is not a bad guy at all, don't take me wrong, he himself understands that he's risking making a cult, and trying very hard NOT to make a cult. That's very redeeming. I do feel bad for the guy, he happened to let one odd belief through, and then voila, a cult that he didn't want. Or a semi cult, with some people in it for cult reasons and some not so much. He happened not to have formal education, or notable accomplishments that are easily to know are challenging (like being an author of some computer vision library or what ever really). He has some ideas. The cult-follower-type people are dragged towards those ideas like flies to food.

↑ comment by thomblake · 2012-04-12T16:26:56.245Z · LW(p) · GW(p)

Seems more obviously a doomsday non-cult.

↑ comment by Emile · 2012-04-12T11:43:08.821Z · LW(p) · GW(p)

If you have nonastronomers (with very few astronomers) worrying about meteorite hitting earth, that's doomsday cult.

So if the US government worries about meteorites hitting earth, it's a doomsday cult?

Replies from: Dmytry

↑ comment by Dmytry · 2012-04-12T11:55:29.964Z · LW(p) · GW(p)

If it starts worrying more than astronomers do, sure. The few is as in percentile, at same level of the worry.

More generally, if the degree of the belief is negatively correlated with achievements in relevant areas of expertise, then the extreme forms of belief are very likely false. (And just in case: comparing to Galileo is cherry picking. For each Galileo there's a ton of cranks)

against "AI risk"

Contents

91 comments