What would defuse unfriendly AI?

post by asr · 2011-06-10T07:27:12.623Z · LW · GW · Legacy · 14 comments

It seems to be a widely held belief around here that unfriendly artificial general intelligence is dangerous, and that (provably) friendly artificial general intelligence is the soundest counter to it.

But I'd like to see some analysis of alternatives.  Here are some possible technical developments. Would any of these defuse the threat? How much would they help?


Are there other advances in computer science that might show up within the next twenty years, that would make friendly-AI much less interesting?

Would anything on this list be dangerous?  Obviously, efficient algorithms for NP-complete problems would be very disruptive. Nearly all of modern cryptography would become irrelevant, for instance.

14 comments

Comments sorted by top scores.

comment by XiXiDu · 2011-06-10T14:29:33.537Z · LW(p) · GW(p)

What would defuse unfriendly AI?

To defuse risks from AI you would have to argue that what we currently know AI to be capable of can not be extrapolated to encompass the full spectrum of the human potential, or that those skills can not be combined to create a coherent framework of agency.

Below are a few examples that hint at the possibility that AI (or artificial algorithmic creation) has already transcended human capabilities in several narrow fields of expertise.


We already know that AI is capable of following:

Algorithmic intelligence can be creative and inventive:

We report the development of Robot Scientist “Adam,” which advances the automation of both. Adam has autonomously generated functional genomics hypotheses about the yeast Saccharomyces cerevisiae and experimentally tested these hypotheses by using laboratory automation.

The Automation of Science

Without any prior knowledge about physics, kinematics, or geometry, the algorithm discovered Hamiltonians, Lagrangians, and other laws of geometric and momentum conservation. The discovery rate accelerated as laws found for simpler systems were used to bootstrap explanations for more complex systems, gradually uncovering the “alphabet” used to describe those systems.

Computer Program Self-Discovers Laws of Physics

This aim was achieved within 3000 generations, but the success was even greater than had been anticipated. The evolved system uses far fewer cells than anything a human engineer could have designed, and it does not even need the most critical component of human-built systems - a clock. How does it work? Thompson has no idea, though he has traced the input signal through a complex arrangement of feedback loops within the evolved circuit. In fact, out of the 37 logic gates the final product uses, five of them are not even connected to the rest of the circuit in any way - yet if their power supply is removed, the circuit stops working. It seems that evolution has exploited some subtle electromagnetic effect of these cells to come up with its solution, yet the exact workings of the complex and intricate evolved structure remain a mystery. (Davidson 1997)

When the GA was applied to this problem, the evolved results for three, four and five-satellite constellations were unusual, highly asymmetric orbit configurations, with the satellites spaced by alternating large and small gaps rather than equal-sized gaps as conventional techniques would produce. However, this solution significantly reduced both average and maximum revisit times, in some cases by up to 90 minutes. In a news article about the results, Dr. William Crossley noted that "engineers with years of aerospace experience were surprised by the higher performance offered by the unconventional design". (Williams, Crossley and Lang 2001)

Genetic Algorithms and Evolutionary Computation

UC Santa Cruz emeritus professor David Cope is ready to introduce computer software that creates original, modern music.

Triumph of the Cyborg Composer

The HR (or Hardy-Ramanujan) program invents and analyses definitions in areas of pure mathematics, including finite algebras, graph theory and number theory. While working in number theory, HR recently invented a new integer sequence, the refactorable numbers, which are defined and developed here.

Refactorable Numbers - A Machine Invention

A computer program written by researchers at Argonne National Laboratory in Illinois has come up with a major mathematical proof that would have been called creative if a human had thought of it. In doing so, the computer has, for the first time, got a toehold into pure mathematics, a field described by its practitioners as more of an art form than a science. And the implications, some say, are profound, showing just how powerful computers can be at reasoning itself, at mimicking the flashes of logical insight or even genius that have characterized the best human minds.

Computer Math Proof Shows Reasoning Power

Improvements of algorithms can in many cases lead to dramatic performance gains:

Everyone knows Moore’s Law – a prediction made in 1965 by Intel co-founder Gordon Moore that the density of transistors in integrated circuits would continue to double every 1 to 2 years. (…) Even more remarkable – and even less widely understood – is that in many areas, performance gains due to improvements in algorithms have vastly exceeded even the dramatic performance gains due to increased processor speed.

The algorithms that we use today for speech recognition, for natural language translation, for chess playing, for logistics planning, have evolved remarkably in the past decade. It’s difficult to quantify the improvement, though, because it is as much in the realm of quality as of execution time.

In the field of numerical algorithms, however, the improvement can be quantified. Here is just one example, provided by Professor Martin Grötschel of Konrad-Zuse-Zentrum für Informationstechnik Berlin. Grötschel, an expert in optimization, observes that a benchmark production planning model solved using linear programming would have taken 82 years to solve in 1988, using the computers and the linear programming algorithms of the day. Fifteen years later – in 2003 – this same model could be solved in roughly 1 minute, an improvement by a factor of roughly 43 million. Of this, a factor of roughly 1,000 was due to increased processor speed, whereas a factor of roughly 43,000 was due to improvements in algorithms! Grötschel also cites an algorithmic improvement of roughly 30,000 for mixed integer programming between 1991 and 2008.

— "Progress in Algorithms Beats Moore’s Law", Page 71 (Report to the President and Congress: Designing a Digital Future: Federally FUnded R&D in Networking and IT)

Replies from: FAWS
comment by FAWS · 2011-06-10T15:24:10.752Z · LW(p) · GW(p)

To defuse risks from AI you would have to argue that what we currently know AI to be capable of can not be extrapolated to encompass the full spectrum of the human potential, or that those skills can not be combined to create a coherent framework of agency.

I think you misunderstand, the question isn't what could defuse worries about UFAI by demonstrating the risks to be lower than previously believed (e. g. proving strong AI to be unfeasible), it's about what could reduce the actual existent risk.

Replies from: XiXiDu
comment by XiXiDu · 2011-06-10T17:03:48.991Z · LW(p) · GW(p)

I think you misunderstand, the question...

No, I think that short of a demonstration that strong AI is unfeasible, there is no way to actually defuse the risk enough that it would matter much. Even a very sophisticated autistic (limited set of abilities) AI, that never undergoes recursive self-improvement, but which does nonetheless possess some superhuman capabilities (which any AI has: superior memory, direct data access etc.), could pose an existential risk.

Take for example what is happening in Syria right now. The only reason that they do not squelch any protests is that nobody can supervise or control that many people. Give them an AGI, that can watch a few million security cameras and control thousands of drones, and they will destroy most of all human values by implementing a world-wide dictatorship or theocracy.

Replies from: asr
comment by asr · 2011-06-11T03:42:42.018Z · LW(p) · GW(p)

You seem to be implying that if both the authorities and the insurgents have access to equally powerful AGI, then this works to the net benefit of the authorities.

I am skeptical of that premise, especially in the context of open revolt as we're seeing in Syria. I don't think lack of eyeballs on cameras is a significant mechanism there; plain old ordinary human secret police would do fine for that, since people are protesting openly. The key dynamic I see is that the regime isn't confident that the police or army will obey orders if driven to use lethal force on a large scale.

I don't see how AI would change that dynamic. If both sides have it, the protesters can optimize their actions to stay within the sphere of uncertainty, even as the government is trying to act as aggressively as it can, without risking the military joining the rebels.

Today, we already have much more sophisticated weapons, communication, information storage, and information retrieval technology than was ever available before. It doesn't appear to have been a clear net benefit for either freedom or tyranny.

Do you envision AGI strengthening authorities in ways that 20th-century coercive technologies did not?

comment by benelliott · 2011-06-10T09:26:17.539Z · LW(p) · GW(p)

Other than the last, I'm afraid I can't see how they would help at all, what makes you think they would? Maybe I'm missing an important point.

The last might help, assuming that whoever builds the UFAI can be trusted to use it. I would probably not want to try it unless there wasn't an alternative, since it seems a bit risky to trust that we can outsmart something so much smarter than us.

Replies from: xxd
comment by xxd · 2011-12-20T18:27:25.309Z · LW(p) · GW(p)

I think you're right that it would be difficult to think we could outsmart something that is smarter than us and can also think faster than us.

That said, I think there is a different solution available to the one we're trying to define here.

If the antidote to unfriendly AI is friendly AI but we are struggling to define what friendly AI is then perhaps there is a logic disconnect?

If unfriendly AI is something that will harm humanity then how can the opposite of that be something that benefits humanity?

Clearly the logical answer is that friendly AI is all incarnations of AI that will not harm humanity if unfriendly AI is all incarnations of AI that will harm humanity.

i.e. simply stated friendly AI = NOT(unfriendly AI).

To me that seems much more logically intuitive than trying to define something that will "benefit humanity" when clearly defining goals that will benefit humanity are contradictory. It's far, far easier to define what unfriendly AI will do and then define not doing that as being friendly.

I think we have a case of feature creep here and instead of trying to define friendly AI as the antidote to unfriendly AI we've also tagged on the nice to have features of also making it "helpful AI".

I think helpful AI != friendly AI.

Replies from: benelliott
comment by benelliott · 2011-12-20T21:55:06.704Z · LW(p) · GW(p)

Roughly speaking the idea is as follows.

An AI can be seen as just a very powerful optimization process. Creating one is likely to result in our local reason of the universe being strongly optimal according to that process's utility function. Currently our local reason of the universe is not strongly optimal according to any utility function, or at any rate not any utility function that takes much less time to describe than our local region of the universe itself. Therefore creation of AGI will almost certainly result in drastic changes to pretty much everything.

If the AGI's utility function very closely matches our own, this will be very good, if it deviates from our own even in quite small ways, this is overwhelmingly likely to be very bad. There is almost no middle ground, so while helpful AI may not be exactly logically equivalent to friendly AI, in practice they are pretty much the same.

Replies from: xxd
comment by xxd · 2011-12-20T22:56:49.191Z · LW(p) · GW(p)

Thank you for the reply.

I don't know if it was your wording, but parts of that went way over my head.

Can I paraphrase by breaking it down to try to understand what your're saying? Please correct me if I misinterpret.

Firstly: "An AI can be seen as just a very powerful optimization process."

So whatever process you apply the AI to, it will make it far more efficient and faster would be examples? And thus just by compound interest, any small deviations from our utility function would rapidly compound till they ran to 100% thus creating problems (or extinction) for us? Doesn't that mean however, that the AI isn't really that intelligent as it seems (perhaps naively) that any creature that seeks to tile the universe with paperclips or smiley faces isn't very intelligent at all. Are we therefore equating an AI to a superfast factory?

Secondly: "Our local reason of the universe being strongly optimal according to that process's utility function."

With the word "reason" in there I don't understand this sentence. Are you saying "the processes we use to utilize resources will become strongly optimal"? If not, can you break it down a little further? I'm particular struggling with this phrase "Our local reason of the universe".

I pretty much get the rest.

One extrapolation, however, is that we ourselves aren't very optimal. If we speed up and e.g. make our resource extraction processes more efficient without putting into place recycling we will much more rapidly run out of resources with a process optimizer to help us do it.

Replies from: Vladimir_Nesov, benelliott
comment by Vladimir_Nesov · 2011-12-21T14:11:25.771Z · LW(p) · GW(p)

Doesn't that mean however, that the AI isn't really that intelligent as it seems (perhaps naively) that any creature that seeks to tile the universe with paperclips or smiley faces isn't very intelligent at all. Are we therefore equating an AI to a superfast factory?

Yes, basically (if I guess at the parsing of your first sentence correctly), because it doesn't matter what we're used to calling "intelligent", what matters are the conditions for the universe getting tiled with worthless stuff. See this post from Luke's new introductory sequence.

comment by benelliott · 2011-12-21T11:28:26.970Z · LW(p) · GW(p)

So whatever process you apply the AI to, it will make it far more efficient and faster would be examples?

That's not quite what optimization process means. The sequences contain a better explanation than I could give, but roughly speaking it is anything whose behaviour is better understood through thinking about a 'goal' that it will try to achieve rather than a simple set of short term laws it will follow. It is a category that includes but is not limited to intelligences, evolution would be an example of a non-intelligent optimization process (thus why they are called processes, they do not have to have a concrete physical form). Humans, most animals and chess-playing computer programs are also optimization processes. Rocks are not.

When an optimization process acts on something that thing ends up 'optimized', which means it is in a configuration which satisfies the optimizers utility function much more closely than could ever be expected by random chance. A car, for example, is highly optimized, it is far better at taking humans where they want to go than anything you'd expect to get simply by choosing random configurations of various metal alloys.

Roughly speaking, we can compare the strengths of different optimization processes by how well they are able to steer the state of the world into their goal region, so we might say that Deep Blue beat Kasparov because it was a stronger optimization process (of course, this only holds within the domain of chess, Kasparov could have won easily by taking a sledgehammer to the processor and there would have been nothing Deep Blue could do about it), as well as by how fast they work (evolution is more powerful than animal intelligence, but needed to create animal intelligence anyway since it is too slow), and how wide the domain in which they can function successfully (count the number of animal species that went extinct because humans threw a problem they weren't used to at them, we ourselves are hopefully a bit more versatile).

When we create an AI we create an optimization process that works very fast, in a very wide domain, and is vastly more powerful than anything that existed before. Thus we are likely to end up far more strongly optimized than ever before.

Doesn't that mean however, that the AI isn't really that intelligent as it seems (perhaps naively) that any creature that seeks to tile the universe with paperclips or smiley faces isn't very intelligent at all.

No, it doesn't mean that. There is no way to derive the proposition "tiling the universe with paper-clips is a pointless thing to do" from pure logic and empirical observation. You and I only believe it because we are humans, and cannot see how such a thing would even slightly satisfy our human goals. If another being has "tile the universe with paper-clips" as its goal that fact alone is no reflection on its intelligence at all. If it then turned our to be extremely good at tiling the universe with paper-clips (a task which will likely include outsmarting humanity) then I would be happy to call it super-intelligent.

Secondly: "Our local reason of the universe being strongly optimal according to that process's utility function."

That was a typo, I meant to write 'our local region of the universe'. Hope that's clearer now.

comment by timtyler · 2011-06-11T10:49:59.004Z · LW(p) · GW(p)

Are there other advances in computer science that might show up within the next twenty years, that would make friendly-AI much less interesting?

Well the general counter is not advances in computer science, but rather the idea that a rogue agent getting much more powerful that other public players that integrate into the rest of society is unlikely.

Today machine intelligence integrates into people's mobile phones, desktop computers. It knows about privacy. It knows when to be unobtrusive. It knows what people like. That process seems likely to continue for quite a while, with machines absorbing human values along the way. It is easier to cooperate with humans than to recycle their atoms, especially when they have direct access to mature advanced molecular nanotechnology bodies, and you don't.

So, we will likely initally see a global advance, that is integrated into human society, and which involves machines absorbing human values, so as to better give them what they want. That doesn't address longer-term questions - but it does make the scenario of an uncaring machine intelligence arising first and trashing the planet seem relatively unlikely. Instead, superhuman machine intelligence will arise out of subhuman machine intelligence - which will by that point most likely have an extended history of cooperating with humans and respecting their values.

comment by Alexei · 2011-06-11T18:19:15.886Z · LW(p) · GW(p)

The main problem that remains unsolved is that somebody else can still build a fully self-improving UFAI. And that AI, being a lot smarter and more free than your limited AI, will crush it. The best way to prevent that from happening is to have a system that can monitor and understand pretty much everything, and that's pretty much FAI.

Replies from: xxd
comment by xxd · 2011-12-20T18:33:44.999Z · LW(p) · GW(p)

Is it really the case that an unrestricted growing entity will outcompete a restricted growing entity?

i.e. something like Cancer vs the Immune System.

I'm not convinced that the unrestricted growing entity will always win.

Replies from: Alexei
comment by Alexei · 2011-12-21T16:38:55.751Z · LW(p) · GW(p)

Didn't you just answer your own question?

And the problem isn't that unrestricted growing entity will always win. One win will be enough to end it all.