Intelligence Amplification and Friendly AI
post by lukeprog · 2013-09-27T01:09:15.978Z · LW · GW · Legacy · 26 commentsContents
Some possible “best current options” for increasing the odds of FAI The IA route None 26 comments
Part of the series AI Risk and Opportunity: A Strategic Analaysis. Previous articles on this topic: Some Thoughts on Singularity Strategies, Intelligence enhancement as existential risk mitigation, Outline of possible Singularity scenarios that are not completely disastrous.
Below are my quickly-sketched thoughts on intelligence amplification and FAI, without much effort put into organization or clarity, and without many references.[1] But first, I briefly review some strategies for increasing the odds of FAI, one of which is to work on intelligence amplification (IA).
Some possible “best current options” for increasing the odds of FAI
Suppose you find yourself in a pre-AGI world,[2] and you’ve been convinced that the status quo world is unstable, and within the next couple centuries we’ll likely[3] settle into one of four stable outcomes: FAI, uFAI, non-AI extinction, or a sufficiently powerful global government which can prevent AGI development[4]. And you totally prefer the FAI option. What should you do to get there?
- Obvious direct approach: start solving the technical problems that must be solved to get FAI: goal stability under self-modification, decision algorithms that handle counterfactuals and logical uncertainty properly, indirect normativity, and so on. (MIRI’s work, some FHI work.)
- Do strategy research, to potentially identify superior alternatives to the other items on this list, or superior versions of the things on this list already. (FHI’s work, some MIRI work, etc.)
- Accelerate IA technologies, so that smarter humans can tackle FAI. (E.g. cognitive genomics.)
- Try to make sure we get high-fidelity WBEs before AGI, without WBE work first enabling dangerous neuromorphic AGI. (Dalyrmple’s work?)
- Improve political and scientific institutions so that the world is more likely to handle AGI wisely when it comes. (Prediction markets? Vannevar Group?)
- Capacity-building. Grow the rationality community, the x-risk reduction community, the effective altruism movement, etc.
- Other stuff. (More in later posts).
The IA route
Below are some key considerations about the IA route. I’ve numbered them so they’re easy to refer to later. My discussion assumes MIRI’s basic assumptions, including timelines similar to my own AGI timelines.
- Maybe FAI is so hard that we can only get FAI with a large team of IQ 200+ humans, whereas uFAI can be built by a field of IQ 130–170 humans with a few more decades and lots of computing power and trial and error. So to have any chance of FAI at all, we’ve got to do WBE or IA first.
- You could accelerate FAI relative to AGI if you somehow kept IA technology secret, for use only by FAI researchers (and maybe their supporters).
- Powerful IA technologies would likely get wide adoption, and accelerate economic growth and scientific progress in general. If you think Earths with slower economic growth have a better chance at FAI, that could be bad for our FAI chances. If you think the opposite, then broad acceleration from IA could be good for FAI.
- Maybe IA increases one’s “rationality” and “philosophical ability” (in scare quotes because we mostly don’t know how to measure them yet), and thus IA increases the frequency with which people will realize the risks of AGI and do sane things about it.
- Maybe IA increases the role of intelligence and designer understanding, relative to hardware and accumulated knowledge, in AI development.[5]
Below are my thoughts about all this. These are only my current views: other MIRI personnel (including Eliezer) disagree with some of the points below, and I wouldn’t be surprised to change my mind about some of these things after extended discussion (hopefully in public, on Less Wrong).
I doubt (1) is true. I think IQ 130–170 humans could figure out FAI in 50–150 years if they were trying to solve the right problems, and if FAI development wasn’t in a death race with the strictly easier problem of uFAI. If normal smart humans aren’t capable of building FAI in that timeframe, that’s probably for lack of rationality and philosophical skill, not for lack of IQ. And I’m not confident that rationality and philosophical skill predictably improve with IQ after about IQ 140. It’s a good sign that atheism increases with IQ after IQ 140, but on the other hand I know too many high-IQ people who think that (e.g.) an AI that maximizes K-complexity is a win, and also there’s Stanovich’s research on how IQ and rationality come apart. For these reasons, I’m also not convinced (4) would be a large positive effect on our FAI chances.
Can we train people in rationality and philosophical skill beyond that of say, the 95th percentile Less Wronger? CFAR has plans to find out, but they need to grow a lot first to execute such an ambitious research program.
(2) looks awfully hard, unless we can find a powerful IA technique that also, say, gives you a 10% chance of cancer. Then some EAs devoted to building FAI might just use the technique, and maybe the AI community in general doesn’t.
(5) seems right, though I doubt it’ll be a big enough effect to make a difference for the final outcome.
I think (3) is the dominant consideration here, along with the worry about lacking the philosophical skill (but not IQ) to build FAI at all. At the moment, I (sadly) lean toward the view that slower Earths have a better chance at FAI. (Much of my brain doesn’t know this, though: I remember reading the Summers news with glee, and then remembering that on my current model this was actually bad news for FAI.)
I could say more, but I’ll stop for now and see what comes up in discussion.
-
My thanks to Justin Shovelain for sending me his old notes on the “IA first” case, and to Wei Dai, Carl Shulman, and Eliezer Yudkowsky for their feedback on this post. ↩
-
Not counting civilizations that might be simulating our world. This matters, but I won’t analyze that here. ↩
-
There are other possibilities. For example, there could be a global nuclear war that kills all but about 100,000 people, which could set back social, economic, and technological progress by centuries, thus delaying the crucial point in Earth’s history in which it settles into one of the four stable outcomes. ↩
-
And perhaps also advanced nanotechnology, intelligence amplification technologies, and whole brain emulation. ↩
-
Thanks to Carl Shulman for making this point. ↩
26 comments
Comments sorted by top scores.
comment by Yosarian2 · 2013-11-20T23:10:12.517Z · LW(p) · GW(p)
I think this is an overly simplistic and binary way of looking at it:
Maybe FAI is so hard that we can only get FAI with a large team of IQ 200+ humans...I doubt (1) is true. I think IQ 130–170 humans could figure out FAI in 50–150 ...
If it's possible or not, IMHO, isn't the only question. A better question may be "What are the odds that a team of IQ 150 humans thinks they have developed a FAI and are correct, vs. the odds that they think they have developed a FAI and are wrong? Are those odds better or worse if it's a team of IQ 200+ individuals? "
I think that a group of normal people could probably develop a FAI. But I also think that a group of IA people are more likely to do so correctly on the first try without missing any vital details, given that in practice you may only have one shot at it.
I would also say that if something does go badly wrong, a group of IA people (or people with other augmentations, like brain-computer interfaces) probably have a better shot at figuring it out in time and responding properly (not necessarily that they have good odds at it, but probably at least better odds). They're also probably less likely to fail at other related forms of AI danger (trying to create a GAI that is designed to not self-improve, for example, at least not until the team is convinced that it is friendly; or maintaining control over some kind of oracle AI, or losing control over a narrow AI with significant destructive capability).
Note that I'm not necessarily saying that those are good ideas, but either way, AI risk is probably lowered if IA comes first. Very smart people still may intentionally make uFAI for whatever reason, but at least they're less likely to try to make FAI but mess it up.
comment by Andreas_Giger · 2013-09-27T23:27:44.393Z · LW(p) · GW(p)
Where in the linked article does it say that atheism correlates with IQ past 140? I cannot find this.
Replies from: lukeprog↑ comment by lukeprog · 2013-09-28T00:42:45.406Z · LW(p) · GW(p)
That study just says that the most prestigious scientists are even more atheistic than normal scientists. I think for other reasons that the most prestigious scientists have higher average IQ than normal scientists, a large fraction of them higher than 140.
Replies from: Andreas_Giger↑ comment by Andreas_Giger · 2013-09-28T00:51:21.206Z · LW(p) · GW(p)
You should probably edit your post then, because it currently suggests an IQ-atheism correlation that just isn't supported by the cited article.
Replies from: lukeprog↑ comment by lukeprog · 2013-10-06T04:03:15.716Z · LW(p) · GW(p)
I don't think this will satisfy you or the people who upvoted your comment, but by way of explanation...
The original post opened with a large section emphasizing that this was a quick and dirty analysis, because writing more careful analyses like When Will AI Be Created takes a long time. The whole point was for me to zoom through some considerations without many links or sources or anything. I ended up cutting that part of the post based on feedback.
Anyway, when I was writing and I got to the bit about there being some evidence of atheism increasing after IQ 140, I knew a quick way to link to some of my evidence for that, but to explain all my reasons for thinking so would take many hours, especially the part of tracking down the sources. So I decided providing some of my evidence was better than providing none of my evidence.
Providing none of my evidence is what I did for most of my claims, but you seem to only be complaining about one of the few claims for which I decided to provide some evidence. Oh well.
comment by James_Miller · 2013-09-27T03:30:17.784Z · LW(p) · GW(p)
Eugenics might work for (2) if people interested in FAI become early users of the technology and raise their children to care about FAI. Perhaps MIRI and CFAR should offer to pay for intelligence-enhancing eugenics technologies as an employee benefit when these technologies become available.
comment by lyghtcrye · 2013-09-28T08:25:29.378Z · LW(p) · GW(p)
I have been mulling around a rough and mostly unformed idea in my head regarding AI-first vs IA-first strategies, but I was loathe to try and put it into words until I saw this post, and noticed that one of the scenarios that I consider highly probable was completely absent.
On the basis that subhuman AGI poses minimal risk to humanity, and that IA increases the level of optimization ability required of an AI to be considered human level or above, it seems that there is a substantial probability that an IA-first strategy could lead to a scenario in which no superhuman AGI can be developed because it is economically infeasible to research that field as opposed to optimizing accelerating returns from IA creation and implementation. Development of AI whether friendly or not would certainly occur at a faster pace, but if IA proves to simply be easier than AI, which given our poor ability estimate the difficulty of both approaches may be true, development in that field would continue to outpace it. It could certainly instigate either a fast or slow takeoff event from our current perspective, but from the perspective of enhanced humans it would be simply an extension of existing trends.
A similar argument could be made in regard to Hanson's WBEM based scenarios, through the implication that given the ability to store a mind to some hardware system, it would be more economically efficient to emulate that mind at a faster pace than to parallel process multiple copies of that mind in the same hardware space, and likewise hardware design would trend toward rapid emulation of single workers rather than multiple instances in order to reduce costs accrued by redundancy and increase gains in efficiency accrued by experience. This would imply that mind enhancement of a few high efficiency minds would occur much earlier and that exceptional numbers of emulated workers would be unlikely to be created, but rather that a few high value workers would occupy a large majority of relevant hardware very soon after the creation of such technology.
An IA field with greater pace than AI does of course present its own problems, and I'm not trying to endorse moving towards an IA-first approach with my ramblings. I suppose I'm simply trying to express the belief that discussion of IA as an alternative to AI rather than an instrument toward AI is rather lacking in this forum and I find myself confused as to why.
Replies from: None↑ comment by [deleted] · 2013-09-28T08:47:59.547Z · LW(p) · GW(p)
"Superhuman AI" as the term is generally used is a fixed reference standard, i.e. your average rationalist computer scientist circa 2013. This particular definition has meaning because if we posit that human beings are able to create an AGI, then a first generation superhuman AGI would be able to understand and modify its own source code, thereby starting the FOOM process. If human beings are not smart enough to write an AGI then this is a moot point. But if we are, then we can be sure that once that self-modifying AGI also reaches human-level capability, it will quickly surpass us in a singularity event.
So the point of whether IA advances humans faster or slower than AGI is a rather uninteresting point. All that matters is when a self-modifying AGI becomes more capable than its creators at the time of its inception.
As to your very last point, it is probably because the timescales for AI are much closer than IA. AI is basically a solvable software problem, and there are many supercompute clusters in the world that could are probably capable of running a superhuman AGI at real time speeds, if such a software existed. Significant IA, on the other hand, requires fundamental breakthroughs in hardware...
Replies from: lyghtcrye↑ comment by lyghtcrye · 2013-09-28T10:10:50.977Z · LW(p) · GW(p)
I seem to have explained myself poorly. You are effectively restating the commonly held (on LessWrong) views that I was attempting to originally address, so I will try to be more clear.
I don't understand why you would use a particular fixed standard for "human level". It seems to be arbitrary, and it would be more sensible to use the level of human at the time when a given AGI was developed. You yourself say as much in your second paragraph ("more capable than its creators at the time of its inception"). Since IA rate determines the capabilities of the AIs creators, then a faster rate of IA than AI would mean that the event of a more capable AGI would never occur.
If a self-modifying AGI is less capable than its creators at the time of its inception, then it will be unable to FOOM, from the perspective of its creators, both because they would be able to develop a better AI in a shorter time than an AI could improve itself, and because if they were developing IA at a greater pace they would advance faster than the AGI that they had developed. Given the same intelligence and rate of work, an easier problem will see more progress. Therefore, if IA is given equal or greater rate of work than AI, and it happens to be an easier problem, then humans would FOOM before AI did. A FOOM doesn't feel like a FOOM from the perspective of the one experiencing it though.
Your final point makes sense, in that it address the point that the probability of the first fast takeoff being in the AI field may be larger than the IA field, or in that AI is an easier problem. I fail to see why a software problem is inherently easier than a biology or engineering problem though. A fundamental breakthrough in software is just as unlikely as a hardware, and there are more paths to success for IA than AI that are currently being pursued, only one of which is a man-machine interface.
I considered being a bit snarky and posting each of your statements as direct opposites (IE all that matters is if a self modifying human becomes more capable than an AI at the time of its augmentation), but I feel like that would convey the wrong message. The dismissive response genuinely confuses me, but I'm making the assumption that my poor organization has made my point too vague.
Replies from: None↑ comment by [deleted] · 2013-09-28T17:34:11.161Z · LW(p) · GW(p)
It's not an arbitrary reference point. For a singularity/AI-goes-FOOM event to occur, it needs to have sufficient intelligence and capability to modify itself in a recursive self-improvement process. A chimpanzee is not smart enough to do this. We've posited that at least some human beings are capable of creating a more powerful intelligence either though AGI or IA. Therefore the important cutoff where a FOOM event becomes possible is somewhere in-between those two reference levels (the chimpanzee and the circa 2013 rationalist AGI/IA researcher).
Despite my careless phrasing, this isn't some floating standard that depends on circumstances (having to be smarter than your creators). An AGI or IA simply has to meet some objective minimum level of rationalist and technological capability to start the recursive self-improvement process. The problem is our understanding of the nature of intelligence is not developed enough to predict where that hard cutoff is, so we're resorting to making qualitative judgements. We think we are capable of starting a singularity event either through AGI or IA means. Therefore anything smarter than we are (“superhuman”) would be equally capable. This is a sufficient, but not necessary requirement - making humans smarter though IA doesn't mean that an AGI suddenly has to be that much smarter to start its own recursive self-improvement cycle.
My point about software was that an AGI FOOM could happen today. There are datacenters at Google and research supercomputers that are powerful enough to run a recursively improving “artificial scientist” AGI. But IA technology to the level of being able to go super-critical basically requires molecular nanotechnology or equivalently powerful technology (to replace neurons) and/or mind uploading. You won't get an IA FOOM until you can remove the limitations of biological wetware, but these technologies are at best multiple decades away.
comment by Viliam_Bur · 2013-09-27T10:53:15.417Z · LW(p) · GW(p)
Increased intelligence does not mean [EDIT: does not automatically translate to] increased rationality... but still, with proper education, more intelligent people could become rational more quickly.
If we invent a pill for increasing everyone's IQ by 50, distributing the pill to the world will not make it more rational. But if CFAR develops a curriculum for making people more rational, those who already took the pill will be more successful students.
(In the opposite direction: If a mad scientist releases a virus that will lower everyone's IQ to 80, some specific forms of irrationality may disappear... but CFAR's or MIRI's missions will become completely hopeless.)
Replies from: Lumifer↑ comment by Lumifer · 2013-09-27T16:29:55.851Z · LW(p) · GW(p)
Increased intelligence does not mean increased rationality
Yes, it does. It's not a one-to-one (or linear) correspondence, but it's really hard to be rational if you're stupid.
Replies from: None↑ comment by [deleted] · 2013-09-27T20:53:49.552Z · LW(p) · GW(p)
As a more technical elaboration, rationality is a computational process involving prioritized search. It's a specific software algorithm. “Intelligence augmentation” is not well defined, but generally involves increasing the computational power of an existing human brain, aka making the hardware faster. That says nothing about the software running on it. But it is easy to show that if you increase the computational power available to a rational agent, you get a more rational agent (but increasing the computational power available to a non-rational agent would not magically impart rationality).
comment by [deleted] · 2013-10-02T05:01:40.340Z · LW(p) · GW(p)
At the moment, I (sadly) lean toward the view that slower Earths have a better chance at FAI. (Much of my brain doesn’t know this, though: I remember reading the Summers news with glee, and then remembering that on my current model this was actually bad news for FAI.)
Feeling the vertigo, huh? I went through a stage around the 2012 election where I thought growth was bad for somewhat similar reasons. It strongly affected how I voted. But with another year's consideration I admit to simply having no clue whether GDP growth overall is good for the far future. So I'd now say growth is good but that its importance shrinks with how much you care about the far future over the present.
That's uncertain, but I probably won't think much more about it given the likelihood of better future-affecting interventions than influencing growth.
comment by [deleted] · 2013-09-27T15:46:30.836Z · LW(p) · GW(p)
(1) I've butted heads with you on timelines before. We're about a single decade away from AGI, if reasonable and appropriate resources are allocated to such a project. FAI in the sense that MIRI defines the term - provably friendly - may or may not be possible, and just finding that out is likely to take more time than we have left. I'm glad you made this post because if your estimate for a FAI theory timeline is correct, then MIRI is on the entirely wrong track, and alternatives or hybrid alternatives involving IA need to be considered. This is a discussion which needs to happen, and in public. (Aside: this is why I have not, and continue to refuse to donate to MIRI. You're solving the wrong problem, albeit with the best of intentions, and my money and time is better spent elsewhere.)
(2) Secrecy rarely has the intended outcome, is too easily undone, and is itself an self-destructive battle that would introduce severe risks. Achieving and maintaining operational security is a major entropy-fighting effort which distracts from the project goals, often drives away participants, and amplifies power dynamics among project leaders. That's a potent, and very bad mix.
(3-5) Seem mostly right, or wrong without negative consequences. I don't think there's any specific reasons in there not to take an IA route.
Take for example an UFAI (not actively unfriendly, just MIRI's definition of not-provably-FAI) tasked with the singular goal of augmenting the intelligence of humanity. This would be much safer than the Scary Idea strawman that MIRI usually paints, as it would in practice be engineering its own demise through an explicit goal to create runaway intelligence in humans.
If you were to actually implement this, the goal may need a little clarity:
Instead of “humanity” you may need to explicitly specify a group of humans (pre-chosen by the community or a committee for their own history of moral action and rational decision making), as well as constraints that they all advance / are augmented at approximately the same rate.
The AI should be penalized for any route it takes which results in the AI being even temporarily smarter than the humans. Presumably there is a tradeoff and an approximation here as the AI needs to be at least some level of superhuman smart in order to start the augmentation process, and needs to continue to improve itself in order to come up with even better augmentations. But it should favor plans which require smaller AI/human intelligence differentials.
To prevent weird outcomes, utility of future states should be weighted by an exponential decay. The AI should be focused on just getting existing humans augmented in the near term, and not worry itself over what it thinks future outcome would be millennia from now - that's for the augmented humans to worry about.
And I'm sure there are literally hundreds of other potential problems and small protective tweaks required. I would rather that MIRI spent it's time and money working on scenarios like this and formulating the various risks and counter measures, rather than obsessing over Löb obstacles (an near complete waste of time).
This is similar to how things are done in computer security. We have a well understood repertoire of attacks and general counter measures. Cryptographers then design specific protocols which through their unique construction are not vulnerable to the known attacks, and auditors make sure that implementations are free of side-channel vulnerabilities and such. How many security systems are provably secure? Very few, and none if you consider that those which have proofs have underlying assumptions which are not universally true. Nevertheless the process works and with each iterative design we move the ball forward towards end goal of a fully secure system in practice.
I'm not interested in an airtight mathematical proof of an AGI design which by your own estimate take an order of magnitude longer to develop than an unfriendly AGI. Money and time spent towards that is better directed towards other projects. I'd much rather see effort towards the evaluation of existing designs for self-modifying AGI, such as the GOLEM architecture[1], and accompanying “safe” goal systems implementing hybrid IA like I outlined above, or an AGI nanny architecture, etc.
EDIT: See Wai Dei's post[2] for a similar argument.
EDIT2: If you want to down-vote, that's fine. But please explain why in a reply.
Replies from: donald-hobson↑ comment by Donald Hobson (donald-hobson) · 2018-08-05T16:54:33.503Z · LW(p) · GW(p)
Lets suppose that Nanotechnology capable of recording and manipulating brains on a subneronal level exists, to such a level that duplicating people is straightforward. Lets also assume that everyone working on this project has the same goal function, and that they aren't too intrinsically concerned about modifying themselves. The problem you are setting this AI is, given a full brain state, modify it to be much smarter but otherwise the same "person". Same person implies both same goal function and same memories and same personality quirks. So it would be strictly easier to tell your AI to make a new "person" that has the same goals, and I don't care if they have the same memories. Remove a few restrictions about making it psychologically humanoid, and you are asking it to solve friendly AI, that won't be easy.
If there was a simple drug that made humans FAR smarter while leaving our goal functions intact, the AI could find that. However, given my understanding of the human mind, making large intelligence increases and mangling the goal function seems strictly easier than making large increases in intelligence while preserving the goal function. The latter would also seem to require a technical definition of the human goal function, a major component of friendly AI.
comment by ChrisHallquist · 2013-10-05T04:23:32.623Z · LW(p) · GW(p)
...there could be a global nuclear war that kills all but about 100,000 people...
Did you mean to say 100,000,000?
Edit: D'oh, missed the "but."
Replies from: lukeprog↑ comment by lukeprog · 2013-10-06T04:04:16.994Z · LW(p) · GW(p)
No.
Replies from: ChrisHallquist↑ comment by ChrisHallquist · 2013-10-08T03:06:39.509Z · LW(p) · GW(p)
See edit above.
comment by joaolkf · 2013-10-02T23:35:55.890Z · LW(p) · GW(p)
Interesting analysis, not so much because it is particularly insightful on itself as it stands, but more because it gives a hard step back in order to have a wider view. I have been intending to investigate another alternative: a soft take off through moral enhancement initially solving the transference of value problem. This is not the only reason I decided to study this, but it does seem like a worthy idea to explore. Hopefully, I will have some interesting stuff to post here later. I am working on a doctoral thesis proposal about this, I use some material from LessWrong - but, for evil-academic reasons, not as often and directly as I would like. It would be nice to have some feedback from LW.
Replies from: lukeprogcomment by John_Maxwell (John_Maxwell_IV) · 2013-09-29T02:20:00.418Z · LW(p) · GW(p)
(2) looks awfully hard, unless we can find a powerful IA technique that also, say, gives you a 10% chance of cancer.
Edit: if the utility function of EAs working on FAI is really such that they would take a 10% chance of cancer for powerful IA, is it safe to assume that they are taking advantage of the moderate IA that's reportedly possible with existing nootropics, apparently without significant side effects? (Though, on that thread in particular, there's probably a bit of a selection effect where folks who use nootropics long-term are the ones who respond well to them.)
comment by spuckblase · 2013-09-28T19:52:32.626Z · LW(p) · GW(p)
(2) looks awfully hard, unless we can find a powerful IA technique that also, say, gives you a 10% chance of cancer. Then some EAs devoted to building FAI might just use the technique, and maybe the AI community in general doesn’t.
Using early IA techniques is probably risky in most cases. Commited altruists might have a general advantage here.