Intelligence Explosion analysis draft: Why designing digital intelligence gets easier over time

post by lukeprog · 2011-11-22T22:28:03.868Z · LW · GW · Legacy · 27 comments

 

 

Again, I invite your feedback on this snippet from an intelligence explosion analysis Anna Salamon and myself have been working on. This section is less complete than the others; missing text is indicated with brackets: [].

_____

 

Many predictions of human-level digital intelligence have been wrong.1 On the other hand, machines surpass human ability at new tasks with some regularity (Kurzweil 2005). For example, machines recently achieved superiority at visually identifying traffic signs at low resolution (Sermanet and LeCun 2011), diagnosing cardiovascular problems from some types of MRI scan images (Li et al. 2009), and playing Jeopardy! (Markoff 2011). Below, we consider several factors that, considered together, appear to increase the odds that we will develop digital intelligence as the century progresses.

More hardware. For at least four decades, computing power2 has increased exponentially, in accordance with Moore’s law.3 Experts disagree on how much longer Moore’s law will hold (e.g. Mack 2011; Lundstrom 2003), but if it holds for two more decades then we may have enough computing power to emulate human brains by 2029.4 Even if Moore’s law fails to hold, our hardware should become much more powerful in the coming decades.5 More hardware doesn’t by itself give us digital intelligence, but it contributes to the development of digital intelligence in several ways:

Powerful hardware may improve performance simply by allowing existing “brute force” solutions to run faster (Moravec, 1976). Where such solutions do not yet exist, researchers might be incentivized to quickly develop them given abundant hardware to exploit. Cheap computing may enable much more extensive experimentation in algorithm design, tweaking parameters or using methods such as genetic algorithms. Indirectly, computing may enable the production and processing of enormous datasets to improve AI performance (Halevi et al., 2009), or result in an expansion of the information technology industry and the quantity of researchers in the field.6

Massive datasets. The greatest leaps forward in speech recognition and translation software have come not from faster hardware or smarter hand-coded algorithms, but from access to massive data sets of human-transcribed and human-translated words (Halevy, Norvig, and Pereira 2009). [add sentence about how datasets are expected to increase massively, or have been increasing massively and trends are expected to continue] [Possibly a sentence about Watson or usefulness of data for AI]

Better algorithms. Mathematical insights can reduce the computation time of a program by many orders of magnitude without additional hardware. For example, IBM’s Deep Blue played chess at the level of world champion Garry Kasparov in 1997 using about 1.5 trillion instructions per second (TIPS), but a program called Deep Junior did it in 2003 using only 0.15 TIPS. Thus, the power of the chess algorithms increased by a factor of 100 in only six years, or 3.33 orders of magnitude per decade (Richard and Shaw 2004). [add sentence about how this sort of improvement is not uncommon, with citations]

Progress in neuroscience. [neuroscientists have figured out brain algorithms X, Y, and Z that are related to intelligence.] New insights into how the brain achieves human-level intelligence can inform our attempts to build human-level intelligence with silicon (van der Velde 2010; Koene 2011). 

Accelerated science. A growing First World will mean that more researchers at well-funded universities will be available to do research relevant to digital intelligence. The world’s scientific output (in publications) grew by a third from 2002 to 2007 alone, much of this driven by the rapid growth of scientific output in developing nations like China and India (Smith 2011). New tools can accelerate particular fields, just as fMRI accelerated neuroscience in the 1990s. Finally, the effectiveness of scientists themselves can potentially be increased with cognitive enhancement drugs (Sandberg and Bostrom 2009) and brain-computer interfaces that allow direct neural access to large databases (Groß 2009). Better collaboration tools like blogs and Google scholar are already yielding results (Nielsen 2011).

Automated science. Early attempts at automated science — e.g., using data mining algorithms to make discoveries from existing data (Szalay and Gray 2006), or having a machine with no physics knowledge correctly infer natural laws from motion-tracking data (Schmidt and Lipson 2009) — were limited by the slowest part of the process: the human in the loop. Recently, the first “closed-loop” robot scientist successfully devised its own hypotheses (about yeast genomics), conducted experiments to test those hypotheses, assessed the results, and made novel scientific discoveries, all without human intervention (King et al. 2009). Current closed-loop robot scientists can only work on a narrow set of scientific problems, but future advances may allow for scalable, automated scientific discovery (Sparkes et al. 2010).

Embryo selection for better scientists. At age 8, Terrence Tao scored 760 on the math SAT, one of only [2?3?] children ever to do this at such an age; he later went on to [have a lot of impact on math]. Studies of similar kids convince researchers that there is a large “aptitude” component to mathematical achievement, even at the high end.7 How rapidly would mathematics or AI progress if we could create hundreds of thousands of Terrence Tao’s? This is a serious question because the creation of large numbers of exceptional scientists is an engineering project that we know in principle how to do. The plummeting costs of genetic sequencing [expected to go below AMOUNT per genome by SOONYEAR e.g. 2015] will soon make it feasible to compare the characteristics of an entire population of adults with those adults’ full genomes, and, thereby, to unravel the heritable components of intelligence, dilligence, and other contributors to scientific achievement. To make large numbers of babies with scientific abilities near the top of the current human range8 would then require only the ability to combine known alleles onto a single genome; procedures that can do this have already been developed for mice. China, at least, appears interested in this prospect.9

It isn’t clear which of these factors will ease progress toward digital intelligence, but it seems likely that — across a broad range of scenarios — some of these inputs will do so.

 

 

____

1 For example, Simon (1965, 96) predicted that “machines will be capable, within twenty years, of doing any work a man can do.”

2 The technical measure predicted by Moore’s law is the density of components on an integrated circuit, but this is closely tied to affordable computing power.

3 For important qualifications, see Nagy et al. (2010); Mack (2011).

4 This calculation depends on the “level of emulation” expected to be necessary for successful WBE. Sandberg and Bostrom (2008) report that attendees to a workshop on WBE tended to expect that emulation at the level of the brain’s spiking neural network, perhaps including membrane states and concentrations of metabolites and neurotransmitters, would be required for successful WBE. They estimate that if Moore’s law continues, we will have the computational capacity to emulate a human brain at the level of its spiking neural network by 2019, or at the level of metabolites and neurotransmitters by 2029.

5 Quantum computing may also emerge during this period. Early worries that quantum computing may not be feasible have been overcome, but it is hard to predict whether quantum computing will contribute significantly to the development of digital intelligence because progress in quantum computing depends heavily on unpredictable insights in quantum algorithms (Rieffel and Polak 2011).

6 Shulman and Sandberg (2010).

7 [Benbow etc. on study of exceptional talent; genetics of g; genetics of conscientiousness and openness, pref. w/ any data linking conscientiousness or openness to scientific achievement.  Try to frame in a way that highlights hard work type variables, so as to alienate people less.]

8 [folks with very top scientific achievement likely had lucky circumstances as well as initial gifts (so that, say, new kids with Einstein’s genome would be expected to average perhaps .8 times as exceptional).  However, one could probably identify genomes better than Einstein’s, both because these technologies would let genomes be combined that had unheard of, vastly statistically unlikely amounts of luck, and because e.g. there are likely genomes out there that are substantially better than Einstein (but on folks who had worse environmental luck).]

9 [find source]

 

 

_____

All references, including the ones used above:

 

 

 

 

27 comments

Comments sorted by top scores.

comment by Morendil · 2011-11-22T23:10:53.314Z · LW(p) · GW(p)

For some of these, you can play the "magic wand" game to probe the connections between nodes in your belief network:

  • More hardware - suppose you waved a magic wand just now, and suddenly there were 10 times as many computers around (or they all got 10 times faster and bigger), how do you suppose would that get us closer to digital intelligence?
  • Bigger data - magic wand gives you access to every word ever spoken by a human, magically transcribed; how does that get us closer? From the perspective of AGI, statistical machine translation, no matter how wondrous-looking, is just plain dumb - it does not even pretend to be able to generalize insights.
  • Better algorithms - this should really be "faster algorithms"; by definition "better" is what gets us closer to AGI. But short of a breakthrough in complexity theory, optimized algorithms are just an equivalent of faster hardware. Precisely which algorithms would bring us closer to AI if we could speed them up a lot with the magic wand? I can't really see a quicker sort, or matrix inverse, or even a faster traveling salesman (if that was the only algorithm in that class we knew to speed up).
Replies from: Logos01
comment by Logos01 · 2011-11-23T09:45:21.296Z · LW(p) · GW(p)

More hardware [...] how do you suppose would that get us closer to digital intelligence?

IF Minsky's "Society of Mind" is near to accurate, then if we had enough separate "narrow" agents operating, we could solve all problems that could be encountered -- call this the "Eusocial Generalization" approach. That is, rather than actually solving the problem of general intelligence, just make programs that solve every last problem we can think of, individually -- and then run them all at once.

Horridly inefficient, but if we had magically infinite computational power available we could at least implement it.

As to the "bigger data" -- an element can be part of the solution without being capable of providing the entire solution. Highly rigorous relational databases allow pattern-matching algorithms to at least perform superior analysis.

comment by steven0461 · 2011-11-23T01:01:26.631Z · LW(p) · GW(p)

folks with very top scientific achievement likely had lucky circumstances as well as initial gifts (so that, say, new kids with Einstein’s genome would be expected to average perhaps .8 times as exceptional)

.8 sounds like a lot, though it depends on what ".8 times as exceptional" means.

comment by cousin_it · 2011-11-22T23:34:19.463Z · LW(p) · GW(p)

For example, IBM’s Deep Blue played chess at the level of world champion Garry Kasparov in 1997 using about 1.5 trillion instructions per second (TIPS), but a program called Deep Junior did it in 2003 using only 0.15 TIPS. Thus, the power of the chess algorithms increased by a factor of 100 in only six years

I'm seeing a factor of 10...

Replies from: arundelo
comment by arundelo · 2011-11-23T01:31:10.695Z · LW(p) · GW(p)

Yep. The source says:

Moravec claims [Deep Blue] to be equivalent to a general-purpose processor having throughput on the order of 1-3 trillion instructions per second (TIPS). [...] The host computer [that Deep Blue's successor Deep Junior ran on] was capable of a peak throughput of approximately 15 billion instructions per second (GIPS).

If we consider the Deep Blue machine to be a 1.5 TIPS machine for arithmetic convenience [....]

So "only 0.15 TIPS" should have been "only 0.015 TIPS".

comment by Kaj_Sotala · 2011-11-23T07:41:19.364Z · LW(p) · GW(p)

This calculation depends on the “level of emulation” expected to be necessary for successful WBE. Sandberg and Bostrom (2008) report that attendees to a workshop on WBE tended to expect that emulation at the level of the brain’s spiking neural network, perhaps including membrane states and concentrations of metabolites and neurotransmitters, would be required for successful WBE. They estimate that if Moore’s law continues, we will have the computational capacity to emulate a human brain at the level of its spiking neural network by 2019, or at the level of metabolites and neurotransmitters by 2029.

The roadmap does estimate that we could do a spiking neural network emulation in 2019, but the target dates for the more detailed levels of emulation come later: 2033 for the electrophysiology level, 2044 for the metabolome level. The 2029 estimate is right if you only look at the demands for memory (on page 79), but the demands are higher for processing power (on page 80).

comment by shminux · 2011-11-23T17:06:26.597Z · LW(p) · GW(p)

Embryo selection for better scientists.

Breeding for a single ability tends to be detrimental to others. E.g. fastest breeds of dogs/horses are often stupid and/or sickly. I would hate to see an army of geniuses who lack some essential qualities like compassion.

Replies from: Prismattic
comment by Prismattic · 2011-11-24T00:12:00.791Z · LW(p) · GW(p)

If people get bored of arguing about torture v. specks, for variety one could substitute the hypothetical of creating a human whose superintelligence would benefit the rest of the species in a way similar to an FAI, but at a cost of an existence that was extraordinarily miserable in some way on a personal level.

comment by Kyre · 2011-11-25T04:58:26.934Z · LW(p) · GW(p)

Under "Progress in Neuroscience", is this the sort of thing you are referring to ?

Buesing L, Bill J, Nessler B, Maass W (2011) Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons. PLoS Comput Biol 7(11): e1002211. doi:10.1371/journal.pcbi.1002211

Replies from: lukeprog
comment by lukeprog · 2011-12-03T22:35:30.574Z · LW(p) · GW(p)

Great paper; thanks!

comment by multifoliaterose · 2011-11-23T02:37:25.004Z · LW(p) · GW(p)

Embryo selection for better scientists. At age 8, Terrence Tao scored 760 on the math SAT, one of only [2?3?] children ever to do this at such an age; he later went on to [have a lot of impact on math]. Studies of similar kids convince researchers that there is a large “aptitude” component to mathematical achievement, even at the high end.7 How rapidly would mathematics or AI progress if we could create hundreds of thousands of Terrence Tao’s?

Though I think agree with the general point that you're trying to make here (that there's a large "aptitude" component to the skills relevant to AI research and that embryo selection technology could massively increase the number of people who have high aptitude), I don't think that it's so easy to argue:

(a) The math that Terence Tao does is arguably quite remote from AI research.

(b) More broadly, the relevance of mathematical skills to AI research skills is not clear cut.

(c) The SAT tests mathematical aptitude only very obliquely.

(d) Correlation is not causation; my own guess is that high mathematical aptitude as measured by conventional metrics (e.g. mathematical olympiads) is usually necessary but seldom sufficient for the highest levels of success as a mathematical researcher.

(e) Terence Tao is a single example

7 [Benbow etc. on study of exceptional talent; genetics of g; genetics of conscientiousness and openness, pref. w/ any data linking conscientiousness or openness to scientific achievement. Try to frame in a way that highlights hard work type variables, so as to alienate people less.]

Is there really high quality empirical data here? I vaguely remember Carl referencing a study about people at the one in ten thousand level of IQ having more success becoming professors than others, but my impression is that there's not much research in the way of the genetics of high achieving scientists.

For what it's worth I think that the main relevant variable here is a tendency (almost involuntary) to work in a highly focused way for protracted amounts of time. This seems to me much more likely to be the limiting factor than g.

I think that one would be on more solid footing both rhetorically and factually just saying something like "capacity for scientific achievement appears to have a large genetic component and it may be possible to select for genes relevant to high scientific achievement by studying the genes of high achieving scientists."

comment by [deleted] · 2011-11-23T00:31:08.060Z · LW(p) · GW(p)

Studies of similar kids convince researchers that there is a large “aptitude” component to mathematical achievement, even at the high end.

This sentence seems strange, because aptitude isn't the best choice of word. I know you are trying not to alienate people, but really if someone doesn't accept the existence of innate intelligence with genetic causes, they shouldn't be part of your target audience for this stuff. I would just go ahead and say "there is a large genetically determined component to mathematical achievement".

Also, "even at the high end" should be replaced with "even amongst the most capable individuals", or otherwise changed to be more precise.

This is a serious question because the creation of large numbers of exceptional scientists is an engineering project that we know in principle how to do.

This sentence seems awkward to me.

Perhaps: How rapidly would mathematics or AI progress if we could create hundreds of thousands of Terrence Tao’s? This is not an idle question, because in principle the creation of large numbers of exceptional scientists is a feasible engineering project.

Finally, it seems strange to me that your list of references is longer than the actual excerpt. Is anyone actually going to look at those? It reminds me of Jaynes in PT:TLoS criticising a phd student giving a presentation who spent all of his allotted time setting out his mathematical definitions and being outstandingly rigorous, and never got round to actually demonstrating his findings.

You also have a lot of references in parentheses, which make the piece frustrating to read - hopefully you'll use little numbers instead.

Replies from: Kaj_Sotala
comment by Kaj_Sotala · 2011-11-23T07:07:04.853Z · LW(p) · GW(p)

Finally, it seems strange to me that your list of references is longer than the actual excerpt.

The list of references seems to also include the references for the previous (and perhaps also the following?) excerpts.

You also have a lot of references in parentheses, which make the piece frustrating to read - hopefully you'll use little numbers instead.

This will depend entirely on the guidelines of the publication where they'll submit this, but I'll regardless note that I prefer parentheses in the text.

comment by JoshuaZ · 2011-11-25T19:13:35.607Z · LW(p) · GW(p)

Better algorithms. Mathematical insights can reduce the computation time of a program by many orders of magnitude without additional hardware. For example, IBM’s Deep Blue played chess at the level of world champion Garry Kasparov in 1997 using about 1.5 trillion instructions per second (TIPS), but a program called Deep Junior did it in 2003 using only 0.15 TIPS. Thus, the power of the chess algorithms increased by a factor of 100 in only six years, or 3.33 orders of magnitude per decade (Richard and Shaw 2004). [add sentence about how this sort of improvement is not uncommon, with citations]

One good example is linear programming and related algorithms. Kaj discussed this earlier here(pdf):

In the past, improvements in algorithms have sometimes been even more important than improvements in hardware. The President's Council of Advisors on Science and Technology [2010] mentions that performance on a benchmark production planning model improved by a factor of 43 million between 1988 and 2003. Out of the improvement, a factor of roughly 1,000 was due to better hardware and a factor of roughly 43,000 was due to improvements in algorithms. Also mentioned is an algorithmic improvement of roughly 30,000 for mixed integer programming between 1991 and 2008.

comment by Richard_Kennaway · 2011-11-24T13:27:24.152Z · LW(p) · GW(p)

This is a serious question because the creation of large numbers of exceptional scientists is an engineering project that we know in principle how to do. The plummeting costs of genetic sequencing [expected to go below AMOUNT per genome by SOONYEAR e.g. 2015] will soon make it feasible to compare the characteristics of an entire population of adults with those adults’ full genomes, and, thereby, to unravel the heritable components of intelligence, dilligence, and other contributors to scientific achievement.

Way too optimistic. The plummeting costs of genetic sequencing have already made available the full genomes for individuals of many organisms, including humans. However, the results derived from the Human Genome Project, at least as summarised here are rather underwhelming, as far as engineering projects are concerned. What you are proposing is not an engineering project, it is basic research -- that is, no-one knows what the results are going to be until they find them.

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2011-11-25T23:12:31.943Z · LW(p) · GW(p)

The blocking point on that is monumentally massive replication. This has not yet happened.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2011-11-27T10:43:09.995Z · LW(p) · GW(p)

That's one obstacle, of course, but I'm going with the original supposition of cheap and fast readout of whole genomes being available. If it was, what research proposal would you write? What questions would you expect to be able to answer?

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2012-04-08T21:21:46.185Z · LW(p) · GW(p)

You can 'go with the original supposition of cheap and fast readout of whole genomes being available', but in that case the counterargument is malformed - it's way cheaper than it was, but still way too expensive for monumentally massive replication, so failure to have done so is still expected.

So, what can you do once you have it super-cheap? The main thing to do is to do a huge association fishing-expedition studies, with the enormous numbers being sufficient to make up for the huge numbers of hypotheses being tested, which then lead into studies to determine the nature of the association. The HGP tested what? A few dozen people? That's not going to be statistically significant for just about anything.

When the genome gets cheap enough that it's insignificant compared to the other costs, then it changes the cost analysis for ordinary experiment design in two ways. First, you can add genomic data to existing experiments just to clarify the controls. Secondly, in genomic experiments, it enables you to expand your cohort. This in turn shifts cost-saving focus to the other per-person elements. An experiment could take, say, the full genome, an online IQ test, and several proxies for intelligence, and sample many people, rather than do multiple batteries of IQ tests conducted in person. If a genome costs $5, you can afford to have a cohort that will make the experiment worth something. If a genome costs $1k, you're not going to be able to afford the massive replication, no matter how cheap you make the profiling. Even if you maintain your profiling standards, saving that much money will let you expand your cohort.

comment by timtyler · 2011-11-23T19:20:45.466Z · LW(p) · GW(p)

They estimate that if Moore’s law continues, we will have the computational capacity to emulate a human brain at the level of its spiking neural network by 2019, or at the level of metabolites and neurotransmitters by 2029.

...for one million dollars. Note that that would not be cost-competitive with humans.

Replies from: TheOtherDave
comment by TheOtherDave · 2011-11-23T19:55:13.129Z · LW(p) · GW(p)

Typical humans, no.

That said, there are individuals that corporations pay more than a million dollars to rent the time of. If we assume that decision is cost-effective (which is a big "if"), getting to own those individuals outright for a million dollars might be a bargain.

Replies from: timtyler
comment by timtyler · 2011-11-23T20:54:50.203Z · LW(p) · GW(p)

Here, we risk crossing over from the realm of wondering "how much computer power it would take" into the bizarre fantasy realm - where emulations actually happen before engineered machine intelligence does.

Replies from: TheOtherDave
comment by TheOtherDave · 2011-11-23T22:42:55.109Z · LW(p) · GW(p)

Agreed. OTOH, to my mind we'd already made that crossover earlier in the discussion, as well... once we have engineered human-level machine intelligences in the mix, all assumptions about how much anything costs are just pretty-sounding numbers, so to talk about emulations costing a million dollars (or any other particular number) already presumes that we don't have engineered human-level machine intelligences yet.

comment by lessdazed · 2011-11-23T05:51:51.557Z · LW(p) · GW(p)

Massive datasets.

Using “captchas” to digitize books

A growing First World will mean

E.g. Romania joins the first world, or Germany grows its economy?

However, one could probably identify genomes better than Einstein’s, both because these technologies would let genomes be combined that had unheard of, vastly statistically unlikely amounts of luck, and because e.g. there are likely genomes out there that are substantially better than Einstein (but on folks who had worse environmental luck).

We have every reason to believe that humans' maximum potential intelligence is greater than any achieved by anyone in the past, be it Archimedes, Einstein, or anyone else.

Because academic achievement is the result of many biological, environmental, social, and other factors, those we can identify as having had the most outstandingly productive and creative minds were likely not as smart as some members of the vast majority of humanity that had no opportunity to noticeably intellectually distinguish itself.

Much more importantly, the vast majority of possible combinations of human alleles have never been combined [some preposition or phrase] a person. As scientists learn more and more about different combinations of alleles and their effects on intelligence, diligence, and other traits, they will be able to make combinations of alleles that would have been extremely unlikely to have ever occurred in nature.

comment by timtyler · 2011-11-23T00:22:03.710Z · LW(p) · GW(p)

Perhaps consider breaking "Accelerated science" down into: more scientists, more technologists, more programmers, and better programming tools. Also, consider scratching embryo selection as being too low down on the list.

comment by Morendil · 2011-11-22T23:00:05.793Z · LW(p) · GW(p)

because these technologies would let genomes be combined that had unheard of, vastly statistically unlikely amounts of luck

This sounds like you mean a Teela Brown gene. Do you actually mean that? If so, that's kind of crazy. If not, rephrase?

Replies from: Normal_Anomaly, TheOtherDave
comment by Normal_Anomaly · 2011-11-23T03:52:40.586Z · LW(p) · GW(p)

I think ey means something like this:

because these technologies would let genomes be combined that would have taken unheard of amounts of luck to arise naturally at any significant frequency.

comment by TheOtherDave · 2011-11-23T03:51:25.914Z · LW(p) · GW(p)

I understood this to mean "would let genomes be [deliberately] combined that [would require] unheard of, vastly statistically unlikely amounts of luck [to occur naturally]"