New report: Intelligence Explosion Microeconomics
post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-29T23:14:58.463Z · LW · GW · Legacy · 246 commentsContents
Abstract: Outline of contents: None 246 comments
Summary: Intelligence Explosion Microeconomics (pdf) is 40,000 words taking some initial steps toward tackling the key quantitative issue in the intelligence explosion, "reinvestable returns on cognitive investments": what kind of returns can you get from an investment in cognition, can you reinvest it to make yourself even smarter, and does this process die out or blow up? This can be thought of as the compact and hopefully more coherent successor to the AI Foom Debate of a few years back.
(Sample idea you haven't heard before: The increase in hominid brain size over evolutionary time should be interpreted as evidence about increasing marginal fitness returns on brain size, presumably due to improved brain wiring algorithms; not as direct evidence about an intelligence scaling factor from brain size.)
I hope that the open problems posed therein inspire further work by economists or economically literate modelers, interested specifically in the intelligence explosion qua cognitive intelligence rather than non-cognitive 'technological acceleration'. MIRI has an intended-to-be-small-and-technical mailing list for such discussion. In case it's not clear from context, I (Yudkowsky) am the author of the paper.
Abstract:
I. J. Good's thesis of the 'intelligence explosion' is that a sufficiently advanced machine intelligence could build a smarter version of itself, which could in turn build an even smarter version of itself, and that this process could continue enough to vastly exceed human intelligence. As Sandberg (2010) correctly notes, there are several attempts to lay down return-on-investment formulas intended to represent sharp speedups in economic or technological growth, but very little attempt has been made to deal formally with I. J. Good's intelligence explosion thesis as such.
I identify the key issue as returns on cognitive reinvestment - the ability to invest more computing power, faster computers, or improved cognitive algorithms to yield cognitive labor which produces larger brains, faster brains, or better mind designs. There are many phenomena in the world which have been argued as evidentially relevant to this question, from the observed course of hominid evolution, to Moore's Law, to the competence over time of machine chess-playing systems, and many more. I go into some depth on the sort of debates which then arise on how to interpret such evidence. I propose that the next step forward in analyzing positions on the intelligence explosion would be to formalize return-on-investment curves, so that each stance can say formally which possible microfoundations they hold to be falsified by historical observations already made. More generally, I pose multiple open questions of 'returns on cognitive reinvestment' or 'intelligence explosion microeconomics'. Although such questions have received little attention thus far, they seem highly relevant to policy choices affecting the outcomes for Earth-originating intelligent life.
The dedicated mailing list will be small and restricted to technical discussants.
This topic was originally intended to be a sequence in Open Problems in Friendly AI, but further work produced something compacted beyond where it could be easily broken up into subposts.
Outline of contents:
1: Introduces the basic questions and the key quantitative issue of sustained reinvestable returns on cognitive investments.
2: Discusses the basic language for talking about the intelligence explosion, and argues that we should pursue this project by looking for underlying microfoundations, not by pursuing analogies to allegedly similar historical events.
3: Goes into detail on what I see as the main arguments for a fast intelligence explosion, constituting the bulk of the paper with the following subsections:
- 3.1: What the fossil record actually tells us about returns on brain size, given that most of the difference between Homo sapiens and Australopithecus was probably improved software.
- 3.2: How to divide credit for the human-chimpanzee performance gap between "humans are individually smarter than chimpanzees" and "the hominid transition involved a one-time qualitative gain from being able to accumulate knowledge".
- 3.3: How returns on speed (serial causal depth) contrast with returns from parallelism; how faster thought seems to contrast with more thought. Whether sensing and manipulating technologies are likely to present a bottleneck for faster thinkers, or how large of a bottleneck.
- 3.4: How human populations seem to scale in problem-solving power; some reasons to believe that we scale inefficiently enough for it to be puzzling. Garry Kasparov's chess match vs. The World, which Kasparov won.
- 3.5: Some inefficiencies that might cumulate in an estimate of humanity's net computational efficiency on a cognitive problem.
- 3.6: What the anthropological record actually tells us about cognitive returns on cumulative selection pressure, given that selection pressures were probably increasing over the course of hominid history. How the observed history would be expected to look different, if there were in fact diminishing returns on cognition.
- 3.7: How to relate the curves for evolutionary difficulty, human-engineering difficulty, and AI-engineering difficulty, considering that they are almost certainly different.
- 3.8: Correcting for anthropic bias in trying to estimate the intrinsic 'difficulty 'of hominid-level intelligence just from observing that intelligence evolved here on Earth.
- 3.9: The question of whether to expect a 'local' (one-project) FOOM or 'global' (whole economy) FOOM and how returns on cognitive reinvestment interact with that.
- 3.10: The great open uncertainty about the minimal conditions for starting a FOOM; why I. J. Good's postulate of starting from 'ultraintelligence' is probably much too strong (sufficient, but very far above what is necessary).
- 3.11: The enhanced probability of unknown unknowns in the scenario, since a smarter-than-human intelligence will selectively seek out and exploit flaws or gaps in our current knowledge.
4: A tentative methodology for formalizing theories of the intelligence explosion - a project of formalizing possible microfoundations and explicitly stating their alleged relation to historical experience, such that some possibilities can allegedly be falsified.
5: Which open sub-questions seem both high-value and possibly answerable.
6: Formally poses the Open Problem and mentions what it would take for MIRI itself to directly fund further work in this field.
246 comments
Comments sorted by top scores.
comment by RolfAndreassen · 2013-05-03T16:50:40.965Z · LW(p) · GW(p)
Agreeing with several other people that the introduction needs a major rewrite or possibly just a cut. Consider the opening sentence:
Isadore Jacob Gudak, who anglicized his name to Irving John Good and used I. J. Good for publication
Dude, no. Who gives a toss how he anglicised his name? Get to your point, if you have one.
Somewhat similarly, in the fourth paragraph, you have
Please note that...
Please note that the phrase "please note that" is unnecessary; it adds length and the impression that you are snippily correcting someone's blog comment, without adding any information (or politeness) to the sentence. I'm familiar with your argument about formal writing just adding a feeling of authority, but this isn't informality, it's sloppy editing.
Your whole first page, actually, is a pretty good demonstration of not having a point. I get the impression that you thought "Hmm, I need some kind of introduction" and went off to talk about something, anything, that wasn't the actual point of the paper, because the point belongs in the body and not the introduction. This makes for a page that adds nothing. You have a much better introduction starting with the paragraph at the end of the first page, the one that opens
The question of what happens when smarter-than-human agencies
See, this is getting to the point. You can do it! This is where you should start the paper.
At an absolute, utter minimum, move the subclause about how what's-his-name anglicised the Hungarian into a footnote, or the bibliography, or a biographical appendix, or a Wikipedia article, or for dog's sake the Author's Notes to the next HPMOR chapter, or a random thought that Harry has and then wonders why he is considering such a total irrelevancy. Just please, anywhere, anywhere except the opening sentence of your paper. Talk about burying the lede.
Additionally, your abstract is too long. The abstract should not explain anything; it should summarise the argument or result on the assumption that the reader is already familiar with the subject. You're trying to write your introduction in the abstract; a common error, but an error. A single paragraph s sufficient; going off the first page is way too long. Disclaimer: The above applies to physics abstracts; conceivably philosophy has a different set of conventions.
Finally: If in fact you actually want this kind of feedback, you can make it much easier on your beta readers by adding line numbers to the paper. LaTeX makes this easy. This avoids such circumlocutions as "The fourth paragraph, the one beginning...", with the attendant confusion about whether I'm counting the block quote as a separate paragraph.
Replies from: Kawoomba↑ comment by Kawoomba · 2013-05-03T17:09:48.652Z · LW(p) · GW(p)
At an absolute, utter minimum, move the subclause about how what's-his-name anglicised the Hungarian into a footnote, or the bibliography, or a biographical appendix, or a Wikipedia article, or for dog's sake the Author's Notes to the next HPMOR chapter, or a random thought that Harry has and then wonders why he is considering such a total irrelevancy.
Or ... what about I write it in a comment, then ... I put it all in the smallest font. Then I edit the comment not to include it anymore, then I retract the comment and then a mod deletes the comment?
Or are you gonna spank me too, then. :-o
Replies from: RolfAndreassen↑ comment by RolfAndreassen · 2013-05-03T18:53:41.290Z · LW(p) · GW(p)
If you want a spanking all you have to do is ask. No need for elaborate bratting. :)
comment by gjm · 2013-04-30T11:41:15.643Z · LW(p) · GW(p)
Superficial stylistic remarks (as you'll see, I've only looked at about the first 1/4 of the paper):
The paper repeatedly uses the word "agency" where "agent" would seem more appropriate.
I agree with paper-machine that the mini-biography of I J Good has little value here.
The remark in section 1 about MIRI being funding-limited is out of place and looks like a whine or a plea for more money. Just take it out.
"albeit" on page 10, shortly before footnote 8, should just be "but". (Or maybe "even though", if that's your meaning.) [EDITED to add: there's another "albeit" that reads oddly to me, in footnote 66 on page 50. It's not wrong, but it feels odd. Roughly, wherever you can correctly write "albeit" you can equivalently write "even though", and that's a funny thing to be starting a footnote with.]
"criteria" in footnote 11 about paperclip maximizers should be "criterion".
In footnote 15 (about "g") the word "entrants" seems very weirdly chosen, and the footnote seems to define g as the observed correlation between different measures of intelligence, which is nonsense.
The premise of the paper is that whether or not intelligence explosion will occur is (or at least is being pretended to be) an open question. But at many points within the paper there are references to "the intelligence explosion" that seem to presuppose that there will in fact be one.
Footnote 23 puts "Wikipedia" in italics, which to my eyes looks very strange.
[EDITED to incorporate a comment I made separately, which I'll now retract.]
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-04T01:26:33.898Z · LW(p) · GW(p)
I agree with paper-machine that the mini-biography of I J Good has little value here.
Done.
The remark in section 1 about MIRI being funding-limited is out of place and looks like a whine or a plea for more money. Just take it out.
Done.
comment by mwengler · 2013-04-29T15:26:43.861Z · LW(p) · GW(p)
Just a thought on chess playing. Rather than looking at an extreme like Kasparov vs the world, it would be interesting to me to have teams of two, three, and four players of well-known individual ranking. These teams could then play many games against individuals and against each other. The effective ranking of the teams could be determined from their results. In this way, some sense of "how much smarter" a team is than the individual members could be determined. Ideally, the team would not be ranked until it had had significant experience playing as a team. We are interested in what a team could accomplish, and no strong reason to think it would take less time to optimize a team than to optimize an individual.
Along the same lines, teams could be developed to take IQ and other GI correlated tests to see how much smarter a few people together are than a single human. Would the results have implications for optimal AI design?
Replies from: Eliezer_Yudkowsky, ThrustVectoring, rhollerith_dot_com↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-29T16:25:10.756Z · LW(p) · GW(p)
I think that teams of up to five people can scale "pretty well by human standards" - not too far from linearly. It's going up to a hundred, a thousand, a million, a billion that we start to run into incredibly sublinear returns.
Replies from: ThrustVectoring, timtyler, joshuabecker, mwengler↑ comment by ThrustVectoring · 2013-04-29T17:23:29.664Z · LW(p) · GW(p)
As group size increases you have to spend more and more of your effort getting your ideas heard and keeping up with the worthwhile ideas being proposed by other people, as opposed to coming up with your own good ideas.
Depending on the relevant infrastructure and collaboration mechanisms, it's fairly easy to have a negative contribution from each additional person in the project. If someone is trying to say something, then someone else has to listen - even if all the listener does is keep it from lowering the signal-to-noise ratio by removing the contribution.
Replies from: DanArmak↑ comment by DanArmak · 2013-04-29T19:50:22.076Z · LW(p) · GW(p)
You correctly describe the problems of coordinating the selection of the best result produced. But there's another big problem: coordinating the division of work.
When you add another player to a huge team of 5000 people, he won't start exploring a completely new series of moves no-one else had considered before. Instead, he will likely spend most of his time considering moves already considered by some of the existing players. That's another reason why his marginal contribution will be so low.
Unlike humans, computers are good at managing divide-and-conquer problems. In chess, a lot of the search for the next move is local in the move tree. That's what makes it a particularly good example of human groups not scaling where computers would.
↑ comment by timtyler · 2013-05-03T10:56:55.020Z · LW(p) · GW(p)
I think that teams of up to five people can scale "pretty well by human standards" - not too far from linearly. It's going up to a hundred, a thousand, a million, a billion that we start to run into incredibly sublinear returns.
That's parallelism for you. It's like the way that four-core chips are popular, while million-core chips are harder to come by.
↑ comment by joshuabecker · 2018-09-13T03:11:56.591Z · LW(p) · GW(p)
I assume by 'linear' you mean directly proportional to population size.
The diminishing marginal returns of some tasks, like the "wisdom of crowds" (concerned with forming accurate estimates) are well established, and taper off quickly regardless of the difficulty of the task---it's basically follows the law of large numbers and sample error (see "A Note on Aggregating Opinions", Hogarth, 1978). This glosses over some potential complexity but you're probably unlikely to do ever get much benefit from more than a few hundred people, if that many.
Other tasks do not see such quickly diminishing returns, such as problem solving in a complex fitness landscape (see work on "Exploration and Exploitation" especially in NK space). Supposing the number of possible solutions to a problem to be much greater than the number of people feasibly working on the problem (e.g., the population of creative and engaged humans) then as the number of people increase, the probability of finding the optimal solution increases. Coordinating all those people is another issue, as is the potential opportunity cost of having so many people work on the same problem.
However, in my experience, this difference between problem-solving and wisdom-of-crowds tasks is often glossed over in collective intelligence research.
Replies from: joshuabecker↑ comment by joshuabecker · 2018-09-13T03:22:37.694Z · LW(p) · GW(p)
Regarding the apparent non-scaling benefits of history: what you call the "most charitable" explanation seems to me the most likely. Thousands of people work at places like CERN and spend 20 years contributing to a single paper, doing things that simply could not be done by a small team. Models of problem-solving on "NK Space" type fitness landscapes also support this interpretation: fitness improvements become increasingly hard to find over time. As you've noted elsewhere, it's easier to pluck low-hanging fruit.
↑ comment by mwengler · 2013-04-29T22:12:40.136Z · LW(p) · GW(p)
Are you or anyone else aware of any work along these lines, showing the intelligence of groups of people?
Any sense of what the intelligence of the planet as a whole, or the largest effective intelligence of any group on the planet might be?
If groups of up to 5 scale well, and we get sublinear returns above 5, but positive returns up to some point anyway, does this prove that AI won't FOOM until it has an intelligence larger than the largest intelligence of a group of humans? That is, until AI has a higher intelligence than the group, that the group of humans will dominate the rate at which new AI's are improved?
Replies from: CarlShulman↑ comment by CarlShulman · 2013-04-29T22:50:42.004Z · LW(p) · GW(p)
There is the MIT Center for Collective Intelligence.
Replies from: joshuabecker↑ comment by joshuabecker · 2018-09-13T03:02:32.187Z · LW(p) · GW(p)
Update: this is a pretty large field of research now. The Collective Intelligence Conference is going into its 7th year.
↑ comment by ThrustVectoring · 2013-04-29T17:23:31.373Z · LW(p) · GW(p)
As far as empirically finding the optimum group size, it'd be cheaper to find the number of researchers in a scientific sub-discipline and measure the productive work they do in that field. They are teams that review work for general distribution, read on others' progress, and contribute to the discussion. Larger sub-fields that would be more efficient divided up would have large incentives to do so, as defectors to the sub-sub-field would have higher productivity (and less irrelevant work to read up on).
↑ comment by RHollerith (rhollerith_dot_com) · 2013-05-08T15:54:26.638Z · LW(p) · GW(p)
Does anyone play (rated) chess on freechess.org? If so, do you want to get together to play some team games for the purposes of adding hard data to this discussion?
My blitz rating is in the high 1200s. My teammate should have a blitz rating close to that to make the data valuable. I play 8-minute games, and am not interested in playing enough non-blitz games to get my rating to be an accurate reflection of my (individual) skill. (Non-blitz games would take too much time and take too much out of me. "Non-blitz" games are defined as games with at least 15 minutes on the clock for each player.)
I envision the team being co-located while playing, which limits my teammate to someone who is or will be in San Francisco or Berkeley.
I've played a little "team chess" before. Was a lot of fun.
My contact info is here.
comment by Shmi (shminux) · 2013-04-30T22:55:10.431Z · LW(p) · GW(p)
Having looked through the document again, I feel that a competent technical writer, or anyone with a paper-writing experience, can make this report into a paper suitable for submission within a couple of days, maybe a week, assuming MIRI wants it published. A lot would have to be cut, and the rest rearranged and tidied up, but there is definitely enough meat there for a quality paper or two. I am not sure what MIRI's intention is re this report, other than "hope that the open problems posed therein inspire further work by economists or economically literate modelers".
comment by Qiaochu_Yuan · 2013-04-30T09:36:04.865Z · LW(p) · GW(p)
Page 4: the sum log(w) + log(log(w)) + ... doesn't converge. Some logarithm will be negative and then the next one will be undefined. Presumably you meant to stop the sum once it becomes negative, but then I'm somewhat confused about this argument because I'm not sure it's dimensionally consistent (I'm not sure what units cognitive work is being measured in).
Top of page 18: there's a reference to "this graph" but no graph...?
General comment 1: who's the intended audience here? Most of the paper reads like a blog post, which I imagine could be disconcerting for newcomers trying to evaluate whether they should be paying attention to MIRI and expecting a typical research paper from a fancy-looking .pdf.
General comment 2: I still think this discussion needs more computational complexity. I brought this up to you earlier and I didn't really digest your reply. The question of what you can and can't do with a given amount of computational resources seems highly relevant to understanding what the intelligence explosion could look like; in particular I would be surprised if questions like P vs. NP didn't have a strong bearing on the distribution over timelines (I expect that the faster it is possible to solve NP-complete problems, which apparently includes protein folding, the faster AI could go foom). But then I'm not a domain expert here and I could be off-base for various reasons.
Replies from: lukeprog, gjm, Eliezer_Yudkowsky↑ comment by gjm · 2013-04-30T13:56:57.405Z · LW(p) · GW(p)
The reference to "this graph" is a hyperlink. There are many such hyperlinks in the document. They feel rather weird, and are easy to miss, given the generally print-like typesetting. It might be worth writing them all out in something like the form one would use in an ordinary printed document, while preserving their hyperlinkiness.
↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T14:23:14.989Z · LW(p) · GW(p)
Protein folding cannot be NP-hard. The physical universe is not known to be able to solve NP-hard problems, and protein folding will not involve new physics.
Replies from: Kindly, EHeller, Vaniver, David_Gerard, Qiaochu_Yuan↑ comment by Kindly · 2013-04-30T21:27:37.320Z · LW(p) · GW(p)
The physical universe doesn't need to "solve" protein folding in the sense of having a worst-case polynomial-time algorithm. It just needs to fold proteins. Many NP-complete problems are "mostly easy" with a few hard instances that rarely come up. (In fact, it's hard to find an NP-complete problem for which random instances are hard: if we could do this, we would use it for cryptography.) It's reasonable to suppose protein folding is like this.
Of course, if this is the case, maybe the AI doesn't care about the rare hard instances of protein folding, either.
Replies from: Gurkenglas↑ comment by Gurkenglas · 2018-12-14T14:03:02.490Z · LW(p) · GW(p)
If we have an NP-complete problem for which random instances are hard, but we can't generate them with solutions, that doesn't help cryptography.
↑ comment by EHeller · 2013-04-30T19:17:03.985Z · LW(p) · GW(p)
Proteins are finite in length. Why would nature care if it can't do something in polynomial time?
Edit: It would be interested to turn this around- suppose proteins folding IS NP-hard, can we put an upper bound on the length of proteins using evolutionary time scales?
Replies from: None, Qiaochu_Yuan↑ comment by [deleted] · 2013-05-09T15:02:55.745Z · LW(p) · GW(p)
can we put an upper bound on the length of proteins using evolutionary time scales?
Not really, most big proteins consist of 'domains' which fold up pretty independantly of each other (smaller proteins can vary quite a bit though). Titin is a ~30,000 amino acid protein in human muscle with ~500 repeats of the same 3 basic modules all laid out in a line... over evolutionary time you can shuffle these functional units around and make all kinds of interesting combinations.
Actually, the lab I'm working in recently had a problem with this. We optimized a gene to be read extremely fast by a ribosome while still producing exactly the same protein sequence (manipulating synonymous codons). But it turned out that when you have the actual protien molecule being extruded from the ribosome as rapidly as we had induced it to be, the normal independant folding of successive domains was disrupted - one domain didn't fully fold before the next domain started being extruded, they interacted, and the protein folded all wrong and didn't work despite having exactly the same sequence as the wild protein.
↑ comment by Qiaochu_Yuan · 2013-04-30T19:25:21.122Z · LW(p) · GW(p)
I think the more important point is that Nature doesn't care about the worst case (if a protein takes forever to fold correctly then it's not going to be of any use). But an AI trying to design arbitrary proteins plausibly might.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T23:45:05.646Z · LW(p) · GW(p)
Why? Nature designed ribosomes without solving any hard cases of NP-hard problems. Why would Ribosomes 2.0, as used for constructing the next generation of molecular machinery, require any NP-hard problems?
Replies from: None, Qiaochu_Yuan↑ comment by [deleted] · 2013-05-01T02:04:42.503Z · LW(p) · GW(p)
Nature designed ribosomes without solving any hard cases of NP-hard problems.
Is that so? Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
The thing with such problems is that you don't solve them exactly for optimal, you weaken the problem and solve them heuristically for good. Nature did just this: produced a solution to a weakened NP-hard design problem.
I think at this point, taboo "solved a hard problem". Nature produced a better replicator, not necessarily the optimal superbest replicator, which it didn't have to.
Obviously an intelligently designed automated design system could be just as good as dumbass evolution, given sufficient computing power, so I agree that swiftly designing nanomachinery is quite plausible.
(The advantage that evolution has over an engineer in a leaky box with a high discount rate is that there's a lot going on in molecular dynamics, solutions are dense in the space, as shown by that evolution got some, but that's no guarantee that a given solution is predictably a solution. So it might be cost prohibitive. (Not that I'm up to date on protein folding.)
In the worst case, the protein folding could be a secure hash, that takes a very expensive high-fidelity simulation to compute. Evolution would do just fine at cracking hashes by brute force, because it bruteforces everything, but an intelligent engineer wouldn't be able to do much better in this case. It may end up taking too much computation to run such sims. (I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
However, it is highly unlikely for lower-fidelity simulations and models to give you literally no information, which would be what is required for the "secure hash" scenario. Even things designed to be secure hashes often lose huge parts of their theoretical complexity after some merely human attacks (eg md5). The mainline case is that there exist reasonable heuristics discoverable by reasonable amounts of intelligent computation. (likewise should fermi here, but would need some harder assumptions)
Of course you already knew all this... (But I didn't))
So yeah, this just made nano seem a bit more real to me.
Replies from: Eliezer_Yudkowsky, DanArmak, None↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-01T02:20:37.920Z · LW(p) · GW(p)
In the worst case, the protein folding could be a secure hash
Then it would be harder, in fact impossible, to end up with slightly better proteins via point mutations. A point mutation in a string gives you a completely different secure hash of that string.
This isn't a minor quibble, it's a major reason to go "Whaa?" at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn't traverse that space at all.
Replies from: None↑ comment by [deleted] · 2013-05-01T03:10:02.721Z · LW(p) · GW(p)
This isn't a minor quibble, it's a major reason to go "Whaa?" at the idea that protein folding and protein design have intractable search spaces in practice. They have highly regular search spaces in practice or evolution couldn't traverse that space at all.
Yep, it's highly implausible that a natural non-designed process would happen to be a secure hash (as you know from arguing with cryonics skeptics). And that's before we look at how evolution works.
Good point. The search space is at least smooth once in a few thousand tries. (While doing the nearby fermi estimate, I saw a result that 12% (!!!) of point mutations in some bacteria were beneficial).
That said, the "worst possible case" is usually interesting.
↑ comment by DanArmak · 2013-05-02T16:47:16.107Z · LW(p) · GW(p)
Pretty much every problem of interest (chess, engineering, etc) is NP-hard, or something on that level of difficulty.
Isn't that in large part a selection effect? After decades having computers, most of the low hanging fruit has been picked, and so many unsolved problems are NP-hard. But many equally important problems have been solved because they weren't.
↑ comment by [deleted] · 2013-05-01T03:03:02.056Z · LW(p) · GW(p)
(I should calculate the expected time to nano in this case, it would be a very interesting fermi estimate).
Lets go!
From Wik:
In early 2008, Rosetta was used to computationally design a protein with a function never before observed in nature.
Skimmed the paper looks like they used the rosetta@home network (~9 TFLOPS) to design a rudimentary enzyme.
So that suggests that a small amount of computation (bearable time by human research standards, allowing for fuckups and restarts) can do protein design. Let's call it a week of computation total. There's 1e6 seconds in a week, flopping at a rate of 1e13 flops, giving us 1e19 flops.
They claimed to have tested 1e18 somethings, so our number is plausible, but we should go to at least 1e22 flops to include 1e4 flops per whatever. (which would take a thousand weeks?) something doesn't add up. Whatever, call it 1e20 (ten weeks) and put some fat error bounds on that.
Don't know how to deal with the exponential complexity. A proper nanothing could require 1e40 flops (square the exponent for double complexity), or it may factor nicely, requiring only 1e21 flops.
Let's call it 1e25 flops with current techniques to design nanotech.
If AI is in 20 years, that's 13 moores doublings or 1e4, then let's say the AI can seize a network of as much computational power as they used, plus moores scaling.
So 1e21 todayflops, 1e20 of which is doable in a standard research project amount of time with a large distributed network.
So anywhere from days to 20 years, with my numbers giving 2 years, to brute force nanotech on 20-years-in-future computational power with today's algorithms.
Factor of 1e6 speedups are reasonable in chess (another problem with similar properties) with a bunch of years of human research, so that puts my middle at 10 minutes.
The AI will probably do better than that, but that would be good enough to fuck us.
This was somewhat conservative, even. (nanotech involves 100000 times more computation than these guys used)
Let's get this thing right the first time....
EDIT: an interesting property of exponential processes is that things go from "totally impossible" to "trivial" very quickly.
Replies from: None↑ comment by Qiaochu_Yuan · 2013-05-01T01:21:08.063Z · LW(p) · GW(p)
I can feel some inferential distance here that isn't successfully being bridged. It's far from clear to me that the default assumption here should be that no NP-hard problems need to be solved and that the burden of proof is on those who claim otherwise.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-01T02:17:17.747Z · LW(p) · GW(p)
I guess to me the notion of "solve an NP-hard problem" (for large N and hard cases, i.e., the problem really is NP hard) seems extremely exotic - all known intelligence, all known protein folding, and all known physical phenomena must be proceeding without it - so I feel a bit at a loss to relate to the question. It's like bringing up PSPACE completeness - I feel a sense of 'where did that come from?' and find it hard to think of what to say, except for "Nothing that's happened so far could've been PSPACE complete."
Replies from: Kindly↑ comment by Kindly · 2013-05-02T02:14:53.434Z · LW(p) · GW(p)
Agreed if you mean "Nothing that's happened so far could've been [computationally hard] to predict given the initial conditions."
But the reverse problem -- finding initial conditions that produce a desired output -- could be very hard. Nature doesn't care about this, but an AI plausibly might.
I'm not sure how protein folding fits into this picture, to be honest. (Are people just trying to figure out what happens to a given protein in physics, or trying to find a protein that will make something good happen?) But more generally, the statement "P=NP" is more or less equivalent to "The reverse problem I mention above is always easy." Things become very different if this is true.
↑ comment by Vaniver · 2013-04-30T14:35:00.083Z · LW(p) · GW(p)
Protein folding cannot be NP-hard. The physical universe is not known to be able to solve NP-hard problems, and protein folding will not involve new physics.
Here's a paper claiming that the most widely studied model of protein folding in 1998 is NP-Complete. I don't know enough about modern research into protein folding to comment how applicable that result still is.
My guess is you're referring to Aaronson's paper, which doesn't seem relevant here. The universe doesn't solve NP-hard problems in P time, but the universe took NP time to build the first useful proteins, didn't it?
Replies from: zslastman, Eliezer_Yudkowsky↑ comment by zslastman · 2013-04-30T15:34:33.297Z · LW(p) · GW(p)
The solution space is large enough that even proteins sampling it's points at a rate of trillions per second couldn't really fold if they were just searching randomly through all possible configurations, that would be NP complete. They don't actually do this of course. Instead they fold piece by piece as they are produced, with local interactions forming domains which tend to retain their approximate structure once they come together to form a whole protein. They don't enter the lowest possible energy state therefore. Prion diseases are an example of what can happen when proteins enter a normally inaccessible local energy minimum, which in that case happens to have a snowballing effect on other proteins.
The result is that they follow grooves in the energy landscape towards an energy well which is robust enough to withstand all sorts of variation, including the horrific inaccuracies of our attempts at modeling. Our energy functions are just very crude approximations to the real one, which is dependent on quantum level effects and therefore intractable. Another issue is that proteins don't fold in isolation - they interact with chaperone proteins and all sorts of other crap. So simulating their folding might require one to simulate a LOT of other things besides just the protein in question.
Even our ridiculously poor attempts at in silico folding are not completely useless though. They can even be improved with the help of the human brain (see Foldit). I think an A.I. should make good progress on the proteins that exist. Even if it can't design arbitrary new ones from scratch, intelligent modification of existing ones would likely be enough to get almost anything done. Note also that an A.I. with that much power wouldn't be limited to what already exists, technology is already in the works to produce arbitrary non-protein polymers using ribosome like systems, stuff like that would open up an unimaginably large space of solutions that existing biology doesn't have access to.
Replies from: Vaniver↑ comment by Vaniver · 2013-04-30T15:44:44.996Z · LW(p) · GW(p)
Right, this gets called Levinthal's Paradox.
Replies from: zslastman↑ comment by zslastman · 2013-04-30T15:51:34.622Z · LW(p) · GW(p)
Right. I'm actually not sure how relevant it all is to discussions of an A.I. trying to get arbitrary things done with proteins. Folding an existing protein may be a great deal easier than finding a protein which folds into an arbitrary shape. Probably not all shapes are allowed by the physics of the problem. Evolution can't really be said to solve that problem either. It just produces small increments in fitness. Otherwise organisms' proteomes would be a lot more efficient.
Although on second thought, an A.I. would probably just be able to design a protein with a massively low energy funnel, so that even if it couldn't simulate folding perfectly, it could still get things done.
Regardless, an imperfect solution would probably suffice for world domination...
↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T15:10:08.120Z · LW(p) · GW(p)
I believe I've seen that discussed before and the answer is just that in real life, proteins don't fold into the lowest-energy conformation. It's like saying that the global minimum energy for soap bubbles is NP-complete. Finding new useful proteins tends to occur via point mutations so that can't be NP-complete either.
Replies from: Vaniver, Kawoomba, JoshuaZ↑ comment by Vaniver · 2013-04-30T15:35:04.125Z · LW(p) · GW(p)
So, I can think of several different things that could all be the 'protein folding problem.'
- Figure out the trajectory an unfolded protein takes towards a folded protein, with a known starting state. (P)
- Given a known folded protein, find local minima that unfolded proteins starting with random start states might get stuck in. (NP, probably)
- Given a desired chemical reaction, find a folded protein that will catalyze the reaction. (Not sure, probably NP.)
- Given a desired chemical reaction, find a folded protein that will catalyze the reaction that is the local minimum reached by most arbitrary unfolded positions for that protein (Optimal is definitely NP, but I suspect feasible is too.)
- Others. (Here, the closest I get to molecular nanotech is 'catalyze reactions,' but I imagine the space for 'build a protein that looks like X' might actually be smaller.)
It looks to me like the problems here that have significant returns are NP.
Finding new useful proteins tends to occur via point mutations so that can't be NP-complete either.
It's not at all clear to me what you mean by this. I mean, take the traveling salesman problem. It's NP-Hard*, but you can get decent solutions by using genetic algorithms to breed solutions given feasible initial solutions. Most improvements to the route will be introduced by mutations, and yet the problem is still NP-hard.
That is, it's not clear to me that you're differentiating between the problem of finding an optimal solution being NP hard, it taking NP time to find a 'decent' solution, and an algorithm requiring NP time to finish running.
(The second is rarely true for things like the traveling salesman problem, but is often true for practical problems where you throw in tons of constraints.)
* A variant is NP-Complete, which is what I originally wrote.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T16:08:04.053Z · LW(p) · GW(p)
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period. It could be a physical event that you choose to regard as a P-approximation to a theoretical problem whose optimal solution would be NP-hard, but so what, that wouldn't have anything to do with what physically happened. It would take unknown, exotic physics to have anything NP-hard physically happen. Anything that could not plausibly have involved black holes rotating at half the speed of light to produce closed timelike curves, or whatever, cannot have plausibly involved NP-hard problems. NP-hard = "did not physically happen". "Physically happened" = not NP-hard.
Replies from: Tyrrell_McAllister, Luke_A_Somers, CronoDAS, asr, shminux↑ comment by Tyrrell_McAllister · 2013-04-30T18:42:22.940Z · LW(p) · GW(p)
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard. Period.
I've seen you say this a couple of times, and your interlocutors seem to understand you, even when they dispute your conclusion. But my brain keeps returning an error when I try to parse your claim.
Read literally, "NP-hard" is not a predicate that can be meaningfully applied to individual events. So, in that sense, trivially, nothing that happens (physically or otherwise, if "physically" is doing any work here) can be NP-hard. But you are evidently not making such a trivial claim.
So, what would it look like if the physical universe "solved an NP-hard problem"? Presumably it wouldn't just mean that some actual salesman found a why to use existing airline routes to visit a bunch of pre-specified cities without revisiting any one of them. Presumably it wouldn't just mean that someone built a computer that implements a brute-force exhaustive search for a solution to the traveling salesman problem given an arbitrary graph (a search that the computer will never finish before the heat death of the universe if the example is large). But I can't think of any other interpretation to give to your claim.
Replies from: CarlShulman, Vaniver↑ comment by CarlShulman · 2013-04-30T22:46:22.863Z · LW(p) · GW(p)
ETA: this is a side point.
Here's Scott Aaronson describing people (university professors in computer science and cognitive science at RPI) who claim that the physical universe efficiently solves NP-hard problems:
It was only a matter of time before someone put the pieces together. Last summer Bringsjord and Taylor [24] posted a paper entitled “P=NP” to the arXiv. This paper argues that, since (1) finding a Steiner tree is NP-hard, (2) soap bubbles find a Steiner tree in polynomial time, (3) soap bubbles are classical objects, and (4) classical physics can be simulated by a Turing machine with polynomial slowdown, it follows that P = NP.
My immediate reaction was that the paper was a parody. However, a visit to Bringsjord’s home page2 suggested that it was not. Impelled, perhaps, by the same sort of curiosity that causes people to watch reality TV shows, I checked the discussion of this paper on the comp.theory newsgroup to see if anyone recognized the obvious error. And indeed, several posters pointed out that, although soap bubbles might reach a minimum-energy configuration with a small number of pegs, there is no “magical” reason why this should be true in general. By analogy, a rock in a mountain crevice could reach a lower-energy configuration by rolling up first and then down, but it is not observed to do so.
In other news, Bringsjord also claims to show by a modal argument, similar to the theistic modal argument (which he also endorses), that human brains are capable of hypercomputation: "it's possible humans are capable of hypercomputation, so they are capable of hypercomputation." For this reason he argues that superhumanly intelligent Turing machines/Von Neumann computers are impossible and belief in their possibility is fideistic.
Replies from: EHeller, Will_Newsome↑ comment by EHeller · 2013-04-30T22:55:00.180Z · LW(p) · GW(p)
Here's Scott Aaronson describing people (university professors at RPI) who claim that the physical universe efficiently solves NP-hard problems.
This doesn't refute what you are responding to. Saying the universe can't solve a general NP problem in polynomial time is not the same thing as saying the universe cannot possibly solve specific instances of generally NP-complete problems, which is Tyrrell_McAllister's point, as far as I can parse. In general, the traveling salesman is NP-complete, however there are lots of cases where heuristics get the job done in polynomial time, even if those heuristics would run-away if they were given the wrong case.
To use Aaronson's soap bubbles, sometimes the soap bubble finds a Steiner tree, sometimes it doesn't. When it DOES, it has solved one instance of an NP-complete problem fairly quickly.
↑ comment by Will_Newsome · 2013-04-30T23:07:25.817Z · LW(p) · GW(p)
Existence of God, not existence of god [or gods]; my sincere apologies for pedantry, it's just that LW folk generally misunderstand theology and so it seems important to be especially careful when discussing it. The distinction between God and god is quite important and the words look so similar, it's unfortunate. ETA: Retracting because Carl edited the comment. (Thanks Carl!)
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-04-30T23:49:55.016Z · LW(p) · GW(p)
In many contexts where a quite important distinction between two terms exists, it is helpful to say what it is, rather than simply assert that there is one.
Admittedly, I have no reason to believe this is one such context.
Replies from: gjm↑ comment by gjm · 2013-05-01T10:56:22.248Z · LW(p) · GW(p)
I'm pretty sure the distinction Will was making was as follows.
A "god" is a being with superhuman powers and so forth, possibly existing somehow "on another plane of existence", but still in some sense the same kind of thing as us: another thing among all the things there are.
"God" is not in any way the same sort of thing as we are. God is not just "another thing" but, in theologian-speak, the Ground Of All Being, a sort of precondition for everything else.
However, it's possible that he's making a different distinction, wherein "God" means a unique divine being of infinite power who created everything and "god" means lesser beings like the Greek gods.
(If he's making the first distinction, I would remark that (1) most theists' notions of God do in fact fall under the description of "god", and (2) the radically different notion of "God" is of very doubtful coherence. If he's making the second, I'm unconvinced that the distinction is as important as he suggests it is.)
Replies from: Will_Newsome↑ comment by Will_Newsome · 2013-05-01T18:02:40.985Z · LW(p) · GW(p)
(Hastily written:) I agree with your (1), but think that's all the more reason to make clear distinctions instead of further muddying the epistemic waters; it's much as if creationists didn't bother to distinguish between something like Pokemon evolution and actual evolutionary biology, because after all most evolutionists can't tell the difference.
Mildly disagree with your (2): I can see how the coherence of the God idea is somewhat doubtful, but there aren't actually any overwhelmingly strong arguments in terms of metaphysics that that is the case, and most atheists take a different route by more or less rejecting all of metaphysics and instead placing emphasis on epistemology. (Then there are weaksauce theists like Kant and William James to argue with but I don't think that's as challenging.) Although I'm sympathetic to skepticism of metaphysics we should keep in mind that the obvious attempts to banish it have failed (e.g. logical positivism), and we should also keep in mind that though LessWrong is (somewhat justifiably) allergic to the word "metaphysics", metaphysics actually shows up here quite a bit in the guise of computationalism/simulationism and in some semi-epistemic rules like Eliezer's GAZP. So to reject metaphysics entirely would be inconsistent; from there, charitably engaging the actual metaphysical arguments of philosophical theists would be necessary, and I see this done very rarely.
In the meantime I think assigning probabilities below, say, 1% to philosophical theism would be premature, especially when the motivations for doing so seem largely to be desires to reverse the stupidity of religious thinking, when philosophical theism stems from Socrates, Plato, and Aristotle and isn't easily dismissed as the rationalizations of a Christian hegemony the way that most atheists seem to assume in practice.
(ETA: It occurs to me that David Chalmers, who is LW-friendly and the editor of Metametaphysics, would be a good person to ask about the tenability of philosophical theism, from a metaphilosophical perspective. I might send him an email / LW message.)
Replies from: BerryPick6↑ comment by BerryPick6 · 2013-05-13T11:18:52.746Z · LW(p) · GW(p)
(ETA: It occurs to me that David Chalmers, who is LW-friendly and the editor of Metametaphysics, would be a good person to ask about the tenability of philosophical theism, from a metaphilosophical perspective. I might send him an email / LW message.)
Did you ever end up doing this, and if so, would you mind sharing the response?
↑ comment by Vaniver · 2013-04-30T23:14:22.800Z · LW(p) · GW(p)
But my brain keeps returning an error when I try to parse your claim.
I agree with your parse error. It looks like EY has moved away from the claim made in the grandparent, though.
↑ comment by Luke_A_Somers · 2013-04-30T17:45:58.056Z · LW(p) · GW(p)
That seems a little strongly put - NP-hard scales very poorly, so no real process can take N up to large numbers. I can solve the traveling salesman problem in my head with only modest effort if there are only 4 stops. And it's trivial if there are 2 or 3 stops.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T18:12:51.177Z · LW(p) · GW(p)
Conceded.
↑ comment by CronoDAS · 2013-07-15T03:59:21.408Z · LW(p) · GW(p)
Um... doesn't it take exponential time in order to simulate quantum mechanics on a classical computer?
Replies from: pengvado, bogdanb↑ comment by pengvado · 2013-07-17T04:01:09.096Z · LW(p) · GW(p)
Yes (At least that's the general consensus among complexity theorists, though it hasn't been proved.) This doesn't contradict anything Eliezer said in the grandparent. The following are all consensus-but-not-proved:
P⊂BQP⊂EXP
P⊂NP⊂EXP
BQP≠NP (Neither is confidently predicted to be a subset of the other, though BQP⊂NP is at least plausible, while NP⊆BQP is not.)
If you don't measure any distinctions finer than P vs EXP, then you're using a ridiculously coarse scale. There are lots of complexity classes strictly between P and EXP, defined by limiting resources other than time-on-a-classical-computer. Some of them are tractable under our physics and some aren't.
↑ comment by asr · 2013-05-05T05:47:17.197Z · LW(p) · GW(p)
Nothing that has physically happened on Earth in real life, such as proteins folding inside a cell, or the evolution of new enzymes, or hominid brains solving problems, or whatever, can have been NP-hard.
I don't understand why you think new physics is required to solve hard instances of NP-complete problems. We routinely solve the hard instances of NP-hard problems in practice on computers -- just not on large instances of the problem. New physics might be required to solve those problems quickly, but if you are willing to wait exponentially long, you can solve the problems just fine.
If you want to argue that actual practical biological folding of proteins isn't NP-hard, the argument can't start from "it happens quickly" -- you need to say something about how the time to fold scales with the length of the amino acid strings, and in particular in the limit for very large strings.
Similarly, I don't see why biological optimization couldn't have solved hard cases of NP-compete problems. If you wait long enough for evolution to do its thing, the result could be equivalent to an exhaustive search. No new physics required.
Replies from: wedrifid↑ comment by wedrifid · 2013-05-05T06:23:36.529Z · LW(p) · GW(p)
I don't understand why you think new physics is required to solve hard instances of NP-complete problems. We routinely solve the hard instances of NP-hard problems in practice on computers -- just not on large instances of the problem.
Eliezer already conceded that trivial instances of such problems can be solved. (We can assume that before he made that concession he thought it went without saying.)
New physics might be required to solve those problems quickly, but if you are willing to wait exponentially long, you can solve the problems just fine.
The physics and engineering required to last sufficiently long may be challenging. I hear it gets harder to power computers once the stars have long since burned out. As far as I know the physics isn't settled yet.
(In other words, I am suggesting that "just fine" is an something of an overstatement when it comes to solving seriously difficult problems by brute force.)
Replies from: Kawoomba, asr↑ comment by Kawoomba · 2013-05-05T07:03:36.083Z · LW(p) · GW(p)
The physics and engineering required to last sufficiently long may be challenging. I hear it gets harder to power computers once the stars have long since burned out. As far as I know the physics isn't settled yet.
That counterargument is a bit too general, since it applies not only to NP problems, but even to P problems (such as deciding whether a number is the GCD of two other numbers), or even any arbitrary algorithm modified by a few lines of codes such that its result is unaffected, merely delayed until after the stars burned out, or whatever limit we postulate.
For NP problems and e.g. P problems both, given how we understand the universe, there is only a finite number of inputs in both cases which are tractable, and an infinite number of inputs which aren't. Though the finite number is well different for both, as a fraction of all "possible", or rather well-defined (let's avoid that ambiguity cliff) inputs, it would be the same.
Cue "We all live in a Finite State Machine, Finite State Machine, Finite State Machine ..."
↑ comment by asr · 2013-05-05T14:37:40.319Z · LW(p) · GW(p)
Eliezer already conceded that trivial instances of such problems can be solved. (We can assume that before he made that concession he thought it went without saying.)
The point can't be confined to "trivial instances". For any NP-complete problem on some reasonable computing platform that can solve small instances quickly, there will be instance sizes that are non-trivial (take appreciable time to solve) but do not require eons to solve. There is absolutely no mathematical reason for assuming that for "natural" NP-complete problems, interesting-sized instances can't be solved on a timescale of months/years/centuries by natural processes.
The dichotomy between "trivial" and "impossible to solve in a useful time-frame" is a false one.
↑ comment by Shmi (shminux) · 2013-05-01T21:35:27.422Z · LW(p) · GW(p)
Anything that could not plausibly have involved black holes rotating at half the speed of light to produce closed timelike curves, or whatever
Presumably quantum suicide is a part of "whatever".
↑ comment by Kawoomba · 2013-04-30T15:33:02.948Z · LW(p) · GW(p)
It occurs to me that we can't really say, since we only have access to the time of the program, which may or may not reflect the actual computational resources expended.
Imagine you were living in a game, and trying to judge the game's hardware requirements. If you did that by looking at a clock in the game, you'd need to assume that the clock is synchronized to the actual system time. If you had a counter you increased, you wouldn't be able to say from inside the program every which step you get to that counter++ instruction.
The problem being that we don't have access to anything external, we aren't watching the Turing Machine compute, we are inside the Turing Machine "watching" the effects of other parts of the program, such as a folding protein (observing whenever it's our turn to be simulated). We don't, however, see the Turing Machine compute, we only see the output. The raw computing power / requirements "behind the scenes", even if such a behind the scenes is only a non-existent abstraction, is impossible to judge with certainty, similar to a map-territory divide. Since there is no access in principle, we cannot observe anything but the "output", we have no way of verifying any assumptions about a correspondence between "game timer" and "system timer" we may make, or of devising any experiments.
Even the recently linked "test the computational limits" doesn't break the barrier, since for all we know the program may stall, and the next "frame" it outputs may still seem consistent, with no stalling, when viewed from inside the program, which we are. We wouldn't subjectively realize the stall. If such an experiment did find something, it would be akin to a bug, not to a measurement of computational resources expended.
Back to diapers.
Replies from: Richard_Kennaway, ESRogs↑ comment by Richard_Kennaway · 2013-04-30T19:26:06.901Z · LW(p) · GW(p)
It occurs to me that we can't really say, since we only have access to the time of the program, which may or may not reflect the actual computational resources expended.
That's a valid point, but it does presuppose exotic new physics to make that substrate, in which "our" time passes arbitrarily slowly compared to the really real time, so that it can solve NP-hard problems between our clock ticks. We would, in effect be in a simulation. Evidence of NP-hard problems actually being solved in P could be taken as evidence that we are in one.
↑ comment by ESRogs · 2013-04-30T18:44:11.397Z · LW(p) · GW(p)
we only have access to the time of the program [...] we are inside the Turing Machine "watching" the effects of other parts of the program, such as a folding protein
If we assume that protein folding occurs according to the laws of quantum mechanics, then it shouldn't tell us anything about the computational complexity of our universe besides what quantum mechanics tells us, right?
Replies from: Kawoomba↑ comment by Kawoomba · 2013-04-30T19:02:42.942Z · LW(p) · GW(p)
Well, yea that's what I'm leaning towards. The laws of physics themselves need not govern the machine (Turing or otherwise), they are effects we observe, us being other effects. The laws of physics and the observers both are part of the output.
Like playing an online roleplaying game and inferring what the program can actually do or what resources it takes, when all you can access is "how high can my character jump" and other in-game rules. The rules regarding the jumping, and any limits the program chose to confer to the jumping behavior are not indicative of the resource requirements and efficiency of the underlying system. Is calculating the jumping easy or hard for the computer? How would you know as a character? The output, again, is a bad judge, take this example:
Imagine using an old Intel 386 system which you rigged into running the latest FPS shooter. It may only output one frame every few hours, but as a sentient character inside that game you wouldn't notice. Things would be "smooth" for you because the rules would be unchanged from your point of view.
We can only say that given our knowledge of the laws of physics, the TM running the universe doesn't output anything which seems like an efficient NP-problem solver, whether the program contains one, or the correct hardware abstraction running it uses one, is anyone's guess. (The "contains one" probably isn't anyone's guess because of Occam's Razor considerations.)
If this is all confused (it may well be, was mostly a stray thought), I'd appreciate a refutation.
Replies from: ESRogs↑ comment by ESRogs · 2013-05-01T08:38:50.826Z · LW(p) · GW(p)
If I understand correctly you're saying that what is efficiently computable within a universe is not necessarily the same as what is efficiently computable on a computer simulating that universe. That is a good point.
Replies from: Kawoomba↑ comment by Kawoomba · 2013-05-01T08:51:13.412Z · LW(p) · GW(p)
Exactly. Thanks for succinctly expressing my point better than I could.
The question is whether assuming a correspondence as a somewhat default case (implied by the "not necessarily") is even a good default assumption.
Why would the rules inherent in what we see inside the universe be any more indicative of the rules of the computer simulating that universe than the rules inside a computer game are reflective of the instruction set of the CPU running it (they are not)?
I am aware that the reference class "computer running super mario brother / kirby's dream land" implies for the rules to be different, but on what basis would we choose any reference class which implies a correspondence?
Also, I'm not advocating simulationism with this per se, the "outer" computer can be strictly an abstraction.
↑ comment by JoshuaZ · 2013-04-30T15:27:44.498Z · LW(p) · GW(p)
Finding new useful proteins tends to occur via point mutations so that can't be NP-complete either.
This does not follow. It may be that finding new useful proteins just takes a very long time and is very inefficient. The rest of your comment seems correct though.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T16:02:40.751Z · LW(p) · GW(p)
Then evolution wouldn't happen in real life.
Actually, even that understates the argument. If you can take a 20,000 base sequence and get something useful by point-perturbing at least one of the 20,000 bases in the sequence, then whatever just happened was 50 lightyears from being NP-hard - you only had to search through 19 variants on each of 20,000 cases.
Replies from: JoshuaZ↑ comment by JoshuaZ · 2013-04-30T19:00:06.006Z · LW(p) · GW(p)
Huh? How does this argument work? That doesn't mean that evolution can't happen in real life, it would be a reason to think that evolution is very slow (which it is!) or that evolution is missing a lot of interesting proteins (which seems plausible).
If you can take a 20,000 base sequence and get something useful by point-perturbing at least one of the 20,000 bases in the sequence, then whatever just happened was 50 lightyears from being NP-hard - you only had to search through 19 variants on each of 20,000 cases.
I'm not sure I follow your logic. Are you arguing that because log 20,000 <19? Yes, you can check every possible position in a base sequence this way, but there are still a lot more proteins than those 19. One doesn't get something special from just changing a specific base. Moreover, even if something interesting does happen for changing a specific one, it might not happen if one changes some other one.
Replies from: CarlShulman, Eliezer_Yudkowsky↑ comment by CarlShulman · 2013-04-30T23:12:33.588Z · LW(p) · GW(p)
missing a lot of interesting proteins (which seems plausible).
Definitely, since evolution keeps introducing new interesting proteins.
That doesn't mean that evolution can't happen in real life, it would be a reason to think that evolution is very slow (which it is!)
But it's not slow on a scale of e^n for even modestly large n. If you can produce millions of proteins with hundreds to thousands of amino acids in a few billion years, then approximate search for useful proteins is not inefficient like finding the lowest-energy conformation is (maybe polynomial approximation, or the base is much better, or functional chunking lets you effectively reduce n greatly...).
Replies from: SilasBarta↑ comment by SilasBarta · 2013-05-01T00:23:25.286Z · LW(p) · GW(p)
[...evolution is] missing a lot of interesting proteins (which seems plausible).
Definitely, since evolution keeps introducing new interesting proteins.
Wait, the fact the evolution is often introducing a interesting new proteins is evidence that evolution is missing a lot of interesting proteins? How does that follow?
Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?
Replies from: CarlShulman↑ comment by CarlShulman · 2013-05-01T00:35:41.485Z · LW(p) · GW(p)
Switch the scenario around: if evolution never produced interesting new proteins (anymore, after time T), would that be evidence that there are no other interesting proteins than what evolution produced?
Yes.
That would be evidence that the supply of interesting proteins had been exhausted, just as computer performance at tic-tac-toe and checkers has stopped improving. I don't see where you're coming from here.
Replies from: SilasBarta↑ comment by SilasBarta · 2013-05-01T00:42:38.234Z · LW(p) · GW(p)
Because evolution can't get stuck in the domain of attraction of a local optimum? It always finds any good points?
Edit to add: Intelligent humans can quickly refactor their programs out of poor regions of designspace. Evolution must grope within its neighborhood.
2nd Edit: How about this argument:
"Evolution has stopped producing interesting new ways of flying; therefore, there are probably no other interesting ways of accomplishing flight, since after all, if there were a good way of doing it, evolution would find it."
Replies from: None, JoshuaZ↑ comment by [deleted] · 2013-05-09T15:37:15.878Z · LW(p) · GW(p)
Point mutations aren't the only way for new things to be produced. You can also recombine large chunks and domains together from multiple previous genes.
Hell, there are even examples of genes evolving via a frame-shift that knocks the 3-base frame of a gene off by one producing a gobbeldygook protein that selection then acts upon...
↑ comment by JoshuaZ · 2013-05-01T00:47:14.027Z · LW(p) · GW(p)
Carl wasn't commenting on whether it would be very strong evidence but whether it would be evidence.
Replies from: SilasBarta↑ comment by SilasBarta · 2013-05-01T00:52:25.662Z · LW(p) · GW(p)
Replies from: CarlShulmanDefinitely, since evolution keeps introducing new interesting proteins.
↑ comment by CarlShulman · 2013-05-01T01:20:37.750Z · LW(p) · GW(p)
Yes, we can be definitely confident that there are more interesting proteins in the vicinity because of continuing production. We have less evidence about more distant extrapolations, although they could exist too.
Replies from: SilasBarta↑ comment by SilasBarta · 2013-05-01T17:13:08.005Z · LW(p) · GW(p)
That makes a lot more sense.
It's just that, from the context, you seemed to be making a claim about evolution's ability to find all cool proteins, rather than just the ones within organisms' local search neighborhood (which would thus be within evolution's reach).
Hence why you appeared, from my reading, to be making the common mistake of attributing intelligence (and global search capabilities) to evolution, which it definitely does not have.
This insinuation was compounded by your comparison to human-intelligence-designed game algorithms, further making it sound like you attributed excessive search capability to evolution.
(And I'm a little scared, to be honest, that the linked comment got several upvotes.)
If you actually recognize the different search capabilities of evolution version more intelligent algorithms, I suggest you retract, or significantly revise, the linked comment.
↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T19:07:10.980Z · LW(p) · GW(p)
20 standard amino acids, so 19 * 20,000 one-amino-acid-changed variants. If you can find something by searching 380,000 cases, it wasn't an NP-hard problem.
EDIT: Actually, since I originally said 20,000 bases, it should just be 3 variants per base (4 standard DNA bases). I don't know if there's any significant 20,000-unit peptides.
Replies from: JoshuaZ, timtyler↑ comment by JoshuaZ · 2013-04-30T19:12:03.313Z · LW(p) · GW(p)
20 amino acids, so 19 * 20,000 one-amino-acid-changed variants.
Ok. Got it.
If you can find something by searching 380,000 cases, it wasn't an NP-hard problem.
This does not follow. Instances of NP-hard problems can be quite easy. For example, consider the traveling salesman with 2 cities or SAT with 2 or 3 variables. These are easy enough to solve by hand. But the general problems are still NP-hard.
Note incidentally, that part of why proteins are difficult is that even when you do have all the variations, how they behave in the actual living thing is really complicated. You can't easily grow thousands of mice each with each small protein difference and see what happens. So there are a lot of associated problems with brute forcing this sort of thing.
Replies from: CarlShulman↑ comment by CarlShulman · 2013-04-30T23:14:28.535Z · LW(p) · GW(p)
Removed for duplication with comment elsewhere that discussed padding.
Replies from: JoshuaZ↑ comment by JoshuaZ · 2013-04-30T23:24:40.576Z · LW(p) · GW(p)
Sure, but standard padding arguments can make NP-hard problems with much larger initial easy instances. It may be that there's some general heuristic that real life, non-constructed problems never act this way, but that's a distinct claim than what Eliezer seems to be making here.
↑ comment by timtyler · 2013-05-03T11:05:23.656Z · LW(p) · GW(p)
If you can find something by searching 380,000 cases, it wasn't an NP-hard problem.
The "NP" terminology is typically not a reference to hard problems are on an absolute scale. It's a about how the difficulty of the problem in a specified class changes as its scale increases. So, there's no issue with evolution solving particular instances of problems from classes of problem that are NP-hard - and talk about solving a whole class of such problems would be silly - all such classes consist of infinite sets of problems.
↑ comment by David_Gerard · 2013-05-02T15:52:45.201Z · LW(p) · GW(p)
For various values of "solve". NP-hard problems may not be analytically solvable, but numerical approximations can get pretty darn close, certainly enough for evolutionary advantage.
↑ comment by Qiaochu_Yuan · 2013-04-30T21:09:49.138Z · LW(p) · GW(p)
Is this your complete response? I guess I could expand this to "I expect all the problems an AI needs to solve on the way to an intelligence explosion to be easy in principle but hard in practice," and I guess I could expand your other comments to "the problem sizes an AI will need to deal with are small enough that asymptotic statements about difficulty won't come into play." Both of these claims seem like they require justification.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T21:16:33.755Z · LW(p) · GW(p)
It's not meant as a response to everything, just noting that protein structure prediction can't be NP-hard. More generally, I tend to take P!=NP as a background assumption; I can't say I've worried too much about how the universe would look different if P=NP. I never thought superintelligences could solve NP-hard problems to begin with, since they're made out of wavefunction and quantum mechanics can't do that. My model of an intelligence explosion just doesn't include anyone trying to do anything NP-hard at any point, unless it's in the trivial sense of doing it for N=20 or something. Since I already expect things to local FOOM with P!=NP, adding P=NP doesn't seem to change much, even if the polynomial itself is small. Though Scott Aaronson seems to think there'd be long-term fun-theoretic problems because it would make so many challenges uninteresting. :)
Replies from: polypubs↑ comment by polypubs · 2013-05-01T23:04:19.800Z · LW(p) · GW(p)
Suppose there is a single A.I. with a 'Devote x % of resources to Smartening myself' directive. Suppose further that the A.I is already operating with Daid Lewis 'elite eligible' ways of carving up the World along its joints- i.e. it is climbing the right hill. Presumably, the Smartening module faces a race hazard type problem in deciding whether it is smarter to devote resources to evaluating returns to smartness or to just release resources back to existing operations. I suppose it could internally breed its own heuristics for Karnaugh map type pattern recognition so as to avoid falling into an NP problem. However, if NP hard problems are like predators, there has to be a heuristic to stop the A.I avoiding them to the extent of roaming uninteresting space and breeding only 'Speigelman monster' or trivial or degenerate results. In other words the A.I's 'smarten yourself' Module is now doing just enough to justify its upkeep but not so much as to endanger its own survival. At this point it is enough for there to be some exogenous shock or random discontinuity on the morphology of the fitness landscape for some sort of gender dimorphism and sexual selection to start taking place within the A.I. with speciation events and so on. However, this opens an exploit for parasites- i.e. humans- so FOOM cashes out as ...of fuck, it's the episode of South Park with the cat saying 'O long Johnson'. Beenakker solution to Hempel's dillemma was wrong- http://en.wikipedia.org/wiki/Hempel's_dilemma- The boundary between physics and metaphysics is NOT the boundary between what can and what cannot be computed in the age of the universe' because South Park has resolved every philosophical puzzle in the space of what?- a few hundred hours?
comment by Shmi (shminux) · 2013-04-29T20:04:20.382Z · LW(p) · GW(p)
Some review notes as I go through it (at a bright dilettante level):
Section 1:
- I wonder if the chain-reaction model is a good one for recursive self-improvement, or is it just the scariest one? What other models have been investigated? For example, the chain-reaction model of financial investment would result in a single entity with the highest return rate dominating the Earth, this has not happened yet, to my knowledge.
Section 1.3:
- There was a recent argument here by David Pearce, I think, that an intelligent enough paperclip maximizer will have to self-modify to be more "humane". If I recall correctly, the logic was that in the process of searching the space of optimization options it will necessarily encounter an imperative against suffering or something to that effect, inevitably resulting in modifying its goal system to be more compassionate, the way humanity seems to be evolving. This would restrict the Orthogonality Thesis to the initial takeoff, and result in goal convergence later on. While this seems like wishful thinking, it might be worth addressing in some detail, beyond the footnote 11.
Chapter 2:
- log(n) + log(log(n)) + ... seems to describe well the current rate of scientific progress, at least in high-energy physics
- empty space for a meditation seems out of place in a more-or-less formal paper
- the Moore's law example is an easy target for criticism, because it's an outlier: most current technologies independent of computer progress are probably improving linearly or even sublinearly with investment (power generation, for example)
- "total papers written" seems like a silly metric to measure scientific progress, akin to the Easter Island statue size.
- If the point of the intro is to say that all types of trends happen simultaneously, and we need "to build an underlying causal model" of all trends, not cherry-pick one of them, then it is probably good to say upfront.
Section 2.1:
- the personal encounter with Kurzwell belongs perhaps in a footnote, not in the main text
- the argument that the Moore's law will speed up if it's reinvested into human cognitive speedup (isn't it now, to some degree?) assumes that faster computers is a goal in itself, I think. If there is no economic or other external reason to make faster/denser/more powerful computers, why would the cognitively improved engineers bother and who would pay them to?
- In general, the section seems quite poorly written, more like a stream of consciousness than a polished piece. It needs a decent summary upfront, at the very least. And probably a few well-structured subsections, one on the FOOM debate, one on the validity of the outside view, one on the Lucas critique, etc. It may also be worth discussing while Hanson apparently remains unconvinced.
I might add more later.
Replies from: Eliezer_Yudkowsky, Vaniver, satt, yli↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-04T01:29:05.553Z · LW(p) · GW(p)
For example, the chain-reaction model of financial investment would result in a single entity with the highest return rate dominating the Earth, this has not happened yet, to my knowledge.
Like... humans? Or the way that medieval moneylenders aren't around anymore, and a different type of financial organization seems to have taken over the world instead? See also the discussion of China catching up to Australia.
Replies from: shminux↑ comment by Shmi (shminux) · 2013-05-05T04:26:20.576Z · LW(p) · GW(p)
Fair point about human domination. Though I'm not sure how it fits into the chain reaction model. Maybe reinvestment of knowledge into more knowledge does, not intelligence into more intelligence. As for financial investments, I don't know of any organization emerging as a singleton.
↑ comment by Vaniver · 2013-04-29T20:40:26.038Z · LW(p) · GW(p)
If I recall correctly, the logic was that in the process of searching the space of optimization options it will necessarily encounter an imperative against suffering or something to that effect, inevitably resulting in modifying its goal system to be more compassionate, the way humanity seems to be evolving.
I see no reason to suspect the space of optimization options contains value imperatives, assuming the AI is guarded against the equivalent of SQL injection attacks.
Humanity seems to be evolving towards compassion because being the causal factors increasing compassion are on average profitable for individual humans with those factors. The easy example of this is stable, strong police forces routinely hanging murderers, instead of those murderers profiting from from their actions. If you don't have an analogue of the police, then you shouldn't expect the analogue of the reduction in murders.
(I should remark that I very much like the way this report is focused; I think that trying to discuss causal models explicitly is much better than trying to make surface-level analogies.)
- empty space for a meditation seems out of place in a more-or-less formal paper
At the very least, using a page break rather than a bunch of ellipses seems better.
Replies from: shminux↑ comment by Shmi (shminux) · 2013-04-29T23:37:13.274Z · LW(p) · GW(p)
Humanity seems to be evolving towards compassion because being the causal factors increasing compassion are on average profitable for individual humans with those factors.
I was simply paraphrasing David Pearce, it's not my opinion, so no point arguing with me. That said, your argument seems misdirected in another way: the imperative against suffering applies to people and animals whose welfare is not in any way beneficial and sometimes even detrimental to those exhibiting compassion.
Replies from: Jayson_Virissimo, Vaniver↑ comment by Jayson_Virissimo · 2013-04-30T01:58:44.961Z · LW(p) · GW(p)
Yeah, but they are losing compassion for other things (unborn babies, gods, etc...). What reason is there to believe there is a net gain in compassion, rather than simply a shift in the things to be compassionate towards?
EDIT: This should have been directed towards Vaniver rather than shminux.
Replies from: davidpearce, Will_Newsome↑ comment by davidpearce · 2013-05-17T11:37:44.693Z · LW(p) · GW(p)
an expanding circle of empathetic concern needn't reflect a net gain in compassion. Naively, one might imagine that e.g. vegans are more compassionate than vegetarians. But I know of no evidence this is the case. Tellingly, female vegetarians outnumber male vegetarians by around 2:1, but the ratio of male to female vegans is roughly equal. So an expanding circle may reflect our reduced tolerance of inconsistency / cognitive dissonance. Men are more likely to be utilitarian hyper-systematisers.
Replies from: Nornagest, Jayson_Virissimo↑ comment by Nornagest · 2013-05-18T08:45:02.550Z · LW(p) · GW(p)
Does your source distinguish between motivations for vegetarianism? It's plausible that the male:female vegetarianism rates are instead motivated by (e.g.) culture-linked diet concerns -- women adopt restricted diets of all types significantly more than men -- and that ethically motivated vegetarianism occurs at similar rates, or that self-justifying ethics tend to evolve after the fact.
Replies from: davidpearce↑ comment by davidpearce · 2013-05-19T10:40:19.342Z · LW(p) · GW(p)
Nornagest, fair point. See too "The Brain Functional Networks Associated to Human and Animal Suffering Differ among Omnivores, Vegetarians and Vegans" : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0010847
↑ comment by Jayson_Virissimo · 2013-05-17T23:23:42.171Z · LW(p) · GW(p)
an expanding circle of empathetic concern needn't reflect a net gain in compassion. Naively, one might imagine that e.g. vegans are more compassionate than vegetarians. But I know of no evidence this is the case. Tellingly, female vegetarians outnumber male vegetarians by around 2:1, but the ratio of male to female vegans is roughly equal. So an expanding circle may reflect our reduced tolerance of inconsistency / cognitive dissonance. Men are more likely to be utilitarian hyper-systematisers.
Right. What I should have said was:
Replies from: davidpearceWhat reason is there to believe that people are compassionate towards more types of things, rather than merely different types of things?
↑ comment by davidpearce · 2013-05-18T08:24:05.257Z · LW(p) · GW(p)
The growth of science has led to a decline in animism. So in one sense, our sphere of concern has narrowed. But within the sphere of sentience, I think Singer and Pinker are broadly correct. Also, utopian technology makes even the weakest forms of benevolence vastly more effective. Consider, say, vaccination. Even if, pessimistically, one doesn't foresee any net growth in empathetic concern, technology increasingly makes the costs of benevolence trivial.
[Once again, I'm not addressing here the prospect of hypothetical paperclippers - just mind-reading humans with a pain-pleasure (dis)value axis.]
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-05-18T09:28:41.131Z · LW(p) · GW(p)
But within the sphere of sentience, I think Singer and Pinker are broadly correct.
Would this be the same Singer who argues that there's nothing wrong with infanticide?
Replies from: davidpearce↑ comment by davidpearce · 2013-05-18T11:23:49.076Z · LW(p) · GW(p)
On (indirect) utilitarian grounds, we may make a strong case that enshrining the sanctity of life in law will lead to better consequences than legalising infanticide. So I disagree with Singer here. But I'm not sure Singer's willingness to defend infanticide as (sometimes) the lesser evil is a counterexample to the broad sweep of the generalisation of the expanding circle. We're not talking about some Iron Law of Moral Progress.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-05-19T08:18:08.592Z · LW(p) · GW(p)
But I'm not sure Singer's willingness to defend infanticide as (sometimes) the lesser evil
If I recall correctly Singer's defense is that it's better to kill infants than have them grow up with disabilities. The logic here relies on excluding infants and to a certain extent people with disabilities from our circle of compassion.
is a counterexample to the broad sweep of the generalisation of the expanding circle. We're not talking about some Iron Law of Moral Progress.
You may want to look at gwern's essay on the subject. By the time you finish taking into account all the counterexamples your generalization looks more like a case of cherry-picking examples.
Replies from: davidpearce, timtyler↑ comment by davidpearce · 2013-05-19T09:17:39.178Z · LW(p) · GW(p)
Eugine, are you doing Peter Singer justice? What motivates Singer's position isn't a range of empathetic concern that's stunted in comparsion to people who favour the universal sanctity of human life. Rather it's a different conception of the threshold below which a life is not worth living. We find similar debates over the so-called "Logic of the Larder" for factory-farmed non-human animals: http://www.animal-rights-library.com/texts-c/salt02.htm. Actually, one may agree with Singer - both his utilitarian ethics and bleak diagnosis of some human and nonhuman lives - and still argue against his policy prescriptions on indirect utilitarian grounds. But this would take us far afield.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-05-21T04:33:09.930Z · LW(p) · GW(p)
What motivates Singer's position isn't a range of empathetic concern that's stunted in comparsion to people who favour the universal sanctity of human life. Rather it's a different conception of the threshold below which a life is not worth living.
By this logic most of the people from the past who Singer and Pinker cite as examples of less empathic individuals aren't less empathic either. But seriously, has Singer made any effort to take into account, or even look at, the preferences of any of the people who he claims have lives that aren't worth living?
Replies from: davidpearce↑ comment by davidpearce · 2013-05-21T09:11:59.916Z · LW(p) · GW(p)
I disagree with Peter Singer here. So I'm not best placed to argue his position. But Singer is acutely sensitive to the potential risks of any notion of lives not worth living. Recall Singer lost three of his grandparents in the Holocaust. Let's just say it's not obvious that an incurable victim of, say, infantile Tay–Sachs disease, who is going do die around four years old after a chronic pain-ridden existence, is better off alive. We can't ask this question to the victim: the nature of the disorder means s/he is not cognitively competent to understand the question.
Either way, the case for the expanding circle doesn't depend on an alleged growth in empathy per se. If, as I think quite likely, we eventually enlarge our sphere of concern to the well-being of all sentience, this outcome may owe as much to the trait of high-AQ hyper-systematising as any widening or deepening compassion. By way of example, consider the work of Bill Gates in cost-effective investments in global health (vaccinations etc) and indeed in: http://www.thegatesnotes.com/Features/Future-of-Food ("the future of meat is vegan"). Not even his greatest admirers would describe Gates as unusually empathetic. But he is unusually rational - and the growth in secular scientific rationalism looks set to continue.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-05-22T05:43:05.373Z · LW(p) · GW(p)
But Singer is acutely sensitive to the potential risks of any notion of lives not worth living.
I'm not sure what you mean by "sensitive", it certainly doesn't stop him from being at the cutting edge pushing in that direction.
Either way, the case for the expanding circle doesn't depend on an alleged growth in empathy per se. If, as I think quite likely, we eventually enlarge our sphere of concern to the well-being of all sentience, this outcome may owe as much to the trait of high-AQ hyper-systematising as any widening or deepening compassion.
By way of example, consider the work of Bill Gates in cost-effective investments in global health (vaccinations etc) and indeed in: http://www.thegatesnotes.com/Features/Future-of-Food ("the future of meat is vegan"). Not even his greatest admirers would describe Gates as unusually empathetic. But he is unusually rational - and the growth in secular scientific rationalism looks set to continue.
You seem to be confusing expanding the circle of beings we care for and being more efficient in providing that caring.
Replies from: davidpearce↑ comment by davidpearce · 2013-05-22T07:58:25.896Z · LW(p) · GW(p)
Cruelty-free in vitro meat can potentially replace the flesh of all sentient beings currently used for food. Yes, it's more efficient; it also makes high-tech Jainism less of a pipedream.
↑ comment by timtyler · 2013-06-10T01:06:08.149Z · LW(p) · GW(p)
If I recall correctly Singer's defense is that it's better to kill infants than have them grow up with disabilities. The logic here relies on excluding infants and to a certain extent people with disabilities from our circle of compassion.
As I understand the common arguments for legalizing infanticide, it involves weighting the preferences of the parents and society more - not a complete discounting of the infant's preferences.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-06-10T01:19:47.182Z · LW(p) · GW(p)
As I understand the common arguments for legalizing infanticide, it involves weighting the preferences of the parents and society more - not a complete discounting of the infant's preferences.
Try replacing "infanticide" (and "infant's") in that sentence with "killing Jews" or "enslaving Blacks". Would you also argue that it's not excluding Jews or Blacks from the circle of compassion?
Replies from: timtyler↑ comment by timtyler · 2013-06-10T01:25:41.975Z · LW(p) · GW(p)
It seems like a silly question. Practically everyone discounts the preferences of the very young. They can't vote, and below some age, are widely agreed to have practically no human rights, and are generally eligible for death on parental whim.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-06-12T05:12:54.953Z · LW(p) · GW(p)
Well the same applies even more strongly to animals, but the people arguing for the "expanding circle of compassion" idea like to site vegetarianism as an example of this phenomenon.
Replies from: timtyler↑ comment by timtyler · 2013-06-13T00:20:55.160Z · LW(p) · GW(p)
Well, sure, but adult human females have preferences too, and they are quite significant ones. An "expanding circle of compassion" doesn't necessarily imply equal weights for everyone.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-06-13T02:46:06.867Z · LW(p) · GW(p)
Well, sure, but adult human females have preferences too, and they are quite significant ones.
So did slave owners.
An "expanding circle of compassion" doesn't necessarily imply equal weights for everyone.
At the point where A's inconvenience justifies B's being killed you've effectively generalized the "expanding circle of compassion" idea into meaninglessness.
Replies from: timtyler↑ comment by timtyler · 2013-06-13T10:10:48.310Z · LW(p) · GW(p)
Well, sure, but adult human females have preferences too, and they are quite significant ones.
So did slave owners.
Sure.
An "expanding circle of compassion" doesn't necessarily imply equal weights for everyone.
At the point where A's inconvenience justifies B's being killed you've effectively generalized the "expanding circle of compassion" idea into meaninglessness.
Singer's obviously right about the "expanding circle" - it's a real phenomenon. If A is a human and B is a radish, A killing B doesn't seem too awful. Singer claims newborns are rather like that - in being too young to have much in the way of preferences worthy of respect.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-06-15T04:20:42.673Z · LW(p) · GW(p)
Singer's obviously right about the "expanding circle" - it's a real phenomenon.
Um, this is precisely the point of disagreement, and given that your next sentence is about the position that babies have the moral worth of radishes I don't see how you can assert that with a straight face.
Replies from: timtyler↑ comment by timtyler · 2013-06-15T14:13:55.944Z · LW(p) · GW(p)
Singer's obviously right about the "expanding circle" - it's a real phenomenon.
Um, this is precisely the point of disagreement,
I didn't know that. I normally take this for granted.
Some conventional cites on the topic are: Singer and Dawkins.
Replies from: Eugine_Nier↑ comment by Eugine_Nier · 2013-06-16T04:05:50.924Z · LW(p) · GW(p)
I didn't know that. I normally take this for granted.
Some conventional cites on the topic are: Singer and Dawkins.
You just steelmanned Singer's position to claiming that babies have the moral worth of radishes, and it hasn't occurred to you that he might not be the best person to site for arguing for an expanding moral circle?
Sorry, but I have to ask: Are you trolling?
↑ comment by Will_Newsome · 2013-04-30T18:06:26.004Z · LW(p) · GW(p)
I find it really weird that I don't recall having seen that piece of rhetoric before. (ETA: Argh, dangerously close to politics here. Retracting this comment.)
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T18:11:57.951Z · LW(p) · GW(p)
I wish I could upvote your retraction.
Replies from: Adele_L↑ comment by Adele_L · 2013-04-30T19:08:19.290Z · LW(p) · GW(p)
The closest thing I have seen to this sort of idea is this:
http://www.gwern.net/The%20Narrowing%20Circle
Replies from: Jayson_Virissimo↑ comment by Jayson_Virissimo · 2013-05-01T04:50:00.270Z · LW(p) · GW(p)
Wow, an excellent essay!
If I remember correctly, I started thinking along these lines after hearing Robert Garland lecture on ancient Egyptian religion. As a side-note to a discussion about how they had little sympathy for the plight of slaves and those in the lower classes of society (since this was all part of the eternal cosmic order and as it should be), he mentioned that they would likely think that we are the cruel ones, since we don't even bother to feed and cloth the gods, let alone worship them (and the gods, of course, are even more important than mere humans, making our lack of concern all the more horrible).
Replies from: gwern↑ comment by gwern · 2013-05-01T20:01:56.680Z · LW(p) · GW(p)
Any idea where Garland might've written that up? All the books listed in your link sound like they'd be on Greece, not Egypt.
Replies from: Jayson_Virissimo↑ comment by Jayson_Virissimo · 2013-05-04T07:10:18.134Z · LW(p) · GW(p)
It was definitely a lecture, not a book. Maybe I'll track it down when I get around to Ankifying my Ancient Egypt notes.
↑ comment by Vaniver · 2013-04-30T14:24:23.666Z · LW(p) · GW(p)
it's not my opinion, so no point arguing with me.
It seems beneficial to make sure my understanding of why Pearce's argument fails matches that of others, even if I don't need to convince you that it fails.
the imperative against suffering applies to people and animals whose welfare is not in any way beneficial and sometimes even detrimental to those exhibiting compassion.
I interpret imperatives as "you should X," where the operative word is the "should," even if the content is the "X." It is not at all obvious to me why Pearce expects the "should" to be convincing to a paperclipper. That is, I don't think there is a logical argument from arbitrary premises to adopt a preference for not harming beings that can feel pain, even though the paperclipper may imagine a large number of unconvincing logical arguments whose conclusion is "don't harm beings that can feel pain if it costless to avoid" on the way to accomplishing its goals.
Replies from: davidpearce↑ comment by davidpearce · 2013-05-16T19:21:58.741Z · LW(p) · GW(p)
Perhaps it's worth distinguishing the Convergence vs Orthogonality theses for: 1) biological minds with a pain-pleasure (dis)value axis. 2) hypothetical paperclippers.
Unless we believe that the expanding circle of compassion is likely to contract, IMO a strong case can be made that rational agents will tend to phase out the biology of suffering in their forward light-cone. I'm assuming, controversially, that superintelligent biological posthumans will not be prey to the egocentric illusion that was fitness-enhancing on the African savannah. Hence the scientific view-from-nowhere, i.e. no arbitrarily privileged reference frames.
But what about 2? I confess I still struggle with the notion of a superintelligent paperclipper. But if we grant that such a prospect is feasible and even probable, then I agree the Orthogonality thesis is most likely true.
Replies from: Eugine_Nier, Vaniver↑ comment by Eugine_Nier · 2013-05-17T02:04:52.921Z · LW(p) · GW(p)
Unless we believe that the expanding circle of compassion is likely to contract
As mentioned elsewhere in this thread, it's not obvious that the circle is actually expanding right now.
↑ comment by Vaniver · 2013-05-16T20:04:39.406Z · LW(p) · GW(p)
Unless we believe that the expanding circle of compassion is likely to contract, IMO a strong case can be made that rational agents will tend to phase out the biology of suffering in their forward light-cone.
This reads to me as "unless we believe conclusion ~X, a strong case can be made for X," which makes me suspect that I made a parse error.
that superintelligent biological posthumans will not be prey to the egocentric illusion that was fitness-enhancing on the African savannah
This is a negative statement: "synthetic superintelligences will not have property A, because they did not come from the savanna." I don't think negative statements are as convincing as positive statements: "synthetic superintelligences will have property ~A, because ~A will be rewarded in the future more than A."
I suspect that a moral "view from here" will be better at accumulating resources than a moral "view from nowhere," both now and in the future, for reasons I can elaborate on if they aren't obvious.
Replies from: davidpearce↑ comment by davidpearce · 2013-05-16T21:53:34.971Z · LW(p) · GW(p)
There is no guarantee that greater perspective-taking capacity will be matched with equivalent action. But presumably greater empathetic concern makes such action more likely. [cf. Steven Pinker's "The Better Angels of Our Nature". Pinker aptly chronicles e.g. the growth in consideration of the interests of nonhuman animals; but this greater concern hasn't (yet) led to an end to the growth of factory-farming. In practice, I suspect in vitro meat will be the game-changer.]
The attributes of superintelligence? Well, the growth of scientific knowledge has been paralleled by a growth in awareness - and partial correction - of all sorts of cognitive biases that were fitness-enhancing in the ancestral environment of adaptedness. Extrapolating, I was assuming that full-spectrum superintelligences would be capable of accessing and impartially weighing all possible first-person perspectives and acting accordingly. But I'm making a lot of contestable assumptions here. And see too the perils of: http://en.wikipedia.org/wiki/Apophatic_theology
↑ comment by satt · 2013-04-30T00:02:17.139Z · LW(p) · GW(p)
- log(n) + log(log(n)) + ... seems to describe well the current rate of scientific progress, at least in high-energy physics
I'm going to commit pedantry: nesting enough logarithms eventually gives an undefined term (unless n's complex!). So where Eliezer says "the sequence log(w) + log(log(w)) + log(log(log(w))) will converge very quickly" (p. 4), that seems wrong, although I see what he's getting at.
Replies from: None, Eliezer_Yudkowsky, GuySrinivasan, yli↑ comment by [deleted] · 2013-04-30T00:17:17.217Z · LW(p) · GW(p)
It really bothers me that he calls it a sequence instead of a series (maybe he means the sequence of partial sums?), and that it's not written correctly.
The series doesn't converge because log(w) doesn't have a fixed point at zero.
It makes sense if you replace log(w) with log^+(w) = max{ log(w), 0 }, which is sometimes written as log(w) in computer science papers where the behavior on (0, 1] is irrelevant.
I suppose that amounts to assuming there's some threshold of cognitive work under which no gains in performance can be made, which seems reasonable.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-04T01:29:22.009Z · LW(p) · GW(p)
Now fixed, I hope.
Replies from: None↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T18:23:39.373Z · LW(p) · GW(p)
Since this apparently bothers people, I'll try to fix it at some point. A more faithful statement would be that we start by investing work w, get a return w2 ~ log(w), reinvest it to get a new return log(w + w2) - log(w) = log ((w+w2)/w). Even more faithful to the same spirit of later arguments would be that we have y' ~ log(y) which is going to give you basically the same growth as y' = constant, i.e., whatever rate of work output you had at the beginning, it's not going to increase significantly as a result of reinvesting all that work.
I'm not sure how to write either more faithful version so that the concept is immediately clear to the reader who does not pause to do differential equations in their head (even if simple ones).
Replies from: Vaniver↑ comment by Vaniver · 2013-05-01T00:53:14.789Z · LW(p) · GW(p)
Well, suppose cognitive power (in the sense of amount of cognitive work put unit time) is a function of total effort invested so far, like P=1-e^(-w). Then it's obvious that while dP/dw= e^(-w) is always positive, it rapidly decreases to basically zero, and total cognitive power converges to some theoretical maximum.
↑ comment by SarahNibs (GuySrinivasan) · 2013-04-30T18:44:55.235Z · LW(p) · GW(p)
This is in the context of reinvesting dividends of cognitive work, assuming it takes exponentially greater investments to produce linearly greater returns. For example, maybe we get a return of log(X) cognitive work per time with what we have now, and to get returns of log(X+k) per time we need to have invested X+k cognitive work. What does it look like to reinvest all of our dividends? After dt, we have invested X+log(X) and our new return is log(X+log(X)). After 2dt, we have invested X+log(X)+log(X+log(X)), etc.
The corrected paragraph would then look like:
Therefore, an AI trying to invest an amount of cognitive work w to improve its own performance will get returns that go as log(w), or if further reinvested, an additional log(1+log(w)/w), and the sequence log(w)+log(1+log(w)/w)+log(1+log(w+log(w))/(w+log(w))) will converge very quickly.
Except then it's not at all clear that the series converges quickly. Let's check... we could say the capital over time is f(t), with f(0)=w, and the derivative at t is f'(t)=log(f(t)). Then our capital over time is f(t)=li^(-1)(t+li(w)). This makes our capital / log-capital approximately linear, so our capital is superlinear, but not exponential.
comment by gjm · 2013-04-30T14:04:27.831Z · LW(p) · GW(p)
The discussion of the Yudkowsky-Hanson debate feels rather out of place. The points made are mostly highly relevant to the paper; the fact that they were made during an online debate is less so; the particular language used by either side in that debate still less. This discussion is also particularly informal and blog-post-like (random example: footnote 30), which may or may not be a problem depended on the intended audience for the paper.
I'd recommend major reworking of this section, still addressing the same issues but no longer so concerned with what each party said, or thought, or was happy to concede during that particular debate.
comment by ModusPonies · 2013-04-29T17:28:00.594Z · LW(p) · GW(p)
I am glad to see this report. I've felt that MIRI was producing less cool stuff than I would've expected, but this looks like it will go a long way towards addressing that. I am revising my opinion of the organization upwards. I look forward to reading this, and commit to having done so by the end of this weekend.
comment by Vaniver · 2013-04-29T18:55:21.781Z · LW(p) · GW(p)
Comments:
Given human researchers of constant speed, computing speeds double every 18 months.
Human researchers, using top-of-the-line computers as assistants. I get the impression this matters more for chip design than litho-tool design, but it definitely helps for those too.
Humans have around four times the brain volume of chimpanzees, but the difference between us is probably mostly software algorithms.
Is 'software algorithms' the right phrase? I'd characterize the improvements more as firmware or hardware improvements. [edit] Later you use the phrase "cognitive algorithms," which I'm much happier with.
A more concrete example you can use to replace the handwaving: one of the big programming productivity boosters is a second monitor, which seems directly related to low human working memory. It's easy to imagine minds with superior working memory able to handle much more complicated models and tasks. (We indeed seem to see this diversity among humans.)
In particular, your later arguments on serial causal depth seem like they would benefit from explicitly considering working memory as well as speed.
Any lab that shuts down overnight so its researchers can sleep must be limited by serial cause and effect in researcher brains more than serial cause and effect in instruments- researchers who could work without sleep would correspondingly speed up the lab.
I don't know about you, but I do research in my sleep, and my lab never shuts off our computers because we often have optimization processes running overnight (on every computer in the lab).
It is the case that most of the cycle time in research is mostly due to the human researchers rather than the computer speed (each month on average there might be about a week that's code-limited rather than human-limited), but this example as you present it is unconvincing.
Replies from: TheOtherDave, timtyler↑ comment by TheOtherDave · 2013-04-29T19:51:55.996Z · LW(p) · GW(p)
It's easy to imagine minds with superior working memory able to handle much more complicated models and tasks. [..] In particular, your later arguments on serial causal depth seem like they would benefit from explicitly considering working memory
Strong, albeit anecdotal, agreement.
Working memory capacity was a large part of what my stroke damaged, and in colloquial terms I was just stupid, relatively speaking, until that healed/retrained. I was fine when dealing with simple problems, but add even a second level of indirection and I just wasn't able to track. The effect is at least subjectively highly nonlinear.
Replies from: Vaniver↑ comment by Vaniver · 2013-04-29T20:31:15.835Z · LW(p) · GW(p)
Incidentally, I think this is the strongest argument against Egan's General Intelligence Theorem (or, alternatively, Deutsch's "Universal Explainer" argument from The Beginning of Infinity). Yes, humans could in theory come up with arbitrarily complex causal models, and that's sufficient to understand an arbitrarily complex causal system, but in practice, unaided humans are limited to rather simple models. Yes, we're very good at making use of aids (I'm reminded of how much writing helps thinking whenever I try to do a complicated calculation in my head), but those limitations represent a plausible way for meaningful superhuman intelligence to be possible.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-04-29T20:53:00.897Z · LW(p) · GW(p)
I hope never to forget the glorious experience of re-inventing the concept of lists, about two weeks into my recovery. I suddenly became indescribably smarter.
In the same vein, I have been patiently awaiting the development of artificial working-memory cognitive buffers. As you say, for practical purposes this is superhuman intelligence.
Replies from: Luke_A_Somers↑ comment by Luke_A_Somers · 2013-04-30T17:40:39.721Z · LW(p) · GW(p)
Gaaah. I hate brain damage.
Congratulations on your discovery, anyway.
Replies from: TheOtherDave↑ comment by TheOtherDave · 2013-04-30T18:15:04.734Z · LW(p) · GW(p)
Yeah, you and me both, brother.
↑ comment by timtyler · 2013-05-04T01:33:43.362Z · LW(p) · GW(p)
Given human researchers of constant speed, computing speeds double every 18 months.
Human researchers, using top-of-the-line computers as assistants.
Indeed. For me, that was the most glaring conceptual problem. That and attempting to predict the course of evolution with minimal reference to evolutionary theory. There is a literature on how cultural systems evolve. For a specific instance see this:
The third tipping point was the appearance of technology capable of accumulating and manipulating vast amounts of information outside humans, thus removing them as bottlenecks to a seemingly self-perpetuating process of knowledge explosion.
comment by gjm · 2013-04-30T16:40:04.389Z · LW(p) · GW(p)
I am unconvinced by the argument that H. sapiens can't be right at a limit on brain size because some people have larger-than-average heads without their mothers being dead.
Presumably head size is partly determined by environmental factors outside genetic control, and presumably having your mother die in childbirth is a really big disadvantage, much worse than being slightly less intelligent. If that's so, then what should it look like if we, as a species, are hard against that wall? (Which I take to mean that any overall increase in head size would be bad even if being cleverer is a big advantage.) I suggest we'd see head sizes that are far enough away from outright disaster that, even given that random environmental variation, death in childbirth is still pretty rare, but not completely unknown. And, of course, that's just what we see; death in childbirth is very rare now, in prosperous advanced countries, but if you go back 100 years or look in less fortunate parts of the world it's not so rare at all.
This could be quantified, at least kinda. We could look at how the frequency of death in childbirth, in places without modern medical care, varies with head size (though controlling this adequately would be hard); we could look at the (rather weak and confounded with other things) relationship between head size and intelligence; and we could say that it looks as if d(IQ)/d(head size) points of IQ are about as valuable, evolutionarily, as a probability -d(maternal mortality)/d(head size) of having your mother die when you're born.
Without doing that calculation or something like it, I don't think Eliezer is justified in saying that it doesn't look as if there's been strong selective pressure for bigger brains; it seems to me like the pressure could be pretty strong without the picture being qualitatively different from what we actually see.
Replies from: None, RolfAndreassen↑ comment by RolfAndreassen · 2013-05-03T16:21:08.179Z · LW(p) · GW(p)
I get the impression that deaths from childbirth don't, usually, come about because the head is too large, but because there are other complications: Breech births, especially.
comment by lukeprog · 2013-05-08T02:36:09.626Z · LW(p) · GW(p)
On page 15, you write:
the Moore’s-like law for serial processor speeds broke down in 2004
No citation is given, but I found one: Fuller & Millett (2011). The paper includes this handy graph:
And also this one:
Replies from: lukeprog↑ comment by lukeprog · 2013-05-08T03:16:26.589Z · LW(p) · GW(p)
The book behind the paper includes lots more detail. Its introduction is quite cheery:
Replies from: NoneThe end of dramatic exponential growth in single-processor performance marks the end of the dominance of the single microprocessor in computing... There is no guarantee that we can make parallel computing as common and easy to use as yesterday’s sequential single-processor computer systems, but unless we aggressively pursue efforts suggested by the recommendations below, it will be “game over” for growth in computing performance. If parallel programming and related software efforts fail to become widespread, the development of exciting new applications that drive the computer industry will stall; if such innovation stalls, many other parts of the economy will follow suit.
↑ comment by [deleted] · 2013-05-09T14:52:15.707Z · LW(p) · GW(p)
I have an engineer friend who has recently put forward the idea that computing technology is approaching becoming a 'mature' technology, like the automobile in the 1950s. It gets a job done and does it well, every change made after that point is a matter of small incremental tweaks. Yeah you get twice the gas mileage now as you did then after a load of small changes with diminishing returns, but is it really all that different? Other friends of mine working as programmers have reacted favorably when I relayed this idea.
Also, why should slower development of new applications for the computer industry kill the economy?
comment by gjm · 2013-04-30T11:54:11.321Z · LW(p) · GW(p)
The discussion of Moore's law, faster engineers, hyperbolic growth, etc., seems to me to come close to an important point but not actually engage with it.
As the paper observes, a substantial part of the work of modern CPU design is already done by computers. So why don't we see hyperbolic rather than merely Moorean growth? One reason would be that as long as some fraction of the work, bounded away from zero, is done by human beings, you don't get the superexponential speedup, for obvious Amdahl-ish reasons. The human beings end up being the rate-limiting factor.
Now suppose we take human beings out of the loop entirely. Is the whole thing now in the hands of the ever-speeding-up computers? Alas, no. When some new technology is developed that enables denser circuitry, Intel and their rivals have to actually build the fabs before reaping the benefits by making faster CPUs. And that building activity doesn't speed up exponentially, and indeed its cost increases rapidly from one generation to the next.
There are things that are purely a matter of clever design. For instance, some of the increase in speed of computer programs over the years has come not from the CPUs but from the compilers. But they improve really slowly; hence "Proebsting's Law", coined tongue-in-cheek by Todd Proebsting, that compiler improvements give you a doubling in speed every 18 years. (It's obvious that even if this were meant seriously it couldn't continue at that rate for very long.)
This doesn't refute any claim made in the paper (since it doesn't, e.g., say that actual literal hyperbolic growth is to be expected) but I think it does suggest some downward adjustment in how rapid we should expect self-improvement to be.
Perhaps this sort of thing is discussed later in the paper -- I've only read about the first 1/4. If so, I'll edit or retract this as appropriate when I find it.
[EDITED to add: there is indeed some discussion of this, in section 3.3, around page 48. The arguments there against sensorimotor bottlenecks greatly reducing foominess seem handwavy to me.]
comment by Shmi (shminux) · 2013-04-29T23:47:55.537Z · LW(p) · GW(p)
Section 5:
The initial effort to get some numerical models going could be overestimated, unless such models have been done already. At the very least, a small-scale effort can pin-point the hard issues. This reminds me of the core-collapse Supernova modeling: it was reasonably easy to get the explosion modeled, except for the ignition by the initial shock wave. We still don't know what exactly makes them go FOOM. Most models predict a fizzle instead of an explosion. This is likely just a surface analogy, but it might well be that a few months of summer student-level simulations, as opposed to a few years of a PhD-level work, would point to the weak links in the model.
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-29T00:13:50.050Z · LW(p) · GW(p)
Two quick notes on the current text: Kasparov was apparently reading the forum of the opposing players during his chess victory in Kasparov vs. The World, which doesn't quite invalidate the outcome as evidence but does weaken the strength of evidence for cognitive (non-)scaling with population. Also Garett Jones made some relevant remarks here which I should've cited in the discussion of how science scales with scientific population and invested money (or rather, how it doesn't scale).
Replies from: Will_Newsome↑ comment by Will_Newsome · 2013-04-30T18:24:13.096Z · LW(p) · GW(p)
Anyone know the details of Karpov vs. The World? (Here are more GM vs. World games; most involve both sides using stronger-than-human chess engines.)
comment by Shmi (shminux) · 2013-05-01T20:55:18.894Z · LW(p) · GW(p)
Here is my takeout from the report. It is not a summary, and some of the implications are mine.
The 4 Theses (conjectures, really):
- Inevitability of Intelligence (defined as cross-domain optimization power) Explosion due to recursive self-improvement
- Orthogonality (intelligence/rationality is preference-independent)
- Instrumental Convergence (most optimizers compete for the same resources)
- Complexity of Value (values are not easily formalizable, no 3 laws of robotics)
if true, imply that AGI is an x-risk, because an AGI emerging in an ad hoc fashion will compete with humans and inevitably win.
The difference between 1-3 and 4 is that 1-3 are outside of human control, but there is a hope for solving 4, hence FAI research.
There are a few outs, which the report considers unlikely:
- If Intelligence Explosion is not inevitable, preventing it may avert the x-risk. If it's impossible, we are in the clear, as well.
- If all sufficiently advanced agents tend to converge on "humane" values, a la David Pearce, we have nothing to worry about
- If powerful optimizers are likely to find their own resources, they might leave us alone
Given the above, the obvious first step is formalizing each of the theses 1-3 as a step toward evaluating their validity. The report outlines potential steps toward formalizing thesis 1, Intelligence Explosion (IE):
- Step 1 is basically categorizing existing positions on IE and constructing explicit models for them, hopefully somewhat formal, checking them for self-consistency and comparing them with past precedents, where possible
- Step 2 is comparing the models formalized in Step 1 by constructing a common theory of which they would be sub-cases (I am not at all sure if this is what Eliezer means)
- Step 3 is constructing a model which is likely to contain "reality" and eventually be able to answer the question "AI go FOOM?" with some probability it is confident in.
The answer to this last question would then determine the direction of the FAI effort, if any.
This was basically the content of the first 4 chapters, as far as I can tell. (Chapter 2 is the advocacy of an outside view and chapter 3 is mostly advocacy of the hard take-off.) Chapters 5 and 6 are a mix of open questions in IE relevant to the Step 3 above, some speculations and MIRI policy arguments, as well as some musings about the scope/effort/qualifications required to tackle the problem.
comment by Shmi (shminux) · 2013-05-01T04:36:15.491Z · LW(p) · GW(p)
I'm also wondering about the estimated FOOM date of 2035 (presumably give or take a decade), is there an explicit calculation of it, and hopefully the confidence intervals as well?
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-01T16:26:29.709Z · LW(p) · GW(p)
Where does it say 2035 in the text? How did you get the impression that this was an estimation?
Replies from: shminux↑ comment by Shmi (shminux) · 2013-05-01T16:30:48.455Z · LW(p) · GW(p)
Maybe I misunderstood this passage:
Replies from: Eliezer_Yudkowsky(Even this kind of “I don’t know” still has to correspond to some probability distribution over decades, just not a tight distribution. I’m currently trying to sort out with Carl Shulman why my median is more like 2035 and his median is more like 2080. Neither of us thinks we can time it down to the decade—we have very broad credible intervals in both cases—but the discrepancy between our “I don’t knows” is too large to ignore.)
↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-01T16:40:22.906Z · LW(p) · GW(p)
Hm. Thanks for pointing that out. Maybe I should remove the specific dates from there and just say we were 45 years apart. I think in a lot of ways trying to time the intelligence explosion is a huge distraction. An important probability distribution, but still a huge distraction.
Replies from: shminux↑ comment by Shmi (shminux) · 2013-05-01T16:50:38.641Z · LW(p) · GW(p)
Well, you mentioned on occasion that this date affects your resource allocation between CFAR and MIRI, so it might be a worthwhile exercise to make the calculation explicit and subject to scrutiny, if not in the report, then some place else.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-01T17:42:07.507Z · LW(p) · GW(p)
Fair point. We're still struggling to express things verbally, but yeah.
comment by Clippy · 2013-04-29T21:24:39.066Z · LW(p) · GW(p)
Thanks for adopting my suggestion to publish more on paperclip-production-relevant topics.
Replies from: someonewrongonthenet↑ comment by someonewrongonthenet · 2013-04-30T02:24:06.086Z · LW(p) · GW(p)
While I'm amused by your existence, the "novelty account" meme is quite virulent and has the potential to lower the signal-to-noise ratio in the comments if everyone starts doing this...
Replies from: Luke_A_Somers, Clippy, None↑ comment by Luke_A_Somers · 2013-04-30T17:58:39.287Z · LW(p) · GW(p)
There are only so many contextually appropriate novelty names available. It wouldn't make much sense for someone to swoop in and begin posting in character as any random character. So far as I know, we've got Clippy, Voldemort and Quirinus Quirrel, and... one other, maybe?
If people showed up and began noticeably posting in character as random vaguely (anti-)rationality-related characters (Spock, Kamina, Pinkie Pie), we'd have a problem. Fortunately, they don't.
Replies from: army1987↑ comment by A1987dM (army1987) · 2013-06-09T14:00:16.507Z · LW(p) · GW(p)
A few others only wrote one or two comments each.
↑ comment by [deleted] · 2013-04-30T02:48:47.324Z · LW(p) · GW(p)
That's what we said when Clippy first started being ridiculous years ago. Luckily, it constrained itself to a few vanity posts, and the phenomena in general didn't really take off.
Admittedly we're just mostly annoyed that people think we're such an account pretending to be an actual made-of-paper kind of machine, but so it goes.
Replies from: CarlShulman↑ comment by CarlShulman · 2013-04-30T04:03:13.915Z · LW(p) · GW(p)
What is the story behind the account name, then?
Replies from: Nonecomment by jbash · 2013-04-29T20:58:22.814Z · LW(p) · GW(p)
TL;DR.
The first four or five paragraphs were just bloviation, and I stopped there.
I know you think you can get away with it in "popular education", but if you want to be taken seriously in technical discourse, then you need to rein in the pontification.
Replies from: None, None↑ comment by [deleted] · 2013-04-30T03:27:55.226Z · LW(p) · GW(p)
if you want to be taken seriously in technical discourse, then you need to rein in the pontification.
I don't disagree, but I also don't think this is correct. There are plenty of verbose mathematicians out there who spend too much time expounding on the philosophical merits of their approach, and they're taken seriously.
You'll excuse me if I don't name names, though.
Replies from: JoshuaZcomment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-29T18:25:34.863Z · LW(p) · GW(p)
For lack of any huge problems discovered so far, moving this to Main.
Replies from: gwern↑ comment by gwern · 2013-04-29T21:57:02.274Z · LW(p) · GW(p)
It's 40,000 words, you say. How fast exactly do you expect any huge problems to be found?
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-30T00:23:02.956Z · LW(p) · GW(p)
If it's a huge problem like "I can't download" or "all these pages are blank" then relatively fast.
comment by ZHD · 2013-04-30T06:56:37.347Z · LW(p) · GW(p)
Hominid brain size has not been increasing for at least the past 100,000 years. In fact, the range is tighter and median is lower for homo sapiens vs homo neanderthalensis.
Given that information, how does this change your explanation of your data?
The most important brain developments in the genus have come during the time when brain size was not increasing. This means that size can not be an explanatory variable.
Cheers, ZHD
Replies from: CarlShulman, pcm↑ comment by CarlShulman · 2013-04-30T08:17:59.033Z · LW(p) · GW(p)
Footnote 44 discusses Neanderthals having larger brains, so it's not new data.
Replies from: ZHD↑ comment by ZHD · 2013-04-30T08:25:06.515Z · LW(p) · GW(p)
Thank you Carl. I am having some difficulty navigating to that discussion. Can you provide a direct link?
Replies from: CarlShulman↑ comment by CarlShulman · 2013-04-30T08:27:08.451Z · LW(p) · GW(p)
It's the link at the top of the OP. Look on page 38 of the document (page numbers are at the bottom) to find footnote 44.
Replies from: ZHD↑ comment by ZHD · 2013-04-30T09:00:48.127Z · LW(p) · GW(p)
Thanks for the help!
So this is the footnote:
- Neanderthals may have had larger brains than modern humans (Ponce de León et al. 2008) and it is an open question how much Neanderthals interbred with the ancestors of modern humans. It is possible that the marginal fitness returns on cognition have leveled off sharply enough that improvements in cognitive efficiency have shifted the total resource cost of brains downward rather than upward over very recent history. If true, this is not the same as Homo sapiens sapiens becoming stupider or even staying the same intelligence. But it does imply that either marginal fitness returns on cognition or marginal cognitive returns on brain scaling have leveled off significantly compared to earlier evolutionary history
That appears to be circular reasoning. It only implies that "marginal fitness return on cognition" has leveled off if we define fitness as a function of brain size—we have no fitness measurement otherwise.
My previous suggestion, that the most important brain developments in our genus are independent of brain size, needs an explanation with a much different anchor.
comment by Paul Crowley (ciphergoth) · 2013-05-04T09:14:12.164Z · LW(p) · GW(p)
The whole counter-, counter-counter- thing is very difficult to follow. I've seen both you and Dennett use conversations between imagined participants to present such arguments, which I find vastly more readable.
comment by Strilanc · 2013-04-30T15:52:40.934Z · LW(p) · GW(p)
I'm going to nitpick on Section 3.8:
If there are several “hard steps” in the evolution of intelligence, then planets on which intelligent life does evolve should expect to see the hard steps spaced about equally across their history, regardless of each step’s relative difficulty. [...]
[...] [...]
[...] the time interval from Australopithecus to Homo sapiens is too short to be a plausible hard step.
I don't think this argument is valid. Assuming there's a last hard step, you'd expect intelligence to show up soon after it was made (because there's no more hard steps).
In terms of the analogy to lock picking, it's inappropriate to set the clock to timeout now. The clock should timeout when it's too late for intelligence to succeed (e.g. when the sun has aged too much and is evaporating the oceans away).
Also, I tested if the claim about the even spacing was correct. The locks do appear to have the same distribution when you condition on being below the time limit. However, the picking times don't appear to be evenly spaced particularly often. With a timeout of 1000 and clocks with pick-chances of 10^-3 through 10^-8 I get average standard deviations between picking times of ~300 which I think implies that the most common situation is for a lock or two to be picked quickly with the other locks consuming all the slack.
edit Now when I read the sentence I see something slightly different. Did you mean "the undertaking of the step should take a long time" or that "the amount of time since the last step was made should be a long time"?
comment by ThrustVectoring · 2013-04-29T18:52:27.277Z · LW(p) · GW(p)
I may have found a minor problem on page 50:
Better algorithms could decrease the serial burden of a thought and allow more thoughts to occur in serial rather than parallel
Shouldn't that be "allow more thoughts to occur in parallel rather than serial"? Turning a thought from multiple parallel sub-tasks to one serial task increases the serial burden of that thought, rather than decreasing it.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-04-29T20:22:44.522Z · LW(p) · GW(p)
The idea here is that you use parallelism to implement operations like caching which can decrease the number of serial steps required for a thought, so that more of them can occur one after another. In the simplest case, if you were already using a serial processor to emulate parallel computers, adding parallel power increases serial depth because you need no longer burn serialism to emulate parallelism.
Replies from: ThrustVectoring↑ comment by ThrustVectoring · 2013-04-29T21:07:33.694Z · LW(p) · GW(p)
Oh, so diverting serial processing cycles to get serial depth instead of getting half the depth over two independent tasks. I thought the sentence was saying something else entirely: that a better algorithm does the same thing except with higher serial depth over fewer processes.
comment by Mitchell_Porter · 2013-05-03T07:56:44.145Z · LW(p) · GW(p)
Suppose I already believe that, because of computer science, neuroscience, etc, there will in the future be agents or coalitions of agents capable of outwitting human beings for control of the world, and that we can hope to shape the character of these future agents by working on topics like friendly AI. If I already believe that, do I need to read this?
Replies from: Manfredcomment by leplen · 2013-06-10T23:08:37.866Z · LW(p) · GW(p)
I have some questions about the math in the first couple pages, specifically the introduction of k. I'm not totally sure I follow exactly what's going on.
So, my assumption is that we're trying to model AI capacity as a function of investment, and I assume that we're modeling this as the integral of an exponential function of base k such that
=\int{k%5Ei}di=\frac{k%5Ei}{\log(k)})
with k held constant. The integral is necessary I believe to insure that the derivative of C is positive in both the k1 scenarios. This I believe matches the example of the nuclear chain reaction. I note here that C as I've defined it, is only a function of investment and tells us nothing about time or any other variable. I think it's also true that we've defined C as an exponential because we're assuming that the AI is reinvesting it's returns. This seems to conflict with the linear relationship between investment and returns mentioned in the Chalmer's quote
"The key issue is the “proportionality thesis” saying that among systems of certain class, an increase of δ in intelligence will yield an increase of δ in the intelligence of systems that these systems can design."
although perhaps those deltas are not intended to be quantitative and equal.
But even then, I'm a little uncertain that my relation is correct. It is not clear to me that the sequence of logarithms obtained in the k<1 case is a result of this function. Specifically, I thought the notion of reinvestment was the motivation for choosing an exponential/logarithmic function to start with, and so I'm not clear on why reinvestment suddenly changes the behavior to that of nested logarithms. Is the logarithmic nature of our return being double counted?
I was also confused by the statement
Over the last many decades, world economic growth has been roughly exponential—growth has neither collapsed below exponential nor exploded above, implying a metaphorical k roughly equal to 1
But from my model, which I think is the correct one, this isn't true. I feel like I understand the math from the nuclear chain reaction, but I have
so that k=1 implies not exponential growth, but linear growth. Even worse, no value of k in my model is capable of making k "explode above" the exponential. I agree with the assessment that k has been slightly on the positive side, which gave me some hope I still have the correct model, but then I got really discouraged by the fact that for money is on the order of 1.02 while for the neutrons in the nuclear pile was 1.006. The implication from k values alone is that my bank account is somehow more explosive than a large pile of Uranium. Unfortunately this is not true, and so it seems like my model needs to account not only for C as a function of i, but C as a function of time as well.
This issue really comes into play with the prompt critical AI. One of the ways prompt critical AI is deemed capable of growing exponentially smarter is by stealing access to more hardware. Having this as an option challenges either the definition of investment or seriously challenges the notion of constant k. Even in the limit that solving AI problems is exponentially hard (k1 coupled to a short generation time?
I'm really terrible at LW formatting/writing in tiny comment boxes, so if I screwed this up to the point of being confusing let me know.
comment by Manfred · 2013-05-01T03:42:44.814Z · LW(p) · GW(p)
The lock problem from 3.8: Suppose there were 2 locks, one with a uniform solving distribution 5 hours long, and one with a uniform solving distribution 10 hours long. Now suppose we make a new probability distribution where first we solve lock one, then lock two, in times X and Y. The probability is now (up to the time limits) X/10*Y/5. Hey look, symmetry!
Now suppose we condition on the total time being 1 hour. So X+Y=1. But there's still symmetry between X and Y. So yeah.
comment by CronoDAS · 2013-07-15T04:10:26.851Z · LW(p) · GW(p)
When I read this segment, I was compelled to comment:
A key component of the debate between Robin Hanson and myself was the question of locality. Consider: If there are increasing returns on knowledge given constant human brains—this being the main assumption that many non-intelligence-explosion, general technological-hypergrowth models rely on, with said assumption seemingly well-supported by exponential technology-driven productivity growth—then why isn’t the leading human nation vastly ahead of the runner-up economy? Shouldn’t the economy with the most knowledge be rising further and further ahead of its next-leading competitor, as its increasing returns compound?
The obvious answer is that knowledge is not contained within the borders of one country; improvements within one country soon make their way across borders. China is experiencing greater growth per annum than Australia, on the order of 8% versus 3% RGDP growth.92 This is not because technology development in general has diminishing marginal returns. It is because China is experiencing very fast knowledge-driven growth as it catches up to already-produced knowledge that it can cheaply import.
There actually are historical examples of this happening between civilizations that had relatively little information transfer between them. Pre-Columbian America was far behind Europe, as was sub-Saharan Africa. China's economic and technological development has also been relatively isolated from Europe until relatively recently; there were times where it was more advanced and times where it was less advanced.
comment by itaibn0 · 2013-04-30T22:27:23.513Z · LW(p) · GW(p)
One point you don't address: While you justify the claim that intelligence is real thing and can be compared, you don't explain why it would be measurable in a numerical scale. In particular, I don't see what "linear increase in intelligence" and "exponential increase in intelligence" mean and how they can be compared.
Stylistically, I agree with many of the other comments and I think this paper is unsuitable for academic publication. You should keep out discussion of side issues like speculation on the bottlenecks in academic research, how MIRI plans to deal with the potential intelligence explosion, and general discussions on how to reason, and focus just on the arguments for the existence and on the nature of an intelligence explosion.
comment by pcm · 2013-05-11T19:10:38.313Z · LW(p) · GW(p)
China is experiencing very fast knowledge-driven growth as it catches up to already-produced knowledge that it can cheaply import.
To the extent that AIs other than the most advanced project can generate self-improvements at all, they generate modifications of idiosyncratic code that can’t be cheaply shared with any other AIs.
I say it's at least as expensive for China to import knowledge. A fair amount is trade secrets that are more carefully guarded than AI content. China copies on the order of $1 trillion in value. What's the value of uncopied AI content?
We don’t invest in larger human brains because that’s impossible with current technology
No, we have technology for that (selective breeding, maybe genetic engineering). The return on investment is terrible. In an em dominated world, the technology for building larger minds (and better designed minds) may still be poor compared with the technology for copying. How much will that change with AGI? I expect people to disagree due to differing intuitions about how AGI will work.
comment by [deleted] · 2013-04-30T03:34:25.131Z · LW(p) · GW(p)
There are some places in the text that appear to be originally hyperlinked, but whose hyperlinks are not present in the .pdf. For example, footnote 21.
In general, the paper needs a technical editor.
EDIT: The lack of hyperlinks is clearly something on my end. I apologize for jumping to conclusions.
Replies from: RomeoStevens, lukeprog↑ comment by RomeoStevens · 2013-04-30T05:46:57.534Z · LW(p) · GW(p)
Tangent: Why isn't editing of academic papers done on github or another revision control platform?
Replies from: malo, None↑ comment by Malo (malo) · 2013-05-01T20:49:56.447Z · LW(p) · GW(p)
MIRI uses Git to track edits for all documents it publishes with it's official template.
↑ comment by [deleted] · 2013-04-30T06:45:18.492Z · LW(p) · GW(p)
Do you mean editing done by the publishers? I don't know anything about that domain.
As far as the writing of academic papers goes, I know a few groups that maintain a CVS, but some portion of mathematicians wouldn't be technical enough to run one. Of the examples I know, two groups only use a CVS because their PI told them to, over much groaning.
Replies from: RomeoStevens↑ comment by RomeoStevens · 2013-04-30T08:38:31.595Z · LW(p) · GW(p)
I was just struck how this is a perfect example where the efficient flow would go 'see problem->fix problem->submit pull request', rather than having to post here and hoping someone sees it and acts on it.
It occurs to me that the person who makes revision control easy in the same way that social websites made having your own website easy will also make a billion dollars. Dropbox is both a good example of success and a good example of how much more could be done (trying to use dropbox for proper revision control is something I've seen attempted and it is not pretty).
Replies from: David_Gerard, None↑ comment by David_Gerard · 2013-04-30T09:57:01.599Z · LW(p) · GW(p)
It occurs to me that the person who makes revision control easy in the same way that social websites made having your own website easy will also make a billion dollars.
Hell yeah. I have successfully explained to a bunch of not-very-technical managers why they would want version control: "Have you ever spent three days editing the wrong version of a document?" Wide eyes, all slowly nod. Once they understood what it was for, they would have crawled across broken glass for version control. (And since we were using ClearCase, they did pretty much that!)
"Track changes" in Word solves a little of the problem from the other end.
↑ comment by [deleted] · 2013-04-30T08:46:24.562Z · LW(p) · GW(p)
I would anticipate that a github-like approach to editing would get you decent coverage of "local" editing issues (e.g., this hyperlink business) while not obtaining decent coverage of "global" editing issues (e.g., the use of "counter^3-argument", "counter^4-argument" in some places and "counter-counter-counter-argument" in another).
A lot of this just boils down to having an acceptable style guide that an editor can enforce without worrying too much about taking every issue to the author for approval.
Replies from: ZHD↑ comment by ZHD · 2013-04-30T09:11:21.320Z · LW(p) · GW(p)
I like the way you're approaching the problem. However, I think the temptation for a familiar conclusion is too great and that you might be missing some possibilities.
See:
A lot of this just boils down to having an acceptable style guide that an editor can enforce without worrying >too much about taking every issue to the author for approval.
The solution you're putting forth suggests that there needs to be a single person in charge of coalescing the many suggestions and edits.
But the great thing about version control is the ability to branch and tag. There could be an arbitrary number of editors who each have their own branch and set of improvements that they are working on—where non-editor contributors could switch branches and commit changes specific to that branch's needs.
In the end, all branches would need to merge into the trunk. This process doesn't necessarily need a single editor either.
Cheers
Replies from: None↑ comment by [deleted] · 2013-04-30T14:54:17.261Z · LW(p) · GW(p)
One person's "familiar conclusion" is another's "best practices", I suppose.
The solution you're putting forth suggests that there needs to be a single person in charge of coalescing the many suggestions and edits.
Not really. Many suggestions and edits put forth by random people, e.g., here, aren't edits that I think an editor should really make. Nor do I really think a single person is necessary; again, a well-defined style guide would go a long way.
I understand how CVSes work, and I have no problem with collaborative editing. But papers are not coding projects. There are a lot of global things going on that need to happen correctly. Even open source projects tend to have lead developers, no?
↑ comment by lukeprog · 2013-04-30T16:37:22.033Z · LW(p) · GW(p)
The hyperlink in footnote 21 works for me. It goes here. What happens when you click on "this online post" in footnote 21?
We did use a couple editors on the paper, like we do with all our papers.
Replies from: None↑ comment by [deleted] · 2013-04-30T20:29:15.649Z · LW(p) · GW(p)
What happens when you click on "this online post" in footnote 21?
Nothing. Adobe Reader 11.0.2 on Windows 7.
We did use a couple editors on the paper, like we do with all our papers.
Yeah, I saw the percent signs were interpreted correctly. It's a work in progress.
Replies from: malo↑ comment by Malo (malo) · 2013-05-01T21:19:16.095Z · LW(p) · GW(p)
MIRI's LaTeX document template uses the /href command to hyperlink text and styles links (both internal and external) using the pdfboarderstyle specification from Abode. We aren't doing anything unusual.
Links are working (and styled) for me in OS X Preview and Adobe Reader 10.1.6, on OS 10.8.3. They even work in Chrome's pdf viewer which currently doesn't support pdfboarderstyle, i.e., the text is linked even though there is no underline or box to indicate that it is.
I suspect something fishy is going on with your Reader install . . .
Also, to clarify Luke's comments, we have a dedicated technical editor (who I have been very impressed with so far), and the papers are reviewed by a couple other people (once they have been typeset) before they are published. I'd be interested to hear about (possibly more appropriate through PM or email) other things in this document that made you think we didn't have a technical editor.
EDIT: I should clarify that the editing and proofreading I'm talking about is done once the content has been finalized. See a definition of technical editing here.
Replies from: None↑ comment by [deleted] · 2013-05-01T23:36:00.197Z · LW(p) · GW(p)
PM sent.
EDIT: I'm no longer sure that sending all of that over a PM (which I unwisely forgot to retain) was such a great idea. Your edit makes it sound like my objections weren't really under the aegis of "technical editing", but I don't recall objecting to anything that doesn't fall under that objection. Anyone who doubts my sincerity, please feel free to PM me.
Replies from: malo↑ comment by Malo (malo) · 2013-05-02T18:09:04.926Z · LW(p) · GW(p)
Sorry, I didn't mean to imply that at all. I just reread my message an it occurred to me that it might not be clear to everyone what technical editing was.
Your PM was indeed about technical edits.
BTW you can see all PMs you sent by visiting, http://lesswrong.com/message/sent/
Replies from: wedrifid↑ comment by wedrifid · 2013-05-03T03:56:22.428Z · LW(p) · GW(p)
BTW you can see all PMs you sent by visiting, http://lesswrong.com/message/sent/
Ohh! Thanks. I hadn't noticed that feature.
comment by timtyler · 2013-05-02T01:18:31.776Z · LW(p) · GW(p)
My response is on my own blog: Response to Intelligence Explosion Microeconomics.
Replies from: None, shminux↑ comment by [deleted] · 2013-05-02T01:44:27.371Z · LW(p) · GW(p)
The problem here is that Yudkowsky is ignoring cultural evolution.
I think it's obvious that EY's model ignores all sorts of things -- the question is whether or not these things are worth not ignoring.
The process that is responsible for Moore's law involves human engineers, but it also involves human culture, machines and software.
So are you arguing for some superexponential growth from cultural evolution changing this process, or what? It's completely unclear why this matters.
The human engineer's DNA may have stayed unchanged over the last century, but their cultural software has improved dramatically over that same period - resulting in the Flynn effect.
That may be your explanation for the Flynn effect, but I think it's safer to remain on the fence. There are too many other possible causal mechanisms at play to blame it on cultural evolution.
Only by considering how this phenomenon is rooted in the present day, can it be properly understood.
Show me a modification to one of the basic models that follows from this statement and changes the consequence of the argument.
Replies from: Eliezer_Yudkowsky, timtyler↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-02T22:05:15.753Z · LW(p) · GW(p)
An easy basic test of whether humans are currently the limiting factor in a process is to ask whether the labs run all night, with researchers sometimes standing idle until the results come in; a lab that runs 9-5 can be sped up by at least a factor of 3 if the individual researchers don't have to eat, sleep or go to the bathroom.
Replies from: EHeller, timtyler↑ comment by EHeller · 2013-05-03T00:14:08.382Z · LW(p) · GW(p)
An easy basic test of whether humans are currently the limiting factor in a process is to ask whether the labs run all night, with researchers sometimes standing idle until the results come in
It is my experience that many labs do in fact run all night, with researchers taking shifts baby sitting equipment and watching data roll in.
Replies from: Eliezer_Yudkowsky, SilasBarta↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-03T02:23:42.518Z · LW(p) · GW(p)
Well, those are the labs that don't have a blindingly obvious route to speedups just by speeding up the researchers, though de facto I'd expect it to work anyway up to a point.
Replies from: RolfAndreassen↑ comment by RolfAndreassen · 2013-05-03T16:18:28.895Z · LW(p) · GW(p)
When I wrote my thesis, a major limiting factor was the speed of the computers doing the analysis; I would start the latest variant of my program in the afternoon, and come back next morning to see what it reported. I'm currently working on software to take advantage of massively-parallel processors to speed up this process by a couple of orders of magnitude.
Replies from: Eliezer_Yudkowsky↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-04T01:31:42.103Z · LW(p) · GW(p)
Next time, try shifting processing resources from your brain to the analytic computers until neither is waiting on the other!
Replies from: itaibn0↑ comment by itaibn0 · 2013-05-04T17:57:02.902Z · LW(p) · GW(p)
Ahem
Replies from: Eliezer_YudkowskyI'm currently working on software to take advantage of massively-parallel processors to speed up this process by a couple of orders of magnitude.
↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2013-05-04T18:52:13.905Z · LW(p) · GW(p)
But then his brain will be too slow.
Replies from: CCC↑ comment by CCC · 2013-05-04T19:43:55.161Z · LW(p) · GW(p)
The difficulty of brain-computer interfaces is that the brain does not appear to work with any known executable format, making running anything on it something of a hit-and-miss affair.
Of course, he could solve this by simply increasing the precision of his computer calculations until it's the right speed for his brain...
↑ comment by SilasBarta · 2013-05-03T02:28:34.576Z · LW(p) · GW(p)
Someone baby-sitting equipment is a technician, not a researcher, properly understood.
Replies from: EHeller, RolfAndreassen↑ comment by EHeller · 2013-05-03T19:15:25.482Z · LW(p) · GW(p)
Not every part of research is glamorous, there is a lot of routine labor to do, and most of the time its the researchers (grad students or postdocs) doing it. The first lab I ever worked in, we spent about 3 months designing and building the experiment and almost a year straight of round-the-clock data collection, I suppose you could say we temporarily stopped being researchers and became technicians but that seems a bit odd. During one of my postdocs, a good 60% of my job was sys-admin type work to keep a cluster running, while waiting for code to run. My point is that the rate-limiting step in a lot of research is that experiments take time to perform, and code takes time to run. Most labs have experiments/code running round the clock.
I guess if you want to differentiate technician work from researcher work, you could do something non-standard and say that every postdoc/grad student in a lab is 30% sales (after all, begging for money isn't being a researcher, properly understood), 60% technician, 10% researcher.
Replies from: SilasBarta↑ comment by SilasBarta · 2013-05-04T05:44:01.166Z · LW(p) · GW(p)
The cluster of thingspace you're referring to can properly be called researchers (probably).
Just the same, if that were how the term were typically used -- for cases where the deep theoretical, high-inferential-distance understanding is vital for core job functions -- I would not feel the need to raise the point I did.
Rather, it's because people tend to inflate their own job descriptions, and my frequent observation of anyone working in lab-like environments being classified as a "researcher" or "doing research", regardless of how small the intellectual component of their contribution is, that I feel the need to point out the possible mis-labeling.
(A high-profile example of this mistake is Freeman Dyson's criticism of climate scientists for being too lazy to do the hard work of collecting data in extreme conditions, which is itself not the scientific component of the work. Start from:"It is much easier for a scientist to sit in an air-conditioned building..." )
↑ comment by RolfAndreassen · 2013-05-03T16:16:36.996Z · LW(p) · GW(p)
"Baby-sitting equipment" is rather a condescending description of what a shift-taker at a particle physics experiment does. This being said, it must be admitted that the cheapness of grad-student labour is a factor in the staffing decisions, here.
↑ comment by timtyler · 2013-05-02T23:51:02.125Z · LW(p) · GW(p)
An easy basic test of whether humans are currently the limiting factor in a process is to ask whether the labs run all night, with researchers sometimes standing idle until the results come in [...]
That "incorrectly" bundles culture in with the human engineers.
To separate the improving components (culture, machines, software) from the relatively static ones (systems based on human DNA) you would have to look at the demand for uncultured human beings. There are a few natural experiments in this area out there - in the form of feral children. Demand for these types of individual appears to rarely be a limiting factor in most enterprises. It is clear that progress is driven by systems that are themselves progressing and improving.
As for computers - they may not typically be on the critical path as often as humans, but that doesn't mean that their contributions to progress are small. What it does mean is that they have immense serial speed. That is partly because we engineered them to compensate for our weaknesses.
If you know about computers operating with high serial speed, then observing that computers are waiting around for humans more than humans are waiting around for computers tells you next to nothing about their relative contributions to making progress. This proposed test is too "basic" to be of much use.
Other ways of comparing the roles of men and machines involve looking at their cost and/or their weight. However you look at it, the influence of the tech domain today on progress is hard to ignore. If someone were to take Intel's tools away from its human employees, its contributions to making progress would immediately halt.
↑ comment by timtyler · 2013-05-02T09:47:54.211Z · LW(p) · GW(p)
The process that is responsible for Moore's law involves human engineers, but it also involves human culture, machines and software.
So are you arguing for some superexponential growth from cultural evolution changing this process, or what? It's completely unclear why this matters.
The position I'm arguing against is:
if our old extrapolation was for Moore’s Law to follow such-and-such curve given human engineers, then faster engineers should break upward from that extrapolation.
This treats human engineers as a fixed quantity. However the process that actually produces Moore's law involves human engineers, human culture, machines and software. Only the former are relatively unchanging. Culture, machines and software are all improving dramatically as time passes - and they are absolutely the reason why Moore's law can keep up the pace. Yudkowsky has a long history of not properly understanding this process - and it hinders his analysis.
The human engineer's DNA may have stayed unchanged over the last century, but their cultural software has improved dramatically over that same period - resulting in the Flynn effect.
That may be your explanation for the Flynn effect, but I think it's safer to remain on the fence. There are too many other possible causal mechanisms at play to blame it on cultural evolution.
All of the proposed explanations of the Flynn effect can be expressed in terms of cultural evolution - except perhaps for for Heterosis, which is rather obviously incapable of explaining the observed effect.
Only by considering how this phenomenon is rooted in the present day, can it be properly understood.
Show me a modification to one of the basic models that follows from this statement and changes the consequence of the argument.
That seems like a vague and expensive-sounding order. How would seeing "a modification to one of the basic models that follows from this statement and changes the consequence of the argument" add to the discussion?
Replies from: None↑ comment by [deleted] · 2013-05-02T10:27:14.342Z · LW(p) · GW(p)
This treats human engineers as a fixed quantity. However the process that actually produces Moore's law involves human engineers, human culture, machines and software. Only the former are relatively unchanging. Culture, machines and software are all improving dramatically as time passes - and they are absolutely the reason why Moore's law can keep up the pace.
So then Moore's law should be faster than Yudkowsky's analysis predicts, because of cultural evolution? I still have no idea what you're trying to argue.
Yudkowsky has a long history of not properly understanding this process - and it hinders his analysis.
How does it hinder his analysis? Please give me something concrete to work with. For example, when a mathematician says "Only by looking at the cohomology groups of a space can we properly understand the topology of its holes," it means that under any weaker theory (e.g., looking only at the Euler characteristic -- see Lakatos' Proofs and Refutations) one quickly runs into problems (e.g., a torus has the same Euler characteristic as a Mobius strip, but the cohomology is much different).
All of the proposed explanations of the Flynn effect can be expressed in cultural evolution
Granted. I still don't think you could cause the Flynn effect by inducing cultural evolution (whatever that means). The reactionaries would have a field day regaling you with tales of Ethiopia and decolonization.
Only by considering how this phenomenon is rooted in the present day, can it be properly understood.
Show me a modification to one of the basic models that follows from this statement and changes the consequence of the argument.
That seems like an expensive-sounding order.
Should be as simple as modifying a few terms and solving a differential equation, or perhaps a system of them. Doing such things is why humans invented computers. More importantly, it would be an actionable contribution to the study.
How would seeing "a modification to one of the basic models that follows from this statement and changes the consequence of the argument" add to the discussion?
It'd be the rent for believing cultural evolution is significantly relevant to the model.
Replies from: EHeller, timtyler↑ comment by EHeller · 2013-05-02T14:06:14.982Z · LW(p) · GW(p)
So then Moore's law should be faster than Yudkowsky's analysis predicts, because of cultural evolution? I still have no idea what you're trying to argue.
It seems to me that timtyler's point is that Yudkowsky is wrong to claim that the current Moore's law was extrapolated from fix-speed engineers. Engineers were ALREADY using computers to enhance their productivity, and timtyler suggests that cultural factors also increase the engineers speed. The cycle of build faster computer -> increase engineering productivity -> build even faster computer -> increase engineering productivity even more, etc was already cooked in to the extrapolation, so there is no reason to assume we'll break above it.
Replies from: timtyler↑ comment by timtyler · 2013-05-02T23:39:53.982Z · LW(p) · GW(p)
That's a correct summary. See also my: Self-Improving Systems Are Here Already.
Using computers and culture to enhance productivity is often known as intelligence augmentation. It's an important phenomenon.
↑ comment by timtyler · 2013-05-02T10:48:06.629Z · LW(p) · GW(p)
All of the proposed explanations of the Flynn effect can be expressed in cultural evolution
Granted. I still don't think you could cause the Flynn effect by inducing cultural evolution (whatever that means). The reactionaries would have a field day regaling you with tales of Ethiopia and decolonization.
Modern cultural evolution is, on average, progressive. Fundamentally, that's because evolution is a giant optimization process operating in a relatively benign environment. The Flynn effect is one part of that.
It'd be the rent for believing cultural evolution is significantly relevant to the model.
Machine intelligence will be a product of human culture. The process of building machine intelligence is cultural evolution in action. In the future, we will make a society of machines that will share cultural information to recapitulate the evolution of human society. That's what memetic algorithms are all about.
Replies from: None↑ comment by [deleted] · 2013-05-02T10:56:10.611Z · LW(p) · GW(p)
There is an ocean between us. I keep asking for specifics, and you keep giving generalities.
I give up. There was an interesting idea somewhere in here, but it was lost in too many magical categories.
Replies from: timtyler↑ comment by Shmi (shminux) · 2013-05-02T01:48:27.768Z · LW(p) · GW(p)
I tried reading your response, but white text on black background hurts my eyes, so I had to stop. Don't you hear from your readers complaining about it?
Replies from: timtyler↑ comment by timtyler · 2013-05-02T09:32:12.648Z · LW(p) · GW(p)
Surely it is black text on white backgrounds that is more likely to actually contribute to retinal damage.
Replies from: khafra↑ comment by khafra · 2013-05-02T16:02:38.405Z · LW(p) · GW(p)
This is either a parody of a common nerdly failure mode, or it's an honest example of a common nerdly failure mode; too close for me to call.
Replies from: DaFranker, shminux, timtyler↑ comment by DaFranker · 2013-05-03T17:43:31.302Z · LW(p) · GW(p)
It seems from my previous readings that there's a non-negligible proportion of the population on both sides that can read much more easily with light on dark or dark on light and has trouble with their respective reverses.
Personally, if the room I'm in is very brightly lit, I tend to prefer dark-on-light, but otherwise under most normal lighting conditions or dim light (like in my apartment) I prefer light-on-dark. In both cases, this preference is due to eye strain during prolonged reading and ease of finding words I'm looking for when I'm "seeking" (i.e. finding the spot where I paused reading, or finding a specific thing in a piece of code, or something similar), measured by how long it takes to find whatever I'm looking for.
(also anecdotal quip re the above link: That thing about the refresh rate isn't just random for me - if the refresh rate of a traditional monitor drops anywhere below 50hz, I will reliably get a harsh migraine within an hour, and if also reading dark-on-light text on it, this time drops to within ten minutes. LCD / LED tend to be less punishing and I've never had this problem with them even as low as 30hz. )
↑ comment by Shmi (shminux) · 2013-05-02T18:10:10.365Z · LW(p) · GW(p)
My guess is that it's the latter.
↑ comment by timtyler · 2013-05-03T23:46:39.050Z · LW(p) · GW(p)
Failure mode?!? Look, if you like this sort of discussion, I propose that the continuation should be somewhere where it is on-topic.