Assessing Kurzweil: the results

stuart_armstrong

Assessing Kurzweil: the results

post by Stuart_Armstrong · 2013-01-16T16:51:50.511Z · LW · GW · Legacy · 64 comments

64 comments

Predictions of the future rely, to a much greater extent than in most fields, on the personal judgement of the expert making them. Just one problem - personal expert judgement generally sucks, especially when the experts don't receive immediate feedback on their hits and misses. Formal models perform better than experts, but when talking about unprecedented future events such as nanotechnology or AI, the choice of the model is also dependent on expert judgement.

Ray Kurzweil has a model of technological intelligence development where, broadly speaking, evolution, pre-computer technological development, post-computer technological development and future AIs all fit into the same exponential increase. When assessing the validity of that model, we could look at Kurzweil's credentials, and maybe compare them with those of his critics - but Kurzweil has given us something even better than credentials, and that's a track record. In various books, he's made predictions about what would happen in 2009, and we're now in a position to judge their accuracy. I haven't been satisfied by the various accuracy ratings I've found online, so I decided to do my own assessments.

I first selected ten of Kurzweil's predictions at random, and gave my own estimation of their accuracy. I found that five were to some extent true, four were to some extent false, and one was unclassifiable

But of course, relying on a single assessor is unreliable, especially when some of the judgements are subjective. So I started a call for volunteers to get assessors. Meanwhile Malo Bourgon set up a separate assessment on Youtopia, harnessing the awesome power of altruists chasing after points.

The results are now in, and they are fascinating. They are...

Ooops, you thought you'd get the results right away? No, before that, as in an Oscar night, I first want to thank assessors William Naaktgeboren, Eric Herboso, Michael Dickens, Ben Sterrett, Mao Shan, quinox, Olivia Schaefer, David Sønstebø and one who wishes to remain anonymous. I also want to thank Malo, and Ethan Dickinson and all the other volunteers from Youtopia (if you're one of these, and want to be thanked by name, let me know and I'll add you).

It was difficult deciding on the MVP - no actually it wasn't, that title and many thanks go to Olivia Schaefer, who decided to assess every single one of Kurzweil's predictions, because that's just the sort of gal that she is.

The exact details of the methodology, and the raw data, can be accessed through here. But in summary, volunteers were asked to assess the 172 predictions (from the "Age of Spiritual Machines") on a five point scale: 1=True, 2=Weakly True, 3=Cannot decide, 4=Weakly False, 5=False. If we total up all the assessments made by my direct volunteers, we have:

As can be seen, most assessments were rather emphatic: fully 59% were either clearly true or false. Overall, 46% of the assessments were false or weakly false, and and 42% were true or weakly true.

But what happens if, instead of averaging across all assessments (which allows assessors who have worked on a lot of predictions to dominate) we instead average across the nine assessors? Reassuringly, this makes very little difference:

What about the Youtopia volunteers? Well, they have a decidedly different picture of Kurzweil's accuracy:

This gives a combined true score of 30%, and combined false score of 57%! If my own personal assessment was the most positive towards Kurzweil's predictions, then Youtopia's was the most negative.

Putting this all together, Kurzweil certainly can't claim an accuracy above 50% - a far cry from his own self assessment of either 102 out of 108 or 127 out of 147 correct (with caveats that "even the predictions that were considered 'wrong' in this report were not all wrong"). And consistently, slightly more than 10% of his predictions are judged "impossible to decide".

As I've said before, these were not binary yes/no predictions - even a true rate of 30% is much higher that than chance. So Kurzweil remains an acceptable prognosticator, with very poor self-assessment.

64 comments

Comments sorted by top scores.

comment by gwern · 2020-03-04T02:33:27.053Z · LW(p) · GW(p)

As another decade has passed, his 2019 from 1999 predictions could be graded too: https://en.wikipedia.org/wiki/Predictions_made_by_Ray_Kurzweil#2019

Skimming them, there are a lot of total misses (eg computer chips being carbon nanotubes - not even on the horizon as of 2020, silicon is king), but there's also a fair number of ones which came true only just recently. For example, a lot of the software ones, like speech recognition/synthesis/transcription and being able to wear glasses with real-time captioning and using machine translation apps in ordinary conversation, really only came true within the last few years, and I don't think this would have been predicted in 1999 based purely on trend extrapolation from n-grams machine translation or HMM voice recognition. Likewise, even allowing for "genetic algorithms" being wrong, "Massively parallel neural nets and genetic algorithms are in wide use." would have been dismissed. I thought while grading his 2010 predictions back then that Kurzweil's errors seemed to be to overestimate how much hardware and biology would be to change, but his software-related projections generally much better, and that seems even truer of his 2019 predictions.

Replies from: habryka4, Stuart_Armstrong

↑ comment by habryka (habryka4) · 2020-03-04T06:46:34.263Z · LW(p) · GW(p)

Man, I really like the fact that LessWrong has now existed for a full decade, and we get to have comments like this.

↑ comment by Stuart_Armstrong · 2020-04-02T12:08:24.761Z · LW(p) · GW(p)

Going for it here: https://www.lesswrong.com/posts/TEqW7GFBuBvGo4fbW/call-for-volunteers-assessing-kurzweil-2019 [LW · GW]

comment by Rob Bensinger (RobbBB) · 2013-01-16T22:58:35.135Z · LW(p) · GW(p)

There are a variety of reasons interpreters might think that a prediction didn't come true, while Kurzweil boldly claims that it did:

Kurzweil didn't express himself clearly, so interpreters misunderstood what the prediction really was. Miscommunication adds random noise, and most randomly generated predictions will turn out false, so this will skew the results against Kurzweil.
Kurzweil's prediction was vague. So charitable interpreters will think they're basically true, while less charitable interpreters will think they're basically false. And we can expect random LessWrongers to be less charitable toward Kurzweil than Kurzweil is toward Kurzweil.
Interpreters tend to be factually mistaken about current events, in a specific direction: They are ignorant of the nature, existence, or prevalence of the latest innovations in technology and culture.
Kurzweil tends to be factually mistaken about current events, in a specific direction: He thinks a variety of technologies are more advanced, and more widespread, than they really are.
There are systemic differences in the evaluation scales used by Kurzweil and by others. For instance, Kurzweil and Armstrong individuate 'predictions' differently, lumping and splitting at different points in the source text. There may also be systemic disagreements about how (temporally and technologically) precise an interpretation must be to count as 'correct,' and about whether grammatical forms like 'X is Y' most closely means 'X is always Y', 'X is usually Y', 'X is commonly Y', 'X is sometimes (occasionally) Y', or 'X is Y at least once'. This ties into vagueness, but may bias the results due to linguistic variation rather than just as a result of generic degree of interpretive charity.

I'm particularly curious about testing 3, since the strongest criticism Kurzweil could make of our methodology for assessing his accuracy is that our reviewers simply got the facts wrong. We can calibrate our assumptions about the accuracy and up-to-dateness of LessWrongers regarding technology generally. Or more specifically we can expose them to Kurzweil's arguments and see how much their assessment of his predictive success changes after hearing why he thinks he got a certain prediction 'correct'.

Replies from: Decius, mfb

↑ comment by Decius · 2013-01-30T00:54:24.172Z · LW(p) · GW(p)

With the advent of multi-core architectures, these devices are starting to have 2, 4, 8… computers each in them, so we’ll exceed a dozen computers “on and around their bodies” very soon. One could argue that it is “typical” already, but it will become very common within a couple of years.

There's clearly a disconnect between his 'computer' and the general meaning of 'computer'; A multicore processor isn't more than one computer, and it wasn't in 1990.

Also, he seems to regard things as 'typical' that I would call 'common'; I say 'common' when it isn't surprising to see something, and 'typical' when it is surprising to note it's absence, while he seems to use 'typical' for things which are not surprising, and 'common' for things which are commercially available (regardless of cost or prevalence)

↑ comment by mfb · 2013-01-22T19:11:25.491Z · LW(p) · GW(p)

I think (5.) can give a significant difference (together with 1 and 2 - I would not expect so much trouble from 3 and 4). Imagine a series of 4 statements, where the last three basically require the first one. If all 4 are correct, it is easy to check every single statement, giving 4 correct predictions. But if the first one is wrong - and the others have to be wrong as consequence - Kurzweil might count the whole block as one wrong prediction.

For predictions judged by multiple volunteers, it might be interesting to check how much they deviate from each other. This gives some insight how important (1.) to (3.) are. satt looked at that, but I don't know which conclusion we can draw from that.

comment by Shmi (shminux) · 2013-01-15T16:44:58.703Z · LW(p) · GW(p)

even a true rate of 30% is much higher that than chance.

What is the chance rate, and how do you calculate it?

Replies from: JoshuaFox, Stuart_Armstrong, alex_zag_al

↑ comment by JoshuaFox · 2013-01-16T07:59:29.227Z · LW(p) · GW(p)

I'd also like to compare Kurzweil against the success rate of other predictors.

Some predictions might be very obvious.

Replies from: CarlShulman, JoshuaFox, MichaelAnissimov

↑ comment by CarlShulman · 2013-01-16T20:04:20.170Z · LW(p) · GW(p)

Yes, if one has a source of abundant likely, obvious predictions one can arbitrarily 'juice' one's overall accuracy rate even if most of the surprising predictions go wrong. On the other hand, judging 'obviousness' in hindsight is very tricky.

One also has to pay attention to the independence of predictions. E.g. one could predict the continuation of Moore's Law as one prediction or as many predictions with connected answers: a prediction about chips in laptops, a prediction about chips in supercomputers, a prediction about the performance of algorithms with well-understood hardware scaling, etc. In the extreme, one could make 1000 predictions about computer performance in consecutive minutes, which would almost certainly rise or fall together.

Kurzweil's separate predictions aren't perfectly correlated (e.g. serial speed broke off from supercomputer performance in FLOPS) but many of them are far from independent.

Replies from: TechnoToad

↑ comment by TechnoToad · 2013-01-29T22:13:35.386Z · LW(p) · GW(p)

Carl is basically pointing out that assessing predictions is tricky business, because it's hard to be objective.

Here are a few points that need to be taken into account:

1. People have a lot to gain from being pessimistically defensive. It prevents them from being disappointed at some point in the future. The option for being pleasantly surprised, remains open. Being defensively pessimistic also prevents you from looking crazy to your peers. After all... who wants to be the only one in a group of 10 to think that by 2030 we'll have nanobots in our brains?

2. The poster assessed Kurzweil's predictions because he felt the need to do so. Why did he feel the need to do so? Is this about defensive pessimism?

3. It is safe to assume that a random selection of assessors would be biased towards judging 'False' for two obvious reasons. The first is the fact that they are uninformed about technology and simply aren't able to properly judge the lion's share of all predictions. The second is defensive pessimism.

4. Why is it judged that a 30% 'Strong True' is a weak score? In comparison to the predictions of futurologists before Kurzweil, 30% seems like an excellent score to me. It strikes me as a score that a man with a blurred vision of the future would have. But blurred vision of the future is all you can ever have. If the future were here, you'd be able to see it sharply in focus. Having blurred vision of the future is a real skill. Most people (SL0) have no vision of the future whatsoever.

5. How many years does a prediction have to be off in order for it to be wrong? How would you determine this number of years objectively?

6. Why did the assessors have to go with the 5-step-true-to-false system? Is that really the best way of assessing a futurologists predictions? I understand that we are a group of rational people here, but sometimes, you've gotta let go of the numbers, the measurements, the data and the whole binary thinking. Sometimes, you have to go back to just being a guy with common sense.

Take Kurzweil's long standing predictions for solar power, for example. He's been predicting for years that the solar tipping point would be around 2010. Spain hit grid parity in 2012 and news outlets are saying that the USA and parts of Europe will hit grid parity in 2013.

Calling Kurzweil's prediction on solar power wrong just because it's happening 2 to 3 years after 2010, is wrong in my opinion.

Kurzweil deserves some slack here. In the 1980s he predicted a computer would beat a human chess player in 1998. And that ended up happening a year earlier in 1997.

Kurzweil has blurry vision of the future. He might be a genius, but he is also just a human being that doesn't have anything better to go on than big data. Simple as that.

Instead of bickering about his predictions, we would do better to just look at the big picture of things.

Nanotech, wireless, robotics, biotech, AI... all of it is happening.

And be honest about Google's self driving car, which came out 2 years ago already: that was just an unexpected leap into the future right there!

I don't think Kurzweil himself saw self driving cars coming in 2011 already.

And to really hammer the point home, the self driving car had thousands of registered miles when it was introduced at the start of 2011. So it was probably already finished in 2010.

For all we know, the Singularity will occur in 2030. We just don't know.

Kurzweil has brought awareness to the world. Rather than sit around and count all the right and the wrong ones as the years pass by, the world would do better if it tried turning those predictions into self fullfilling prophecies.

↑ comment by JoshuaFox · 2013-01-17T08:07:26.563Z · LW(p) · GW(p)

I don't know which predictions, if any, are obvious, but by comparing Kurzweil to other predictors at the same time when he wrote the book (if there were any), we could see how much better he does.

↑ comment by MichaelAnissimov · 2013-01-16T18:33:43.179Z · LW(p) · GW(p)

Which predictions are very obvious?

Replies from: EricHerboso

↑ comment by EricHerboso · 2013-01-22T18:59:32.268Z · LW(p) · GW(p)

As a (perhaps) trivial example, consider the pair of predictions:

"Intelligent roads are in use, primarily for long-distance travel."

"Local roads, though, are still predominantly conventional."

As one of the people who participated in this study, I marked the first as false and the second as true. Yet the second "true" prediction seems like it is only trivially true. (Or perhaps not; I might be suffering from hindsight bias here.)

Replies from: V_V

↑ comment by V_V · 2013-01-30T13:43:52.274Z · LW(p) · GW(p)

But why was this counted as two separate predictions? The two statements are even syntactically linked by the "though" conjunction.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-01-30T14:56:08.969Z · LW(p) · GW(p)

Why oughtn't it be? The construction "A, though B" is an independent assertion of A and B. Syntactic linkage is not enough to establish contingency.

It is not like "A, because B" for example, where it's arguably unfair if B and A are both false to count it as two failures... in that case, the claim of A can be seen as contingent on the claim of B, and not independent.

To put this differently, "A, though B" makes the following claims:
A
B
You might (mistakenly) expect -B given A, which is why I mention B explicitly.

Whereas "A, because B" makes the following claims:
B
If B, then A

If A happens in the first case, the first claim is correct. If B happens, the second is correct. If both happen, both claims are correct.

If A happens in the first case but B doesn't, the first claim is correct and the second claim is unevaluatable.

(I suppose one could argue that the second case implicitly claims "if -B, then -A" as well... "because" is somewhat ambiguous in English.)

Replies from: Kindly, V_V

↑ comment by Kindly · 2013-01-30T15:06:32.784Z · LW(p) · GW(p)

This is only a problem because we haven't been comparing the relative "difficulty" of predictions. Admittedly this is hard to do; but I think it's clear that:

"Intelligent roads are in use, primarily for long-distance travel." is a much more ambitious prediction than "Local roads, though, are still predominantly conventional."
Treating the two statements as a single prediction "A, though B" is more ambitious than either, and should be worth as many points as the two of them combined.

Moreover, any partial credit for "A, though B" would take into account that B happened though A didn't. Or rather, a prediction that intelligent roads are only somewhat in use should receive more credit than a prediction that intelligent roads are ubiquitous.

Replies from: TheOtherDave

↑ comment by TheOtherDave · 2013-01-30T17:33:19.779Z · LW(p) · GW(p)

Agreed that understanding the "difficulty" of a prediction is key if we're going to evaluate the reliability of a predictor in a useful way.

Replies from: EricHerboso

↑ comment by EricHerboso · 2013-01-31T20:47:05.612Z · LW(p) · GW(p)

In the future, we might distinguish "difficult" predictions from trivial ones by seeing if the predictions are unlike the predictions made by others at the same time. This is easy to do if we evaluate contemporary predictions.

But I have no idea how to accomplish this when looking back on past predictions. I can't help but to feel that some of Kurzweil's predictions are trivial, yet how can we tell for sure?

↑ comment by V_V · 2013-01-30T17:13:22.933Z · LW(p) · GW(p)

Well, if you analyze the statements in terms of prepositional logic, then all the English language conjunctions "and", "but", "though", etc. map to the only type of logical conjunction ∧.

But natural language is richer than (directly mapped) prepositional logic. I interpret the statement "Local roads, though, are still predominantly conventional." as a clarification of "Intelligent roads are in use, primarily for long-distance travel.".

Formally, if you just claim:
"Intelligent roads are in use, primarily for long-distance travel."
it is equivalent to:
"Intelligent roads are in use, primarily for long-distance travel." ∧ ( "Local roads are still predominantly conventional" ∨ ¬"Local roads are still predominantly conventional" )
which is different from
"Intelligent roads are in use, primarily for long-distance travel." ∧ "Local roads are still predominantly conventional"

However, we can assume that if you claim:
"Intelligent roads are in use, primarily for long-distance travel."
you also wanted to communicate that
"Local roads are still predominantly conventional"
not that you are undecided between
"Local roads are still predominantly conventional", ¬"Local roads are still predominantly conventional"
otherwise you would have probably stated that explicitely.

Therefore, the information content of:
"Intelligent roads are in use, primarily for long-distance travel."
and:
"Intelligent roads are in use, primarily for long-distance travel. Local roads, though, are still predominantly conventional."
is rougly the same.

↑ comment by Stuart_Armstrong · 2013-01-15T17:51:14.037Z · LW(p) · GW(p)

Subjective impression. The predictions are so varied and sometimes so ambiguous, there's no decent way of establishing a base rate. But picking some predictions at random, they appear to be quite specific (certainly better than a dart throwing chimp):

"For the most part, these truly personal computers have no moving parts." "Unused computes on the Internet are being harvested, creating virtual parallel supercomputers with human brain hardware capacity." "The security of computation and communication is the primary focus of the U.S. Department of Defense." "Military conflicts between nations are rare, and most conflicts are between nations and smaller bands of terrorists."

↑ comment by alex_zag_al · 2013-01-16T19:28:14.726Z · LW(p) · GW(p)

A chance rate isn't the right thing to compare to, I think. It would have to be randomly generated predictions, wouldn't it? But any non-expert human will do much better than that, since basic knowledge such as that the Earth will stay in orbit around the sun rules out most of these.

I think the right thing to compare to is if he did significantly better than I would have. Which he probably did, which means I can improve my vision of the future by reading Kurzweil.

Replies from: shminux

↑ comment by Shmi (shminux) · 2013-01-16T20:31:34.719Z · LW(p) · GW(p)

I think the right thing to compare to is if he did significantly better than I would have.

How do you know how you would have done? have you tried?

comment by MTGandP · 2013-01-15T16:45:08.034Z · LW(p) · GW(p)

You make a quick statement at the end about how Kurzweil does better than random chance. But I wonder how we'd assess that? I'd guess that, if he's getting 50% correct or weakly correct, he's doing better than random chance because many (most?) of his claims are far-fetched.

I've thought of a way to test this, although it will take another ten years:

Kurzweil makes a bunch of predictions about what will happen by 2023. Then you have a bunch of non-experts decide which of his predictions they agree with. After 10 years, we can measure how much better Kurzweil did than the non-experts.

Replies from: cjemmott, handoflixue

↑ comment by cjemmott · 2013-01-16T19:39:10.744Z · LW(p) · GW(p)

I think I can do this! I read "The Age of Spiritual Machines" when it came out, and remember marking in the margins about whether or not I agreed with each. I was in high school at the time, and think I left the book at home when I left for college. I will see if it is still there.

Though I also agree with the comment from handofixue that making the predictions is much harder than judging them.

Replies from: MTGandP

↑ comment by MTGandP · 2013-01-17T00:09:23.228Z · LW(p) · GW(p)

Very cool! I'd love to see that. What year did you do this?

↑ comment by handoflixue · 2013-01-15T20:48:52.949Z · LW(p) · GW(p)

In fairness, it would seem that simply coming up with the prediction is probably a lot of the work.

As a metaphor: it's relatively easy to walk non-experts through a proof of Godel's Incompleteness Theorem. The hard part is often coming up with the idea in the first place, or proving it's correctness; simply agreeing on a proof or theory is vastly easier :)

Replies from: TylerJay

↑ comment by TylerJay · 2014-04-13T05:57:54.086Z · LW(p) · GW(p)

For anyone who hasn't read it, see locating the hypothesis

Some of the predictions are affected by this more than others, but it's hard to judge in any case. For example, the "nanotechnology is prevalent" hypothesis wouldn't be that hard to locate, given that a lot of people were talking about nanotechnology at the time. Then it's just a matter of deciding yes or no based on evidence and your model(s). On the other hand, something like his prediction that "Personal computers with high resolution interface embedded in clothing and jewelry, networked in Body LAN’s," while wrong, is much harder to locate in the hypothesis space.

comment by BlueSun · 2013-01-16T19:36:57.995Z · LW(p) · GW(p)

I wonder if we could get Pundit Tracker to start tracking him? They've mentioned tracking technology pundits in the future.

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2013-01-17T09:23:57.012Z · LW(p) · GW(p)

Can you suggest it to them?

comment by Friendly-HI · 2013-01-21T20:01:45.714Z · LW(p) · GW(p)

@ Everyone:

What are the most interesting and useful conclusions we can reasonably draw from this?

I'm not being facetious, it's just that after I've read this and most of the top rated comments I'm not sure what to draw from all of this. We have a rough estimate of how K. is doing in absolute terms, but not in relative terms because we're left without a baseline to compare him to. Chance or the "average predictions of the average human" can't be a meaningful baseline (for me) because I'm not going to use them as potential sources for my personal predictions/beliefs anyway. What actually interests me is how seriously I can take Kurzweil's predictions for the upcoming future (!) in comparison to other competent predictos. But how K. is doing in comparison to other predictors is very hard to judge because we simply can't standardise relatively vague predictions by different people in order to reasonably compare them.

So what I am left with is only that a bunch of random better-than-average informed people (regarding current technology) estimated that slightly less than one third of K's predictions came true, one third was hard to judge and consists of shades of grey somewhere between definitely true and definitely false, and one third was judged as plain false. So the only thing I really take away from this is that K. seems like a reasonably competent predictor in absolute terms, since any given prediction of his had roughly the same chance of leaning towards "true" as towards "false". Assuming he keeps this rate up for the upcoming decade(s) my ultimate takeaway for now is that he's at least worth reading for inspiration.

Also, I take away that his self-assessment of accuracy is probably either iffy at best or plain dishonest at worst. But to judge this point further I'd have to read his personal accounts on each of his predictions and the specific reasons why he apparently counted many of them as "essentially true", while most technophiles didn't.

Replies from: EricHerboso

↑ comment by EricHerboso · 2013-01-22T18:53:44.312Z · LW(p) · GW(p)

As one of the people who contributed to this project by assessing his predictions, I do want to point out that several of the predictions marked as "True" seemed very obvious to me. Of course, this might be the result of hindsight bias, and in fact it is actually very impressive for him to have predicted something like the following examples:

"[Among portable computers,] Memory is completely electronic, and most portable computers do not have keyboards."

"However, nanoengineering is not yet considered a practical technology."

"China has also emerged as a powerful economic player."

Note also that some of the statements marked "True" are only vacuously true. For example, one of his wrong predictions was that "intelligent roads are in use...for long-distance travel". But he follows this up with the following prediction which got marked as "True":

"Local roads, though, are still predominantly conventional."

As you can see, I do not think that looking just at the percentage of true predictive statements he made is enough. Some of those predictions seem almost trivial. And yet we can't just dismiss them out of hand, because the reason I think they are trivial might just be because I'm looking at it from after the fact. Counterfactually, if intelligent roads had come about, but local roads were still conventional, would I still call the prediction trivial? What if local roads weren't conventional? Would I then still call it a trivial prediction?

We had no choice but to just mark such statements as true and count them in the percentage he got correct, because there's just no way I know of to disregard such "trivial" predictions. And this means we shouldn't really be looking at the percentage marked as true except to compare it with Kurzweil's own self-assessment of accuracy. Using the percentage marked as true for other reasons, like "should I trust Kurzweil's predictive power more than others'", seems like a misuse of this data.

Replies from: V_V

↑ comment by V_V · 2013-01-30T13:24:30.357Z · LW(p) · GW(p)

"[Among portable computers,] Memory is completely electronic, and most portable computers do not have keyboards."

Is that actually true? Notebooks have keyboards and hard disks, many also have optical drives. Tablets still sale less than notebooks ( I found a prediction of tablet sales topping notebooks by 2016 ). I suppose that you can consider Kurzweil's prediction true if you count smartphones as portable computers, but I don't think that's appropriate since they are typically not used as notebook replacements.

"However, nanoengineering is not yet considered a practical technology."

"China has also emerged as a powerful economic player."

These two seem quite obvious. Why do you think they were impressive predictions?

Replies from: gwern, Stuart_Armstrong

↑ comment by gwern · 2013-01-30T17:04:08.040Z · LW(p) · GW(p)

"China has also emerged as a powerful economic player."

China could, at any point, have collapsed into a Japan-style lost decade(s), and there are commentators like Pettis right now who are predicting such a collapse soon; Pettis in particular has an active bet with Economist that it will happen.

Of course in hindsight Chinese growth seems obvious. Why would anyone think that corruption would not strangle growth or that the Communist Party would not collapse or urban rioting and warfare not break out? After all just look at how China managed 7%+ for the last decade and more! Isn't it obvious that China would just keep growing and not stall out or collapse?

But then again, people used to praise the wisdom of MITI (Communist Party) in guiding Japanese (Chinese) growth and speculate about when the Japanese economy would surpass the American economy to become the largest in the world and explain how sky-high property prices in Tokyo (Beijing) were perfectly justified.

If you think it's really that "quite obvious", perhaps you should go wager a thousand bucks with Pettis or on Long Bets or something on whether Chinese growth will exceed, say, 5% for the next decade...

Replies from: V_V

↑ comment by V_V · 2013-01-30T17:54:29.185Z · LW(p) · GW(p)

China could, at any point, have collapsed into a Japan-style lost decade(s)

Japan is a powerful economic player, and China has more than ten times its population. If China "collapsed" to Japan's per capita GDP it would be by far the largest world economy.

If you think it's really that "quite obvious", perhaps you should go wager a thousand bucks with Pettis or on Long Bets or something on whether Chinese growth will exceed, say, 5% for the next decade...

Next decade is another story.

EDIT:

According to Wikipedia, in 1999 China was already the seventh world economy by nominal GDP.

Replies from: gwern, Vaniver

↑ comment by gwern · 2013-01-30T18:11:39.179Z · LW(p) · GW(p)

Japan is a powerful economic player, and China has more than ten times its population. If China "collapsed" to Japan's per capita GDP it would be by far the largest world economy.

You can stagnate/collapse without having reached Japan's per capita GDP: http://en.wikipedia.org/wiki/Middle_income_trap

Next decade is another story.

Of course it is. I'm sure you would in 1998/1999 have said 'it's obvious that China would grow like gangbusters up to 2013, but 2013-2023 - well, I just don't know!', right? You'll pardon me if this looks more like hindsight bias + excuse-making for why you won't extend the 'obvious' prediction out another decade.

Replies from: V_V

↑ comment by V_V · 2013-01-30T23:37:25.913Z · LW(p) · GW(p)

You can stagnate/collapse without having reached Japan's per capita GDP: http://en.wikipedia.org/wiki/Middle_income_trap

And where are the manufacturing jobs going to go? Africa is still to much behind in terms of infrastructures and political stability.

Of course it is. I'm sure you would in 1998/1999 have said 'it's obvious that China would grow like gangbusters up to 2013, but 2013-2023 - well, I just don't know!', right? You'll pardon me if this looks more like hindsight bias + excuse-making for why you won't extend the 'obvious' prediction out another decade.

Well, think what you want. In 1999 China was the 7th world economy by GDP, and had the highest GDP growth of the seven. Pretty much every consumer product was already "made in China". Was it so difficult to predict that China would have kept growing through the decade?

For the next decade, I expect China to become the world largest economy by total GDP, though I'm not betting on the exact growth rate.

Replies from: gwern

↑ comment by gwern · 2013-01-31T02:10:23.709Z · LW(p) · GW(p)

And where are the manufacturing jobs going to go?

Any poorer or more reliable country; so maybe Africa - but maybe the USA or Germany.

(And of course, we've all heard about Foxconn investing heavily in robotics, which is the sort of trend that might preserve some manufacturing-based GDP growth - at the price of increased economic inequality and through that trend, increase various low but catastrophic risks like war or revolution.)

In 1999 China was the 7th world economy by GDP, and had the highest GDP growth of the seven.

Making it the most likely to regress to the mean or fail to turn in continued exceptional performance. When you're growing that fast, there's not much your growth rate can do but go down at some point.

Pretty much every consumer product was already "made in China". Was it so difficult to predict that China would have kept growing through the decade?

It sure isn't in hindsight.

Replies from: V_V

↑ comment by V_V · 2013-01-31T07:12:51.319Z · LW(p) · GW(p)

Any poorer or more reliable country; so maybe Africa - but maybe the USA or Germany.

The Chinese government seems highly reliable, and before Americans or German workers have lower salaries than Chinese workers, China would be the world's largest economy.

(And of course, we've all heard about Foxconn investing heavily in robotics, which is the sort of trend that might preserve some manufacturing-based GDP growth - at the price of increased economic inequality and through that trend, increase various low but catastrophic risks like war or revolution.)

Just like industrial automation increased the risks of war and revolution in first world countries?

Making it the most likely to regress to the mean or fail to turn in continued exceptional performance. When you're growing that fast, there's not much your growth rate can do but go down at some point.

These trends usually don't change abruptly. You can't extrapolate them to 50 years, but 10 years seems reasonable. Moreover, as I answered to Vaniver, there were more fundamental reason for why China was expected to grow more than Japan or other first world countries.

It sure isn't in hindsight.

Whatever.
Pick your favorite Bayesian-Solomonoffian-Yudkowskyian-Muehlhauserian-ian method and make your best predicion about the size of the Chinese economy in 2009 using only data available up to 1999. Let's see if your techniques pay rent.

Replies from: CCC, gwern

↑ comment by CCC · 2013-01-31T08:15:48.800Z · LW(p) · GW(p)

Pick your favorite Bayesian-Solomonoffian-Yudkowskyian-Muehlhauserian-ian method and make your best predicion about the size of the Chinese economy in 2009 using only data available up to 1999. Let's see if your techniques pay rent.

There is a certain temptation here, to pick and choose data that was available in 1999 that leads to a correct conclusion about 2009. This may be unintentional - the result of noting that the result is correct and then not bothering to double-check the sources, or of noting that the result is incorrect and then ruthlessly double-checking the sources and eliminating or updating some, or possibly altering some parts of the predictive model used, until such time as the result is correct.

I would therefore, personally, be more impressed about a correct prediction, made now, with regards to the size of the Chinese economy in 2023, than by a correct prediction, made now, with regards to the size of the Chinese economy in 2009, regardless of what information is used to make the prediction.

Replies from: gwern

↑ comment by gwern · 2013-01-31T17:19:01.377Z · LW(p) · GW(p)

There is a certain temptation here, to pick and choose data that was available in 1999 that leads to a correct conclusion about 2009.

There's far more than just a temptation. How do you even reconstruct the dataset one would've been working with in 1999? So many media sources or websites have disappeared or are completely inaccessible (enjoy 14 years of link rot), or the print versions are extremely time-consuming to access (public libraries keep only a few years of periodicals). In addition, people who turned out wrong about China will not be mentioning, citing, or linking their old 1999 pieces or projections even though they were drawing on plenty of germane information, so there's a double whammy of both references disappearing and knowledge of the disappeared materials itself disappearing.

One could pick a publication which has survived to the present and invested heavily in making its materials accessible, like the Economist, but what would such an exercise boot you? 'A hypothetical person like the 2013 me, inextricably contaminated by more than a decade of knowledge & experience, who read only everything mentioning "China" in the Economist up to 1999 and not any critics or dissenters or commentary, would estimate X% for China growing such-and-such.' Well, uh, ok...

↑ comment by gwern · 2013-01-31T17:11:51.224Z · LW(p) · GW(p)

The Chinese government seems highly reliable,

Surely you're kidding? This is the same Chinese government that went through an internal power struggle over Bao during the just past decadal transfer of power, which is almost as opaque as North Korea, and which just yesterday was revealed to have hacked the NYT's entire internal network as retaliation for reporting on the premier's relatives accumulating billions in suspiciously-obtained wealth and to obtain the names of anyone who helped the NYT investigation? This is the image of a reliable government?

These trends usually don't change abruptly.

How abruptly did the Japanese trend change? Feel free to look it up.

Pick your favorite Bayesian-Solomonoffian-Yudkowskyian-Muehlhauserian-ian method and make your best predicion about the size of the Chinese economy in 2009 using only data available up to 1999.

You want someone who is pointing out hindsight bias to go and engage in a worthless exercise one knows in advance will be contaminated by hindsight bias? I'm not sure what point you're trying to make here.

Replies from: V_V

↑ comment by V_V · 2013-01-31T18:06:57.958Z · LW(p) · GW(p)

This is the image of a reliable government?

That's irrelevant from a business point of view.

How abruptly did the Japanese trend change? Feel free to look it up.

http://www.google.com/publicdata/explore?ds=k3s92bru78li6_&ctype=l&met_y=pppgdp#!ctype=l&strail=false&bcs=d&nselm=h&met_y=ngdpd&scale_y=lin&ind_y=false&rdim=world&idim=country:JP&ifdim=world&hl=en_US&dl=en_US&ind=false

GDP grew linearly from 1990 to 1995, then it oscillated approximately around the 1993 value. It seems the typical pattern that one would expect when you converge to a steady state economy. It may have started to grow again in 2009, or maybe that's just another oscillation.

The GDP at Purchase Power Parity curve is much less wavy: http://www.google.com/publicdata/explore?ds=k3s92bru78li6_&ctype=l&met_y=pppgdp#!ctype=l&strail=false&bcs=d&nselm=h&met_y=pppgdp&scale_y=lin&ind_y=false&rdim=world&idim=country:JP&ifdim=world&hl=en_US&dl=en_US&ind=false

Convergence to an approximate steady state is what I'd expect in a "full", technologically advanced country.

You want someone who is pointing out hindsight bias to go and engage in a worthless exercise one knows in advance will be contaminated by hindsight bias? I'm not sure what point you're trying to make here.

So you are saying that Kurzweil prediction was not obvious. Which means that it was surprising. So, what did the unsurprising scenario look like? You refuse to answer.

Replies from: gwern

↑ comment by gwern · 2013-01-31T18:26:54.999Z · LW(p) · GW(p)

That's irrelevant from a business point of view.

Whether you will be hacked and your trade secrets leaked to a competitor is not irrelevant. Whether you'll be forced to bribe officials to do business is not irrelevant. Whether you can count on them to stay bribed and not be imprisoned is not irrelevant. Whether you have to worry, ala Rio Tinto, that your executives will be imprisoned is not irrelevant. Whether the succession of power will be smooth and not descend into coups or other infighting is not irrelevant. Politics matters; a company might not be interested in politics, but politics is interested in the company.

GDP grew linearly from 1990 to 1995, then it oscillated approximately around the 1993 value. It seems the typical pattern that one would expect when you converge to a steady state economy. It may have started to grow again in 2009, or maybe that's just another oscillation.

And if you look at the growth rate, you see that not only was the growth rate leading to exponential growth (4% a year compounding), the growth rate itself grew to 7.15%, after which it fell a percentage point or two a year until in 1998 it was actually negative 2%, whereupon it hovers around 0% with occasional small periods of growth counterbalanced by disasters like negative 5.5% in 2009 ('may have started to grow again'?).

A flat real GDP contradicts your previous claims that a stable economy should then grow at the rate of technological improvements; or has there been no such improvements since 1989?

The GDP at Purchase Power Parity curve is much less wavy

It's also misleading since it's not real GDP PPP. If you look at a graph of real GDP like http://research.stlouisfed.org/fred2/series/JPNRGDPR or real GDP PPP http://research.stlouisfed.org/fred2/series/JPNRGDPC , you see the abrupt break at 1989; if the '80s trend had continued, the real GDP would be off the chart.

So, what did the unsurprising scenario look like? You refuse to answer.

Refused? I already answered: it looks like the middle-income trap which has affected many countries. The country just stops growing significantly, or actually declines in real terms, due to any of the potential problems already listed.

Replies from: V_V

↑ comment by V_V · 2013-01-31T21:45:26.796Z · LW(p) · GW(p)

Whether you will be hacked and your trade secrets leaked to a competitor is not irrelevant. Whether you'll be forced to bribe officials to do business is not irrelevant. Whether you can count on them to stay bribed and not be imprisoned is not irrelevant. Whether you have to worry, ala Rio Tinto, that your executives will be imprisoned is not irrelevant.

China is known to have a fairly predictable patterns for bribery. Predictable bribey becomes de facto just another tax. Of course there can be occasional exceptions.

Whether the succession of power will be smooth and not descend into coups or other infighting is not irrelevant.

The last one was the Chinese Revolution in 1949.

And if you look at the growth rate, you see that not only was the growth rate leading to exponential growth (4% a year compounding), the growth rate itself grew to 7.15%, after which it fell a percentage point or two a year until in 1998 it was actually negative 2%, whereupon it hovers around 0% with occasional small periods of growth counterbalanced by disasters like negative 5.5% in 2009 ('may have started to grow again'?).

Derivatives magnify noise. Compute a 5 years average.

A flat real GDP contradicts your previous claims that a stable economy should then grow at the rate of technological improvements; or has there been no such improvements since 1989?

Japan current GDP, both nominal and PPP, are higher today than in 1989.

Refused? I already answered: it looks like the middle-income trap which has affected many countries. The country just stops growing significantly, or actually declines in real terms, due to any of the potential problems already listed.

The middle-income trap that didn't happen in Japan, or South Korea, or many other countries.

Replies from: gwern

↑ comment by gwern · 2013-01-31T22:24:55.202Z · LW(p) · GW(p)

China is known to have a fairly predictable patterns for bribery. Predictable bribey becomes de facto just another tax. Of course there can be occasional exceptions.

Handwaving away just one of the examples.

The last one was the Chinese Revolution in 1949.

Yes, minus some minor events like the Great Leap Forward, the Cultural Revolution, Russia & China nearly going to nuclear war in 1969, Tienanmen Square... Gosh, there's no reason to think that China is at the slightest risk of war or revolution; it'd be like worrying about nuclear warfare when the last nuclear bomb used in a war was back in 1946!

Derivatives magnify noise. Compute a 5 years average.

Better yet, let's look at 1989-2011 using the FRED data on real GDP. Over 23 years, the economy increased in size by 26%, for an annualized growth rate of ~1%. This in a time period which saw, among other things, the Internet revolution start and reach maturity. Did technology really improve only by 1% Is this really consistent with your claim that Japan hit the limits of what is possible?

Japan current GDP, both nominal and PPP, are higher today than in 1989.

Use real figures. Inflation has not been 0% the entire period, and your chosen graphs aren't even being reported in yen.

The middle-income trap that didn't happen in Japan, or South Korea, or many other countries.

'Many'? Look at a ranking of real PPP, there's a great many countries well below South Korea. Above SK, it's basically oil countries, European/American countries, and tiny outliers like Singapore.

Replies from: shminux, shminux

↑ comment by Shmi (shminux) · 2013-01-31T22:29:51.368Z · LW(p) · GW(p)

Yes, minus some minor events like...

Biting sarcasm rarely improves one's point, though it's a great way to signal.

↑ comment by Shmi (shminux) · 2013-01-31T22:29:13.536Z · LW(p) · GW(p)

Yes, minus some minor events like

Biting sarcasm rarely improves one's point, though it's a great way to signal.

↑ comment by Vaniver · 2013-01-30T18:07:15.192Z · LW(p) · GW(p)

If China "collapsed" to Japan's per capita GDP it would be by far the largest world economy.

He is referring to the Lost Decade), a period of slow GDP growth in Japan following an asset price bubble; it's not clear to me why you are referring to absolute per capita GDP values.

Replies from: V_V

↑ comment by V_V · 2013-01-30T23:06:49.025Z · LW(p) · GW(p)

Japan is a country with high per-capita GDP, very high HDI, high population density (home to the world largest megalopolis ), cutting edge technology, low natural resources.
It's economy can't grow by implementing already existing technologies, or increasing the population, or using more land, or exploiting more domestic natural resources. It can only grow by technological development, which, in many core areas, is quite slow: a modern car may have all kinds of digital gizmos, but it's only marginally more efficient than a car made 10 or 20 years ago, both in terms of fuel consumption and material costs.

China is far from that point, and was ever further in 1999. It's population is more or less artificially capped, but it has lots of usable land, natural resources, lot of room for technological improvement using already existing technologies, room for increase in internal consumption.
Maybe it will never reach Japan-level per capita wealth, maybe in the long run Japan and other first world countries' wealth will fall and they will equalize with China at some intermediate point, maybe they will keep oscillating, maybe world economy will collapse.

I can't make long-term predictions (other than world economy will probably not keep growing for more than 50-100 years), but I expect that China will keep growing through the next decade. Fourteen years ago, that prediction would have been even stronger.

↑ comment by Stuart_Armstrong · 2013-01-31T11:01:45.627Z · LW(p) · GW(p)

"However, nanoengineering is not yet considered a practical technology."

Maybe it's because I share an office with Eric Drexler, but I get the definite impression that nanotech was expected to be something huge, back in 1999 - and maybe could have done, had the funding not been diverted to classical material science.

Replies from: V_V

↑ comment by V_V · 2013-01-31T18:22:37.070Z · LW(p) · GW(p)

Enthusiasts certaily expected it, but I'm under the impression that professional chemists didn't share that view. Drexler was sharply criticized by Richard Smalley, one of the Nobel prize recipient for the discovery of buckminsterfullerene.

While Kurzweil sided with Drexler, he wasn't so far fetched to believe that nanotech was imminent.

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2013-02-01T12:17:16.652Z · LW(p) · GW(p)

Drexler has his own view on that criticism (claiming that it myopically criticised a particular type of nanotech manipulation that nobody was actually proposing to do).

But I don't have the technical ability to sort out the truth of these matters.

Replies from: V_V

↑ comment by V_V · 2013-02-01T13:39:01.042Z · LW(p) · GW(p)

I suppose that for a sufficiently broad definition nanotechnology includes biochemistry.

comment by satt · 2013-01-15T18:31:55.042Z · LW(p) · GW(p)

I've been looking forward to this. Looking at the raw data now to get an idea of the inter-rater agreement. The two columns of Youtopia ratings agree fairly well on the 33 predictions where they overlap, and the 9 LW raters seem to disagree more, but that's only my first impression. (Maybe it's just that inter-rater variation is more obvious for predictions with ≥3 ratings.) Thanks again to all the assessors for putting in the legwork.

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2013-01-15T18:38:45.379Z · LW(p) · GW(p)

There are more than two youtopia raters. Different people, at different times, completed the assessments individually (and sometimes did "second opinions" if someone had already done that one before them). I think they were 5 assessors in total.

Replies from: satt

↑ comment by satt · 2013-01-15T19:13:36.922Z · LW(p) · GW(p)

Oops. Fixed.

comment by somervta · 2013-01-20T08:48:39.506Z · LW(p) · GW(p)

Would it be possible for you to send me the original data with the comments/justifications attached/ I'm interested in doing a side-by-side comparison with Kurzweil's own analysis of his predictions.

Replies from: Stuart_Armstrong, EricHerboso

↑ comment by Stuart_Armstrong · 2013-01-20T10:51:01.753Z · LW(p) · GW(p)

Most of the assessments didn't have comments or justifications (maybe about 1/4 did). I think the assessors would feel uncomfortable having those published (some are in very informal style). And, as I said, it wouldn't be a fair or systematic assessment - the comments weren't intended for that purpose.

↑ comment by EricHerboso · 2013-01-22T19:03:12.549Z · LW(p) · GW(p)

I am only one of the contributors, but you're welcome to view my comments. I doubt it will be helpful for your purpose, though.

Replies from: somervta

↑ comment by somervta · 2013-01-23T02:24:19.082Z · LW(p) · GW(p)

I'll see what I can do with them. It may be useful, even if I can only do a partial comparison.

comment by HoverHell · 2013-01-15T16:42:29.660Z · LW(p) · GW(p)

Replies from: CarlShulman, AnthonyC

↑ comment by CarlShulman · 2013-01-16T20:14:22.909Z · LW(p) · GW(p)

(with non-binary those graphs, as it seems to me, get relatively useless)

They are at least fairly comparable to the format in Kurzweil's self-assessment, and so useful for putting the high accuracy ratings reported there into perspective.

↑ comment by AnthonyC · 2013-01-16T13:39:16.834Z · LW(p) · GW(p)

Estimate the complexity in bits of each prediction? ;)

Replies from: Stuart_Armstrong

↑ comment by Stuart_Armstrong · 2013-01-24T11:18:29.026Z · LW(p) · GW(p)

How complex, in bits is: "it will rain in Oxford at some point this year"? Very. And yet, I would hesitate to call that an impressive prediction...

Assessing Kurzweil: the results

Contents

64 comments