Checking Kurzweil's track record
post by Stuart_Armstrong · 2012-10-30T11:07:50.989Z · LW · GW · Legacy · 54 commentsContents
54 comments
Predictions are cheap and easy; verification is hard, essential, and rare. For things like AI, we seem to be restricted to nothing but expert predictions - but expert predictions on AI are not very good, either in theory or in practice. If we are some experts who stand out, we would really want to identify them - and there is nothing better than a track record for identifying true experts.
So we're asking for help to verify the predictions of one of the most prominent futurists of this century: Ray Kurzweil, from his book "The Age of Spiritual Machines". By examining his predictions for times that have already come and gone, we'll be able to more appropriately weight his predictions for times still to come. By taking part, by lending your time to this, you will be directly helping us understand and predict the future, and will get showered in gratitude and kudos and maybe even karma.
I've already made an attempt at this (if you are interested in taking part in this project, avoid clicking on that link for now!). But you cannot trust a single person's opinions, and that was from a small (albeit random) sample of the predictions. For this project, I've transcribed his predictions into 172 separate (short) statements, and any volunteers would be presented with a random selection among these. The volunteers would then do some Google research (or other) to establish whether the prediction had come to pass, and then indicate their verdict. More details on what exactly will be measured, and how to interpret ambiguous statements, will be given to the volunteers once the project starts.
If you are interested, please let me know at stuart.armstrong@philosophy.ox.ac.uk (or in the comment thread here), indicating how many of the 172 questions you would like to attempt. The exercise will probably happen in late November or early December.
This will be done unblinded, because Kurzweil's predictions are so well known that it would be infeasible to find large numbers of people who are technologically aware but ignorant of them. Please avoid sharing your verdicts with others; it is entirely your own individual assessment that we are interested in having.
54 comments
Comments sorted by top scores.
comment by selylindi · 2012-10-30T13:12:39.837Z · LW(p) · GW(p)
This will be done unblinded, because Kurzweil's predictions are so well known that it would be infeasible to find large numbers of people who are technologically aware but ignorant of them.
Is this true? It could be, or alternatively it could simply appear true from your perspective of familiarity. I'm only vaguely aware of Kurzweil and have never heard any mention of him among my group of largely grad student / geek friends.
Replies from: gwern, DaFranker, Nornagest, Stuart_Armstrong↑ comment by gwern · 2012-10-30T14:48:35.638Z · LW(p) · GW(p)
I don't think it's true. I think it would be pretty easy to survey people and include some questions checking for technological awareness and Kurzweil awareness, and quietly discarding any results from people low on the former or high on the latter.
I mean, you could do it on Amazon Mechanical Turk! Such people are pretty technically sophisticated to be on Mechanical Turk in the first place, psychologists use it for surveys all the time, and a dismayingly large fraction of Turkers have college educations. It'd work fine.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T09:21:34.589Z · LW(p) · GW(p)
We'll look into the Mechanical Turk angle...
Replies from: sheeplearepeople↑ comment by sheeplearepeople · 2012-11-28T18:25:12.575Z · LW(p) · GW(p)
http://www.kurzweilai.net/predictions/download.php
This contains all his predictions, it shouldn't be hard to verify them.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-29T11:35:10.752Z · LW(p) · GW(p)
Self-assessment isn't a good idea. And I'm really not impressed by that one (though I shouldn't comment directly at this stage).
But we will see. I'll wait till I have all the data in.
↑ comment by DaFranker · 2012-10-30T14:27:11.571Z · LW(p) · GW(p)
I'm likewise vaguely aware of Kurzweil's reputation and accomplishments, and vaguely recall mention of a big prediction in the field of AI, a very optimistic one according to critics IIRC.
That's about the extent of it. I wasn't aware he had publicly made other predictions.
Problem is, how would one go about verifying this? Not to mention that if I wasn't already primed to not click the link, I would probably have immediately searched for the predictions in question just to know the subject under discussion.
On that note, thanks and good move Stuart, warning us beforehand about the spoiler/unblinding link.
↑ comment by Nornagest · 2012-10-31T00:24:33.701Z · LW(p) · GW(p)
I doubt it's true. I think it would be relatively easy to find technically sophisticated people who're unaware of Kurzweil's specific predictions; it'd be harder to find technically sophisticated people who're consistently unaware of his general thesis, but I'll bet you could still do it. You'd just need to look outside the transhumanist/singularitarian/AI enthusiast cluster.
Since those clusters are pretty tightly grouped in terms of conceptual underpinnings, it should be easy to filter them from a sample. Getting a good sample would be harder -- LW wouldn't do it, and personal blogs wouldn't either. Gwern's idea below looks promising but I have no idea how you'd go about it.
↑ comment by Stuart_Armstrong · 2012-10-30T15:17:16.886Z · LW(p) · GW(p)
Interesting. My feeling was that it was hard to conceal that it was Kurzweil, and that people would certainly see that it was Kurzweil while googling their answers (also, we would get more interest with a semi-famous name).
comment by blashimov · 2012-10-30T15:57:00.058Z · LW(p) · GW(p)
I will do 20.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-30T16:04:21.861Z · LW(p) · GW(p)
Thanks!
comment by MileyCyrus · 2012-10-30T12:06:42.522Z · LW(p) · GW(p)
I'll do 20.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-30T15:12:44.827Z · LW(p) · GW(p)
Thanks!
comment by orangecat · 2012-11-04T02:23:07.669Z · LW(p) · GW(p)
I'll do 10. Agreed with satt that having multiple raters for each prediction would be helpful. I previously read your previous post with the randomly selected predictions, which hopefully isn't disqualifying.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-09T10:09:37.094Z · LW(p) · GW(p)
Cheers!
previously read your previous post with the randomly selected predictions, which hopefully isn't disqualifying.
Fine as long as you try and ignore them :-)
comment by EricHerboso · 2012-10-31T01:08:01.699Z · LW(p) · GW(p)
I'll commit to doing 20 questions.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T09:20:08.351Z · LW(p) · GW(p)
Thanks!
comment by Simon Fischer (SimonF) · 2012-11-05T14:10:46.478Z · LW(p) · GW(p)
I will do 20, too!
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-09T10:10:53.329Z · LW(p) · GW(p)
Cheers!
comment by MaoShan · 2012-11-03T03:26:48.389Z · LW(p) · GW(p)
I will do ten questions. I have no bias for or against Kurzweil, and I will try my best to find out whether the prediction was accurate, regardless of who predicted it. Maybe including a few placebo questions would be a good idea. kemirunda@hotmail
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-03T10:18:11.539Z · LW(p) · GW(p)
Thanks!
Replies from: MaoShan↑ comment by MaoShan · 2012-11-28T03:08:07.264Z · LW(p) · GW(p)
I have my ten done, but I'd like to continue down the randomized list. Is there a deadline that you would like these returned by?
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-28T17:08:52.670Z · LW(p) · GW(p)
Great! :-) Can you get them done by the 8th of December, so I can present the results at the AGI-impacts conference if need be?
Replies from: MaoShan↑ comment by MaoShan · 2012-11-29T04:20:24.058Z · LW(p) · GW(p)
Sure. I'll send whatever I've completed by Sunday night, so you have a chance to analyze your data in time for the conference. I expect to have about fifty done by then.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-29T11:32:02.829Z · LW(p) · GW(p)
You're being a star!
comment by DaFranker · 2012-11-02T13:46:47.199Z · LW(p) · GW(p)
I'd be willing to do around 10-15, whichever works to fit the number of questions to the number of people.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-02T15:44:57.085Z · LW(p) · GW(p)
Cool, thanks!
comment by MTGandP · 2012-11-02T05:24:54.331Z · LW(p) · GW(p)
I will do 10 predictions. Email mtgandp gmail.
Interestingly, Stuart seems to have adopted a similar approach to the synagogue described in Why Our Kind Can't Cooperate, with people publicly declaring their commitments.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-02T10:09:46.433Z · LW(p) · GW(p)
Cheers!
Interestingly, Stuart seems to have adopted a similar approach to the synagogue described in Why Our Kind Can't Cooperate, with people publicly declaring their commitments.
I wish I could take credit for that level of machiavellianism...
comment by MrMind · 2012-10-31T18:08:32.912Z · LW(p) · GW(p)
Sounds like fun! I'll commit to 10 predictions. Are those in a form which can be clearly examined as "correct/incorrect"?
Email at ikstef at gmail dot com
↑ comment by Stuart_Armstrong · 2012-11-01T10:24:40.918Z · LW(p) · GW(p)
Cheers! There will be partial correct/incorrect scores available as well.
comment by bsterrett · 2012-10-31T17:45:54.449Z · LW(p) · GW(p)
I'll do 10.
What is the error-checking process? Will we fix any mistakes in our verdicts via an LW discussion after they have been gathered?
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T17:57:02.417Z · LW(p) · GW(p)
Thanks! I'll think about the error checking process; certainly there is some possibility for factual errors, but most of the work is in interpreting qualifiers like "ubiquitous" and "most" and mapping that to what happened in the world.
Replies from: satt↑ comment by satt · 2012-11-03T17:32:39.205Z · LW(p) · GW(p)
One way to gauge how reliable people's judgements are: have multiple people rate each Kurzweil prediction and see how well their ratings agree. So far LWers have committed to checking at least 200 predictions, so if everyone pulls through you'll be able to get multiple ratings of at least 28 questions. Those multiple ratings could then be cross-checked for each question.
(I won't volunteer to rate any statements myself because (1) I'm lazy; (2) I already have a mildly negative view of Kurzweil's predictive ability, which might make me biased; and (3) I read your earlier post and re-rated the 10 Age of Spiritual Machines predictions in that post myself, so I've already been primed in that respect.)
Replies from: Unnamed↑ comment by Unnamed · 2012-11-04T02:59:33.276Z · LW(p) · GW(p)
One way to gauge how reliable people's judgements are: have multiple people rate each Kurzweil prediction and see how well their ratings agree.
This is a good idea. It's standard operating procedure (for measures which require a rater's judgment) to have 2 raters for at least some of the items, and to report the agreement rate on those items ("inter-rater reliability"). Be sure to vary which raters are overlapping; for example, don't give gwern and bsterrett the same 10 predictions (instead have maybe one prediction that they both rate, and one where bsterrett & Tenoke overlap, etc.) - that way the agreement rate tells you something about how much agreement there is between all of the raters (and not just between particular pairs of raters).
In cases where the 2 raters disagree, you could just have a 3rd rater rate it and then go with their rating, or you could do something more complicated (like having the two raters discuss it and try to reach a consensus).
comment by Tenoke · 2012-10-31T15:22:44.759Z · LW(p) · GW(p)
I can commit to 20 as well. For the record, I reckon around 3/4 of his predictions will pass.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T17:16:06.246Z · LW(p) · GW(p)
Thanks!
comment by Dallas · 2012-10-31T02:27:31.388Z · LW(p) · GW(p)
I will examine 30 questions. dallasjhaugh at gmail dot com
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T09:20:51.320Z · LW(p) · GW(p)
Thanks!
comment by quinox · 2012-10-30T23:12:24.408Z · LW(p) · GW(p)
Sign me up for ~2 hours of questions. BTW, I'm not familiar with Kurzweil's predictions yet (I'll wait with that until after I've done your questions)
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T09:21:07.772Z · LW(p) · GW(p)
Thanks!
comment by [deleted] · 2012-11-02T13:36:15.107Z · LW(p) · GW(p)
I'm up for that -- andrew @ thenationalpep.co.uk . I can probably do ten. Disclosure of biases upfront -- I'm not familiar with that specific book of Kurzweil, or with much of Kurzweil's work in general, so I don't know what his predictions are. But I am familiar with his book The Singularity Is Near, which I thought utterly, comprehensively wrongheaded to a point where most of what I could say about it would seem like personal abuse, so I suspect that a relatively low proportion of his predictions will come true.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-02T15:45:26.382Z · LW(p) · GW(p)
Cool, thanks!
comment by baiter · 2012-11-01T12:59:45.727Z · LW(p) · GW(p)
I would do 10.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-01T13:14:40.976Z · LW(p) · GW(p)
Cheers!
comment by gwern · 2012-10-31T01:19:28.180Z · LW(p) · GW(p)
I'd be fine with 10, at least initially; email is gwern0@gmail.com
if that's how you're doing it.
↑ comment by Stuart_Armstrong · 2012-10-31T09:20:16.740Z · LW(p) · GW(p)
Thanks!
comment by Eneasz · 2012-10-30T19:30:09.997Z · LW(p) · GW(p)
Before committing to a number, how long do you think it would take to research a given question? I can't do very many if they're likely to take an hour each.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T17:20:00.472Z · LW(p) · GW(p)
The questions will be grouped together (they are constructed grammatically so that they follow on from each other), so you'll generally treating 4-5 questions as a single block. It took me a a few hours to do my research; I did ten blocks, which I think were about 20-30 predictions.
Replies from: Eneasz↑ comment by Eneasz · 2012-10-31T19:52:02.603Z · LW(p) · GW(p)
Alrighty, put me down for 10.
ETA: Forgot email. :) embrodski at gmail
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-11-01T10:23:56.081Z · LW(p) · GW(p)
Cheers!
comment by lavalamp · 2012-10-30T23:35:24.124Z · LW(p) · GW(p)
I'm aware of Kurzweil but I don't think I've read anything of his except for your earlier fact checking article (and that was long enough ago that I don't remember a thing about it).
It wouldn't be so hard to add a "do you think this prediction was made by Kurzweil or someone else? (don't google, but if you accidentally found the answer while researching, don't pretend not to know, either)" checkbox.
Replies from: Stuart_Armstrong↑ comment by Stuart_Armstrong · 2012-10-31T09:25:34.521Z · LW(p) · GW(p)
We'll ask about their familiarity with Kurzweil in general - but remember that the volunteers are going to do research on the questions, so it's very likely they'll realise it's Kurzweil at that point.