Checking Kurzweil's track record

post by Stuart_Armstrong · 2012-10-30T11:07:50.989Z · LW · GW · Legacy · 54 comments

Predictions are cheap and easy; verification is hard, essential, and rare. For things like AI, we seem to be restricted to nothing but expert predictions - but expert predictions on AI are not very good, either in theory or in practice. If we are some experts who stand out, we would really want to identify them - and there is nothing better than a track record for identifying true experts.

So we're asking for help to verify the predictions of one of the most prominent futurists of this century: Ray Kurzweil, from his book "The Age of Spiritual Machines". By examining his predictions for times that have already come and gone, we'll be able to more appropriately weight his predictions for times still to come. By taking part, by lending your time to this, you will be directly helping us understand and predict the future, and will get showered in gratitude and kudos and maybe even karma.

I've already made an attempt at this (if you are interested in taking part in this project, avoid clicking on that link for now!). But you cannot trust a single person's opinions, and that was from a small (albeit random) sample of the predictions. For this project, I've transcribed his predictions into 172 separate (short) statements, and any volunteers would be presented with a random selection among these. The volunteers would then do some Google research (or other) to establish whether the prediction had come to pass, and then indicate their verdict. More details on what exactly will be measured, and how to interpret ambiguous statements, will be given to the volunteers once the project starts.

If you are interested, please let me know at stuart.armstrong@philosophy.ox.ac.uk (or in the comment thread here), indicating how many of the 172 questions you would like to attempt. The exercise will probably happen in late November or early December.

This will be done unblinded, because Kurzweil's predictions are so well known that it would be infeasible to find large numbers of people who are technologically aware but ignorant of them. Please avoid sharing your verdicts with others; it is entirely your own individual assessment that we are interested in having.

54 comments

Comments sorted by top scores.

comment by selylindi · 2012-10-30T13:12:39.837Z · LW(p) · GW(p)

This will be done unblinded, because Kurzweil's predictions are so well known that it would be infeasible to find large numbers of people who are technologically aware but ignorant of them.

Is this true? It could be, or alternatively it could simply appear true from your perspective of familiarity. I'm only vaguely aware of Kurzweil and have never heard any mention of him among my group of largely grad student / geek friends.

Replies from: gwern, DaFranker, Nornagest, Stuart_Armstrong
comment by gwern · 2012-10-30T14:48:35.638Z · LW(p) · GW(p)

I don't think it's true. I think it would be pretty easy to survey people and include some questions checking for technological awareness and Kurzweil awareness, and quietly discarding any results from people low on the former or high on the latter.

I mean, you could do it on Amazon Mechanical Turk! Such people are pretty technically sophisticated to be on Mechanical Turk in the first place, psychologists use it for surveys all the time, and a dismayingly large fraction of Turkers have college educations. It'd work fine.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T09:21:34.589Z · LW(p) · GW(p)

We'll look into the Mechanical Turk angle...

Replies from: sheeplearepeople
comment by sheeplearepeople · 2012-11-28T18:25:12.575Z · LW(p) · GW(p)

http://www.kurzweilai.net/predictions/download.php

This contains all his predictions, it shouldn't be hard to verify them.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-29T11:35:10.752Z · LW(p) · GW(p)

Self-assessment isn't a good idea. And I'm really not impressed by that one (though I shouldn't comment directly at this stage).

But we will see. I'll wait till I have all the data in.

comment by DaFranker · 2012-10-30T14:27:11.571Z · LW(p) · GW(p)

I'm likewise vaguely aware of Kurzweil's reputation and accomplishments, and vaguely recall mention of a big prediction in the field of AI, a very optimistic one according to critics IIRC.

That's about the extent of it. I wasn't aware he had publicly made other predictions.

Problem is, how would one go about verifying this? Not to mention that if I wasn't already primed to not click the link, I would probably have immediately searched for the predictions in question just to know the subject under discussion.

On that note, thanks and good move Stuart, warning us beforehand about the spoiler/unblinding link.

comment by Nornagest · 2012-10-31T00:24:33.701Z · LW(p) · GW(p)

I doubt it's true. I think it would be relatively easy to find technically sophisticated people who're unaware of Kurzweil's specific predictions; it'd be harder to find technically sophisticated people who're consistently unaware of his general thesis, but I'll bet you could still do it. You'd just need to look outside the transhumanist/singularitarian/AI enthusiast cluster.

Since those clusters are pretty tightly grouped in terms of conceptual underpinnings, it should be easy to filter them from a sample. Getting a good sample would be harder -- LW wouldn't do it, and personal blogs wouldn't either. Gwern's idea below looks promising but I have no idea how you'd go about it.

comment by Stuart_Armstrong · 2012-10-30T15:17:16.886Z · LW(p) · GW(p)

Interesting. My feeling was that it was hard to conceal that it was Kurzweil, and that people would certainly see that it was Kurzweil while googling their answers (also, we would get more interest with a semi-famous name).

comment by blashimov · 2012-10-30T15:57:00.058Z · LW(p) · GW(p)

I will do 20.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-30T16:04:21.861Z · LW(p) · GW(p)

Thanks!

comment by MileyCyrus · 2012-10-30T12:06:42.522Z · LW(p) · GW(p)

I'll do 20.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-30T15:12:44.827Z · LW(p) · GW(p)

Thanks!

comment by orangecat · 2012-11-04T02:23:07.669Z · LW(p) · GW(p)

I'll do 10. Agreed with satt that having multiple raters for each prediction would be helpful. I previously read your previous post with the randomly selected predictions, which hopefully isn't disqualifying.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-09T10:09:37.094Z · LW(p) · GW(p)

Cheers!

previously read your previous post with the randomly selected predictions, which hopefully isn't disqualifying.

Fine as long as you try and ignore them :-)

comment by EricHerboso · 2012-10-31T01:08:01.699Z · LW(p) · GW(p)

I'll commit to doing 20 questions.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T09:20:08.351Z · LW(p) · GW(p)

Thanks!

comment by Simon Fischer (SimonF) · 2012-11-05T14:10:46.478Z · LW(p) · GW(p)

I will do 20, too!

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-09T10:10:53.329Z · LW(p) · GW(p)

Cheers!

comment by MaoShan · 2012-11-03T03:26:48.389Z · LW(p) · GW(p)

I will do ten questions. I have no bias for or against Kurzweil, and I will try my best to find out whether the prediction was accurate, regardless of who predicted it. Maybe including a few placebo questions would be a good idea. kemirunda@hotmail

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-03T10:18:11.539Z · LW(p) · GW(p)

Thanks!

Replies from: MaoShan
comment by MaoShan · 2012-11-28T03:08:07.264Z · LW(p) · GW(p)

I have my ten done, but I'd like to continue down the randomized list. Is there a deadline that you would like these returned by?

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-28T17:08:52.670Z · LW(p) · GW(p)

Great! :-) Can you get them done by the 8th of December, so I can present the results at the AGI-impacts conference if need be?

Replies from: MaoShan
comment by MaoShan · 2012-11-29T04:20:24.058Z · LW(p) · GW(p)

Sure. I'll send whatever I've completed by Sunday night, so you have a chance to analyze your data in time for the conference. I expect to have about fifty done by then.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-29T11:32:02.829Z · LW(p) · GW(p)

You're being a star!

comment by DaFranker · 2012-11-02T13:46:47.199Z · LW(p) · GW(p)

I'd be willing to do around 10-15, whichever works to fit the number of questions to the number of people.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-02T15:44:57.085Z · LW(p) · GW(p)

Cool, thanks!

comment by MTGandP · 2012-11-02T05:24:54.331Z · LW(p) · GW(p)

I will do 10 predictions. Email mtgandp gmail.

Interestingly, Stuart seems to have adopted a similar approach to the synagogue described in Why Our Kind Can't Cooperate, with people publicly declaring their commitments.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-02T10:09:46.433Z · LW(p) · GW(p)

Cheers!

Interestingly, Stuart seems to have adopted a similar approach to the synagogue described in Why Our Kind Can't Cooperate, with people publicly declaring their commitments.

I wish I could take credit for that level of machiavellianism...

comment by MrMind · 2012-10-31T18:08:32.912Z · LW(p) · GW(p)

Sounds like fun! I'll commit to 10 predictions. Are those in a form which can be clearly examined as "correct/incorrect"?
Email at ikstef at gmail dot com

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-01T10:24:40.918Z · LW(p) · GW(p)

Cheers! There will be partial correct/incorrect scores available as well.

comment by bsterrett · 2012-10-31T17:45:54.449Z · LW(p) · GW(p)

I'll do 10.

What is the error-checking process? Will we fix any mistakes in our verdicts via an LW discussion after they have been gathered?

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T17:57:02.417Z · LW(p) · GW(p)

Thanks! I'll think about the error checking process; certainly there is some possibility for factual errors, but most of the work is in interpreting qualifiers like "ubiquitous" and "most" and mapping that to what happened in the world.

Replies from: satt
comment by satt · 2012-11-03T17:32:39.205Z · LW(p) · GW(p)

One way to gauge how reliable people's judgements are: have multiple people rate each Kurzweil prediction and see how well their ratings agree. So far LWers have committed to checking at least 200 predictions, so if everyone pulls through you'll be able to get multiple ratings of at least 28 questions. Those multiple ratings could then be cross-checked for each question.

(I won't volunteer to rate any statements myself because (1) I'm lazy; (2) I already have a mildly negative view of Kurzweil's predictive ability, which might make me biased; and (3) I read your earlier post and re-rated the 10 Age of Spiritual Machines predictions in that post myself, so I've already been primed in that respect.)

Replies from: Unnamed
comment by Unnamed · 2012-11-04T02:59:33.276Z · LW(p) · GW(p)

One way to gauge how reliable people's judgements are: have multiple people rate each Kurzweil prediction and see how well their ratings agree.

This is a good idea. It's standard operating procedure (for measures which require a rater's judgment) to have 2 raters for at least some of the items, and to report the agreement rate on those items ("inter-rater reliability"). Be sure to vary which raters are overlapping; for example, don't give gwern and bsterrett the same 10 predictions (instead have maybe one prediction that they both rate, and one where bsterrett & Tenoke overlap, etc.) - that way the agreement rate tells you something about how much agreement there is between all of the raters (and not just between particular pairs of raters).

In cases where the 2 raters disagree, you could just have a 3rd rater rate it and then go with their rating, or you could do something more complicated (like having the two raters discuss it and try to reach a consensus).

comment by Tenoke · 2012-10-31T15:22:44.759Z · LW(p) · GW(p)

I can commit to 20 as well. For the record, I reckon around 3/4 of his predictions will pass.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T17:16:06.246Z · LW(p) · GW(p)

Thanks!

comment by Dallas · 2012-10-31T02:27:31.388Z · LW(p) · GW(p)

I will examine 30 questions. dallasjhaugh at gmail dot com

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T09:20:51.320Z · LW(p) · GW(p)

Thanks!

comment by quinox · 2012-10-30T23:12:24.408Z · LW(p) · GW(p)

Sign me up for ~2 hours of questions. BTW, I'm not familiar with Kurzweil's predictions yet (I'll wait with that until after I've done your questions)

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T09:21:07.772Z · LW(p) · GW(p)

Thanks!

comment by [deleted] · 2012-11-02T13:36:15.107Z · LW(p) · GW(p)

I'm up for that -- andrew @ thenationalpep.co.uk . I can probably do ten. Disclosure of biases upfront -- I'm not familiar with that specific book of Kurzweil, or with much of Kurzweil's work in general, so I don't know what his predictions are. But I am familiar with his book The Singularity Is Near, which I thought utterly, comprehensively wrongheaded to a point where most of what I could say about it would seem like personal abuse, so I suspect that a relatively low proportion of his predictions will come true.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-02T15:45:26.382Z · LW(p) · GW(p)

Cool, thanks!

comment by baiter · 2012-11-01T12:59:45.727Z · LW(p) · GW(p)

I would do 10.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-01T13:14:40.976Z · LW(p) · GW(p)

Cheers!

comment by gwern · 2012-10-31T01:19:28.180Z · LW(p) · GW(p)

I'd be fine with 10, at least initially; email is gwern0@gmail.com if that's how you're doing it.

Replies from: gwern, Stuart_Armstrong
comment by gwern · 2012-11-22T18:42:07.307Z · LW(p) · GW(p)

Update: I've done 20 predictions. Not that tedious, kinda interesting actually.

comment by Stuart_Armstrong · 2012-10-31T09:20:16.740Z · LW(p) · GW(p)

Thanks!

comment by Eneasz · 2012-10-30T19:30:09.997Z · LW(p) · GW(p)

Before committing to a number, how long do you think it would take to research a given question? I can't do very many if they're likely to take an hour each.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T17:20:00.472Z · LW(p) · GW(p)

The questions will be grouped together (they are constructed grammatically so that they follow on from each other), so you'll generally treating 4-5 questions as a single block. It took me a a few hours to do my research; I did ten blocks, which I think were about 20-30 predictions.

Replies from: Eneasz
comment by Eneasz · 2012-10-31T19:52:02.603Z · LW(p) · GW(p)

Alrighty, put me down for 10.

ETA: Forgot email. :) embrodski at gmail

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-11-01T10:23:56.081Z · LW(p) · GW(p)

Cheers!

comment by lavalamp · 2012-10-30T23:35:24.124Z · LW(p) · GW(p)

I'm aware of Kurzweil but I don't think I've read anything of his except for your earlier fact checking article (and that was long enough ago that I don't remember a thing about it).

It wouldn't be so hard to add a "do you think this prediction was made by Kurzweil or someone else? (don't google, but if you accidentally found the answer while researching, don't pretend not to know, either)" checkbox.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2012-10-31T09:25:34.521Z · LW(p) · GW(p)

We'll ask about their familiarity with Kurzweil in general - but remember that the volunteers are going to do research on the questions, so it's very likely they'll realise it's Kurzweil at that point.

comment by MTGandP · 2012-11-02T05:19:46.200Z · LW(p) · GW(p)

How many predictions a person is willing to assess depends on how long they take. Can you give a time estimate, or perhaps provide a sample prediction?