Test Your Rationality

robinhanson

Test Your Rationality

post by RobinHanson · 2009-03-01T13:21:34.375Z · LW · GW · Legacy · 87 comments

87 comments

So you think you want to be rational, to believe what is true even when sirens tempt you? Great, get to work; there's lots you can do. Do you want to justifiably believe that you are more rational than others, smugly knowing your beliefs are more accurate? Hold on; this is hard.

Humans nearly universally find excuses to believe that they are more correct that others, at least on the important things. They point to others' incredible beliefs, to biases afflicting others, and to estimation tasks where they are especially skilled. But they forget most everyone can point to such things.

But shouldn't you get more rationality credit if you spend more time studying common biases, statistical techniques, and the like? Well this would be good evidence of your rationality if you were in fact pretty rational about your rationality, i.e., if you knew that when you read or discussed such issues your mind would then systematically, broadly, and reasonably incorporate those insights into your reasoning processes.

But what if your mind is far from rational? What if your mind is likely to just go through the motions of studying rationality to allow itself to smugly believe it is more accurate, or to bond you more closely to your social allies?

It seems to me that if you are serious about actually being rational, rather than just believing in your rationality or joining a group that thinks itself rational, you should try hard and often to test your rationality. But how can you do that?

To test the rationality of your beliefs, you could sometimes declare beliefs, and later score those beliefs via tests where high scoring beliefs tend to be more rational. Better tests are those where scores are more tightly and reliably correlated with rationality. So, what are good rationality tests?

87 comments

Comments sorted by top scores.

comment by patrissimo · 2009-03-01T21:17:37.006Z · LW(p) · GW(p)

Play poker for significant amounts of money. While it only tests limited and specific areas of rationality, and of course requires some significant domain-specific knowledge, poker is an excellent rationality test. The main difficulty of playing the game well, once one understands the basic strategy, is in how amazingly well it evokes and then punishes our irrational natures. Difficulties updating (believing the improbable when new information comes in), loss aversion, takeover by the limbic system (anger / jealousy / revenge / etc), lots of aspects that it tests.

Replies from: None, steven0461, pwno, steven0461, Annoyance

↑ comment by [deleted] · 2009-03-01T21:59:28.597Z · LW(p) · GW(p)

deleted

Replies from: Johnicholas

↑ comment by Johnicholas · 2009-03-03T10:25:32.439Z · LW(p) · GW(p)

To some extent, all tests have the problem of transfer.

Replies from: MarkusRamikin

↑ comment by MarkusRamikin · 2012-04-06T14:49:41.875Z · LW(p) · GW(p)

Could someone please explain, what is meant by "transfer" here?

↑ comment by steven0461 · 2009-03-02T06:13:44.036Z · LW(p) · GW(p)

A problem here is that it takes something like tens or hundreds of thousands of hands for the signal to emerge from the noise.

↑ comment by pwno · 2009-03-01T21:58:00.280Z · LW(p) · GW(p)

Agreed, but I think it is easier to see yourself confront your irrational impulses with blackjack. For instance, you're faced with a 16 versus a 10; you know you have to hit, but your emotions (at least mine) tell me not to. Anyone else experience this same accidental rationality test?

Replies from: patrissimo

↑ comment by patrissimo · 2009-03-02T02:49:52.905Z · LW(p) · GW(p)

For amateur players, sure. But there is an easily memorizable table by which to play BJ perfectly, either basic strategy or counting cards. So you always clearly know what you should do. If you are playing BJ to win, it stops being a test of rationality.

Whereas even when you become skilled at poker, it is still a constant test of rationality both because optimal strategy is complex (uncertainty about correct strategy means lots of opportunity to lie to yourself) and you want to play maximally anyway (uncertainty about whether opponent is making a mistake gives you even more chances to lie to yourself). Kinda like life...

Replies from: Annoyance

↑ comment by Annoyance · 2009-03-02T21:03:45.010Z · LW(p) · GW(p)

Whether a person memorizes and uses the table is still a viable test. No rational person playing to win would take an action incompatible with the table, and acting only in ways compatible with the table is unlikely to be accidental for an irrational person.

A way of determining whether people act rationally when it is relatively easy to do so can be quite valuable, since most people don't.

↑ comment by steven0461 · 2009-03-02T06:08:24.849Z · LW(p) · GW(p)

A problem here is it takes tens or hundreds of thousands of hands for results not to be dominated by noise.

↑ comment by Annoyance · 2009-03-02T20:03:03.016Z · LW(p) · GW(p)

Poker isn't just about calculating probabilities, it's also about disguising your reactions and effectively reading others'. Being rational has nothing to do with competence at social interaction and deception.

A good test has no confounding variables. Poker, then, is not a good test of rationality.

Replies from: Johnicholas

↑ comment by Johnicholas · 2009-03-02T20:28:33.435Z · LW(p) · GW(p)

I understand Annoyance's point to be: Prefer online poker to in-person.

Replies from: Annoyance

↑ comment by Annoyance · 2009-03-02T20:36:22.938Z · LW(p) · GW(p)

An excellent point and suggestion.

Any test in which there are confounding variables should be suspect, and every attempt should be made to eliminate them. Looking at 'winners' isn't useful unless we know the way in which they won indicates rationality. Lottery winners got lucky. Playing the lottery has a negative expected return. Including lottery winners in the group you scrutinize means you're including stupid people who were the beneficiaries of a single turn of good fortune.

The questions we should be asking ourselves are: What criteria distinguish rationality from non-rationality? What criteria distinguish between degrees of rationality?

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T20:23:39.026Z · LW(p) · GW(p)

The thought occurs to me that the converse question of "How do you know you're rational?" is "Why do you care whether you have the property 'rationality'?" It's not unbound - we hope - so for every occasion when you might be tempted to wonder how rational you are, there should be some kind of performable task that relates to your 'rational'-ness. What kind of test could reflect this 'rationality' should be suggested from consideration of the related task. Or conversely, we ask directly what associates to the task.

Prediction markets would be suggested by the task of trying to predict future variables; and then conversely we can ask, "If someone makes money on a prediction market, what else are they likely to be good at?"

Replies from: Jack

↑ comment by Jack · 2009-03-03T08:49:43.399Z · LW(p) · GW(p)

I think there is likely a distinction between being rational at games and rational at life. In my experience those who are rational in one way are very often not rational in the other. I think it highly unlikely that there is a strong correlation between "good at prediction markets" or "good at poker" and "good at life". Do we think the best poker players are good models for rational existence? I don't think I do and I don't even think THEY do.

A suggestion:

List your goals. Then give the goals deadlines along with probabilities of success and estimated utility (with some kind of metric, not necessarily numerical). At each deadline, tally whether or not the goal is completed and give an estimation or the utility.

From this information you can take at least three things.

Whether or not you can accurately predict your ability.
Whether or not you are picking the right goals (lower than expected utility would be bad, I think)
With enough date points your could determine your ration of success to utility. Too much success and not enough utility means you need to aim higher. Too little success for goals with high predicted utility mean either aim lower or figure out what you're doing wrong in pursuing the goals. If both are high you're living rationally, if both are low YOU'RE DOING IT WRONG.

The process could probably be improved if it was done transparently and cooperatively. Others looking on would help prevent you from cheating yourself.

Not terribly rigorous, but thats the idea.

Replies from: harremans

↑ comment by harremans · 2013-04-03T13:41:12.523Z · LW(p) · GW(p)

I'm not sure if this works, since a lazy person could score very low on utility yet still be quite rational.

comment by Mike Bishop (MichaelBishop) · 2009-03-01T20:06:39.460Z · LW(p) · GW(p)

Are there cognitive scientists creating tests like this? If not, why not?

comment by Tom_Talbot · 2009-03-01T16:24:01.302Z · LW(p) · GW(p)

This almost seems too obvious to mention in one of Robin's threads, but I'll go ahead anyway: success on prediction markets would seem to be an indicator of rationality and/or luck. Your degree of success in a game like HubDub may give some indication as to the accuracy of your beliefs, and so (one would hope) the effectiveness of your belief-formation process.

Replies from: jimrandomh, timtyler

↑ comment by jimrandomh · 2009-03-01T18:33:07.395Z · LW(p) · GW(p)

I would expect success in a prediction market to be more correlated with amount of time spent researching than with rationality. At best, rationality would be a multiplier to the benefit gained per hour of research; alternatively, it could be an upper bound to the total amount of benefit gained from researching.

↑ comment by timtyler · 2009-03-01T17:40:03.084Z · LW(p) · GW(p)

Prediction markets tend to be zero-sum games. Most rational agents would prefer to play in a real stock market - where you can at least expect to make money in line with inflation.

Replies from: RobinHanson, gwern

↑ comment by RobinHanson · 2009-03-01T18:11:19.889Z · LW(p) · GW(p)

The relevant category is constant-sum games, and stock markets are that as well if liquidity traders are included in the relevant trader set. One can subsidize prediction markets so that all traders can gain by revealing info.

↑ comment by gwern · 2009-03-01T17:53:12.933Z · LW(p) · GW(p)

Tim: but don't prediction markets have a lot of benefits compared to stock markets? They terminate on usually set dates, they're very narrowly focused (compare 'will the Democrats win in 2008' to 'will GE's stock go up on October 11, 2008' - there are so many fewer confounding factors for the former), and they're easier to use.

Replies from: Johnicholas

↑ comment by Johnicholas · 2009-03-01T17:58:31.939Z · LW(p) · GW(p)

Prediction markets as implemented in the real world mostly use fake money, which is a drawback.

Replies from: gwern

↑ comment by gwern · 2009-03-02T18:22:48.727Z · LW(p) · GW(p)

Well, you don't have to use the fake-money ones. Intrade and Betfair have always seemed perfectly serviceable to me, and they're real money prediction markets.

On a related point, fake money could actually be good. There's less motivation to bet what you really truly think, but not wagering real money means you can make trades on just about everything in that market - you aren't so practically or mentally constrained. You're more likely to actually play, or play more.

(Suppose I don't have $500 to spare or would prefer not to risk $500 I do have? Should I not test myself at all?)

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T16:02:50.033Z · LW(p) · GW(p)

This is the fundamental question that determines whether we can do a lot of things - if we can't come up with evidence-based metrics that are good measures of the effect of rationality-improving interventions, then everything becomes much harder. If the metric is easily gamed once people know about it, everything becomes much harder. If it can be defeated by memorization like school, everything becomes much harder. I will post about this myself at some point.

This problem is one we should approach with the attitude of solving as much as possible, not feeling delightfully cynical about how it can't be solved, but at least you know it. It's too important for that. It sets up the incentives in the whole system. If the field of hedonics can try to measure happiness, we can at least try to measure rationality.

...but not to derail the discussion, Robin's individual how-do-you-know? stance is a valid perspective, and I'll post about the scientific measurement / institutional measurement problems later.

comment by badger · 2009-03-01T18:54:00.570Z · LW(p) · GW(p)

Prediction markets seem like the obvious answer, but the range of issues currently available as contracts is too narrow to be of much use. Most probability calibration exercises are focus on trivial issues. I think they are still useful, but the real test is how you deal with emotional issues, not just neutral ones.

This might not be amenable to a market, but I would like to see a database collected of the questions being addressed by research in-progress. Perhaps when a research grant is issued, if a definite conclusion is anticipated, the question can be entered in the database. The question would have to be constructed so that users could enter in definite predictions. At first glance, I think the predictions would have to remain private until after a result is published, but I'm unsure. In contrast to existing prediction sites, this would have the benefits of a broad range of questions formulated by experts who are concerned about precisely defining the issue at hand. How would a standard procedure of formulating a question for a prediction database influence the type of research done?

Another broad test I've considered is whether your judgment of the quality of an individual's claims is correlated with their social club affiliations. To me, political party stands out as the most relevant example of a social club for this purpose. If you find yourself disagreeing with Republicans more frequently than with Democrats over factual issues, that appears to be a sign of confirmation bias. Because association with social clubs tends to be caused by how you were raised, social class, or the sheer desire to be part of a group, there is no reason to think that affiliation should be a strong predictor of quality. Any thoughts?

Replies from: jimmy, steven0461

↑ comment by jimmy · 2009-03-02T02:44:47.336Z · LW(p) · GW(p)

"If you find yourself disagreeing with Republicans more frequently than with Democrats over factual issues, that appears to be a sign of confirmation bias."

Only to the extent that you think Republicans and Democrats are equally wrong. I don't see any rule demanding this.

Since all accurate maps are consistent with eachother, everyone with accurate political beliefs are going to be consistent, and you might as well use a new label for this regularity. It's fine to be a Y if the causality runs from X is true -> you believe X is true -> you're labeled "member of group Y".

Tests for "Group Y believes this-> I believe this" that can rule out the first causal path would be harder to come up with, especially since irrational group beliefs are chosen to be hard to prove (to the satisfaction of the group members).

The situation gets worse when you realize that "Group Y believes this-> I believe this" can be valid to the extent that you have evidence that Group Y gets other things right.

↑ comment by steven0461 · 2009-03-01T19:17:27.105Z · LW(p) · GW(p)

Even if rationality isn't a major cause of party affiliation, party affiliation could conceivably still be a major cause of rationality.

comment by Johnicholas · 2009-03-01T15:52:05.587Z · LW(p) · GW(p)

ISO quality certification doesn't look primarily at the results, but primarily at the process. If the process has a good argument or justification that it consistently produces high quality, then it is deemed to be compliant. For example "we measure performance in [this] way, the records are kept in [this] way, quality problems are addressed like [this], compliance is addressed like [such-and-so]".

I can imagine a simple checklist for rationality, analogous to the software carpentry checklist.

Do you have a procedure for making decisions?
Is the procedure available at the times and locations that you make decisions?
How do you prevent yourself from making decisions without following this procedure?
If your procedure depends on calibration data, how do you guarantee the quality of your calibration data?
How does your procedure address (common rationality failure #1)?
et cetera

Sorry, it's just a sketch of a checklist, not a real proposal, but I think you get the idea. Test the process, not the results. Of course, the process should describe how it tests the results.

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T16:43:08.099Z · LW(p) · GW(p)

How do you know whether the checklist actually works or if it's just pointless drudgery?

Replies from: Johnicholas

↑ comment by Johnicholas · 2009-03-01T17:05:46.319Z · LW(p) · GW(p)

Sorry, I said "Test the process, not the results", which is a strictly wrong misstatement. It is over-strong in the manner of a slogan.

A more accurate statement would be "Focus primarily on testing process, and secondarily on testing results."

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T17:11:21.358Z · LW(p) · GW(p)

Okay but how do you test the results?

Replies from: Johnicholas

↑ comment by Johnicholas · 2009-03-01T17:47:42.278Z · LW(p) · GW(p)

Consult someone else who commented on this article. I didn't have an idea for how to solve Dr. Hanson's original question. I was trying to pull the question sideways.

comment by Emile · 2009-03-02T10:13:23.809Z · LW(p) · GW(p)

Set up a website where people can submit artistic works - poetry, drawings, short stories, maybe even pictures of themselves - and it's expected rating on a 1-10 scale.

The works would be publicly displayed, but anonymously, and visitors could rate them ("nonymously" is to make sure the ratings are "global" and not "compared to other work by the same guy" - so maybe the author could be displayed once you rated it).

You could then compare the expected rating of a work to the actual ratings it received, and see how much the author under- or over-estimates himself.

(for extra measurment of calibration, you could also ask the author to give a confidence factor, though I'm not sure how exactly it should be presented and calculated)

Your own art has the advantage of being something about which you might be systematically biased, and which can still be evaluated pretty easily (as opposed to predictions about how to get out of the financial crisis).

comment by Vladimir_Gritsenko · 2009-03-01T17:00:16.084Z · LW(p) · GW(p)

Anyone up for some Rational Debating?

comment by johnbr · 2009-03-06T14:42:42.929Z · LW(p) · GW(p)

Another test.

Find out the general ideological biases of the test subject
Find two studies, one (Study A) that supports the ideological biases of the test subject, but is methodologically flawed. The other (Study B) refutes the ideological biases of the subject, but is methodologically sound.
Have the subject read/research information about the studies, and then ask them which study is more correct.

If you randomize this a bit (sometimes the study is both correct and "inline with one's bias") and run this multiple times on a person, you should get a pretty good read on how rational they are.

Some people might decide "Because I want to show off how rational I am, I'll accept that study X is more methodologically sound, but I'll still believe in my secret heart that Y is correct"

I'm not sure any amount of testing can handle that much self-deception, although I'm willing to be convinced otherwise :)

Replies from: j03

↑ comment by j03 · 2009-03-07T05:42:37.005Z · LW(p) · GW(p)

How do you know your determination of "ideological bias" isn't biased itself?
All experiments are flawed in one way or another to some degree. Are you saying one study is more methodologically flawed than another? How do you measure the degree of the flaws? How do you know your determination of flaws isn't biased?
Again, you've already decided the which study is "correct" based on your own biased interpretations. How do you prove the other person is wrong and it's not you that is biased?

I agree with the randomize and repeat bit though.

However, I would like to propose that this test methodology for rationality is deeply flawed.

comment by johnbr · 2009-03-06T14:34:23.997Z · LW(p) · GW(p)

Keep track of when you change your mind about important facts based on new evidence.

a) If you rarely change your mind, you're probably not rational.

b) If you always change your mind, you're probably not very smart.

c) If you sometimes change your mind, and sometimes not, I think that's a pretty good indication that you're rational.

Of course, I feel that I fall into category (c), which is my own bias. I could test this, if there was a database of how often other people had changed their mind, cross-referenced with IQ.

Here's some examples from my own past:

I used to completely discount AGW. Now I think it is occuring, but I also think that the negative feedbacks are being ignored/downplayed.
I used to think that the logical economic policy was always the right one. Now, I (begrudgingly) accept that if enough people believe an economic policy is good, it will work, even though it's not logical. And, concomitantly, a logical economic policy will fail if enough people hate it.
Logic is our fishtank, and we are the fish swimming in it. It is all we know. But there is a possibility that there's something outside the fishtank, that we are unable to see because of our ideological blinders.
The two great stresses in ancient tribes were A) "having enough to eat" and B) "being large enough to defend the tribe from others". Those are more or less contradictory goals. But both are incredibly important. People who want to punish rulebreakers and free-riders are generally more inclined to weigh A) over B). People who want to grow the tribe, by being more inclusive and accepting of others are more inclined to weight B) over A).
None of the modern economic theories seem to be any good at handling crises. I used to think that Chicago and Austrian schools had better answers than Keynesians.
I used to think that banks should have just been allowed to die, now I'm not so sure - I see a fair amount of evidence that the logical process there would have caused a significant panic. Not sure either way.

Replies from: adamisom, j03

↑ comment by adamisom · 2012-01-13T23:47:20.840Z · LW(p) · GW(p)

I'm not sure about this.

The words are vague enough that I think we'll usually see ourselves as only sometimes changing our mind. That becomes the new happy medium that we all think we've achieved, simply because we're too ignorant on what it actually means to change your beliefs the right amount that we think.

I'm having a hard time knowing how I could decide if I'm changing my beliefs the right amount; since that would be a (very rough) estimation of an indirect indicator, I feel like I have to disagree with the potential of this idea.

↑ comment by j03 · 2009-03-07T06:05:00.654Z · LW(p) · GW(p)

I agree with many of your points, though the practicality of your test methodology is... well, impractical.

I think rationality itself is one of the ideological blinders you speak of. Forget blinding, it can be totally debilitating.

Irrational morons can be quite successful by any of the usual measures: procreation, monetary wealth, even happiness.

Rationality is simply a point of view. It is satisfying and maybe even fun. But it's not God. It's not the "one true way."

The world would be an awful place to live if everyone was "rational."

comment by Marshall · 2009-03-01T13:51:43.282Z · LW(p) · GW(p)

How about testing the rationality of your life (and not just your beliefs)?

Are you satisfied with your job/marriage/health-exercising? Are you deeply in debt? Spent too much money on status-symbols? Cheating on your life-partner? Spending too much time on the net? Drinking too much?

I am sure there are many other life-tests.

Replies from: RobinHanson

↑ comment by RobinHanson · 2009-03-01T16:19:06.265Z · LW(p) · GW(p)

Surely we want to distinguish "rational" from "winner." Are winners on average more rational than others? This is not clear to me.

Replies from: Eliezer_Yudkowsky, PhilGoetz, Cameron_Taylor

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T16:44:16.031Z · LW(p) · GW(p)

If we can't demand perfect metrics then surely we should at least demand metrics that aren't easily gamed. If people with the quality named "rationality" don't on average win more often on life-problems like those named, what quality do they even have, and why is it worthwhile?

Replies from: RobinHanson, PhilGoetz

↑ comment by RobinHanson · 2009-03-01T21:28:30.251Z · LW(p) · GW(p)

I understand "rational" people "win" at the goal of believing the truth, but that goal may be in conflict with more familiar "success" goals. So the people around us we see as succeeding may not have paid the costs required to believe the truth.

↑ comment by PhilGoetz · 2009-03-01T21:31:11.324Z · LW(p) · GW(p)

Suppose we did the experiments and found other policies more winning than rationality. Would you adopt the most winning policy?

If not, then admit that you value rationality, and stop demanding that it win.

Replies from: Cameron_Taylor, arundelo, Eliezer_Yudkowsky

↑ comment by Cameron_Taylor · 2009-03-02T06:00:45.231Z · LW(p) · GW(p)

If rationality is defined as making the decisions that maximise expected utility in a given situation then it is by definition more winninng. The question would be nonsensical.

If another definition of rationality is implied then I don't think Eleizer demanded that it win.

↑ comment by arundelo · 2009-03-02T03:35:28.606Z · LW(p) · GW(p)

Suppose we did the experiments

That would be a rational thing to do!

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T21:52:54.294Z · LW(p) · GW(p)

I do have components of my utility function for certain rituals of cognition (as described in the segment on Fun Theory) but net wins beyond that point would compel me.

↑ comment by PhilGoetz · 2009-03-01T21:26:30.620Z · LW(p) · GW(p)

I predict that winners are on average less rational than rationalists. Risk level has an optimal point determined by expected payoff. But the maximal payoff keeps increasing as you increase risk. The winners we see are selected for high payoff. Thus they're likely to be people who took more risks than were rational. We just don't see all the losers who made the same decisions as the winners.

↑ comment by Cameron_Taylor · 2009-03-02T05:35:31.680Z · LW(p) · GW(p)

Those who take rational actions win more often than those who do not.

If we take a sample of those who have achieved the greatest utility then we can expect that sample to to be biased towards those who have taken the most risks.

Even in idealised situations where success is determined soley by decisions made based off information and in which rationality measured based on how well those decision maximise expected utility we can expect the biggest winners to not be the most rational.

When it comes to actual humans the above remains in place, yet may well be dwarfed by other factors. Some lyrics from Ben Folds spring to mind:

Fate doesn't hang on a wrong or right choice, fortune depends on the tone of your voice

comment by MichaelHoward · 2009-03-01T13:34:22.651Z · LW(p) · GW(p)

I am 95% confident that calibration tests are good tests for a very important aspect of rationality, and would encourage everyone to try a few.

Replies from: RobinHanson, Marshall

↑ comment by RobinHanson · 2009-03-01T16:16:22.200Z · LW(p) · GW(p)

Yes calibration tests are rationality tests, but they are better tests on subjects where you are less likely to be rational. So what are the best subjects on which to test your calibration?

Replies from: AnnaSalamon, AnnaSalamon

↑ comment by AnnaSalamon · 2009-03-02T01:10:42.358Z · LW(p) · GW(p)

I suspect I should also be writing down calibrated probability estimates for my project completion dates. This calibration test is easy to do oneself, without infrastructure, but I'd still be interested in a website tabulating my and others' early predictions and then our actual performance -- perhaps a page within LW?. Might be especially good to know about people within a group of coworkers, who could perhaps then know how much to actually estimate timelines when planning or dividing complex projects.

Replies from: pwno

↑ comment by pwno · 2009-03-02T01:20:46.680Z · LW(p) · GW(p)

Wouldn't making a probability estimate for your project completion dates influence your date of completion? Predicting your completion times successfully won't prove your rationality.

Replies from: AnnaSalamon

↑ comment by AnnaSalamon · 2009-03-02T02:18:52.223Z · LW(p) · GW(p)

This is a good point. Still, it would provide evidence of rationality, especially in the likely majority of cases where people didn't try to game the system by e.g. deliberately picking dates far in advance of their actual completions, and then doing the last steps right at that date. My calibration scores on trivia have been fine for awhile now, but my calibration at predicting my own project completions is terrible.

Replies from: badger

↑ comment by badger · 2009-03-02T03:18:52.021Z · LW(p) · GW(p)

I wonder to what degree this is a problem of poor calibration vs. poor motivation. Maybe commitment mechanisms like Stikk.com would have a greater marginal benefit than better calibration. I don't know about you, but that seems to be the case with regards to similar issues on my end.

↑ comment by AnnaSalamon · 2009-03-02T01:06:40.516Z · LW(p) · GW(p)

Perhaps we could make a procedure for asking your friends, coworkers, and other acquaintance (all mixed together) to rate you on various traits, and anonymizing who submitted which rating to encourage honesty? You could then submit calibrated probability estimates as to what ratings were given.

I'd find this a harder context in which to be rational than I'd find trivia.

Replies from: AnnaSalamon

↑ comment by AnnaSalamon · 2009-03-11T00:47:16.639Z · LW(p) · GW(p)

Actually, there's probably some website out there already that lets one solicit anonymous feedback. (Which would be a rationality boost for some of us in itself, even apart from calibration -- though I'd like to try calibration on it, too.)

Does anybody know of such a site? I spent an hour looking on Google -- perhaps not with the right keywords -- and found only What Others Think, Kumquat, and a couple Facebook/Myspace apps.

Both look potentially worth using, but neither is ideal. Are there other competitors?

↑ comment by Marshall · 2009-03-01T13:42:22.506Z · LW(p) · GW(p)

I don't associate rationality with Trivial Pursuit, which does rather seem to dominate the test questions.

Replies from: MichaelHoward

↑ comment by MichaelHoward · 2009-03-01T14:28:53.192Z · LW(p) · GW(p)

Marshall, you don't need to be good at the question subjects, just as long as you don't think you're good when you're not. Calibration tests aren't about how many of the questions you can get right, they test if you're over (or under) confident about your answers. They tend to use obscure questions which few people are likely to know for sure what the answers are.

Replies from: Marshall

↑ comment by Marshall · 2009-03-01T15:06:05.981Z · LW(p) · GW(p)

Thanks Michael - I just don't think calibrating on useless information is evidence of my rationality. I am 95% sure, that I don't know all the time. Calibrating on whether I book my next dental appointment on time seems a better clue.

Replies from: MichaelHoward

↑ comment by MichaelHoward · 2009-03-01T15:43:10.364Z · LW(p) · GW(p)

Interesting point. Does anyone know of any evidence about how well calibration test results match overconfidence in important real-life decisions? I'd expect it would give a good indication, but has anyone actually tested it?

Replies from: Eliezer_Yudkowsky

↑ comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-03-01T16:45:13.194Z · LW(p) · GW(p)

There are a lot of tests that look plausibly useful but would be much more trustworthy if we could find a sufficiently good gold standard to validate against.

Replies from: AnnaSalamon

↑ comment by AnnaSalamon · 2009-03-02T01:15:20.029Z · LW(p) · GW(p)

If we have enough tests that look plausibly useful, each somewhat independent-looking, we could see how well they correlate. Test items on which good performance predicts high scores on other test items would seem more credible as indicators of actual rationality.

We could include behavioral measures of success, such as those suggested by Marshall, in our list of test items. (Income, self-reported happiness, having stable, positive relationships, managing to exercise regularly or to keep resolutions, probabilistic predictions for the situation you'll be in next year (future accomplishments, self-reported happiness at future times, etc.; coupled with actual reports next year on your situation) in our list of test items. If we can find pen-and-paper test items that correlate both with behavioral measures of success and with other plausibly rationality-related pen-and-paper test items, after controlling for IQ, I'll say we've won.

comment by Annoyance · 2009-03-03T14:26:27.139Z · LW(p) · GW(p)

An ideal rationality test would be perfectly specific: there would be no way to pass it other than being rational. We can't conveniently create such a test, but we can at least make it difficult to pass our tests by utilizing simple procedures that don't require rationality to implement.

Any 'game' in which the best strategies can be known and preset would then be ruled out. It's relatively easy to write a computer program to play poker (minus the social interaction). Same goes for blackjack. It takes rationality to create such a program, but the program doesn't need rationality to function.

comment by HalFinney · 2009-03-03T00:25:11.288Z · LW(p) · GW(p)

"Do you want to justifiably believe that you are more rational than others, smugly knowing your beliefs are more accurate?"

Is this what people want? To me it would make more sense to cultivate the belief that one is NOT more rational than others, and that one's beliefs are no more likely than theirs to be accurate, a priori. Try to overcome the instinct that a belief is probably correct merely because it is yours.

Now I can understand that for people at the cutting edge of society, pushing into new frontiers like Robin and Eliezer, this would not work. If someone came up to Robin and criticized idea futures, or to Eliezer and said that friendly AI would not work, and they responded, "oh, I guess maybe you're right, thanks" - well, then, they wouldn't get anything done.

But for most of us, this is not an issue. Factual disagreements in my experience are seldom about things that would keep us from being productive and successful in our lives. People tend to disagree most vociferously on things that don't have the slightest impact on their lives, like political and sports questions. Isn't that right?

Even for researchers, in a way it doesn't matter because we are paying them to push the boundaries. It is their job to adopt opinions and fight for them. They are obligated to assume that just because an idea is theirs, it is probably right. Researchers are paid to be irrational in this way, and indeed it is hard to see how a rational person could be successful in science.

comment by RobinHanson · 2009-03-01T17:17:53.788Z · LW(p) · GW(p)

Jim Randomh makes some suggestions here.

comment by Marshall · 2009-03-01T14:16:19.549Z · LW(p) · GW(p)

Karma-score (and voting up/down) could also be a measure of rationality contra affiliation. Movement in itself being more important than the direction of the movement as a clue to your affiliation drive or rationality drive - given a little context and some scrupulous introspection.

Replies from: steven0461, Marshall

↑ comment by steven0461 · 2009-03-03T08:44:00.760Z · LW(p) · GW(p)

I don't know if karma is itself a good measure of rationality, but it might be a good subject to train calibration on. E.g., whenever you make a post or comment there could be an optional field where you put in your expectation and SD for what the post's or the comment's score will be one week later.

Replies from: Jack

↑ comment by Jack · 2009-03-03T08:54:50.605Z · LW(p) · GW(p)

Its too bad karma scores are reads-neutral. Late comments to posts tend to get ignored at the bottom of the thread. I wonder if one couldn't add a "Read this comment" button... though I imagine a lot of people wouldn't bother.

Replies from: Johnicholas

↑ comment by Johnicholas · 2009-03-03T09:59:05.574Z · LW(p) · GW(p)

Late comments getting ignored would not be an issue if people primarily read comments via "Recent Comments".

Replies from: Jack

↑ comment by Jack · 2009-03-03T11:02:45.246Z · LW(p) · GW(p)

"Recent comments" may work when traffic is low and there is only 1-2 posts a day. But imagine when this thing gets going and you're posting in an old article during high traffic hours.

Replies from: khafra

↑ comment by khafra · 2010-06-02T16:35:12.572Z · LW(p) · GW(p)

Maybe there should be a "recent comments less than half the age of the article" feed.

Replies from: RobinZ

↑ comment by RobinZ · 2010-06-02T16:42:45.209Z · LW(p) · GW(p)

Maybe there should be "recent comments on this article" feeds.

Replies from: NancyLebovitz

↑ comment by NancyLebovitz · 2010-06-02T23:23:24.790Z · LW(p) · GW(p)

There certainly should be. And possibly a "recent comments on all articles except [list of articles]" feed. That way, people could see possibly interesting comments on old articles, while avoiding comments to recent high traffic articles they aren't interested in.

↑ comment by Marshall · 2009-03-01T15:00:55.780Z · LW(p) · GW(p)

Another candidate for a behavioural test could be the number of New Year Resolutions you make and hold.

Replies from: pwno

↑ comment by pwno · 2009-03-01T22:34:49.644Z · LW(p) · GW(p)

Funny how the holiday became devoted towards making rational self-improvement goals. Which leads me to my next point: I think people who decide to self-improve on their own, and actually follow through, are already more rational than the average joe. Most people rationalize their mediocrity, find reasons not to self-improve, and stay preoccupied with tasks that don't impinge on their mental comfort level.

Replies from: badger

↑ comment by badger · 2009-03-02T00:03:44.235Z · LW(p) · GW(p)

I agree that people that try to self-improve have more potential to be rational, but I don't think significantly more are in practice. If your goals are misplaced, achieving them could be worse than if you did nothing. On the subject of New Year Resolutions, my parents are apt to make goals like "attend religious services more frequently" and "read scripture more frequently", and they often succeed.

Replies from: Jack, Cameron_Taylor

↑ comment by Jack · 2009-03-03T08:58:41.542Z · LW(p) · GW(p)

You'd want people to estimate the utility of their goals and compare that to a post-goal completion estimate of utility. See here http://lesswrong.com/lw/h/test_your_rationality/dg#comments

↑ comment by Cameron_Taylor · 2009-03-02T06:03:46.918Z · LW(p) · GW(p)

That sounds healthy for them. Did they benefit from it?

comment by Jonnan · 2009-03-06T04:33:12.280Z · LW(p) · GW(p)

Just a personal problem that seems to me to be a precursor to the rationality question.

Various studies have shown that a persons 'memory' of events is very much influenced by later discussion of the event, when put into situations such as the 'Stanford Prison Experiment' or the 'Milgram Experiment' people will do unethical acts under pressure of authority and situation.

Yet people have a two-fold response to these experiments. A) They deny the experiments are accurate, either in whole, or in degree B) They deny that they fall into the realm of those that would be so affected.

With of course, the obvious caveat that some people actually are not so affected in those experiments (or do remember thing accurately), and will stand up for what they determine as ethical regardless.

The obvious fact seems to be that it is among those that honestly consider the possibility that their thoughts can be affected by these outside influences that the greatest chance of successfully maintaining one's own identity against them exists, but others than acknowledging this fact (Which can certainly be faked, even self-deceptively) what self-assessments allow one to develop this?

Once we have that, it seems to me that the question of maintaining rationality itself clarifies itself greatly.

Jonnan

comment by Cosmin · 2009-03-05T21:27:47.328Z · LW(p) · GW(p)

There is quite a gap between wanting to be rational and wanting to know how unbiased you are. Since the test is self-administered, pursuing the first desire could easily lead to a favourable, biased, seemingly rational test result. This result would be influenced by personal expectations, and it's reliability is null according to Löb's Theorem. The latter desire implies one being open to his biased state and states his purpose of assessing some sort of bias/rational balance. This endeavour is more profitable than the previous because, hopefully, it offers actionable information.

Perhaps one could have a good shot at finding out more about his biases by making quick judgements and later trying to contemplate various aspects and sequences of his or her judgement with accounting of seemingly absurd alternatives and attention paid to the smallest of details. The result should occur as a percentage of correct/faulty conclusions. Apart from discovering some sort of rational/biased ratio in a line of thought, this process should automatically bring one closer to being rational by the memorizing of judgement flaws, their sources and pattern, and by the development of a habit for righteous thinking from a rationality point of view.

This test could have a much more reliable result when performed on someone else by providing all necessary information for a right conclusion to be reached together with vague, inconclusive information for incorrect conclusions to be reached, and great incentives for reaching some of the wrong conclusions.

Speaking of incentives, I believe anyone trying to be as rational as possible within a group could be influenced by group values and beliefs. Therefore, trying to find out biases within the group's/group members' judgements could be correlated with one's affinity for that group. Rationality should be neutral, but neutrality is seldom a group value so chances are high that instinctive-rationalists will be outliers. The tendency to agree with beliefs is probably as wrong as the tendency of finding biases, the two depending on one's grade of sympathy for a specific group.

Identifying exterior biases will be an unreliable measure of one's rationality, because of the incentives which exist in interacting with others and also because there is usually little information on exterior thought processes which led to specific outcomes. Also, beliefs widely spread across a social system can have consequences that seemingly prove those beliefs even without their being rational, in which case, comparing one's judgement to facts would be an indicator of power rather than rationality.

comment by nazgulnarsil · 2009-03-01T22:00:22.807Z · LW(p) · GW(p)

it seems that the the relevance of the calibration tests are that the better calibrated you are the better you will perform on predicting how happy various outcomes will make you. being good at this puts you at a huge advantage relative to the average person.

comment by SeanMCoincon · 2014-07-30T23:23:08.029Z · LW(p) · GW(p)

My concern is less with the degree to which I wear the rationality mantle relative to others (which is low to the point of insignificance, though often depressing) and more with ensuring that the process I use to approach rationality is the best one available. To that end, I'm finding that lurking on LessWrong is a pretty effective process test, particularly since I tend to come back to articles I've previously read to see what further understanding I can extract in the light of previous articles. SCORING such a test is a more squiffy concept, though correlation of my (defeasibly) rational conclusions to the evidence of reality seems an effective measure... though I've now run into a concern that my own self-assessment of confirmation bias elimination may not be satisfactorily objective. The obvious solution to THAT problem would be to start publishing process/conclusion articles to LessWrong. I think I may have to start doing so.

comment by j03 · 2009-03-06T21:38:11.532Z · LW(p) · GW(p)

Rationality is fiction.

Belief that purely rational thought is possible or even a reasonable goal is a sure sign of an irrational person.

That last sentence should cause a bit of cognitive dissonance if you're paying attention.

"The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man.” George Bernard Shaw

Replies from: j03

↑ comment by j03 · 2009-03-07T05:32:40.667Z · LW(p) · GW(p)

I guess no-one wants to have a "rational" debate on the subject of rationality.

They'd rather vote-down minor problems like actually having concrete definitions for the basis of their faith in this abstract thing called "rationality."

Good luck with that!

Replies from: Carinthium

↑ comment by Carinthium · 2011-04-30T06:56:47.577Z · LW(p) · GW(p)

If you're still here:

Definition of rationality- The ability to determine truths based on what is justified by the evidence avaliable. Generally requires a degree of ability to compensate for cognitive biases. Definition of 'pure rationality'- Perfect ability to make justified interferences based on the evidence avaliable. Implies complete immunity to cognitive biases.

These are both tenative, but close enough to show the flaws in your argument. Pure rationality as defined may be practically (if not theoretically) impossible, but is a reasonable goal.

Replies from: David_Gerard

↑ comment by David_Gerard · 2011-04-30T07:20:26.632Z · LW(p) · GW(p)

Those are quite good! May I suggest a discussion thread asking for personal working definitions of rationality?

Test Your Rationality

Contents

87 comments