Rationality Toughness Tests

post by RobinHanson · 2009-04-06T01:12:31.928Z · LW · GW · Legacy · 17 comments

(Epistemic) rationality has two major components:

Attending takes time, energy, quiet, etc.  Circumstances where human rationality degrades include when:

It seems relatively easy to test rationality smarts; repeatedly give folks info and time to work new problems and measure their accuracy, calibration, etc.  And I have an idea for testing for rationality toughness: compare performance on info-similar pairs of good/bad-circumstance problems. 

For example, assume people are better at evaluating if a spouse is cheating when considering an acquaintance in their social circle, relative to a stranger or their own spouse. If so, we could pose them a pair of problems with very similar info structure, one with an easy spouse and one with a hard spouse.  The closeness of their response in these two cases would then be a measure of their rationality toughness.

Of course this test may fail if the similarity is too obvious, or the pair are asked too closely in time.  But maybe we don't even need to ask the same person the two questions; perhaps we could usefully compare someone's answer on a hard question to answers from a pool of similar people on matched easy questions.

While I haven't thought this through, it already suggests a training technique: consider matched hard/easy circumstance problems and compare your answers, separated by enough time that you forget most of your previous analysis.

17 comments

Comments sorted by top scores.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-04-06T19:57:18.929Z · LW(p) · GW(p)

This seems like a very general and useful idiom - I propose calling them hot/cold tests, if the term isn't already taken.

Replies from: ciphergoth
comment by Paul Crowley (ciphergoth) · 2009-04-06T20:26:21.599Z · LW(p) · GW(p)

Tests which compare your answers in calm circumstances with "hot" circumstances have been done; they're described in "Predictably Irrational".

Replies from: RobinHanson
comment by RobinHanson · 2009-04-06T22:18:59.597Z · LW(p) · GW(p)

Yes, that is the point. Psychologists often do tests where everything is held constant but for some particular feature of circumstance. Their purpose is to see what circumstantial features have what effects. The idea here is to build on that research to develop ways to test individual rationality.

comment by Br000se · 2009-04-06T06:09:33.019Z · LW(p) · GW(p)

This concept comes up among poker players. Smarts corresponds to an ability to talk about the correct play in a hand in theory. Toughness corresponds to a player's ability to continue to play well in a downswing. There are a lot of correct plays that can lead to bad outcomes with high frequencies. Sometimes a player encounters so many bad outcomes that they begin to doubt whether the play is correct.

comment by jimrandomh · 2009-04-06T15:37:10.463Z · LW(p) · GW(p)

There are many issues that can make thinking rationally harder, some of which are very hard to reproduce in controlled circumstances, and being better at dealing with one issue does not necessarily make you any better at dealing with the others. Almost anything that we would describe as stressful impairs rationality. To your list of circumstances, I would add:

  • Distractions: noisy environments, recent events
  • Altered mental states: fatigue, intoxication
  • Time pressure: either a hard limit on time available to decide, or thinking longer imposes a cost large enough to matter.

A few of these are easy to test; for example, you could take a written test once while rested, once while tired, once in a noisy room, once with a short time limit, etc. However, it doesn't seem likely that these tests would generalize well, and many of the obvious strategies for training seem unlikely to generalize well either.

Replies from: RobinHanson
comment by RobinHanson · 2009-04-06T16:02:16.920Z · LW(p) · GW(p)

I said "under time" to try to factor these things out, but I just edited the post to say "by attending" to better describe them. Yes focusing more attention on a problem lets you do better, and more time, less fatigue, and fewer distractions let you focus more attention. But these attention factors seem importantly different from the other factors I listed.

comment by knb · 2009-04-06T03:49:14.021Z · LW(p) · GW(p)

I was very interested to read your views on why humans choke. I was thinking about this the other day in relation to a question posed by a professor. What I came up with at the time was that choking is caused by sympathetic fight or flight response (adaptive in less cognitively complex organisms but not in our human world where finesse is necessary). In other words, I was suggesting it is a misfire that is so deeply rooted in our mental architecture we haven't managed to evolve out of it yet. Also, I wonder, are there examples of non-human animals "choking"?

Edit: I meant "choking" in the sense of failing during high pressure situations. I was referring to Robin Hanson's second link ( to an OB post). Sorry for any confusion.

Replies from: Unnamed, Annoyance
comment by Unnamed · 2009-04-06T23:32:07.093Z · LW(p) · GW(p)

There is evidence of something similar to choking among other species, including rats and cockroaches. Social psychologists have found that the presence of others tends to make people do better on easy or well-practiced and worse on difficult, complex tasks, and these effects have been found among cockroaches running through easy or difficult mazes. They call this effect social facilitation.

Replies from: gwern
comment by gwern · 2012-06-12T01:27:28.138Z · LW(p) · GW(p)

Rats and cockroaches are pretty social creatures; so should we interpret this as evidence for Hanson's status theory of choking, that it's to avoid appearing to the elites to be dangerously competent?

Replies from: Unnamed
comment by Unnamed · 2012-06-12T02:30:42.122Z · LW(p) · GW(p)

Is competence dangerous for rats and cockroaches? My guess is that there is no cost to a cockroach for being seen by other cockroaches to have run a maze quickly. If that guess is correct, then that study in isolation is a piece of evidence against Hanson's status theory of choking.

And, more directly, the study is evidence for the optimal level of arousal / social facilitation theory of choking. For each task, there is an optimal level of physiological arousal - increasing arousal improves performance up to that point, and then hinders performance if it increases beyond that level. Easy or well-learned tasks tend to have a higher optimal level of arousal than difficult tasks. Some situations (such as the presence of others) increase arousal, which can lead to failure if arousal gets too high ("choking").

If you want an evolutionary story, I would posit that our ancestors evolved to process certain circumstances as cues to increase their level of arousal, in a way that would tend to put them close to the optimal level of arousal for each situation. But individuals in the ancestral environment encountered different sorts of situations and engaged in different tasks than people today. Compared to the ancestral environment, many modern high-pressure situations involve behaviors that are more complicated and less physically demanding, which means that lower levels of arousal are optimal, and this mismatch leads to excessive arousal and choking.

Replies from: wedrifid
comment by wedrifid · 2012-06-12T02:57:43.778Z · LW(p) · GW(p)

Is competence dangerous for rats and cockroaches? My guess is that there is no cost to a cockroach for being seen by other cockroaches to have run a maze quickly.

It is dangerous to the extent that No Free Lunch applies.

comment by Annoyance · 2009-04-06T18:50:09.870Z · LW(p) · GW(p)

The structural modifications to the human throat that permit easy speech make us vulnerable to choking. If I recall correctly, most animals can breathe and swallow at the same time. Human infants are also capable of this. But the larynx slowly shifts so that we can modulate our airflow in sophisticated ways, and one consequence of this is that swallowing is no longer safely compatible with simultaneous breathing.

Replies from: Eliezer_Yudkowsky
comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2009-04-06T19:58:18.829Z · LW(p) · GW(p)

Voted up from -1 because this plausibly is just an honest misunderstanding.

Though Annoyance, you should keep it in mind whenever you're tempted to interpret someone's remarks in a way that seems surprising / non-sequitury... I guess to you the world just seems like an endless stream of people saying crazy things, and so you can't distinguish your own misunderstandings of them within that.

Replies from: Annoyance
comment by Annoyance · 2009-04-07T18:30:50.357Z · LW(p) · GW(p)

As for alternative kinds of 'choking': regarding animals, see the concept of a "'coon trap", sometimes called a "monkey trap". It's a specific example of flight-or-fight leading directly to a maladaptive response.

In an inverse of the situation with literal choking, young humans are also vulnerable to it, but adults (usually) aren't.

comment by [deleted] · 2009-04-07T01:10:40.059Z · LW(p) · GW(p)

The distinction is one between competence (smarts) and performance (toughness).

"Measuring" either is a problem in ability and personality testing: find the underlying factors and the combinatorial rules; whether the constructs exist as consistent dimensions shouldn't be taken for granted.

Some findings will probably surprise you. The cognitive performance of the most able degrades the most under pressure for intellectually demanding tasks.

comment by HughRistik · 2009-04-06T19:50:47.283Z · LW(p) · GW(p)

I read this post while in the middle writing my post on heuristic, and I suspect that smarts and toughness apply to both the heuristic/creative components of rationality, and to the critical/skeptical components of rationality.

I wonder which degrades more under pressure: the ability to formulate new productive ideas, or the ability to check the ideas you currently have.

Replies from: MichaelBishop
comment by Mike Bishop (MichaelBishop) · 2009-04-07T04:26:32.240Z · LW(p) · GW(p)

Have you tried to imagine what experiment you'd do?

Which degrades more will depend on what type of new ideas the experimenter is asking you to generate under what type of pressure. It would also depend on how the experimentalist is distinguishing between "formulating new ideas," and "checking ideas you currently have."

In other words, I feel your question is not concrete enough to even begin trying to answer.