Hindsight Devalues Science
post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2007-08-17T19:39:42.000Z · LW · GW · Legacy · 44 commentsContents
44 comments
This essay is closely based on an excerpt from Meyers’s Exploring Social Psychology; the excerpt is worth reading in its entirety.
Cullen Murphy, editor of The Atlantic, said that the social sciences turn up “no ideas or conclusions that can’t be found in [any] encyclopedia of quotations . . . Day after day social scientists go out into the world. Day after day they discover that people’s behavior is pretty much what you’d expect.”
Of course, the “expectation” is all hindsight. (Hindsight bias: Subjects who know the actual answer to a question assign much higher probabilities they “would have” guessed for that answer, compared to subjects who must guess without knowing the answer.)
The historian Arthur Schlesinger, Jr. dismissed scientific studies of World War II soldiers’ experiences as “ponderous demonstrations” of common sense. For example:
- Better educated soldiers suffered more adjustment problems than less educated soldiers. (Intellectuals were less prepared for battle stresses than street-smart people.)
- Southern soldiers coped better with the hot South Sea Island climate than Northern soldiers. (Southerners are more accustomed to hot weather.)
- White privates were more eager to be promoted to noncommissioned officers than Black privates. (Years of oppression take a toll on achievement motivation.)
- Southern Blacks preferred Southern to Northern White officers. (Southern officers were more experienced and skilled in interacting with Blacks.)
- As long as the fighting continued, soldiers were more eager to return home than after the war ended. (During the fighting, soldiers knew they were in mortal danger.)
How many of these findings do you think you could have predicted in advance? Three out of five? Four out of five? Are there any cases where you would have predicted the opposite—where your model takes a hit? Take a moment to think before continuing . . .
. . .
In this demonstration (from Paul Lazarsfeld by way of Meyers), all of the findings above are the opposite of what was actually found.1 How many times did you think your model took a hit? How many times did you admit you would have been wrong? That’s how good your model really was. The measure of your strength as a rationalist is your ability to be more confused by fiction than by reality.
Unless, of course, I reversed the results again. What do you think?
Do your thought processes at this point, where you really don’t know the answer, feel different from the thought processes you used to rationalize either side of the “known” answer?
Daphna Baratz exposed college students to pairs of supposed findings, one true (“In prosperous times people spend a larger portion of their income than during a recession”) and one the truth’s opposite.2 In both sides of the pair, students rated the supposed finding as what they “would have predicted.” Perfectly standard hindsight bias.
Which leads people to think they have no need for science, because they “could have predicted” that.
(Just as you would expect, right?)
Hindsight will lead us to systematically undervalue the surprisingness of scientific findings, especially the discoveries we understand—the ones that seem real to us, the ones we can retrofit into our models of the world. If you understand neurology or physics and read news in that topic, then you probably underestimate the surprisingness of findings in those fields too. This unfairly devalues the contribution of the researchers; and worse, will prevent you from noticing when you are seeing evidence that doesn’t fit what you really would have expected.
We need to make a conscious effort to be shocked enough.
1 Paul F. Lazarsfeld, “The American Solidier—An Expository Review,” Public Opinion Quarterly 13, no. 3 (1949): 377–404.
2 Daphna Baratz, How Justified Is the “Obvious” Reaction? (Stanford University, 1983).
44 comments
Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).
comment by Tom_McCabe · 2007-08-17T22:51:11.000Z · LW(p) · GW(p)
Ouch. I had vague feelings that something was amiss, but I believed you when you said they were all correct. I knew that sociology had a lot of nonsense in it, but to proclaim the exact opposite of what actually happened and sound plausible is crazy (and dangerous!).
Replies from: EngineerofScience↑ comment by EngineerofScience · 2016-06-03T19:10:06.340Z · LW(p) · GW(p)
I certainly agree. Most of those I instantly believed, and I had a bit of doubt for the one about southern blacks preferring southern to northern white officers (or maybe that is belief as attire, or hindsight bias) but as you said it is crazy that the opposite of what is true is believable when told it is correct.
comment by Robin_Hanson2 · 2007-08-18T00:14:37.000Z · LW(p) · GW(p)
These examples emphasize the benefit of frequently taking calibration tests, where we assign probabilities to answers and then checks those answer for calibration errors. Perhaps someone could create a website where we could do this regularly? Just collect a large list of questions like the ones above, questions with true answers but where we have intuitions about what the answer might be, and then have us answer those questions with probabilities, and then show us a calibration chart for the last X questions. Yes, collecting the good questions will be most of the work.
Replies from: Fredrik, TylerHeishman↑ comment by Fredrik · 2011-06-10T04:26:01.183Z · LW(p) · GW(p)
What if I were to try to create such a web app. Should I take 5 minutes every lunchbreak asking friends and colleagues to brainstorm for questions? Maybe write a LW post asking for questions? Maybe there could be a section of the site dedicated to collecting and curating good questions (crowdsourced or centrally moderated).
↑ comment by TylerHeishman · 2014-02-18T18:19:32.902Z · LW(p) · GW(p)
CFAR has 2 apps you might find interesting; I was able to find them on apple store easily. http://rationality.org/apps/
Replies from: AlexanderRM↑ comment by AlexanderRM · 2014-10-17T22:01:52.159Z · LW(p) · GW(p)
Are those apps only available on Apple products/smartphones? No way to access them on a Windows PC?
Replies from: ChristianKl, Unnamed↑ comment by ChristianKl · 2014-10-18T00:36:39.701Z · LW(p) · GW(p)
The calibration game is also available for Android and was available for Windows but I think the original website is down.
comment by Bob_Disdevon · 2007-08-18T01:17:51.000Z · LW(p) · GW(p)
As a lack of known causes fear, hindsight bias delivers us the comfort we desire at all times, the easy model that we build, rather than the unexpected unknown that causes anxiety. In that way we learn -- some might prefer this unenlightened state, as hindsight bias smoothes their mental ships away from the scholes of uncertainty and self-doubt.
comment by Nathan_T._Freeman · 2007-08-18T12:42:40.000Z · LW(p) · GW(p)
Eliezer, I don't have any contribution to make to the conversation. I just want to tell you that the last 10 or so posts from you have absolutely blown me out of my socks. Without a doubt, some of the most impactful and insightful stuff I've read in my 10+ years on the web.
Please keep it up.
And yes, I realize there's an irony to professing what is really a byline bias on this site. :-)
comment by Constant2 · 2007-08-18T13:59:47.000Z · LW(p) · GW(p)
Frankly none of the five examples strikes me as something I could have predicted, nor ever struck me as such. Nevertheless, social science may indeed produce few significant results which are not predictable. How is that possible given the examples above? Simple: the examples may have been cherry picked to make the point. In particular, their significance (to us now) is seriously damaged by the fact that they are not general statements but are statements about a time and place. While they may be generalizable to the present day while preserving their truth, they may not be. We just do not know. So, as they stand, they are not that useful to us now.
Social science almost certainly produces many insignificant results which are not predictable. It is easy enough to come up with questions which we can then methodically answer by gathering data. What percentage of Massachusetts residents like Fig Newtons? Is it higher or lower than the percentage of New York residents who like Fig Newtons? This is certainly a question that can be asked, and one whose answer I do not know. I could obtain a grant and then spend the grant money studying this question. But it is not a significant question, and learning its answer does not advance human knowledge in a significant way.
But what about significant results? Here's a much more important question: in general, does extreme lack of sleep tend to have any significant negative impact? This is important because if it does not tend to have any significant negative impact, then many people will find this highly useful knowledge. Many people will sleep much less.
But notice something: it is not only an important question, it is also a question which people know the answer to. And this is no coincidence. It is often hard to hide important truths about people, from people.
This is not true of all social facts. Economic facts are facts about large numbers of people interacting, sometimes very indirectly, and so are a kind of fact which people have a hard time seeing, since they only encounter small numbers of other people at any given time, so they do not see the whole.
Replies from: JohnHcomment by pdf23ds · 2007-08-18T18:21:17.000Z · LW(p) · GW(p)
Constant, it's odd you should choose the sleep example. What with the prospects of modafinil, which supresses the desire for sleep and aids concentration, being used as an enhancement drug, and with people practicing polyphasic sleep, (admittedly with more limited success,) where you sleep 15 min at a time six times a day, the question of what the effects of this lack of regular sleep causes is actually a very open, and very interesting one. Of course, without these modifications, lack of sleep has obvious ill effects.
comment by Constant2 · 2007-08-18T20:45:10.000Z · LW(p) · GW(p)
pdf23ds, I don't think it's odd.
Let us grant the two points you have made (of which I am, of course, well aware, as is pretty much anyone who knows what "Digg" or "Reddit" is). In fact let's go further. Suppose that polyphasic sleep lets you do away with sleep entirely (you just blink once every four hours). Suppose furthermore that a single dose of the new drug lets you do completely without sleep without any ill effects for the rest of your life. Now look at this question and try to answer it:
"in general, does extreme lack of sleep tend to have any significant negative impact?"
The correct answer is still "yes", because even if you add up the people who have taken the drug and who practice polyphasic sleep, they make up a small minority of the whole population. Since the statement begins with, "in general", it fails to be contradicted by a small minority.
As for polyphasic sleep and the drug, you can of course prove me wrong but as far as I know, the field of social science gets little if any credit for the discovery and ongoing investigation of polyphasic sleep. From what I have read, the main investigation seems to be done by individual self-experimenters. At least, that's what's made it to the social sites. Similarly, drugs are developed by and large by scientists in the fields of biology, chemistry, biochemistry, etc., not social science, and clinical trials are performed by and large by doctors, not social scientists.
comment by Stuart_Armstrong · 2007-08-19T11:08:52.000Z · LW(p) · GW(p)
Eliezer, I don't have any contribution to make to the conversation. I just want to tell you that the last 10 or so posts from you have absolutely blown me out of my socks.
I agree - very impressive series of articles. Still just about clinging to my own socks, but they're definitely trying to get away.
comment by pdf23ds · 2007-08-19T18:33:58.000Z · LW(p) · GW(p)
What's with the nitpicking, Constant? Of course the general question has the same answer, and I'm not sure what you're trying to prove by asserting your own familiarity with the phenomena I mentioned. And I don't really have any interest in the relationship between polyphasic sleep and the social sciences--why would you think I did?
The reason I thought it odd is that there are other obvious questions that are not even close to being associated with interesting open issues in science, and yet you chose this one. Probably you weren't even thinking of those complications--they weren't relevant to your point.
comment by Nick_Tarleton · 2007-08-19T19:09:00.000Z · LW(p) · GW(p)
Thirded - Eliezer, your posts on this blog are some of the most impressive work I've ever read. The world needs more like you.
comment by Constant2 · 2007-08-20T19:08:21.000Z · LW(p) · GW(p)
"What's with the nitpicking, Constant? [...] they weren't relevant to your point"
I had - mistakenly as it turns out - assumed that you were obeying Grice's maxims. You were, by your own eventual admission, disobeying the maxim of relevance. That is why I misunderstood you.
comment by Snowyowl · 2011-01-27T17:22:50.064Z · LW(p) · GW(p)
Out of curiosity, which time was Yudkowsky actually telling the truth? When he said those five assertions were lies, or when he said the previous sentence was a lie? I don't want to make any guesses yet. This post broke my model; I need to get a new one before I come back.
Replies from: TheOtherDave, Perplexed↑ comment by TheOtherDave · 2011-01-27T17:33:02.843Z · LW(p) · GW(p)
You might find it a worthwhile exercise to decide what your current model is, first.
That is, how likely do you consider those five statements?
Once you know that, you can research the actual data and discover how much your model needs updating, and in what directions. That way you can construct a new model that is closer to observed data.
If you don't know what your current model is, that's much harder.
comment by TheatreAddict · 2011-07-08T13:32:28.903Z · LW(p) · GW(p)
I could only predict one out of five.
comment by Sophronius · 2011-11-05T20:19:03.865Z · LW(p) · GW(p)
Hm, when I first read those findings, I found the first three to be as expected, the fourth to be surprising (why would blacks want racist officers?), and for the fifth I found the result to make sense but considered that I would have though the same if the opposite result had been found (soldiers don't want to abandon their friends in combat, but want to leave together afterwards). So this would seem to indicate a problem with my model, given that the findings were all false.
But, is it possible that in that demonstration, those specific findings were selected specifically because they were opposite to what people would expect? If that is the case, then my model still isn't really in error, because when examining the statements I had no real reason to believe that they were meant to fool me.
comment by TraderJoe · 2012-05-02T12:07:29.490Z · LW(p) · GW(p)
2 and 5 struck me as common sense. I see reasons for 5 to be reversed now that I know the result [yeah...], but I still don't understand why 2 is wrong. Not really the point of the question, but I do wonder...
Replies from: Rixie, BarbaraB↑ comment by Rixie · 2013-07-25T08:33:04.108Z · LW(p) · GW(p)
Maybe because Southeners were used to hot weather and didn't put any real effort into actively combatting the hot weather the way Northeners had to?
Replies from: Making_Philosophy_Better↑ comment by Portia (Making_Philosophy_Better) · 2023-05-06T23:47:48.429Z · LW(p) · GW(p)
Same.
I found four of the findings surprising (because they were either non-obvious, or a bit strange/implausible - education generally makes you more resilient, and why would black people want to hang out with racists who learned the wrong handling lessons, and while being discriminated against makes you less confident to ask for a promotion, you are likely to want it more until you see this working out badly for group members), but I 100 % bought the Southerners dealing better with the heat, and am deeply baffled that they did not.
You'd expect them to have a better biological resistance through prior hardening, more awareness of the danger, and more importantly, more knowledge on what to do.
When we had a heat wave in Northern Europe, we had immense loss of life, despite the fact that such temperatures are regularly exceeded in other countries without such consequences - because people had no idea that heat was dangerous, or how to deal with it. They had no AC installed. They did not know whether to keep windows open or closed. They did not adjust their water and salt intake. They had no adequate clothing. They had an imperfect understanding of ventilation and shade. They did not recognise signs of heat stroke or low blood pressure. They weren't concerned for babies and elderly people. They did not own sun screen. Their work hours were set to work through lunch time. Etc. etc. Even if I were more scared of the tropical heat as a Northener, I would still bet on Southeners doing much better.
Then again, maybe the high humidity turned it into an environment that acted differently than expected, so that the people learning about a new environment learned the right lessons, while those who thought it was familiar already were mal-adapted in some ways, so it evened out?
↑ comment by BarbaraB · 2014-01-05T09:35:17.178Z · LW(p) · GW(p)
I could not swallow the weather example either. Eventually, I looked it up in the article from Meyers: "Southerners were not more likely than Northerners to adjust to a tropical climate."
It sounds like the Northerners were not addapting better, but, rather, there was no difference between groups. If so, the word "opposite" is not fair in this context.
Replies from: elityre↑ comment by Eli Tyre (elityre) · 2019-09-10T18:49:16.047Z · LW(p) · GW(p)
In which case, TraderJoe and Rixie, good job at being appropriately confused!
comment by Danfly · 2012-05-02T12:59:02.276Z · LW(p) · GW(p)
This prompted a memory of something I read in one of my undergrad psychology books a few years ago, which is probably referencing the same study, though using two different examples and one the same as the above example (though the phrasing is slightly different). Here is the extract:
Hindsight (After-the-Fact understanding)
Many people erroneously believe that psychology is nothing more than common sense. "I knew that all along!" or "They had to do a study to find that out?" are common responses to some psychological research. For example, decades ago a New York Times book reviewer criticized a report titled The American Soldier (Stouffer et al., 1949a,1949b), which summarized the results of a study of the attitudes and behavior of U.S. soldiers during World War II. The reviewer blasted the government for spending a lot of money to "tell us nothing we don't already know."
Compared to White soldiers, Black soldiers were less motivated to become officers.
During basic training, soldiers from rural areas had higher morale and adapted better than soldiers from large cities.
Soldiers in Europe were more motivated to return home while the fighting was going on than they were after the war ended.
You should have no difficulty explaining these results. Typical reasoning might go something like this: (1) Due to widespread prejudice, Black soldiers knew that they had little chance of becoming officers. Why should they torment themselves wanting something that was unattainable? (2) It's obvious that the rigors of basic training would seem easier to people from farm settings, who were used to hard work and rising at the crack of dawn. (3) Any sane person would have wanted to go home while bullets were flying and people were dying.
Did your explanations resemble these? If so, they are perfectly reasonable. There is one catch, however. The results of the actual study were the opposite of the preceding statements. in fact, Black soldiers were more motivated than White soldiers to become officers, city boys had a higher morale than farm boys during basic training, and soldiers were more eager to return home after the war ended than during the fighting. When told these actual results, our students quickly found explanations for them. In short, it is easy to arrive at reasonable after-the-fact explanations for almost any result.
Source:Pass, M. W. & Smith, R.E. (2007) Psychology:The Science of Mind and Behavior (Third Edition). McGraw HIll: Boston, pages 31-32
In hindsight, I guess I must have known that it would be a good idea to hang on to my undergrad textbooks. Or did I?
Replies from: Polymath↑ comment by Polymath · 2013-06-13T23:47:23.339Z · LW(p) · GW(p)
I smelled a rat immediately and decided to evaluate all five statements as if they had been randomly replaced with their opposites, or not. All five sounded wrong to me, I could think of rationalizations on each side but the rationalizations for the way they were actually presented sounded more forced.
comment by [deleted] · 2012-11-04T20:38:36.509Z · LW(p) · GW(p)
I believed the first two, one out of personal experience and the other out of System 1. I guessed that as a soft, water-fat intellectual, I'd have more trouble adjusting to a military lifestyle than someone who's actually been in a fight in his life. And that people from warmer climes deal with warmer temperatures more easily, well, I guess I believe people adapt to their circumstances. People from a warmer climate might sweat more and drink more water, or use less energy to generate less heat, whereas a man in Siberia might move more than is strictly necessary to keep his body temperature stable.
The other three are in subjects I know nothing about, and therefore I couldn't have predicted them. A wise man knows his limits...
Replies from: mjankovic↑ comment by mjankovic · 2012-11-22T17:23:59.798Z · LW(p) · GW(p)
I've had a nagging sense of wrongness about #1, not so much about #5, which were the two that I knew the truth about.
While it might be true that intelectuals have trouble adapting to military lifestyle, actual combat is a whole different animal in that respect. It is also different from the type of fighting that goes on in typical civilian life.
Other than that, why would you assume that intelectuals wouldn't be better predisposed to figguring out what they're supposed to do to stay alive and accomplish the mission? Particularly as they're more used to thinking than the average guy.
comment by Georgy_Kolotov · 2013-03-04T12:03:17.518Z · LW(p) · GW(p)
New here, so hoping for (a) an answer, even though it's been a long time and (b) some mercy if I'm completely wrong... :)
Correct me if I'm wrong, but no theory based on known materials could predict what would happen to a completely new material with unknown qualities. If someone would design Kryptonite, which under the same conditions turns into water, this theory would completely fail to predict this.
Of course, you could update your theory to include Kryptonite, but it still would not include Zeptonium, which under the same conditions gives out gamma-rays.
Ridiculous, yes, but no more so than the conditions which would lead Eliezer to believe 2+2=3...
comment by fractalman · 2013-05-29T02:05:10.302Z · LW(p) · GW(p)
After getting miffed by your plethora of retractions, I figure that someone, at some point, left out some statistical significance values.
number 2 is the only one i'd currently be willing to bet on being correct-but now I'm thinking about how soldiers go through boot camp, weakening the effect, and maybe whoever set up the study forgot to make sure the observers didn't know whether they were observing a southern soldier or a norther soldier....
comment by Aitid · 2014-02-07T00:44:24.527Z · LW(p) · GW(p)
I read a few of the sequences now (including this one), and started to find that:
A: They were very interesting to read. B: They seemed a bit obvious, like common sense.
This was somewhat confusing, as things which really are obvious are very familiar and predictable, and thus, not so interesting to experience.
It's only my second time reading this (and first time reading the italics Exploring Social Psychology italics excerpt italics, that I've come to appreciate a link between sufficient explanation and illusion of obviousness.
So keep on dropping in hyper-links to previous sections like you do, they're really helpful.
p.s. The help tables' section on italics was not quite so good, as I've refrained from editing the result in order to demonstrate.
comment by MarsColony_in10years · 2015-02-21T05:41:41.803Z · LW(p) · GW(p)
First time around (with hindsight bias) I answered F, T, F?, T, T
Second time around (without hindsight bias) I answered F, T, T?, T, F
Educated people are more likely to agree with authority, because they have spent more time being conditioned to obey teachers. I generalized that they might also take orders easier in the military, and conform to the circumstances easier, despite "rough and tumbled" blue collar stereotypes.
I thought be difficult to measure, but that there would be some small but probably measurable advantage.
I couldn't really tell. At first my guess was based on the idea that repressed people are more motivated and fight harder. When I reversed my decision, it was because I put a higher weight on the situation being similar to women in business, where men are more likely to rock the boat and ask for a higher salary or promotions than women are. In retrospect, both of these are attempts to confirm a hypothesis, rather than to disprove it.
People often favor members of their own group. Unless a majority of southern blacks dislike a majority of southern whites, rather than just being indifferent, I hypothesized that they would relate to them more easily as fellow southerners.
Initially, I presumed what I thought was the simple and obvious answer, that people would avoid stress. After that, I recalled that adrenalin, deep bonds of sharing an experience, and a sense of purpose are play a big part, and that boredom may actually be a bigger factor since soldiers wouldn't want to abandon their countrymen.
It's hard to describe the different sensations of the two thought processes. I think it was harder to put in as much effort when I was actively suspending my disbelief. I was just going through the motions. The second time, I was really unsure, and took a deeper look. Or maybe I would have taken a deeper look if I had reexamined it under some other pretense.
I wasn't sure whether the recession statement was true. I believe it is, but I'm not sure it would be for a full scale depression, since if people are earning less then they won't be able to be more thrifty, because they will always need to eat and afford the basic necessities.
comment by stripey7 · 2015-06-12T18:25:25.656Z · LW(p) · GW(p)
I "got" 2/5 of the above, before reading they were inverted.
When I took a psychology survey course in college, Dr. John Sabini gave a lot of attention to social psychology experiments, and much of the class was very surprised at their results; they didn't say they "would have predicted them." Of course Sabini may have been cherry-picking results that were likely to surprise. But I've seen it claimed elsewhere that social psychologists in the '60s were largely preoccupied with producing results that would grab a lot of attention by being counterintuitive.
comment by wafflepudding · 2015-09-26T01:58:07.842Z · LW(p) · GW(p)
This hurts my image of Freud. Of course, after I have a dream about skyscrapers, he can explain that it's connected to my love of my phallus, but could he predict my love of my phallus based on a dream about skyscrapers?
comment by 1point7point4 · 2019-12-31T21:03:32.602Z · LW(p) · GW(p)
The link to Meyer's excerpt has been dead for two years, here's an archived link: https://web.archive.org/web/20170801042830/http://csml.som.ohio-state.edu:80/Music829C/hindsight.bias.html
Replies from: habryka4↑ comment by habryka (habryka4) · 2019-12-31T18:51:26.792Z · LW(p) · GW(p)
Fixed, thanks!
comment by tmercer · 2022-07-03T05:39:14.713Z · LW(p) · GW(p)
Strongly believed the reverse on 1 and 4, and had very little belief either way on the rest. But it was enough that I began to suspect they were all false, perhaps also the big white space beneath it tipped off my subconscious to such a possibility. Can't find the paper on sci-hub. What are the answers?
comment by KingOfMadPistoleros · 2023-02-26T11:03:15.504Z · LW(p) · GW(p)
Interesting experience: I attempted to read the sequences ~10 years ago but kept getting sidetracked and put out of order by clicking all the links. This time, I decided to try again, but forced myself to read each post in order. All this to say that I read this post chronologically close to after reading "your strength as a rationalist". I can't evaluate how relevant this fact is, but I had alarm bells ringing in my head when reading the statements 3, 4, and 5. 4 especially was so incoherent to my model that I immediately thought there had to be a trick. Basically, my model:
- could argue either side for 1 with equal probability
- gave a >70% probability to 2
- would have predicted <20% for 3
- <5% for 4
- <30% for 5
I'm not certain of how I did the first time I read this post, but I'm quite certain I didn't do as well. So I'm wondering if I've gotten stronger or if it's due to the reading order.