Link: "When Science Goes Psychic"

post by Tesseract · 2011-01-08T09:00:30.761Z · LW · GW · Legacy · 15 comments

A major psychology journal is planning to publish a study that claims to present strong evidence for precognition. Naturally, this immediately stirred up a firestorm. There are a lot of scientific-process and philosophy-of-science issues involved, including replicability, peer review, Bayesian statistics, and degrees of scrutiny. The Flying Spaghetti Monster makes a guest appearance.

Original New York Times article on the study here.

And the Times asked a number of academics (including Douglas Hofstadter) to comment on the controversy. The discussion is here.

I, for one, defy the data.


Comments sorted by top scores.

comment by Manfred · 2011-01-08T12:54:27.796Z · LW(p) · GW(p)

One lesson of the common misuse of statistics is to not "defy the data" until you're sure what it says.

Here's an important reply cited in the other threads:

Does psi exist? In a recent article, Dr. Bem conducted nine studies with over a thousand participants in an attempt to demonstrate that future events retroactively affect people’s responses. Here we discuss several limitations of Bem’s experiments on psi; in particular, we show that the data analysis was partly exploratory, and that one-sided p-values may overstate the statistical evidence against the null hypothesis. We reanalyze Bem’s data using a default Bayesian t-test and show that the evidence for psi is weak to nonexistent.

comment by gwern · 2011-01-09T19:51:56.797Z · LW(p) · GW(p)

I read the NYT link yesterday or something, and IIRC, they mention somewhere that the statisticians had already found major flaws - like that. I'm a little surprised anyone feels a need to 'defy the data'.

comment by Morendil · 2011-01-08T09:39:32.574Z · LW(p) · GW(p)

Hofstadter's response is quite to the point.

I also liked "No Sacred Mantle" - I believe we should promote more widely some basic techniques for critical reading of science-related claims. I recently pushed out an essay on "Fact and folklore in software engineering" that went ridiculously viral thanks to HN, even though I'll be the first to admit it's not my clearest writing.

I think there's a lot of pent-up demand for things like "how to read a popular article reporting on a science fact", "how to read a scientific paper in a field you don't know", etc. No surprise there - after all, in terms of built-in preferences rationality is primarily about defending yourself from "facts" that you shouldn't accept.

comment by Perplexed · 2011-01-08T15:57:41.921Z · LW(p) · GW(p)

I think there's a lot of pent-up demand for things like "how to read a popular article reporting on a science fact", "how to read a scientific paper in a field you don't know", etc.

I'd like to see that. Or, rather than a how-to synthesis, how about some relatively raw data? A series of postings linking to scientific articles which got some initial positive play in the popular press, but later were convincingly critiques/debunked in the blogosphere.

Good science is all alike. Each example of bad science may be bad in its own individual way. (HT to LT).

comment by cousin_it · 2011-01-08T15:00:11.525Z · LW(p) · GW(p)

Institut Agile? So advocacy for "agile practices" is your day job? Now I understand why our earlier conversation about TDD went so weirdly.

comment by Morendil · 2011-01-08T18:12:51.157Z · LW(p) · GW(p)

How's that?

The implication seems to be that my job makes me biased about the topic. If so, that's precisely the wrong conclusion to draw.

The job isn't just advocacy, it's also (at the moment mostly) research and, where necessary, debunking of Agile. (For instance, learning more about probability theory has made me more skeptical of "planning poker".)

Prior to creating that job from scratch (including getting private funding to support my doing that job full-time), I'd supported myself by selling consulting and training as an expert on Scrum, Extreme Programming and Agile.

Institut Agile was the result of a conscious decision on my part to move to a professional position where I'd be able to afford a more rational assessment of the topic. For instance, I'm compiling an extensive bibliography of the existing empirical studies published that have attempted to verify the claimed benefits of TDD, and reviews and meta-analyses of these studies.

I'm quite interested in thoughtful critiques of TDD, provided that such criticism is expressed from a position of actually knowing something about the topic, or being willing to find out what the claims concerning TDD actually are.

To use a well-known form, if TDD works I desire to believe that TDD works, and if TDD doesn't work I desire to believe that it doesn't work.

From my point of view, our earlier conversation about TDD went weirdly because your responses stopped making sense for me starting from this one. For a while I attempted to correct for misunderstanding on my part and glean more information from you that could potentially change my mind, until that started looking like a lost cause.

comment by cousin_it · 2011-01-08T19:59:02.309Z · LW(p) · GW(p)

For instance, I'm compiling an extensive bibliography of the existing empirical studies published that have attempted to verify the claimed benefits of TDD, and reviews and meta-analyses of these studies.

Is it available online?

comment by Morendil · 2011-01-08T20:49:29.266Z · LW(p) · GW(p)

Do you always answer a question with another question?

I'm planning to make it available on the Institut Agile group on Mendeley. It's intended to cover the entire set of agile practices; for instance, what's relevant to TDD consists of the tags "bdd", "tdd", "unittest" and "refactoring"

What is there right now is a subset only, though - I'm feeding the online set from a local BibDesk file which is still growing. (I'm also having some issues with the synchronization between the local file and Mendeley - there are some duplicates in the online set right now.) So that only has two articles tagged "tdd" proper.

The local file has 68 papers on 10 practices. Of these, 6 for "tdd", 4 for "unittest", 13 for "refactoring", 2 for "bdd". There are additional citations, that I haven't yet copied over, in the two most recent surveys of the topic that I've come across: one is a chapter on TDD in the O'Reilly book "Making Software: What Really Works, and Why We Believe It", the other is an article by an overlapping set of authors in the winter (I think) issue of IEEE Software.

Another source is the set of proceedings of the Agile conference in the US, and the XP conference in Europe, which have run for about 10 years each now. Most of the articles from these are behind paywalls (at IEEE and Springer respectively), but I'm hoping to leverage my position as a member of the Agile Alliance board to set them free.

Anyway, that work is my answer to Hamming's questions. It may not be the best answer, but I'm happy enough that I do have an answer.

comment by cousin_it · 2011-01-10T18:07:44.649Z · LW(p) · GW(p)

Yep, paywalls... :-(

comment by nhamann · 2011-01-08T20:06:15.028Z · LW(p) · GW(p)

I can't believe Hofstadter (or anyone, really) is arguing that Bem's paper should not have been published. The paper was, presumably, published because peer-reviewers couldn't find substantial flaws in its methodology. This speaks more about the nature of standard statistical practices in psychology, or at least about the peer-review practices at the journal in which Bem's paper was published, which is useful information in either case.

comment by Oscar_Cunningham · 2011-01-08T11:48:36.096Z · LW(p) · GW(p)

N.B. It's Daryl Bem again.

comment by Kevin · 2011-01-08T12:29:41.020Z · LW(p) · GW(p)

Yup, it's news because it is actually being published now. The previous discussion was about the preprint paper, which was only discussed in obscure academic and academic-like circles...

comment by Vladimir_Nesov · 2011-01-08T12:09:28.837Z · LW(p) · GW(p)

I have a bizarre nitpick.

Hofstadter writes:

We all have heard it claimed that 13 is an "unlucky number." [...]

There would be no plausible scientific mechanism to make any such things happen, unless there were an extremely bizarre kind of intelligence ruling our universe -- and this would go so profoundly against the laws of physics as we know them that our entire scientific worldview would be toppled if such a pattern involving the number 13 were true. [...]

The point is that such an article would not only be about the luckiness or unluckiness of the number 13, but it would also necessarily (if only implicitly) be about the entire nature of the universe we live in, since if 13 really were provably unlucky in any sense at all, then all bets would be off about the laws of physics, for the laws of physics are simply not compatible with such a finding.

Why add the detail of incompatibility with the laws of physics? The predictions made from laws of physics alone (in 13-ish situations), and so the model conferred by the laws of physics, would err if there was this additional bizarre factor, but among all this confusion one can't make conclusive statements about the laws of physics. The strangeness could well be implemented within physics as we know it.

comment by [deleted] · 2011-01-09T05:11:48.766Z · LW(p) · GW(p)

you believe deeply in science and this deep belief implies that the article is necessarily, certainly, undoubtedly wrong in some fashion, and that the flaws in it should be found and exposed, rather than publishing it prematurely....... There has to be a common sense cutoff for craziness, and when that threshold is exceeded, then the criteria for publication should get far, far more stringent.

The charitable interpretation of Hofstadter's comment is that the likelihood of 13-been-unlucky is so low that we should look extra hard for flaws in the arguments of papers purporting to prove it than we would for less controversial papers. He seems to be suggesting that a more rigorous review would have meant the paper would not be published, or at least not published 'prematurely'. Sounds sensible.

comment by Lightwave · 2011-01-09T13:02:45.057Z · LW(p) · GW(p)

the likelihood of 13-been-unlucky is so low that we should look extra hard for flaws in the arguments

A.K.A. "extraordinary claims require extraordinary evidence". :)