When apparently positive evidence can be negative evidence
post by cata · 2022-10-20T21:47:37.873Z · LW · GW · 5 commentsThis is a link post for https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4841483/
Contents
5 comments
This is a thought-provoking journal article discussing cases where:
- You have something with two possible states, A and B, and you can perform fairly accurate, mostly independent tests which distinguish between A and B.
- However, there is some failure case with a low prior probability, in which the actual state may be B, but something confounds all your tests and makes them look like they favor A regardless.
- If you then perform many tests, and they all favor A, the first few tests will make you update towards A, but then if you keep performing more tests whose results favor A, you will actually start updating towards B, because you will start to think you are in the failure case.
An interesting anecdote which I haven't verified:
Under ancient Jewish law one could not be unanimously convicted of a capital crime—it was held that the absence of even one dissenting opinion among the judges indicated that there must remain some form of undiscovered exculpatory evidence.
Via Hacker News.
5 comments
Comments sorted by top scores.
comment by localdeity · 2022-10-20T21:54:45.739Z · LW(p) · GW(p)
The classic example fitting the title, which I learned from a Martin Gardner article (I think he cited it from some 1800s person), is: "Hypothesis: No man is 100 feet tall. Evidence: You see a man who is 99 feet tall. Technically, that evidence does fit the hypothesis, but probably after seeing that evidence you would become much less confident in the hypothesis."
Well, basically, it can all be interpreted as having multiple competing theories (e.g. "The tallest human ever was slightly under 9 feet, humans generally follow a certain distribution, your heart probably wouldn't even be able to circulate blood that far, etc." vs "Something very, very weird and new that breaks my understanding is happening") and the evidence can be considered in a Bayesian way.
comment by cata · 2022-10-20T22:00:12.831Z · LW(p) · GW(p)
From the HN comments:
If my test suite never ever goes red, then I don't feel as confident in my code as when I have a small number of red tests.
That seems like an example of this that I have definitely experienced, where A is "my code is correct", B is "my code is not correct", and the failure case is "my tests appear to be exercising the code but actually aren't."