How to convince Y that X has committed a murder with >0.999999 probability?

post by Colin Tang (colin-tang) · 2020-05-19T22:55:26.191Z · LW · GW · 3 comments

This is a question post.

Contents

  Answers
    5 Dagon
    2 jimrandomh
    2 Raemon
    1 Teerth Aloke
    0 Richard_Kennaway
None
3 comments

Suppose X has murdered someone with a knife, and is being tried in a courthouse. Two witnesses step forward and vividly describe the murder. The fingerprints on the knife match X's fingerprints. In fact, even X himself confesses to the crime. How likely is it that X is guilty?

It's easy to construct hypotheses in which X is innocent, but which still fit the evidence. E.g. X has an enemy, Z, who bribes the two witness to give false testimony. Z commits the murder, then plants X's fingerprints on the knife (handwave; assume Z is the type of person who will research and discover methods of transplanting fingerprints). X confesses to the murder which he did not commit because of the plea deal.

Is there any way to prove to Y (a single human) that X has committed the murder, with probability > 0.999999? (Even if Y witnesses the murder, there's a >0.000001 chance that Y was hallucinating, or that the supposed victim is actually an animatronic, etc.)

Answers

answer by Dagon · 2020-05-20T19:22:32.933Z · LW(p) · GW(p)

People don't generally form beliefs with that level of precision. "beyond a reasonable doubt" is the usual instruction, for exactly this reason. And the underlying belief is "appears likely enough that it's preferable to hold the person publicly responsible".

comment by Richard_Kennaway · 2020-05-20T21:57:33.609Z · LW(p) · GW(p)

Having sat on a jury (for a rather dull case of a failed burglary), I concur with this.

Jury confidentiality is taken seriously in the UK, so I can't comment on our deliberations, but the consensus was that it was him wot dunnit. He looked resigned rather than indignant when the verdict was read out, so with that and the evidence I'm as sure as I need to be that we got it right. I couldn't put a number on it, but 0.000001 is way smaller than a reasonable doubt.

answer by jimrandomh · 2020-05-22T04:02:33.613Z · LW(p) · GW(p)

Six nines of reliability sounds like a lot, and it's more than is usually achieved in criminal cases, but it's hardly insurmountable. You just need to be confident enough that, given one million similar cases, you would make only one mistake. A combination of recorded video and DNA evidence, with reasonably good validation of the video chain of custody and of the DNA evidence-processing lab's procedures, would probably clear this bar.

comment by Raemon · 2020-05-22T10:26:00.166Z · LW(p) · GW(p)

You just need to be confident enough that, given one million similar cases, you would make only one mistake.

This still seems crazy confident to me though. I do think there are hypothetical people who could do it, but I don't currently have strong reason to believe there actually exist even trained rationalists that could do it, even if they were extremely careful every single time. 

Given a million evaluations of the video chain-of-custody or DNA evidence, you expect there are people who would not make a mistake (or, be actively deceived by an adversary, or have forgotten to eat lunch and not noticed they're tired?) even twice? 

Replies from: jimrandomh
comment by jimrandomh · 2020-05-22T20:48:31.485Z · LW(p) · GW(p)

If I sometimes write down a 6-nines confidence number because I'm sleepy, then this affects your posterior probability after hearing that I wrote down a 6-nines confidence number, but doesn't reduce the validity of 6-nines confidence numbers that I write down when I'm alert. The 6-nines confidence number is inside an argument [LW · GW], while your posterior is outside the argument.

Replies from: Raemon
comment by Raemon · 2020-05-22T21:50:12.686Z · LW(p) · GW(p)

Not 100% sure I understand this.

My claim is "Basically everyone who writes down high confidence claims is, by default, miscalibrated and mistaken. It should take extraordinary evidence both for me to believe your high-confidence claim is calibrated, and separately, for you to believe a high confidence claim of yours is calibrated." (But, I'd agree that you might have inside view knowledge that makes you justifiably more confident than me)

I do think there are types of things one could be theoretically 6-nine-confident about. (I'm probably that confident about how likely I am to stumble on my next footstep? But that's because I've literally taken 1-3 million footsteps in my life)

I think my nearest-crux for this is "what is the actual world record for number of independent high-confidence claims anyone has made? Is there anyone with a perfect record for a large number of... even 99.99% claims, let alone 6-nine-claims?" (If there were someone who'd gotten a hundred 99.99% claims correct with no failures, I'd elevate to attention "this person might be the sort of person who can make 99.9999% claims and possibly be justified)

Do you think that overall reasoning is mistaken?

answer by Raemon · 2020-05-20T01:25:21.637Z · LW(p) · GW(p)

My short answer is "you probably can't." >0.999999 is just a lot of certainty. 

There might exist particularly-well-calibrated humans who can have a justified >.0.999999 probability in a given murder trial, but my guess is that most Well Calibrated People still probably sort of cap-out in justified confidence at some point, based on what the human mind can reasonably process. After that, I think it makes less sense to think in terms of exact probabilities and more sense to think in terms of "real damn certain, enough that it's basically certain for practical purposes, but you wouldn't make complicated bets based on it."

(I'm curious what Well Calibrate Rationalists think is the upper bound of how certain they can be about anything)

[Edit: yes, there are specific domains where you can fully understand a mathematical question, where you can be confident something won't happen apart from "I might be insane or very misguided about reality" reasons.]

comment by Richard_Kennaway · 2020-05-20T21:14:10.251Z · LW(p) · GW(p)

If I buy a ticket in the Euromillions lottery, I am over 0.99999999 sure I will lose. (There are more than 100 million possible draws.)

Replies from: Raemon
comment by Raemon · 2020-05-20T21:26:19.173Z · LW(p) · GW(p)

Yes, see response to Dagon. But, 0.99999999 seems overconfident to me. You have to account not only for "I might be insane" (what are the base rates on that?), but simpler things like "I misread the question or had a brain fart." 

Like, there's an old LW chat log where someone claims they can be 99.999% confident about whether a low-digit number is prime. Then someone challenges them to answer "prime or not?" for ~100 numbers. And then like 25 questions in they get one wrong. 0.99999999 is a Really God Damn Confident.

Replies from: Raemon, thomas-kwa, Richard_Kennaway
comment by Raemon · 2020-05-22T10:37:07.206Z · LW(p) · GW(p)

I was curious to re-read the chat log, and had to do some digging on archive.org to find it. The guy made 17 bets about numbers being prime, and lost the bet on the 17th bet. 

Transcript here

Sequence article that referenced it here.

Interesting followup by Chris Halliquist here:

If it's not clear why this doesn't follow consider the anecdote Eliezer references in the quote above, which runs as follows: A gets B to agree that if 7 is not prime, B will give A $100. B then makes the same agreement for 11, 13, 17, 19, and 23. Then A asks about 27. B refuses. What about 29? Sure. 31? Yes. 33? No. 37? Yes. 39? No. 41? Yes. 43? Yes. 47? Yes. 49? No. 51? Yes. And suddenly B is $100 poorer.

Now, B claimed to be 100% sure about 7 being prime, which I don't agree with. But that's not what lost him his $100. What lost him his $100 is that, as the game went on, he got careless. If he'd taken the time to ask himself, "am I really as sure about 51 as I am about 7?" he'd probably have realized the answer was "no." He probably didn't check  he primality of 51 as carefully as I checked the primality of 53 at the beginning of this post. (From the provided chat transcript, sleep deprivation may have also had something to do with it.)

If you tried to make 10,000 statements with 99.99% certainty, sooner or later you would get careless. Heck, before I started writing this post, I tried typing up a list of statements I was sure of, and it wasn't long before I'd typed 1 + 0 = 10 (I'd meant to type 1 + 9 = 10. Oops.) But the fact that, as the exercise went on, you'd start including statements that weren't really as certain as the first statement doesn't mean you couldn't be justified in being 99.99% certain of that first statement.

I do think this is an important counterpoint, but still, while I agree that if a person actually thought carefully about each prime number, they'd have made it much farther than a 1-out-of-17 failure rate, I'd still bet against them successfully making 10,000 careful statements without ever screwing up in some dumb way.

Replies from: Ericf
comment by Ericf · 2020-05-22T14:02:45.127Z · LW(p) · GW(p)

Anecdata: In the mobile game Golf Rivals, it is trivial to sink a putt from any distance on the green, with a little bit of care. I (and opponents) miss about 1 in 1000 times

comment by Thomas Kwa (thomas-kwa) · 2020-05-21T00:00:33.125Z · LW(p) · GW(p)

+3 for the concrete example.

comment by Richard_Kennaway · 2020-05-20T22:08:48.347Z · LW(p) · GW(p)
Yes, see response to Dagon. But, 0.99999999 seems overconfident to me. You have to account not only for "I might be insane" (what are the base rates on that?), but simpler things like "I misread the question or had a brain fart." 

Those could go either way.

Replies from: mark-xu, Raemon
comment by Mark Xu (mark-xu) · 2020-05-20T22:32:11.135Z · LW(p) · GW(p)

Not so. "X is guilty" is a very specific hypothesis and 0.99999999 is Very Confident, so general increases in uncertainty should make you think it's less likely that "X is guilty" is true. For example, if I'm told I misread the question, since I will not be 0.99999999 confident on nearly every question, since I now have non-trivial probability mass on other questions, I should become less confident.

The result is that it takes a specific misreading to make you more confident and that most misreadings will make you less confident, so you should become less confident.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2020-05-21T17:54:33.067Z · LW(p) · GW(p)

In the log-odds space, both directions look the same. You can wander up as easily as down.

I don't know what probability space you have in mind for the set of all possible phenomena leading to an error, that would give a basis for saying that most errors will lie in one direction.

When I calculated the odds for the Euromillions lottery, my first calculation omitted to divide by a factor to account for there being no ordering on the chosen numbers, giving a probability for winning that was too small by a factor of 240. The true value is about 140 million to 1.

I have noted before that ordinary people, too ignorant to know that clever people think it impossible, manage to collect huge jackpots. It is literally news when they do not.

Replies from: mark-xu
comment by Mark Xu (mark-xu) · 2020-05-21T19:59:35.409Z · LW(p) · GW(p)

It's not a random walk among probabilities, it's a random walk among questions, which have associated probabilities. This results in a non-random walk downwards in probability.

The underlying distribution might be described best as "nearly all questions cannot be decided with probabilities that are as certain as 0.999999".

There is a difference in "error in calculation" versus "error in interpreting the question". The former affects the result in such a way that makes it roughly as likely to go up as down. If you err in interpreting the question, you're placing higher probability mass on other questions, which you are less than 0.999999 certain about on average. Roughly, I'm saying that you expect regression to the mean effects to apply in proportion to the uncertainty. E.g. If I tell you I scored an 90% on my test for which the average was a 70%, then you expect me to score a bit lower on a test of equal difficulty. However, if I tell you that I guessed on half the questions, then you should expect me to score a lot lower than you did if you assumed I guessed on 0 questions.

I don't know why the last comment is relevant. I agree that 1 in a million odds happen 1 in a million times. I also agree that people win the lottery. My interpretation is that it means "sometimes people say impossible when they really mean extremely unlikely", which I agree is true.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2020-05-21T21:51:33.786Z · LW(p) · GW(p)
I don't know why the last comment is relevant. I agree that 1 in a million odds happen 1 in a million times. I also agree that people win the lottery. My interpretation is that it means "sometimes people say impossible when they really mean extremely unlikely", which I agree is true.

The point was not that people win the lottery. It's that when they do, they are able to update against the over 100 million-to-one odds that this has happened. "No, no," say the clever people who think the human mind is incapable of such a shift in log-odds, "far more likely that you've made a mistake, or the lottery doesn't even exist, or you've had a hallucination." The clever people are wrong.

Replies from: Ericf
comment by Ericf · 2020-05-22T14:07:28.737Z · LW(p) · GW(p)

Anecdata: people who win large lotteries often express verbal disbelief, and ask others to confirm that they are not hallucinating. In fact, some even express disbelief while sitting in the mansion they bought with their winnings!

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2020-05-23T12:12:05.148Z · LW(p) · GW(p)

And yet, despite saying "Inconceivable!" they did collect their winnings and buy the mansion.

Replies from: Ericf
comment by Ericf · 2020-06-11T15:04:57.633Z · LW(p) · GW(p)

Right, but they don't update to that from a single data point (looking at the winning numbers and their ticket once), they seek out additional data until they have enough subjective evidence to update to the very, very, unlikely event (and they are able to do this because the event actually happened). Probably hundreds of people think they won any given lottery at first, but when they double-check, they discover that they did not.

comment by Raemon · 2020-05-20T23:19:19.639Z · LW(p) · GW(p)

Seems like what matters is "if you make 1000000 claims that you're .999999 confident in, will you be right 999999 times?" Yes, insanity and brain farts could go in any direction, but it goes in sufficiently many directions (at least two) such that I bet you if you try to make a hundred 99.9999% confidence claims you'll screw up at least once.

comment by Dagon · 2020-05-20T19:34:42.567Z · LW(p) · GW(p)

Even if you include esoteric options, like being a Boltzmann brain, you can have negatives with way more probability than 999999/1000000. It's EASY to be more certain than that on "will I fail to win the next powerball drawing". And more certain still on "did I fail to win the previous powerball drawing".

Some recursive positives approach 1 - "I exist". Tautologies remain actually 1: P -> P.

But for random human-granularity events where you have only very indirect evidence, you're right. 99% would be surprising, 95% would take a fair bit of effort.

Replies from: Raemon
comment by Raemon · 2020-05-20T19:38:47.176Z · LW(p) · GW(p)

Yeah, I agree there are domains where you can be more confident because you fully understand the domain (and then only have to account for model uncertainty in "I'm literally insane or in a simulation or whatever.")

comment by Colin Tang (colin-tang) · 2020-05-20T04:11:49.637Z · LW(p) · GW(p)

how can you tell what your own limits are?

Replies from: Raemon
comment by Raemon · 2020-05-20T19:41:17.092Z · LW(p) · GW(p)

I would start by trying to get calibrated generally, using something like the credence game. (You will probably start out not even able to be reliably confident in 90% likely statements).

I think there might be a better variant of the game available somewhere that someone has built in the past few years, but this what I could easily remember.

comment by Colin Tang (colin-tang) · 2020-05-20T04:06:50.024Z · LW(p) · GW(p)

what's an example of a complicated bet that you shouldn't take even if you're real damn certain?

Replies from: Dagon, ChristianKl
comment by Dagon · 2020-05-20T19:37:13.534Z · LW(p) · GW(p)

Most of them. The fact that you're being offered an unusual bet is itself evidence that you're wrong. The Guys and Dolls quote applies pretty widely:

One of these days in your travels, a guy is going to show you a brand-new deck of cards on which the seal is not yet broken. Then this guy is going to offer to bet you that he can make the jack of spades jump out of this brand-new deck of cards and squirt cider in your ear. But, son, do not accept this bet, because as sure as you stand there, you're going to wind up with an ear full of cider.
comment by ChristianKl · 2020-05-20T18:16:56.078Z · LW(p) · GW(p)

I'm pretty certain that an asteroid won't destroy human civilisation next year but I still want better astroid defense (which is mostly more surveilance in our solar system).

answer by Teerth Aloke · 2020-05-21T11:57:16.390Z · LW(p) · GW(p)

This problem is known in the philosophy of science as the underdetermination problem. Multiple hypotheses can fit the data. If we don't assign a priori probabilties to hypotheses, we will never reach a conclusion. For example, the hypothesis that (a) Stephen Hawking lived till 2018 against (b) There was a massive conspiracy by his relatives and friends to take his existence after his death in 1985. (That was an actual conspiracy theory). No quantity of evidence can refute the second theory. We can always increase the number of conspirators. The only reason we choose (1) over (2) is the implausibility of (2).

answer by Richard_Kennaway · 2020-05-20T21:50:01.433Z · LW(p) · GW(p)

If X has confessed, how can he be on trial?

comment by Ericf · 2020-05-22T14:28:26.072Z · LW(p) · GW(p)

1. X confesses to police, but later claims that the confession was cooerced, and asks for a trial.

2. X confesses to some part of the crime "I was holding the knife that penetrated the deceased" but not all of it "but I was sleepwalking at the time, so it's not Murder" or "but I was in a jealous rage at the time, so it's not pre-meditated"

3. X confesses, but the prosecutor believes that other people were involved (regardless of the status of X) and is holding a joint trial for all the accused.

3 comments

Comments sorted by top scores.

comment by Dagon · 2020-05-22T20:45:44.373Z · LW(p) · GW(p)

by the way, it's an underrated aspect of Bayeseanism that encountering the question _is_ evidence. The prior for even having a trial on a topic that could approach certainty is extremely low. If the evidence existed that would get you even to 99%, they'd bargain out of having a trial.

Replies from: Undredd
comment by Undredd · 2020-05-26T23:32:13.321Z · LW(p) · GW(p)

I am a prosecutor.

Yes, trials occur when there are viable outcomes on all sides. But also:

1. One of the people involved in a criminal plea negotiations is irrational. (Prosecutor, defense attorney, defendant.) Defendants sometimes go to trial on crushing cases. Attorneys may want their clients to take a deal, and it doesn't happen.

2. There's no benefit to pleading for either side. If you've got a second degree murder with a fixed-by-statute sentence and the defendant is stone cold good for it, the prosecution may not offer a deal for the defense to take, and then the defense gets a trial because there's no harm to the client

In most jurisdictions, prosecutors win a very high percentage of trials because there are a lot of very good cases that go to trial.

Replies from: Dagon
comment by Dagon · 2020-05-27T14:38:36.469Z · LW(p) · GW(p)

Thanks for this!