Probability updating question - 99.9999% chance of tails, heads on first flip

post by nuckingfutz · 2011-05-16T00:58:33.790Z · LW · GW · Legacy · 17 comments

This isn't intended as a full discussion, I'm just a little fuzzy on how a Bayesian update or any other kind of probability update would work in this situation.

You have a coin with a 99.9999% chance of coming up tails, and a 100% chance of coming up either tails or heads.

You've deduced these odds by studying the weight of the coin. You are 99% confident of your results. You have not yet flipped it.

You have no other information before flipping the coin.

You flip the coin once. It comes up heads.

How would you update your probability estimates?

 

(this isn't a homework assignment; rather I was discussing with someone how strong the anthropic principle is. Unfortunately my mathematic abilities can't quite comprehend how to assemble this into any form I can work with.)

 

17 comments

Comments sorted by top scores.

comment by Cyan · 2011-05-16T01:09:10.997Z · LW(p) · GW(p)

You are 99% confident of your results.

What does this mean?

If it's a probability, what gets the other 1%?

comment by MinibearRex · 2011-05-16T04:00:38.338Z · LW(p) · GW(p)

Here's the thing: in Bayesian updating, probability mass moves from one hypothesis to another. That means you need another hypothesis. For Bayes' theorem, you need three terms: the prior probability of your hypothesis (99%), the probability of seeing the experimental result assuming your hypothesis is true (0.0001%), and the probability of seeing the experimental result whether or not the hypothesis is true. You need another piece of information, and that will depend on your state of knowledge. For instance, the coin could be perfectly fair, slightly balanced in favor of tails (40:60), slightly balanced in favor of heads (60:40), 90:10 (either direction), etc. You need another hypothesis, or probably a bunch of them.

If you want to make the math simple, you can just consider two possibilities: your hypothesis, and the null hypothesis (the coin is perfectly fair, 50:50). If that's the case:

P(A|X) = P(A)P(X|A) / (P(X|A)P(A) + P(X|N)*P(N))

P(A|X) = (.99)*(.000001) / ((.99)*(.000001) + (.5)*(.01))

which works out to: P(A|X) = 0.000197961... (Nota bene: Check my math. I haven't slept much recently, and I could easily have made any number of mistakes)

Again, this is a really simplistic calculation, because there are so many other plausible hypotheses, and I haven't studied quite enough probability theory to do all the calculations, even if I had more information.

Replies from: DanielLC
comment by DanielLC · 2011-05-16T04:37:30.142Z · LW(p) · GW(p)

That's pretty much what I got. I used a probability distribution for the null hypothesis, but it still works out to 50% for the first flip.

comment by Emile · 2011-05-16T07:28:40.685Z · LW(p) · GW(p)

It seems some information is missing, so I'll try reformulating the problem my way:

A jar contains 1 fair coin (equal odds for heads and tails), and 99 unfair coins (99.9% chances of tails, 0.1% of heads). You pull out a coin, flip it, and it comes up tails. What are your expectations for the second flip?

P(fair & heads) = 0.01 * 0.5 = 0.005

P(unfair & heads) = 0.99 * 0.001 ~= 0.001

so P(heads on first throw) = 0.006

so P(fair | heads on first throw) = 5/6

so P(heads on second throw | heads on first throw) = 5/6 0.05 + 1/6 0.001 ~= 0.42

... at least, that's one way of interpreting "you are 99% confident in your results", I consider that the remaining 1% is "your analysis was completely wrong and the coin is just as likely to land on heads or tails". A more realistic situation would be one where your confidence would be distributed among possible coins in coinspace, something like "90% confident of less than 0.01% odds for heads, 99% confident in less than 1% odds for heads, 99.9% confident in less than 10% odds for heads, etc.".

Replies from: Nic_Smith
comment by Nic_Smith · 2011-05-16T17:31:57.608Z · LW(p) · GW(p)

I like the reinterpretation of the problem, but is

P(unfair & heads) = 0.99 * 0.0001 ~= 0.001

a typo? Just running the numbers through SpeedCrunch gives 0.99 * 0.0001 = 0.000099, and 0.000099 ~= 0.0001, which seems intuitively right as 0.99 is "almost" 1.

Replies from: Emile
comment by Emile · 2011-05-17T07:37:03.865Z · LW(p) · GW(p)

Augh! Serves me right for modifying the values while writing the comment and not thoroughly rechecking the calculations, thanks!

comment by DanielLC · 2011-05-16T01:19:51.041Z · LW(p) · GW(p)

Do you mean that you're 99% confident in your reasoning that it comes up tails 99.9999% of the time? If so, you'd be much less than 99.9999% sure of heads in the first place.

You generally use beta distribution for coinflips. One beta distribution that would get your certainty is alpha = 999999, beta = 1. Landing on heads would get you a posterior of alpha = 999999, beta = 2, which would give you a certainty of about 99.9998% of landing on tails.

My problem is that your confidence isn't well specified. If you could give me a standard deviation, that would work better. Also, with something like this, a beta distribution isn't actually a very good prior. The most likely reason for it to land on heads is that you messed up, and the probability is more like 99.9%, which would be crazy unlikely under the prior I gave.

Replies from: timtyler, Cyan
comment by timtyler · 2011-05-16T16:59:44.455Z · LW(p) · GW(p)

Do you mean that you're 99% confident in your reasoning that it comes up tails 99.9999% of the time? If so, you'd be much less than 99.9999% sure of heads in the first place.

You can be 99.9999% sure of heads - and 99% confident of that - if you memorised your confidence - but then subsequently could not remember for sure if there were six "9"s - or maybe seven.

Replies from: DanielLC
comment by DanielLC · 2011-05-18T00:51:04.193Z · LW(p) · GW(p)

If there was a 99% chance that you remember correctly, and a 1% chance that there was an extra nine, you'd be slightly more than 99.9999% confident if heads.

comment by Cyan · 2011-05-16T01:26:46.651Z · LW(p) · GW(p)

I think the idea is to have a point mass at 0.999999 containing 0.99 of the prior probability. My intuition is that would behave more like a Beta(0,1) than a Beta(1,999999).

Replies from: DanielLC
comment by DanielLC · 2011-05-16T02:14:55.498Z · LW(p) · GW(p)

Beta(0,1) is an improper prior. Do you mean Beta(1,1), the uniform prior?

In that case, it's a silly prior. You can't be certain that it's exactly that probability.

If that's what you're using, you'd have a 50% chance of getting heads if you were wrong, and a 0.0001% chance if you were right, so:

P(pi = 0.000001) = 0.99

P(H) = P(H | pi = 0.000001)P(pi = 0.000001) + P(H | pi != 0.000001)P(pi != 0.000001)

= 0.0000010.99 + 0.50.01

= 0.00500099

~= 0.005

P(pi = 0.000001 | H) = P(H | pi = 0.000001)*P(pi = 0.000001)/P(H)

= 0.000001*0.99/0.005

= 0.000198

So there's a 0.02% chance that you were right about there being a 99.9999% chance of heads. Essentially, you can ignore that. You now have a 66.7% chance that the coin will land on heads, with much the same distribution as if you started with a uniform prior.

If you meant Beta(1,2) the answer is similar.

Edit: How do you make an asterisk show up, instead of italicizing?

Replies from: Manfred, Cyan, Vaniver
comment by Manfred · 2011-05-16T02:37:20.427Z · LW(p) · GW(p)

I usually copy the ascii middle dot ( · ) from somewhere, or just make the multiplication implicit. It is kind of a pain.

Replies from: Miller
comment by Miller · 2011-05-16T05:00:41.204Z · LW(p) · GW(p)

Ok, so for the future you could hold down ALT, then type 250 with the number keypad, then release ALT. That will save you the copy and paste. Ah.. I see that using three asterisks in a row works.. you get an italic asterisk 5 * 3

comment by Cyan · 2011-05-16T02:36:54.793Z · LW(p) · GW(p)

I had in mind something like a mixture prior with 0.99*Beta(0,1) + 0.01*Beta(0.5,0.5). Yes, the Beta(0,1) component makes the prior improper. Once heads is observed the Beta(0,1) component is updated to Beta(1,1) and its mixture weight is also updated and basically becomes negligible such that the component effectively drops out of the mixture.

I'm treating the behavior of the above distribution as a guide for my intuition as to what will happen to the mixture weight when there is no Beta(0,1) but rather a point mass at 0.999999.

To get two asterisks in a line without italics, do \* this \*.

comment by Vaniver · 2011-05-17T12:14:11.275Z · LW(p) · GW(p)

Use a backslash to escape it; for example: * (which is \ followed by the *)

comment by hairyfigment · 2011-05-16T03:04:25.967Z · LW(p) · GW(p)

Seems a little confused. If 99% refers to a model of the coin, and the larger number equals the conditional probability of tails within that model, then I think P(model) suddenly drops to less than .02%. That assumes the other 1% goes to a uniform prior and we can treat the chance of heads if (not-model) as 50%. In this example I think I could have told you beforehand the model leaves out too much, because my sources say the outcome depends more on how you throw the coin and you won't ever get 99.9999% from this.

If you want to know the posterior probability of heads or tails look at the other comments.

comment by saturn · 2011-05-16T04:48:43.403Z · LW(p) · GW(p)

Have you read this?