Considering all scenarios when using Bayes' theorem.

post by Alexei · 2011-06-20T18:11:34.810Z · LW · GW · Legacy · 7 comments

Contents

7 comments

Disclaimer: this post is directed at people who, like me, are not Bayesian/probability gurus.

Recently I found an opportunity to use the Bayes' theorem in real life to help myself update in the following situation (presented in gender-neutral way):

Let's say you are wondering if a person is interested in you romantically. And they bought you a drink.
A = they are interested in you.
B = they bought you a drink.
P(A) = 0.3 (Just an assumption.)
P(B) = 0.05 (Approximately 1 out of 20 people, who might be at all interested in you, will buy you a drink for some unknown reason.)
P(B|A) = 0.2 (Approximately 1 out of 5 people, who are interested in you, will buy you a drink for some unknown reason. Though it's more likely they will buy you a drink because they are interested in you.)

These numbers seem valid to me, and I can't see anything that's obviously wrong. But when I actually use Bayes' theorem:
P(A|B) = P(B|A) * P(A) / P(B) = 1.2
Uh-oh! Where did I go wrong? See if you can spot the error before continuing.

Turns out:
P(B|A) = P(A∩B) / P(A) ≤ P(B) / P(A) = 0.1667
BUT
P(B|A) = 0.2 > 0.1667

I've made a mistake in estimating my probabilities, even though it felt intuitive. Yet, I don't immediately see where I went wrong when I look at the original estimates! What's the best way to prevent this kind of mistake?
I feel pretty confident in my estimates of P(A) and P(B|A). However, estimating P(B) is rather difficult because I need to consider many scenarios.

I can compute P(B) more precisely by considering all the scenarios that would lead to B happening (see wiki article):

P(B) = ∑i P(B|Hi) * P(Hi)

Let's do a quick breakdown of everyone who would want to buy you a drink (out of the pool of people who might be at all interested in you):
P(misc. reasons) = 0.05; P(B|misc) = 0.01
P(they are just friendly and buy drinks for everyone they meet) = 0.05; P(B|friendly) = 0.8
P(they want to be friends) = 0.3; P(B|friends) = 0.1
P(they are interested in you) = 0.6; P(B|interested) = P(B|A) = 0.2
So, P(B) = 0.1905
And, P(A|B) = 0.315 (very different from 1.2!)

Once I started thinking about all possible scenarios, I found one I haven't considered explicitly -- some people buy drinks for everyone they meet -- which adds a good amount of probability (0.04) to B happening. (Those types of people are rare, but they WILL buy you a drink.) There are also other interesting assumptions that are made explicit:

The moral of the story is to consider all possible scenarios (models/hypothesis) which can lead to the event you have observed. It's possible you are missing some scenarios, which under consideration will significantly alter your probability estimates.

Do you know any other ways to make the use of Bayes' theorem more accurate? (Please post in comments, links to previous posts of this sort are welcome.)

7 comments

Comments sorted by top scores.

comment by Unnamed · 2011-06-20T22:33:12.371Z · LW(p) · GW(p)

This is a case where I wouldn't use Bayes' theorem. You have to estimate some probabilities directly, using your experience and knowledge, and P(A|B) (the probability that they are interested in you given that they bought you a drink) seems easier to estimate directly than some of the other probabilities which you would have to estimate to calculate P(A|B). For instance, if many people have bought you drinks before, you could just consider what proportion of them were interested in you. Or, you could rely on drink-buying that you've observed, or just cultural knowledge. On the other hand, I don't know how I'd estimate P(A) (the prior probability that a person is interested in you).

Replies from: Alexei
comment by Alexei · 2011-06-21T02:38:35.160Z · LW(p) · GW(p)

That's really interesting, because I feel almost the opposite way. Estimating P(A) is easy, there are many factors I can look at, and I have experience with it. However, I don't have a lot of experience with people buying me drinks for various reasons, so I'm not sure how to update on that just from experience.

It's actually quiet possible that the magnitude of error I have in my estimations/assumptions renders them useless, but, even then, the exercise overall is pretty helpful to understand Bayes' theorem better.

comment by DanielLC · 2011-06-20T20:53:59.605Z · LW(p) · GW(p)

You could have improved it somewhat by using P(B|~A) instead of P(B). Doing so would make getting a higher probability than 1 impossible.

This looks like the conjunction fallacy. What you have comes out to a 6% chance that they bought you a drink and are interested, vs. a 5% chance that they bought you a drink at all.

I found one I haven't considered explicitly -- some people buy drinks for everyone they meet -- which adds a good amount of probability (0.4) to B happening.

Don't you mean 0.04?

Replies from: Alexei
comment by Alexei · 2011-06-20T22:08:37.848Z · LW(p) · GW(p)

I thought about using ~A, but estimating P(B|~A) or P(B∩~A) is also pretty difficult. There are a lot of reasons, as I've shown, why someone might by me a drink without being interested. So I still have to think about all the scenarios. Are you also saying that using the alternative form of Bayes' formula can't lead to probability > 1? (If that's the case, then that's very helpful!)

P(A and B) = P(B|A) * P(A) = 0.06
P(B) = 0.05
Yes, that's a pretty good way to see the mistake mathematically. (Dannil made the same point.)

And I've corrected the typo, thanks!

Replies from: DanielLC
comment by DanielLC · 2011-06-20T23:43:57.353Z · LW(p) · GW(p)

Are you also saying that using the alternative form of Bayes' formula can't lead to probability > 1?

Yes. In order for you to get higher than one, P(B|~A)P(~A) would have to be negative.

comment by Dannil · 2011-06-20T20:50:38.564Z · LW(p) · GW(p)

I think the error is easier to see by showing that P(B|A)*P(A) > P(B) which is obviously wrong. But of course this is equivalent to what you wrote.

comment by cousin_it · 2011-06-20T18:19:16.287Z · LW(p) · GW(p)

You could try guesstimating other sets of variables, for example P(A), P(B), P(A∩B) (constraint: the latter is smaller than the former two) or P(A), P(B|A), P(B|~A) (no constraints). I'd go with the former set if I had a pile of data, and with the latter set if I didn't.