Can Bayes theorem represent infinite confusion?
post by Yoav Ravid · 2019-03-22T18:02:45.088Z · LW · GW · 3 commentsThis is a question post.
Contents
Answers 15 jimrandomh 6 Dagon None 3 comments
Edit: the title was misleading, i didn't ask about a rational agent, but about what comes out of certain inputs in Bayes theorem, so now it's been changed to reflect that.
Eliezer [LW · GW] and others talked about how a Bayesian with a 100% prior cannot change their confidence level, whatever evidence they encounter. that's because it's like having infinite certainty. I am not sure if they meant it literary or not (is it really mathematically equal to infinity?), but assumed they do.
I asked myself, well, what if they get evidence that was somehow assigned 100%, wouldn't that be enough to get them to change their mind? In other words -
If P(H) = 100%
And P(E|H) = 0%
than what's P(H|E) equals to?
I thought, well, if both are infinities, what happens when you subtract infinities? the internet answered that it's indeterminate*, meaning (from what i understand), that it can be anything, and you have absolutely no way to know what exactly.
So i concluded that if i understood everything correct, then such a situation would leave the Bayesian infinitely confused. in a state that he has no idea where he is from 0% to a 100%, and no amount of evidence in any direction can ground him anywhere.
Am i right? or have i missed something entirely?
*I also found out about Riemann's rearrangement theorem which, in a way, let's you arrange some infinite series in a way that equals whatever you want. Dem, that's cool!
Answers
If you do out the algebra, you get that P(H|E) involves dividing zero by zero:
There are two ways to look at this at a higher level. The first is that the algebra doesn't really apply in the first place, because this is a domain error: 0 and 1 aren't probabilities, in the same way that the string "hello" and the color blue aren't.
The second way to look at it is that when we say and , what we really meant was that and ; that is, they aren't precisely one and zero, but they differ from one and zero by an unspecified, very small amount. (Infinitesimals are like infinities; is arbitrarily-close-to-zero in the same sense that an infinity is arbitrarily-large). Under this interpretation, we don't have a contradiction, but we do have an underspecified problem, since we need the ratio and haven't specified it.
↑ comment by Yoav Ravid · 2019-03-22T19:48:47.106Z · LW(p) · GW(p)
Thanks for the answer! i was somewhat amused to see that it ends up being a zero divided by zero.
Does the ratio between 1epsilon over 2epsilon being undefined means that it's arbitrarily close to half (since 1 over two is half, but that wouldn't be exactly it)? or means that we get the same problem i specified in the question, where it could be anything from (almost) 0 to (almost) 1 and we have no idea what exactly?
Replies from: jimrandomh↑ comment by jimrandomh · 2019-03-22T19:55:06.463Z · LW(p) · GW(p)
The latter; it could be anything, and by saying the probabilities were 1.0 and 0.0, the original problem description left out the information that would determine it.
Replies from: Yoav Ravid↑ comment by Yoav Ravid · 2019-03-22T20:07:11.037Z · LW(p) · GW(p)
I see. so -
If P(H) = 1.0 - ϵ1
And P(E|H) = 0 + ϵ2
Then it equals "infinite confusion".
Am i correct?
and also, when you use epsilons, does it mean you get out of the "dogma" of 100%? or you still can't update down from it?
And what i did in my post may just be another example of why you don't put an actual 1.0 in your prior, cause then even if you get evidence of the same strength in the other direction, that would demand that you divide zero by zero. right?
Replies from: countingtoten↑ comment by countingtoten · 2019-03-22T23:44:24.337Z · LW(p) · GW(p)
Using epsilons can in principle allow you to update. However, the situation seems slightly worse than jimrandomh describes. It looks like you need P(E|h), or the probability if H is false, in order to get a precise answer. Also, the missing info that jim mentioned is already enough in principle to let the final answer be any probability whatsoever.
If we use log odds (the framework in which we could literally start with "infinite certainty") then the answer could be anywhere on the real number line. We have infinite (or at least unbounded) confusion until we make our assumptions more precise.
This math is exactly why we say a rational agent can never assign a perfect 1 or 0 to any probability estimate. Doing so in a universe which then presents you with counterevidence means you're not rational.
Which I suppose could be termed "infinitely confused", but that feels like a mixing of levels. You're not confused about a given probability, you're confused about how probability works.
In practice, when a well-calibrated person says 100% or 0%, they're rounding off from some unspecified-precision estimate like 99.9% or 0.000000000001.
↑ comment by TheWakalix · 2019-03-23T01:18:08.841Z · LW(p) · GW(p)
Which I suppose could be termed "infinitely confused", but that feels like a mixing of levels. You're not confused about a given probability, you're confused about how probability works.
Or alternatively, it's a clever turn of phrase: "infinitely confused" as in confused about infinities.
↑ comment by Yoav Ravid · 2019-03-23T04:57:53.993Z · LW(p) · GW(p)
This math is exactly why we say a rational agent can never assign a perfect 1 or 0 to any probability estimate.
Yes, of course. i just thought i found an amusing situation thinking about it.
You're not confused about a given probability, you're confused about how probability works.
nice way to put it :)
I think i might have framed the question wrong. it was clear to me that it wouldn't be rational (so maybe i shouldn't have used the term "Bayesian agent"). but it did seem that if you put the numbers this way you get a mathematical "definition" of "infinite confusion".
Replies from: Pattern↑ comment by Pattern · 2019-03-25T03:24:43.227Z · LW(p) · GW(p)
The point goes both ways - following Bayes' rule means not being able to update away from 100%, but the reverse is likely as well - unless there exists for every hypothesis, not only evidence against it, but also evidence that completely disproves it, there isn't evidence that if agent B observes, they will ascribe anything 100% or 0% probability (if they didn't start out that way).
So a Bayesian agent can't become infinitely confused unless they obtain infinite knowledge, or have bad priors. (One may simulate a Bayesian with bad priors.)
Replies from: Yoav Ravid↑ comment by Yoav Ravid · 2019-03-25T05:08:32.708Z · LW(p) · GW(p)
Pattern, i miscommunicated my question, i didn't mean to ask about a Bayesian agent in the sense of a rational agent. just what is the mathematical result from plucking certain numbers into the equation.
I am well aware now and before the post, that a rational agent won't have a 100% prior, and won't find evidence equal to a 100%, that wasn't where the question stemmed from.
3 comments
Comments sorted by top scores.
comment by TruePath · 2019-03-27T08:14:49.194Z · LW(p) · GW(p)
There is a lot of philosophical work on this issue some of which recommends taking conditional probability as the fundamental unit (in which case Bayes theorem only applies for non-extremal values). For instance, see this paper
Replies from: Yoav Ravid↑ comment by Yoav Ravid · 2019-03-27T10:55:10.209Z · LW(p) · GW(p)
Thanks, it looks quite interesting, but unfortunately i don't think i have the technical knowledge to understand most of the paper. can you make a quick summery of the relevant points?
comment by Donald Hobson (donald-hobson) · 2019-03-22T22:06:33.783Z · LW(p) · GW(p)
In other words, the agent assigned zero probability to an event, and then it happened.