What does it mean for an event or observation to have probability 0 or 1 in Bayesian terms?

sharmake-farah

What does it mean for an event or observation to have probability 0 or 1 in Bayesian terms?

post by Noosphere89 (sharmake-farah) · 2024-09-17T17:28:52.731Z · LW · GW · No comments

This is a question post.

  Answers
    4 Dagon
    2 kqr
    2 Richard_Kennaway
    1 quila
    1 Amalthea
None
No comments

Okay, this one is a simple probability question/puzzle:

What does it actually mean for a probability 0 or 1 event to actually occur, or for those who like subjective credences more, what does it mean to have a probability 0 or 1 observation in Bayesian terms?

Part of my motivation here is to address the limiting cases of beliefs, where the probabilities are as extreme as they can get, and to see what results from taking the probability to the extremes.

Answers

answer by Dagon · 2024-09-17T23:01:06.666Z · LW(p) · GW(p)

It means your model was inapplicable to the event. Careful Bayesean reasoners don't have any 0s or 1s in predictions of observations. They may have explicit separation of observation and probability, such as "all circles in euclidian planes have pi as their ratio of circumference to their diameter", with the non-1 probability falling into "is that thing I see actually truly a circle in a flat plane?"

Likewise, it's fine to give probability 1 to "a fair die will roll integers between 1 and 6 inclusive with equal probability", and then when a 7 rolls, say "that's evidence that it's not a fair die".

Anyone who assigns a probability of 0 or 1 for a future experience is wrong. There's an infinitesimal chance that the simulation ends or your Boltzmann brain has a glitch or aliens are messing with gravity or whatever. In casual use, we often round these off, which is convenient but not strictly applicable.

Note that there's absolutely no way to GET a 0 or 1 probability in Bayesean calculations, unless it's a prior. Any sane prior can adjust arbitrarily close to 0 or 1 with sufficient observations but can't actually get all the way there - update size is proportional to surprise, so it takes a LOT of evidence to shift a tiny bit closer when it's already close to 0 or 1.

For real, limited-calculation agents and humans, one can also model a meta-credence about "is my model and the probability assignments I have vaguely close to correct", which ALSO is not 1.

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T23:13:23.590Z · LW(p) · GW(p)

I'll give 2 examples:

What's the probability that the program you are given contains a solvable halting problem for your decider:

The answer is it has probability 1, but that doesn't mean that we can extend the decider of the halting problem to cover all cases.

https://arxiv.org/abs/math/0504351

Another example is what's the probability that our physical constants are what they are, especially the constants that seem tuned to life?

The answer is if the constants are arbitrary real numbers, the answer is probability 0, and this applies no matter what number you pick.

This is how we can defuse the fine-tuning argument, that the cosmos's constants have improbable values that seem tuned for life, since any other constant has probability 0, no matter whether it was able to sustain life or not:

https://en.wikipedia.org/wiki/Fine-tuned_universe

Replies from: Dagon

↑ comment by Dagon · 2024-09-17T23:31:17.303Z · LW(p) · GW(p)

I think anytime you say "what is the probability that", as if it were an objective fact or measure, rather than an agent's tool for prediction, framed as "what is this agent's probability assignment over ...", you're somewhat outside of a bayesean framework.

In my view, those are incomplete propositions - your probability assignment may be a convenience in making predictions, but it's not directly updatable. Bayesean calculations are about how to predict evidence, and how to update on that evidence. "what is the chance that this decider can solve the halting program for this program in that timeframe" is something that can use evidence to update. likewise "what is the chance that I will measure this constant next week and have it off by more than 10% from last week".

"What is true of the universe, in an unobservable way" is not really a question for Bayes-style probability calculations. That doesn't keep agents from having beliefs, just that there's no general mechanism for correctly making them better.

answer by kqr · 2024-09-18T08:33:52.809Z · LW(p) · GW(p)

Many of the existing answers seem to confuse model and reality.

In terms of practical prediction of reality, it would be a mistake to emit a 0 or 1, always, because there's always that one-in-a-billion chance that our information is wrong – however vivid it seems at the time. Even if you have secretly looked at the hidden coin and seen clearly that it landed on heads, 99.999 % is a more accurate forecast than 100 %. It could have landed on aardvarks and masqueraded as heads, however unlikely, that is a possibility. Or you confabulated the memory of seeing the coin from a different coin you saw a week ago – also not so likely, but happens. Or you mistook tails for heads – presumably happens every now and then.

When it comes to models, though, probabilities of 0 and 1 show up all the time. Getting a 7 when tossing a d6 with the standard dice model simply does not happen, by construction. Adding two and three and getting five under regular field arithmetic happens every time. We can argue whether the language of probability is really the right tool for those types of questions, but taking a non-normative stance, it is reasonable for someone to ask those questions phrased in terms of probabilities, and then the answers would be 0 % and 100 % respectively.

These probabilities also show up in limits and arguments of general tendency. When a coin is tossed, the probability of getting only tails is 0 % as long as you keep tossing whenever you get tails. In a random walk, the probability of eventually crossing the origin is 100 %. When throwing a d6 for long enough, the mean value will end up within the range 3-4 with probability 100 %.

These latter two paragraphs describe things that apply only to our models, not to reality, but they can serve as a useful mental shortcut as long as one is careful about applying them blindly.

answer by Richard_Kennaway · 2024-09-17T17:59:14.229Z · LW(p) · GW(p)

Rolling a standard 6-sided die and getting a 7 has probability zero.

Tossing an ordinary coin and having it come down aardvarks has probability zero.

Every random value drawn from the uniform distribution on the real interval [0,1] has probability zero.

2=3 with probability zero.

2=2 with probability 1.

For any value in the real interval [0,1], the probability of picking some other value from the uniform distribution is 1.

In a mathematical problem, when a coin is tossed, coming down either heads or tails has probability 1.

In practice, 0 and 1 are limiting cases that from one point of view can be said not to exist [LW · GW], but from another point of view, sufficiently low or high probabilities may as well be rounded off to 0 or 1. The test is, is the event of such low probability that its possibility will not play a role in any decision? In mathematics, probabilities of 0 and 1 exist, and if you try to pretend they don't, all you end up doing is contorting your language to avoid mentioning them.

↑ comment by Aleksander (Omnni) · 2024-09-17T23:00:27.685Z · LW(p) · GW(p)

It seems to me that, in fact, it’s entirely possible for a coin to come up aardvarks. Imagine, for a second, that unbeknownst to you a secret society of gnomes, concealed from you(or from society as a whole), occasionally decide to turn coins into aardvarks(or fulfill whatever condition you have for a coin to come up aardvarks. Now, this is nonsense(obviously). But it’s technically possible in the sense that this race of gnomes could exist without contradicting your previous observations (only perhaps your conclusions based on them). Or, if you don’t accept the gnomish argument, consider that at any point there is a near infinitesimal chance that quantum particles will simply rearrange themselves in vast quantities into a specific form. Thus, it’s impossible for anything to have probability zero, except in the cases where you assert something which is impossible from the principles of logic, like P and not P. Bayesian(and other logical) equations seem to make sense with 1 and 0, but that does not mean that they can ever exist in a real sense

Replies from: Richard_Kennaway

↑ comment by Richard_Kennaway · 2024-09-18T17:23:02.082Z · LW(p) · GW(p)

It seems to me that, in fact, it’s entirely possible for a coin to come up aardvarks. ...

For all practical purposes, none of that is ever going to happen. Neither is the coin going to be snatched away by a passing velociraptor, although out of doors, it could be snatched by a passing seagull or magpie, and I would not be surprised if this has actually happened.

Outré scenarios like these are never worth considering.

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T18:11:28.467Z · LW(p) · GW(p)

Okay, that's a nice answer, but to ask a related question, in Bayesianism, does it mean that if we declare an event has probability 0 or 1, that means that the event never happens or always happens, respectively.

Good answer for the most part though.

Replies from: AnthonyC

↑ comment by AnthonyC · 2024-09-17T18:38:35.465Z · LW(p) · GW(p)

Strictly speaking, no, it does not mean that.

Pick a random rational number between zero and one. The probability of any particular outcome is zero, but it's zero reached in a very particular way (basically, 1/Q, where Q is the infinite cardinality of the rational numbers), because all the infinitely many zeroes have to sum to one.

Getting more precise than that would require getting into the formal underpinnings of calculus and limits.

I will say it is not strictly, formally true that rolling a 7 on a normally-numbered six sided die is zero. That's a rounding convention we use. There is a non-zero, not-technically-infinitesimal probably of the atoms in the die randomly ended up in a configuration, during the roll, where one side has seven dots. Are there classes of problem where the difference between an infinitesimal and an extremely small but finite number can matter? Sure. But the difference usually doesn't matter in practice.

Replies from: sharmake-farah, Amarko

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T18:41:25.580Z · LW(p) · GW(p)

Thanks for the answer.

So if I'm interpreting it correctly, in the general case Bayesians can reasonably assign a probability of 0 to an event that can actually happen, and a probability 1 event under Bayes is not equal to the event always happening.

Is this correctly interpreted?

Replies from: AnthonyC

↑ comment by AnthonyC · 2024-09-17T23:20:56.189Z · LW(p) · GW(p)

Yes, but see @Amarko [LW · GW] 's reply and corrections, below. The examples where this ends up being possible are all theoretical and generally involve some form of infinite or infinitesimal quantities.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T23:33:40.769Z · LW(p) · GW(p)

I gave 2 examples of probability 0 or 1 plausibly occuring in real life, and 1 of them relies on the uniform distribution of all real numbers, where if you pick one of them randomly, no matter which number you pick, it always has probability 0, and the set of Turing Machines that have a decidable halting problem, which has probability 1, but you can't extend them into never getting a real number constant/always deciding the halting problem.

Replies from: AnthonyC

↑ comment by AnthonyC · 2024-09-18T22:37:56.947Z · LW(p) · GW(p)

It is not at all clear that it is possible in reality to randomly select a real number without a process that can make an infinite number of choices in finite time. Similarly, any reasoning about Turing machines has to acknowledge that no real, physical system actually instantiates one in the sense of having an infinite tape and never making an error. We can approach/approximate these examples, but that just means we end up with probabilities that are small-but-finite, not 0 or 1

↑ comment by Amarko · 2024-09-17T22:59:57.820Z · LW(p) · GW(p)

This is not quite accurate. You can't uniformly pick a random rational number from 0 to 1, because there are countably many such numbers, and any probability distribution you assign will have to add up to 1. Every probability distribution on this set assigns a nonzero probability to every number.

You can have a uniform distribution on an uncountable set, such as the real numbers between 0 and 1, but since you can't pick an arbitrary element of an uncountable set in the real world this is theoretical rather than a real-world issue.

As far as I know, any mathematical case in which something with probability 0 can happen does not actually occur in the real world in a way that we can observe.

Replies from: sharmake-farah, AnthonyC

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T23:09:56.332Z · LW(p) · GW(p)

As far as I know, any mathematical case in which something with probability 0 can happen does not actually occur in the real world in a way that we can observe.

Here's one example:

What's the probability that our physical constants are what they are, especially the constants that seem tuned to life?

The answer is if the constants are arbitrary real numbers, the answer is probability 0, and this applies no matter what number you pick.

https://en.wikipedia.org/wiki/Fine-tuned_universe

Replies from: Richard_Kennaway

↑ comment by Richard_Kennaway · 2024-09-18T17:27:13.739Z · LW(p) · GW(p)

The question to ask is, what is the measure of the space of physical constants compatible with life? Although that requires some prior probability measure on the space of all hypothetical values of the constants. But the constants are mostly real numbers, and there is no uniform distribution on the reals.

↑ comment by AnthonyC · 2024-09-17T23:06:09.423Z · LW(p) · GW(p)

Thanks, I didn't realize that! It does make sense now that I think about it. I think if you replace the rationals with the reals in my theoretical example, the rest still works?

And yes, I agree about in the real world. Probabilities 0 and 1 are limits you can approach, but only reach in theory.

answer by [deleted] · 2024-09-23T01:27:30.488Z · LW(p) · GW(p)

I am not sure whether this is the answer you're looking for, but I think it's true and could be de-confusing, and others have given the standard/practical answer already.

You can try running a program which computes Bayesian updates to determine what happens when this program is passed as input an 'observation' to which it assigns probability 0. Two possible outcomes (of many, dependent on the exact program) that come to mind:

The program returns a 'cannot divide by 0' error upon attempting to compute the observation's update.
The program updates on the observation in a way which rules out the entirety of its probability-space, as it was all premised on the non-0 possibilities. The next time the program tries to update on a new observation, it fails to find priors about that observation.

Bayes' theorem is an algorithm which is used because it happens to help predict the world, rather than something with metaphysical status.

We could also imagine very-different (mathematical)-worlds where prediction is not needed/useful, or, maybe, where the world is so differently-structured that Bayes' theorem is not predictive.

answer by Amalthea · 2024-09-17T19:20:19.536Z · LW(p) · GW(p)

(Epistemic status: I know basic probability theory but am otherwise just applying common sense here)

This seems to mostly be a philosophical question. I believe the answer is that then you're hitting the limits of your model and Bayesianism doesn't necessarily apply. In practical terms, I'd say it's most likely that you were mistaken about the probability of the event in fact being 0. (Probability 1 events occuring should be fine).

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T21:32:44.870Z · LW(p) · GW(p)

Re probability 0 events, I'd say a good example of one is the question "What probability do we have to live in a universe with our specific fundamental constants?"

And our current theory relies on 20+ real number constants, but critically the probability of getting the constants we do have are always 0, no matter the number that is picked, yet one of them is picked.

Another example is the set of Turing Machines where we can't decide their halting or non-halting is a probability 0 set, but that doesn't allow us to construct a Turing Machine that decides whether another arbitrary Turing Machine halts, for well known reasons.

(This follows from the fact that the set of Turing Machines which have a decidable halting problem has probability 1):

https://arxiv.org/abs/math/0504351

Replies from: AnthonyC

↑ comment by AnthonyC · 2024-09-17T23:25:00.694Z · LW(p) · GW(p)

I do find myself genuinely confused about how to assign a probability distribution to this kind of question. It's one of the main things that draws me to things like Tegmark's mathematical universe/ultimate ensemble, or the simulation hypothesis. In some sense I consider the simplest answer to be "All possible universes exist, therefore it is guaranteed that there is a me that sees the world I see."

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2024-09-17T23:30:49.565Z · LW(p) · GW(p)

While I agree with the mathematical universe hypothesis/ultimate ensemble/simulation hypothesis, this wasn't really my point, and it was just pointing out examples of probability 0/1 sets in real life where you cannot extend them into something that never/always happens.

This didn't depend on any of the 3 hypotheses you generated here, 1 follows solely from the uniform probability distribution for real numbers, and the other is essentially measuring asymptotic density.

No comments

Comments sorted by top scores.

What does it mean for an event or observation to have probability 0 or 1 in Bayesian terms?

Contents

Answers

No comments