Chance is in the Map, not the Territory

daniel-herrmann

Chance is in the Map, not the Territory

post by Daniel Herrmann (Whispermute), ben_levinstein (benlev), Aydin Mohseni (aydin-mohseni) · 2025-01-13T19:17:15.843Z · LW · GW · 18 comments

  Two Ways to Deal with Chance
  The Key Insight: Symmetries in Our Beliefs
  The Magic of de Finetti
  De Finetti in Practice
    1. Weather Forecasting
    2. Clinical Trials
    3. Machine Learning
  Why This Matters
  Common Objections and Clarifications
  Quick Recap
None
18 comments

"There's a 70% chance of rain tomorrow," says the weather app on your phone. "There’s a 30% chance my flight will be delayed," posts a colleague on Slack. Scientific theories also include chances: “There’s a 50% chance of observing an electron with spin up,” or (less fundamental) “This is a fair die — the probability of it landing on 2 is one in six.”

We constantly talk about chances and probabilities, treating them as features of the world that we can discover and disagree about. And it seems you can be objectively wrong about the chances. The probability of a fair die landing on 2 REALLY is one in six, it seems, even if everybody in the world thought otherwise. But what exactly are these things called “chances”?

Readers on LessWrong are very familiar with the idea that many probabilities are best thought of as subjective degrees of belief. This idea comes from a few core people, including Bruno de Finetti. For de Finetti, probability was in the map, not the territory.

But perhaps this doesn’t capture how we talk about chance. For example, our degrees of belief need not equal the chances, if we are uncertain about the chances [LW · GW]. But then what are these chances themselves? If we are uncertain about the bias of a coin, or the true underlying distribution in some environment, then we can use our uncertainty over those chances to generate our subjective probabilities over what we’ll observe.^[1] But then we have these other probabilities — chances, distributions, propensities, etc. — to which we are assigning probabilities. What are these things?

Here we’ll show how we can keep everything useful about chance-based reasoning while dropping some problematic metaphysical assumptions. The key insight comes from work by, once again, de Finetti. De Finetti’s approach has been fleshed out in detail by Brian Skyrms. We’ll take a broadly Skyrmsian perspective here, in particular as given in his book Pragmatics and Empiricism. The core upshot is that we don't need to believe in chances as real things "out there" in the world to use chance effectively. Instead, we can understand chance through patterns and symmetries in our beliefs.

Two Ways to Deal with Chance

When philosophers and scientists have tried to make sense of chance, they've typically taken one of two approaches. The first tries to tell us what chance IS – maybe it's just long-run frequency, or maybe it's some kind of physical property like mass or charge. Or maybe it is some kind of lossy compression of information. The second approach, which we'll explore here, asks a different question: what role does chance play in our reasoning, and can we fulfill that role without assuming chances exist?

Let's look (briefly) at why the first approach is problematic. Frequentists say chance is just long-run frequency:^[2] The chance of heads is 1/2 because in the long run, about half the flips will be heads. But this has issues. What counts as "long run"? What if we never actually get to infinity? And how do we handle one-off events that can't be repeated?^[3]

Others say chance is a physical property – a "propensity" of systems to produce certain outcomes. But this feels suspiciously like adding a mysterious force [LW · GW] to our physics.^[4] When we look closely at physical systems (leaving quantum mechanics aside for now), they often seem deterministic: if you could flip a coin exactly the same way twice, it would land the same way both times.

The Key Insight: Symmetries in Our Beliefs

To see how this second approach works in a more controlled setting, imagine an urn containing red and blue marbles. Before drawing any marbles, you have certain beliefs about what you'll observe. You might think the sequence "red, blue, red" is just as likely as "blue, red, red"—the order doesn't matter, but you can learn from the observed frequencies of red and blue draws.

This symmetry in your beliefs—that the order doesn't matter—is called exchangeability. As you observe more draws, updating your beliefs each time, you develop increasingly refined expectations about future draws. The key insight is that you're not discovering some "true chance" hidden in the urn. Instead, de Finetti showed that when your beliefs have this exchangeable structure, you'll naturally reason as if there were underlying chances you were learning about in a Bayesian way—even though we never needed to assume they exist.^[5]

This is different from just saying the draws are independent. If they were truly independent, seeing a hundred red marbles in a row wouldn't tell you anything about the next draw. But this isn't how we actually reason! Seeing mostly red marbles leads us to expect more red draws in the future. Exchangeability captures this intuition: we can learn from data while maintaining certain symmetries in our beliefs.

The Magic of de Finetti

De Finetti showed something remarkable: if your beliefs about a sequence of events are exchangeable, then mathematically, you must act exactly as if you believed there was some unknown chance governing those events. In other words, exchangeable beliefs can always be represented as if you had beliefs about chances – even though we never assumed chances existed!

For Technical Readers: De Finetti's theorem shows that any exchangeable probability distribution over infinite sequences can be represented as a mixture of i.i.d. distributions. Furthermore, as one observes events in the sequence and updates one’s probability over events via Bayes’ rule, this corresponds exactly to updating one’s distribution over chance distributions via Bayes’ rule, and then using that distribution over chances to generate the probability of the next event. This means you can treat these events as if there's an unknown parameter (the "chance")—even though we never assumed such a parameter exists.

Let's see how this works in practice. When a doctor says a treatment has a "60% chance of success", traditionally we might think they're describing some real, physical property of the treatment. But in the de Finetti view, they're expressing exchangeable beliefs about patient outcomes—beliefs that happen to be mathematically equivalent to uncertainty about some "true" chance. The difference? We don't need to posit any mysterious chance properties. In this situation, since the doctor says it is 60%, she has probably observed enough outcomes (or reports of outcomes) that her posterior in the chance representation is tightly concentrated near 0.6.

De Finetti in Practice

This perspective transforms how we think about evidence and prediction across many domains:

1. Weather Forecasting

When your weather app says "70% chance of rain," it's not measuring some metaphysical "rain chance" property. It's expressing a pattern of beliefs about similar weather conditions that have proven reliable for prediction. Just like in the urn or medical examples, each new bit of data refines the forecast, and the weather model used by the app updates its probability estimates accordingly. This is true even though we sometimes talk about weather as being chaotic, or unpredictable. That is a statement about us, about our map, not the territory.^[6]

2. Clinical Trials

This same pattern of learning applies in medical trials—though the stakes are far higher than drawing marbles. When a doctor says a treatment has a "60% chance of success" they're not measuring some fixed property of the drug. Instead, they're summarizing a learning process that starts with exchangeable beliefs about patient outcomes, whose representation as a mixture over chances ended up concentrating around 0.6.

Think of how researchers approach a new treatment. Before any trials, they treat each future patient's potential outcome as exchangeable—so "success, failure, success" is considered no more or less likely than "failure, success, success." As they observe real outcomes, each success or failure refines their model of the treatment's effectiveness, pushing their estimated success rate up or down accordingly. Just like with the urn, they're not discovering a true success rate hidden in the drug; they're building and refining a predictive model.

Crucially, this is different from treating outcomes as independent. If patient outcomes were truly independent, for the researchers, then seeing the treatment work in a hundred patients wouldn't affect their expectations for the hundred-and-first. But that's not how clinical knowledge works—consistent success makes doctors more confident in recommending the treatment. In other words, they're updating their map of the world, not uncovering a territory fact about the drug.

This exchangeable approach to patient outcomes captures how we actually learn from clinical data while maintaining certain symmetries in our beliefs—giving us all the practical benefits of "chances" without positing them as objective properties in the world.^[7]

3. Machine Learning

When we train models on data, we often assume that the data points are “i.i.d.” (independent and identically distributed). From a de Finetti perspective, this i.i.d. assumption can be seen as an expression of exchangeable beliefs rather than a literal statement about the world. If you start with an exchangeable prior—meaning you assign the same probability to any permutation of your data—then de Finetti’s Representation Theorem says you can treat those observations as if they were generated i.i.d. conditional on some unknown parameter. In other words, you don’t need reality to be i.i.d.; you simply need to structure your beliefs in a way that allows an “as if” i.i.d. interpretation.

This means that when an ML practitioner says, “Assume the data is i.i.d.,” they’re effectively saying, “I have symmetrical (exchangeable) beliefs about the data-generating process.” As new data arrives, you update your posterior on an unknown parameter—much like the urn or medical examples—without ever needing to claim there’s a literal, unchanging probability distribution out there in the territory. Instead, you’ve adopted a coherent, Bayesian viewpoint that models the data as i.i.d. from your perspective, which is enough to proceed with standard inference and learning techniques from statistics and machine learning.

Furthermore, the de Finetti perspective might help shed light on what is going on inside transformers. Some initial attempts have been made to do this rigorously, though we haven’t worked carefully through them, so we can’t ourselves yet fully endorse them. In general, the de Finetti approach seems to vindicate the intuition that a system that is trained to predict observable variables/events might use a latent variable approach to do so, which of course we see empirically in many ways. Furthermore, it might suggest failure modes of AI systems. Just as humans have reified chances in certain ways, so too might AI systems reify certain latents. This is speculative, and we don’t want the scope of this post to bloat too much, but we it think deserves some thought.

We also suspect that there are connections to Wentworth and Lorell’s Natural Latents [LW · GW] and how they hope to apply it to AI, but looking at the connections in a serious way should be a separate post.

Why This Matters

This approach aligns perfectly with the rationalist emphasis on "the map is not the territory [LW · GW]." Like latitude and longitude, chances are helpful coordinates on our mental map, not fundamental properties of reality. When we say there's a 70% chance of rain, we're not making claims about mysterious properties in the world. Instead, we're expressing beliefs that have certain symmetries, beliefs that let us reason effectively about patterns we observe.

This perspective transforms how we think about statistical inference. When a scientist estimates a parameter or tests a hypothesis, they often talk about finding the "true probability" or "real chance." But now we can see this differently: they're working with beliefs that have certain symmetries, using the mathematical machinery of chance without needing to believe in chances as real things.

Common Objections and Clarifications

"But surely," you might think, "when we flip a fair coin, there really IS a 50% chance of heads!" The pragmatic response is subtle: we're not saying chances don't exist (though the three of us do tend to lean that way). Instead, we're saying we don't need them to exist to vindicate our reasoning. It works just as well if we have exchangeable beliefs about coin flips. The "50% chance" emerges from the symmetries in our beliefs, not from some metaphysical property of the coin.

Some might ask about quantum mechanics, which famously involves probabilities at a fundamental level. Even here, the debate about whether wave function collapse probabilities are "real" or just a device in our predictive models is ongoing. The pragmatic perspective can be extended into interpretations of quantum mechanics, but that's a bigger topic for another post.^[8]

Quick Recap

Three key takeaways:

We can talk about chance in purely pragmatic terms.
Exchangeability and de Finetti's theorem show we lose nothing in predictive power.
This viewpoint integrates well with Bayesian rationality and the "map vs. territory" framework.

^{^}
Via the law of total probability and the principal principle
^{^}
With some kind of randomness assumption.
^{^}
Also, the limiting relative frequency doesn’t change if we append any finite number of flips to the front of the sequence, which can mess up inference we try to make in the short to medium to even very long run. In general there are other issues like this, but we’ll keep it brief here.
^{^}
Of course, chances do play a role in inference, so they do constrain expectations. This makes them not the worst kind of mysterious answer. The upshot of the de Finetti theorem is the sifting the useful part of chance from the mysterious. This allows us to use chance talk, without reifying chance.
^{^}
There are generalizations of exchangeability, such as partial exchangeability and Markov exchangeability. For exposition, and since it is a core case, we focus here on the basic exchangeability property.
^{^}
Of course, there are sophisticated ways to try to bridge this gap, by showing that for a certain class of agents, certain dynamics will render an environment only predictable up to a certain degree.
^{^}
There is also a deep way in which the de Finetti perspective can help us make sense of randomized control trials.
^{^}
Although it is worth noting that many theories of quantum mechanics— in particular, Everettian and Bohmian quantum mechanics—are perfectly deterministic. Here is a summary of why Everett wanted a probability-free theory—the core idea is that most versions of QM that make reference to chances do so via measurement-induced collapses, which leads into the measurement problem. We think the genuinely chancey theory that is most likely to pan out is something like GRW, which doesn’t have measurement as a fundamental term in the theory. Jeff Barrett’s The Conceptual Foundations of Quantum Mechanics has greatly informed our views on QM, and is a great in-depth introduction.

18 comments

Comments sorted by top scores.

comment by TAG · 2025-01-14T21:27:21.036Z · LW(p) · GW(p)

Others say chance is a physical property – a “propensity” of systems to produce certain outcomes. But this feels suspiciously like adding a mysterious force [LW · GW] to our physics.[4] [LW(p) · GW(p)] When we look closely at physical systems (leaving quantum mechanics aside for now), they often seem deterministic: if you could flip a coin exactly the same way twice, it would land the same way both times.

Don't sideline QM: it's highly relevant. If there are propensities, real probabilities, then they are not mysterious, they are just the way reality works. They might be unnecessary to explain many of our practices of ordinary probablistic reasoning, but that doesn't make them mysterious in themselves.

If you can give a map-based account of probablistic reasoning, that's fine as far as it goes ...but it doesn't go as far as proving there are no propensities [LW(p) · GW(p)]

This approach aligns perfectly with the rationalist emphasis on “the map is not the territory [LW · GW].”

Whatever that means , it doesn't mean that maps can never correspond to territories. In-the-map does not imply not-in-the-territory. "Can be thought about in a certain way" does not imply "has be thought about in a certain way".

Like latitude and longitude, chances are helpful coordinates on our mental map, not fundamental properties of reality. When we say there’s a 70% chance of rain, we’re not making claims about mysterious properties in the world.

But you could be partially making claims about the world,since propensities are logically possible...even though there is a layer of subjective ,lack-of-knowkedge-based uncertainty on top.

(And the fact that there is so much ambiguity between in-the-map probability and in-the-territory probability itself explains why there is so much confusion about QM).

@Maxwell Peterson

Well, you can regard QM as deterministic, so long as you are willing to embrace nonlocality..but you don't have to.

Although it is worth noting that many theories of quantum mechanics— in particular, Everettian and Bohmian quantum mechanics—are perfectly deterministic.

...only means you can.

The existence of real probabilities is still an open question, and still not closed by noticing that there is a version of probability/possibility/chance in the mind/map ...because that doesn't mean there is isn't also a version in the territory/reality.

Bayesianism in particular doesn't mean probability is in the mind in a sense exclusive of being in the territory.

Consider performing a Bayesian experiment in a universe with propensities. You start off with a prior of 0.5 , on indifference, that your photons will be spin up. You perform a run of a experiments,and 50% of them are spin up. So your posterior is also 0.5...which is also in the in-the-territory probability.

@Cubefox

Credences need to be about something, but they don't need to be about propensities. A Bayesian can prove that they have the right credences by winning bets, which is quite possible in a deterministic universe.

Replies from: quiet_NaN

↑ comment by quiet_NaN · 2025-01-15T23:32:41.792Z · LW(p) · GW(p)

Agreed.

If the authors claim that adding randomness in the territory in classical mechanics requires making it more complex, they should also notice that for quantum mechanics, removing the probability from the territory for QM (like Bohmian mechanics) tends to make the the theories more complex.

Also, QM is not a weird edge case to be discarded at leisure, it is to the best of our knowledge a fundamental aspect of what we call reality. Sidelining it is like arguing "any substance can be divided into arbitrary small portions" -- sure, as far as everyday objects such as a bottle of water are concerned, this is true to some very good approximation, but it will not convince anyone.

Also, I am not sure that for the many world interpretation, the probability of observing spin-up when looking at a mixed state is something which firmly lies in the map. From what I can tell, what is happening in MWI is that the observer will become entangled with the mixed state. From the point of view of the observer, they find themselves either in the state where they observed spin-up and spin-down, but their world model before observation as "I will find myself either in spin-up-world or spin-down-world, and my uncertainty about which of these it will be is subjective" seems to grossly misrepresent their model. They would say "One copy of myself will find itself in spin-down-world, and one in spin-up-world, and if I were to repeat this experiment to establish a frequentist probability, I would find that that the probability of each outcome is given by the coefficient of that part of the wave function to the power of two".

So, in my opinion,

If a blackjack player wonders if a card placed face-down on the table is an ace, that is uncertainty in their map.
If someone wonders how a deterministic but chaotic physical system will evolve over time, that is also uncertainty in the map.
If someone wonders what outcome they are likely to measure in QM, that is (without adding extra epicycles) uncertainty in the territory.
If someone wonders what the evolution of a large statistical ensemble which is influenced by QM at the microscopic level (such as a real gas) is, that might be mostly uncertainty in the map if looking at very short time scales (where position and momentum uncertainty and the statistical nature of scattering cross-sections would not constrain Laplace's daemon too much), but exists in the territory for most useful time spans.

Replies from: Whispermute

↑ comment by Daniel Herrmann (Whispermute) · 2025-01-16T16:22:24.318Z · LW(p) · GW(p)

I agree with both of you --- QM is one of our most successful physical theories, and we should absolutely take it seriously! We de-empahsized QM in the post so we could focus on the de Finetti perspective, and what it teaches us about chance in many contexts. QM is also very much worth discussing --- it would just be a longer, different, more nuanced post.

It is certainly true that certain theories of QM --- such as the GRW one mentioned in footnote 8 [LW(p) · GW(p)] of the post --- do have chance as a fundamental part of the theory. Insofar as we assign positive probability to such theories, we should not rule out chance as being part of the world in a fundamental way.

Indeed, we tried to point out in the post that the de Finetti theorem doesn't rule out chances, it just shows we don't need them in order to apply our standard statistical reasoning. In many contexts --- such as the first two bullet points in the comment to which I am replying --- I think that the de Finetti result gives us strong evidence that we shouldn't reify chance.

I also think --- and we tried to say this in the post --- that it is an open question and active debate how much this very pragmatic reduction of chance can extend to the QM context. Indeed, it might very well be that the last two bullet points above do involve chance being genuinely in the territory.

So I suspect we pretty much agree the broad point --- QM definitely gives us some evidence that chances are really out there, but there are also non-chancey candidates. We tried to mention QM and indicate that things get subtle there without it distracting from the main text.

Some remarks on the other parts of the comments are below, but they are more for fun & completeness, as they get in the weeds a bit.

***

In response to the discussion of whether or not adding randomness or removing randomness makes something more complex, we didn't make any such claim.

Complexity isn't a super motivating property for me in thinking about fundamental physics. Though I do find the particular project of thinking about randomness in QM really interesting --- here is a paper I enjoy that shows things can get pretty tricky.

I also agree that how different theories of QM interact with the constraints of special relativity (esp. locality) is very important for evaluating the theory.

With respect to the many worlds interpretation, at least Everett himself was clear that he thought his theory didn't involve probability (though of course we don't have to blindly agree with what he says about his version of many worlds --- he could be wrong about his own theory, or we could be considering a slightly different version of many worlds). This paper of his is particularly clear about this point. At the bottom of page 18 he discusses the use of probability theory mathematically in the theory, and writes:

"Now probability theory is equivalent to measure theory mathematically, so that we can make use of it, while keeping in mind that all results should be translated back to measure theoretic language."

Jeff Barrett, whose book I linked to in the QM footnote in the main text, and whose annotations are present in the linked document, describes the upshot of this remark (in a comment):

"The reason that Everett insists that all results be translated back to measure theoretic language is that there are, strictly speaking, no probabilities in pure wave mechanics; rather, the measure derived above provides a standard of typicality for elements in the superposition and hence for relative facts."

In general, Everett thought "typicality" a better way to describe the norm squared amplitude of a branch in his theory. On Everett's view, It would not be appropriate to confuse a physical quantity (typicality) and probability (the kind of thing that guides our actions in an EU way and drives our epistemology in a Bayesian way), even if they obey the same mathematics.

In general, my understanding is that in many worlds you need to add some kind of rationality principle or constraint to an agent in the theory so that you get out the Born rule probabilities, either via self-locating uncertainty (as the previous comment suggested) or via a kind of decision theoretic argument. For example, here is a paper that uses an Epistemic Separability Principle to yield the required probabilities. Here is another paper that takes the more decision theoretic approach, introducing particular axioms of rationality for the many worlds context. So while I absolutely agree that there are attractive strategies for getting probability out of many worlds, they tend to involve some rationality principles/constraints, which aren't themselves supplied by the theory, and which make it look a bit more like the probability is in the map, in those cases. Though, of course, as an aspiring empiricist, I want my map to be very receptive to the territory. If there is some relevant structure in the territory that constraints my credences, in conjunction with some rationality principles, then that seems useful.

But a lot of these remarks are very in the weeds, and I am very open to changing my mind about any of them. It is a very subtle topic.

Replies from: TAG

↑ comment by TAG · 2025-01-19T20:26:19.953Z · LW(p) · GW(p)

We de-empahsized QM in the post

You did a bit more than de-emphasize it in the title!

Also:

Like latitude and longitude, chances are helpful coordinates on our mental map, not fundamental properties of reality.

"Are"?

**Insofar as we assign positive probability to such theories, we should not rule out chance as being part of the world in a fundamental way. **Indeed, we tried to point out in the post that the de Finetti theorem doesn’t rule out chances, it just shows we don’t need them in order to apply our standard statistical reasoning. In many contexts—such as the first two bullet points in the comment to which I am replying—I think that the de Finetti result gives us strong evidence that we shouldn’t reify chance.

The perennial source of confusion here is the assumption that the question is whether chance/probability is in the map or the territory... but the question sidelines the "both" option. If there were
strong evidence of mutual exclusion, of an XOR rather than IOR premise, the question would be appropriate. But there isn't.

If there is no evidence of an XOR, no amount of evidence in favour of subjective probability is evidence against objective probability, and objective probability needs to be argued for (or against), on independent grounds. Since there is strong evidence for subjective probability, the choices are subjective+objective versus subjective only, not subjective versus objective.

(This goes right back to "probability is in the mind")

Occams razor isn't much help. If you assume determinism as the obvious default, objective uncertainty looks like an additional assumption...but if you assume randomness as the obvious default, then any deteministic or quasi deteministic law seems like an additional thing

In general, my understanding is that in many worlds you need to add some kind of rationality principle or constraint to an agent in the theory so that you get out the Born rule probabilities, either via self-locating uncertainty (as the previous comment suggested) or via a kind of decision theoretic argument.

@quiet_NaN

There's a purely mathematical argument for the Born rule. The tricky thing is explaining why observations have a classical basis -- why observers who are entangled with a superposed system don't go into superposition with themselves. There are multiple aspects to the measurement problem...the existence or otherwise if a fundamental measurement process, the justification the Born rule, the reason for the emergence of sharp pointer states, and reason for the appearance of a classical basis. Everett theory does rather badly on the last two.

If the authors claim that adding randomness in the territory in classical mechanics requires making it more complex, they should also notice that for quantum mechanics, removing the probability from the territory for QM (like Bohmian mechanics) tends to make the the theories more complex.

OK, but people here tend to prefer many worlds to Bohmian mechanics.. it isn't clear that MWI is more complex ... but it also isn't clear that it is a actually simpler than the alternatives ...as it's stated to be in the rationalsphere.

comment by Cole Wyeth (Amyr) · 2025-02-06T20:49:21.514Z · LW(p) · GW(p)

Here is an excellent philosophical analysis of de Finetti's theorem from I.J. Good: https://www.jstor.org/stable/20114666

comment by AnthonyC · 2025-01-13T22:59:44.391Z · LW(p) · GW(p)

One pet peeve of mine is that actual weather forecasts for the public don't disambiguate interpretations of rain chance. Is it the chance of any rain at some point in that day or hour? Is it the expected proportion of that day or hour during which it will be raining?

comment by transhumanist_atom_understander · 2025-01-13T20:33:00.624Z · LW(p) · GW(p)

Yes, de Finetti's theorem shows that if our beliefs are unchanged by exchanging members of the sequence, that's mathematically equivalent to having some "unknown probability" that we can learn by observing the sequence.

Importantly, this is always against some particular background state of knowledge, in which our beliefs are exchangeable. We ordinary have exchangeable beliefs about coin flips, for example, but may not if we had less information (such as not knowing they're coin flips) or more (like information about the initial conditions of each flip).

In my post on unknown probabilities [LW · GW], I give more detail on how they are definedc which turns out to involve a specific state of background knowledge, so they only act like a "true" probability relative to that background knowledge. And how they can be interpreted as part of a physical understanding of the situation.

Personally, rather than observing that my beliefs are exchangeable and inferring an unknown probability as a mathematical fiction, I would rather "see" the unknown probability directly in my understanding of the situation, as described in my post.

comment by Cole Wyeth (Amyr) · 2025-01-14T15:25:29.775Z · LW(p) · GW(p)

Great post! Probably worth noting explicitly that this one of de Finetti's theorems does not hold in the finite case.

Replies from: Whispermute

↑ comment by Daniel Herrmann (Whispermute) · 2025-01-14T18:35:26.487Z · LW(p) · GW(p)

Absolutely!
I often think that we use the infinite to approximate the very large but finite, and I think that is a good way to think about the de Finetti theorem for finite sequences. In particular, every finite sequence of exchangeable random variables is equivalent to a mixture over random sampling without replacement. As the length grows, the difference between i.i.d and sampling without replacement gets smaller (in a precise sense). This paper by Diaconis and Freedman on finite de Finetti looks at the variation distance between the mixture of i.i.d. distributions and sampling without replacement in the context of finite sequences, as the length of the finite sequences grows.
I'm also aware of work that tries to recover an exact version of a de Finetti style result in the finite context. This paper by Kerns and Székely extends the notion of mixture and shows that with respect to that notion you get a de Finetti style representation for any finite sequence.

comment by winstonBosan · 2025-01-13T19:40:10.634Z · LW(p) · GW(p)

Great stuff! I don't have strong fundamentals in math and statistics but I was still able to hobble along and understand the post. It reminds me of what Rissanen said about data/observation - that data is really all we have, and there is no true state of nature. Our job is to squeeze as much alpha out of observation as possible, instead of trying to find a "true" generator function. This post hit the same spot for me :)

comment by Noosphere89 (sharmake-farah) · 2025-01-17T16:30:13.302Z · LW(p) · GW(p)

Alright, I found another example where chance isn't in the map, but in the territory, and while I don't think it applies to our universe, it is a counterexample to the idea that we can always replace chance/randomness by a deterministic system that due to our limited compute, we are subjectively uncertain, which is probably a core rationalist ideal, and that's a classical gravity theory on top of a quantum world.

While the theory of gravity is still being debated, we do have a lot of evidence for the Heisenberg uncertainty principle being a real feature of our universe, and any theory of gravity that is classical and deterministic is fundamentally inconsistent with quantum mechanics, because if it did, we could use the deterministic field of gravity to measure where the graviton went to infinite precision, breaking uncertainty principles.

Thus, any combination of quantum mechanics with classical gravity requires the gravitational field to have random noise, and cannot be just reducible to uncertainty, and there's a theory that tries to do this.

I personally don't particularly think it will work out, but it's a useful example to show that randomness cannot always be removed.

https://www.quantamagazine.org/the-physicist-who-bets-that-gravity-cant-be-quantized-20230710/

comment by Noosphere89 (sharmake-farah) · 2025-01-14T19:34:28.624Z · LW(p) · GW(p)

A clear exception to chance/probability is in the mind, not the territory post is game theory, which requires the assumption of chance/randomness to prevent infinite loops/get sensible outcomes, and doesn't allow you to treat the uncertainty as fictional/in the map.

More below:

https://www.lesswrong.com/posts/8A6wXarDpr6ckMmTn/another-argument-against-maximizer-centric-alignment#CerAZLKFsKP7KcoW4 [LW(p) · GW(p)]

https://arxiv.org/abs/1508.04145

Replies from: Whispermute

↑ comment by Daniel Herrmann (Whispermute) · 2025-01-14T23:10:51.751Z · LW(p) · GW(p)

I wouldn't say that is a clear exception. There are perfectly normal, subjective probability ways to make sense of mixed strategies in game theory. For example, this paper by Aumann and Brandenburger provides epistemic conditions for Nash equilibria, that don't require objective probabilities to randomize. From their paper:

"Mixed strategies are treated not as conscious randomizations, but as conjectures, on the part of other players, as to what a player will do." (p. 1161)

In slightly more detail:

"According to [our] view, players do not randomize; each player chooses some definite action. But other players need not know which one, and the mixture represents their uncertainty, their conjecture about his choice. This is the context of our main results, which provide sufficient conditions for a probability of conjectures to constitute a Nash equilibrium." (p. 1162)

Interestingly, this paper is very motivated by embedded agency type concerns. For example, on page 1174 they write:

"Though entirely apt, use of the term “state of the world” to include the actions of the players has perhaps caused confusion. In Savage (1954), the decision maker cannot affect the state; he can only react to it. While convenient in Savage’s one person context, this is not appropriate in the interactive, many-person world under study here. Since each player must take into account the actions of the others, the actions should be included in the description of the state. Also the plain, everyday meaning of the term “state of the world” includes one’s actions: Our world is shaped by what we do. It has been objected that prescribing what a player must do at a state takes away his freedom. This is nonsensical; the player may do what he wants. It is simply that whatever he does is part of the description of the state. If he wishes to do something else, he is heartily welcome to do it, but he thereby changes the state."

In general, getting back to reflective oracles, indeed I think that is one way that one might try to provide a formalism underlying some application of game theory! And I think it is a very interesting one. But, as the Aumann and Brandenburger paper shows, there are totally normal ways to do this without fundamental chance. They have some references in their paper to other papers with this perspective, and it forms one of many motivations for the approach of epistemic game theory.

And, in general, I would resist the inference from "this kind of reasoning requires the world to be a certain way" to "the world must be a certain way".

Edit: Lightly edited for typos.

comment by cubefox · 2025-01-21T07:53:19.156Z · LW(p) · GW(p)

A problem with subjective probability is that it ignores any objective fact which would make one probability "better" or "more accurate" than the other. Someone could believe a perfectly symmetrical coin has a 10% chance of coming up heads, even though such a coin is physically impossible.

The concept of subjective probability is independent of any fact about objectively existing symmetries and laws. It also ignores physical dispositions, called propensities, which is like denying that a vase is breakable because this would, allegedly, be like positing a "mysterious force" which makes it true that the vase would break if it dropped.

Subjective probability is only a measure of degree of belief, not of what a "rational" degree of belief would be, and neither is it a measure of ignorance, of how much evidence someone has about something being true or false.

It is also semantically implausible. It is perfectly valid to say "I thought the probability was low, but it was actually high. I engaged in wishful thinking and ignored the evidence I had." But with subjective probability this would be a contradiction, it would be equivalent to saying "My degree of belief was low, but it was actually high". That's not what the first sentence actually expresses.

comment by Anthony DiGiovanni (antimonyanthony) · 2025-01-19T11:55:31.991Z · LW(p) · GW(p)

Just a pointer, I'd strongly recommend basically anything by Alan Hajek about this topic. "The reference class problem is your problem too" is a highlight. I find him to be an exceptionally clear thinker on philosophy of probability, and expect discussions about probability and beliefs would be less confused if more people read his work.

comment by Pretentious Penguin (dylan-mahoney) · 2025-01-18T20:36:53.889Z · LW(p) · GW(p)

The title of this post was effectively clickbait for me, since my primary thought in clicking on it was "I wonder what claim the post will make about the foundations of quantum mechanics", but then I discovered this topic is relegated to a follow-up post. Maybe "Chance is in the map, not the (classical) territory" or "Chance is in the map, not the territory: Part 1" would've been better titles?

comment by Maxwell Peterson (maxwell-peterson) · 2025-01-14T15:42:56.392Z · LW(p) · GW(p)

>In other words, you don’t need reality to be i.i.d.; you simply need to structure your beliefs in a way that allows an “as if” i.i.d. interpretation.

I think I view exchangeability vs. iid slightly differently. In my view, the “independence” part of iid is just way too strong, and is not required in most of the places people scatter the acronym “iid”.

For example, say you are catching fish in a lake, and you know only bass and carp live in the lake, and that there are a ton of fish in it, but not how many of each, and you’re trying to estimate the proportion of carp as you catch fish.

When I catch a carp, the probability that my next catch is a carp goes up. So my probability is dependent on my previous catches - that’s why I can learn things about the proportion! If they were indeed independent, then I couldn’t learn anything. But happily, the correct requirement is not independence, but exchangeability, so I can still update my beliefs as I see more fish.

However, I may just be confused about “iid” as classicists use it, since I never properly learned classic statistics. Interested in what you think about the difference between the two in this example.

comment by Knight Lee (Max Lee) · 2025-01-14T08:02:51.535Z · LW(p) · GW(p)

Another example: an AI risk skeptic might say that there is only a 10% chance ASI will emerge this decade, there is only a 1% chance the ASI will want to take over the world, and there is only a 1% chance it'll be able to take over the world. Therefore, there is only a 0.001% chance of AI risk this decade.

However he can't just multiply these probabilities since there is actually a very high correlation between them. Within the "territory," these outcomes do not correlate with each other that much, but within the "map," his probability estimates are likely to be wrong in the same direction.

Since chance is in the map and not the territory, anything can "correlate" with anything.

PS: I think not all uncertainty is in the map rather than the territory. In indexical uncertainty [LW · GW], one copy of you will discover one outcome and another copy of you will discover another outcome. This actually is a feature of the territory.

Chance is in the Map, not the Territory

Contents

Two Ways to Deal with Chance

The Key Insight: Symmetries in Our Beliefs

The Magic of de Finetti

De Finetti in Practice

1. Weather Forecasting

2. Clinical Trials

3. Machine Learning

Why This Matters

Common Objections and Clarifications

Quick Recap

18 comments