# Why Two Valid Answers Approach is not Enough for Sleeping Beauty

post by Ape in the coat · 2024-02-06T14:21:58.912Z · LW · GW · 12 comments## Contents

Introduction Two Valid Answer Approach Balls in a Box Machine Crux of Disagreement None 12 comments

*This is the sixth post in my series on Anthropics. The previous one is *Another Non-Anthropic Paradox: The Unsurprising Rareness of Rare Events [LW · GW]*. The next one is **Lessons from Failed Attempts to Model Sleeping Beauty Problem* [LW · GW]

## Introduction

When I was writing about Anthropical Motte-Bailey [LW · GW], I had a faint hope that insights from it would be enough to solve the Sleeping Beauty paradox. But of course, it couldn't be that easy. My idea to simply raise the sanity waterline about anthropics to the point where providing a direct answer to this particular problem would be unnecessary also turned out to be wishful thinking. The discourse went for so long and accumulated so much confusion in process, that it requires much more direct attention.

So, now when I've written a couple of preliminary posts and hopefully persuaded you that anthropics problems are not special and can be solved in the realms of probability theory as long as we are being precise, let's do exactly that for the Sleeping Beauty problem.

But at first, I'll justify why the demand for an answer isn't unreasonable, and why we shouldn't be satisfied with the Two Valid Answers Approach. I'll show what issues this approach has, and how it can be used to highlight the crux of disagreement between two positions. And, given all that, what it would actually mean to solve the Sleeping Beauty problem.

## Two Valid Answer Approach

According to the Two Valid Answers Approach, both 1/2 and 1/3 are correct answers, but to different questions. The problem is ambiguously formulated: both of these questions can be considered valid interpretations, and both answers are relevant to different decision theory problems.

The situation is not unlike [LW · GW] the infamous fallen tree in the forest [LW · GW] question. We just need to talk about specific betting odds and stop trying to answer the general and ambiguous question about probability.

I am sympathetic towards this perspective. In my first post on anthropics, I specifically dedicated some place to explain how different scoring rules can produce different probabilities [LW · GW] in the Sleeping Beauty setting.

But I think it is not the whole story. Two Valid Answers Approach resolves some of the confusion - yes, but it also attempts to hide the rest of it under the rug.

The issue is that people read such explanations, nod in agreement and then do not really change their initial position on the question.

"Well, yes", - they say, - "I see how the other side perspective isn't totally absurd and I'm ready to give them this line of retreat, but now when it's clear that they are answering some *other question*, can we finally agree that *my position* is the correct one?"

Is it just due to the lack of a time travel machine [LW · GW]? I'm not so sure. It really seems that the crux isn't fully resolved yet. After all we are not talking about some imperfectly defined similarity cluster category such as "*sound*". We are talking about "*probability*" - a mathematical concept with a quite precise definition. How come we still have ambiguity about it?

And that's why all the talk about abolishing probability in favor of pure betting odds is problematic [LW · GW]. Not only because if we agree to it, we lose the ability to talk about likelihoods unless we assign some utility function over events, - which itself is silly enough. But here it works as a curiosity stopper to hide a valid mathematical problem.

Okay, never mind the betting odds. We can just as well talk about probability averaged per experiment and probability averaged per awakening, right?

Well, that's a good question. Are both of them actually valid probabilities in regards to Sleeping Beauty Problem? Yes, people constructed two scoring rules which produce probability-looking results and noticed to which questions these results can serve as answers. But has there been a proper investigation into the matter?

## Balls in a Box Machine

As far as I can tell, the idea that both answers are valid should be credited to The end of Sleeping Beauty’s nightmare by Berry Groisman. In this paper an inanimate setting, allegedly isomorphic to the Sleeping Beauty problem is described:

An automatic device tosses a fair coin, if the coin lands ‘Tails’ the device puts two red balls in a box, if the coin lands ‘Heads’ it puts one green ball in the box. The device repeats this procedure a large number of times, N. As a result the box will be full of balls of both colours.

Groisman uses it to highlight the confusion:

Now we have arrived at a critical point. The core of the whole confusion is that we tend to regard ‘A (one) green ball is put in the box’ and ‘A green ball is picked out from the box’ as equivalent.

The reason is that the event ‘A green ball is put in the box’ and the event ‘A

green ball is picked out from the box’ are two different events, and therefore their

probabilities are not necessarily equal. These two events are different because they are the subject to different experimental setups: one is the coin tossing, other is picking up a ball at random from the full box.

And translating back to Sleeping Beauty:

If we mean ‘This awakening is a Head-awakening

under the setup of wakening’, then SB's answer to our question should be 1/3, but if we mean ‘The coin landed Headsunder the setup of coin tossing’, her answer should be 1/2.

There are good things I can say about Groisman's paper. For instance, construction of an inanimate setting is very commendable as it shows that anthropic problems are just probability theory problems. But there are also several issues.

First of all, the way Groisman named the two settings inevitably leads to people being confused in a particular way. They see the words "setup of wakening" and immediately think that this is what the initial question in Sleeping Beauty is supposed to mean. After all, it's about credence that the coin is Heads *on awakening*, not in general, which is, apparently, what "setup of coin tossing" means

This is not correct, because in Groisman's "wakening setup" there is no coin toss at all. It's simply picking a random element from the set of three. Likewise, in the "coin tossing setup" there are no awakenings or ball pickings. It's just a random sample of two. Groisman specifically made these setups totally disconnected from each other.

But in the Sleeping Beauty Problem they do appear to be connected! If the coin came Heads, then Heads & Monday Awakening always happens. This is the second issue with the paper. While Groisman is correct that the two events do not necessarily have the same probabilities, he doesn't present a compelling argument why it would be the case for the Sleeping Beauty Problem.

In Balls in a Box Machine there are multiple coin tosses, which fill the box with lots of different balls, from which later, as a separate procedure, one random ball is picked. But in the Sleeping Beauty, there is only one coin toss that fully determines the awakening routine. Beauty's current awakening isn't selected from all awakenings throughout multiple iterations of experiments. If the Beauty goes to sleep on a particular week of the month, she can't possibly wake up during a previous one.

If we assume that the probability to pick the green ball from the box is the actual question, then halfers will notice that Balls in the Box Machine is not faithful to the Sleeping Beauty set up - that it includes this different randomization procedure. A more accurate version would be something like this:

An automatic device tosses a fair coin, if the coin lands ‘Tails’ the device puts two red balls in a box, if the coin lands ‘Heads’ it puts one green ball in the box. Then it picks a random ball from the box.

But then thirders will demand that there have to be two ball picks on Tails and we are back to square one - argument about whether we are supposed to average per experiment or per awakening. And no deep investigation into the validity of such probabilities was apparently done.

## Crux of Disagreement

This analysis, however, wasn't all in vain. Despite the fact that our adventures with Balls in a Box Machine went full circle, they can help to highlight the crux of disagreement between halfers and thirders.

Notice that initially thirders were not demanding for two balls to be picked on Tails. Why is that? Because there was already random sampling from three states going on. Likewise, before we assumed that it's the ball picking scheme that matters, halfers were not arguing against applicability of Groisman's model to the Sleeping Beauty problem. Why is that? Because there was random sampling from two states going on.

Both thirders and halfers can accept the initial formulation of Balls in a Box Machine, because it includes the kind of random sampling that they believe is relevant for the Sleeping Beauty problem. And disagreement between which sampling it is - is our crux. Or, in different words, it's the disagreement about what "*this awakening" *means and thus how it should be treated via probability theory.

Thirders believe that *this awakening* should be treated as randomly sampled from three possible awakening states. Halfers believe that *this awakening* should be treated as randomly sampled from two possible states, corresponding to the result of a coin toss. This is an objective disagreement, that can be formulated in terms of probability theory and at least one side inevitably has to be in the wrong. This is the unresolved issue that we can't simply dismiss because both sides have a point.

To solve the Sleeping Beauty paradox is to resolve this disagreement. And that's what I'm going to do. As a result, we are supposed to get a clear mathematical model for the Sleeping Beauty problem, generalizable to other problems that include memory erasure, rigorous in regards to betting, while also not controverting fundamental principles such as Law of Conservation of Expected Evidence. And, of course, everything should be justified in terms of probability theory, not vague philosophical concepts.

In order to be properly thorough, I'll have to engage with multiple philosophical papers written on the topic throughout the decades, so this is going to take several posts. The next one will be dedicated to multiple ways people have unsuccessfully tried to model the Sleeping Beauty problem.

*The next post in the series is **Lessons from Failed Attempts to Model Sleeping Beauty Problem* [LW · GW]

## 12 comments

Comments sorted by top scores.

## comment by simon · 2024-02-06T19:13:04.184Z · LW(p) · GW(p)

Thirders believe that

this awakeningshould be treated as randomly sampled from three possible awakening states. Halfers believe thatthis awakeningshould be treated as randomly sampled from two possible states, corresponding to the result of a coin toss. This is an objective disagreement, that can be formulated in terms of probability theory and at least one side inevitably has to be in the wrong. This is the unresolved issue that we can't simply dismiss because both sides have a point.

If you make some assumptions about sampling, probability theory will give one answer, with other assumptions probability theory will give another answer. So both can be defended with probability theory, it depends on the sampling assumptions. And there isn't necessarily any sampling assumption that's objectively correct here.

By the way I normally agree with thirders in terms of my other assumptions about anthropics, but in the case of Sleeping Beauty since it's particularly formulated to separate the multiple awakenings from impacting on the rest of the world including the past and future, I think the halfer sampling assumption isn't necessarily crazy.

Replies from: Ape in the coat## ↑ comment by Ape in the coat · 2024-02-09T06:37:39.614Z · LW(p) · GW(p)

If you make some assumptions about sampling, probability theory will give one answer, with other assumptions probability theory will give another answer.

True, but then you may happen to have issues with some of your other assumptions. And in Sleeping Beauty case, as I'm going to show in my next post, indeed there are troubles justifying thirders sampling assumption with other conditions of the setting

So both can be defended with probability theory, it depends on the sampling assumptions. And there isn't necessarily any sampling assumption that's objectively correct here.

Not neccessarily. But still possible. And this is a direction that needs to be properly explored.

By the way I normally agree with thirders in terms of my other assumptions about anthropics, but in the case of Sleeping Beauty since it's particularly formulated to separate the multiple awakenings from impacting on the rest of the world including the past and future, I think the halfer sampling assumption isn't necessarily crazy.

I'm giving you a strong upvote for this. It's rare to find a person who notices that Sleeping Beauty is quite different from other "antropic problems" such as incubator problems.

Replies from: simon## ↑ comment by simon · 2024-02-09T17:43:30.586Z · LW(p) · GW(p)

And in Sleeping Beauty case, as I'm going to show in my next post, indeed there are troubles justifying thirders sampling assumption with other conditions of the setting

I look forward to seeing your argument.

I'm giving you a strong upvote for this. It's rare to find a person who notices that Sleeping Beauty is quite different from other "antropic problems" such as incubator problems.

Thanks! But I can't help but wonder if one of your examples of someone who doesn't notice is my past self making the following comment (in a thread for one of your previous posts) which I still endorse:

https://www.lesswrong.com/posts/HQFpRWGbJxjHvTjnw/anthropical-motte-and-bailey-in-two-versions-of-sleeping?commentId=dkosP3hk3QAHr2D3b [LW(p) · GW(p)]

I certainly agree that one can have philosophical assumptions such that you sample differently for Sleeping Beauty and Incubator problems, and indeed I would not consider the halfer position particularly tenable in Incubator, whereas I do consider it tenable in Sleeping Beauty.

But ... I did argue in that comment that it is still possible to take a consistent thirder position on both. (In the comment I take the thirder position for sleeping beauty for granted, and argue for it still being possible to apply to Incubator (rather than the other way around, despite being more pro-thirder for Incubator), specifically to rebut an argument in that earlier post of yours that the classic thirder position for Sleeping Beauty didn't apply to Incubator).

Some clarification of my actual view here (rather than my defense of conventional thirderism):

In my view, sampling is not something that occurs in reality, when the "sampling" in question includes sampling between multiple entities that both exist. Each of the entities that actually exists actually exists, and any "sampling" between multiple of such entities occurs (only) in the mind of the observer. (However, can still mix with conventional sampling, in the mind of the observer). Which sampling assumption you use in such cases is in principle arbitrary but in practice should probably be based on how much you care about the correctness of the beliefs of each of the possible entities you are uncertain about being.

Halferism or thirderism for Sleeping Beauty are both viable, in my view, because one could argue for caring equally about being correct at each awakening (resulting in thirderism) or one could argue for caring equally about being correct collectively in the awakenings for each of the coin results (resulting in halferism). There isn't any particular "skin in the game" to really force a person to make a commitment here.

## comment by Signer · 2024-02-07T06:16:38.387Z · LW(p) · GW(p)

The issue is that people read such explanations, nod in agreement and then do not really change their initial position on the question.

For what it's worth, I changed my position from "it's confusing, both answers seem to be inferred by reasonable steps" to "it depends on what you are trying to do" and stayed there without reverting to any single answer.

But here it works as a curiosity stopper to hide a valid mathematical problem.

If you try to minimize all curiosity stoppers, you become philosopher. I don't mind inventing some additional math and discussing it - it may even be useful in some broad range of cases. But if the original problem is undefined, then stating it is progress that shouldn't be undone.

We are talking about “probability”—a mathematical concept with a quite precise definition.

Yeah, that's the problem - specific probabilities are not defined, they depend on arbitrary division of outcomes.

And, of course, everything should be justified in terms of probability theory, not vague philosophical concepts.

Probability theory does not specify an algorithm of translating english to outcome space.

Replies from: Ape in the coat## ↑ comment by Ape in the coat · 2024-02-07T06:51:27.670Z · LW(p) · GW(p)

I don't mind inventing some additional math and discussing it - it may even be useful in some broad range of cases. But if the original problem is undefined, then stating it is progress that shouldn't be undone.

Completely agree. The thing is, there is enough to discuss even without inventing any additional math. Apparently people were so eager to invent something new and exotic, that they didn't make sure they actually comply to the basic stuff.

Yeah, that's the problem - specific probabilities are not defined, they depend on arbitrary division of outcomes.

So I thought. And then I tried to actually check and now I have enough material for several posts specifically about Sleeping Beauty.

Probability theory does not specify an algorithm of translating english to outcome space.

Yep, this is indeed a problem.

However, as math has a property of conserving truth statements, we can notice when we made a mistake. When our model produces paradoxical results its a very good hint that we've made some wrong assumption while modeling a problem. My next post is going to be about it.

## comment by Ben (ben-lang) · 2024-02-15T11:16:25.436Z · LW(p) · GW(p)

You arguments are interesting, but I find myself still believing that no single-probability answer is really quite right for the problem.

When you tell me that we select a random marble from a bag, I know what that means. When you tell me that I will either awaken twice, or just once, and we can select a random "awakening", then yes, somehow of the three awakenings we draw on the tree the answer is 1/3. But two of those awakenings occur sequentially, while one is mutually exclusive with the two.

From the perspective of beauty herself all of those awakenings are, in some sense, indistinguishable from one another - they entail (or may entail) identical subjective experiences. If someone came along and said that in such cases you have you use Bose statistics [1], then I would feel like they were wrong - but I would not be able to definitively tell them they were. Maybe identical states of mind (from memory wipes, or human copying machines or other Anthropic stuff) need to be treated Bose-like.

[1] Bose statistics occur in quantum physics where combinatorials come up. An excellent way of thinking of them is given in the book "The beginning of Infinity" by David Deutch: given a bag of beads, when one bead is removed, we can ask "what is the probability that this specific bead will be removed" in a sensible way. However, if your (electronic) bank balance reads $100, and you spend $1 on your card, it just doesn't make sense to ask "what is the probability that the dollar removed is the *first *dollar in my account?". The account is just a quantity that is incremented down. With quantum particles it turns out (experimentally) that the different end states occur with probabilities as if we had one field (like an electric bank account) for each distinguishable type of particle. In the limit where all the particles are distinguishable (each bank account reads either 0 or 1) then the statistics returns to normal. https://en.wikipedia.org/wiki/Bose%E2%80%93Einstein_statistics This is one reason why people often like to talk about quantum fields instead of particles - the fields act like ledgers that increment up and down, not like collections of marbles.

Saying that when you spend $1 on card "it removes a random dollar from the account" is not even wrong, it is in entirely the wrong frame. Similarly, if I am going into a copying machine that will produce many identical copies of me, then saying "after the copying, I will be a random one of those people" is also (arguably) not even wrong, but in the wrong frame. How much do these things change if, instead of asking "just before" copying or going to sleep we ask just after waking up or stepping out of the copy machine?

Replies from: Ape in the coat## ↑ comment by Ape in the coat · 2024-02-15T13:10:09.310Z · LW(p) · GW(p)

Well yes, I didn't really expect that this post on its own will be persuasive enough. But I hope it gave you enough curiosity not to entirely dismiss the idea, that the solution may be possible and search for it with deeper analysis of a problem isn't just a fool's erand.

Let's see how my next two posts are going to change your beliefs about the matter.

Replies from: ben-lang## ↑ comment by Ben (ben-lang) · 2024-02-15T13:27:25.657Z · LW(p) · GW(p)

I am interested to see your deeper analysis. To clarify, I am not saying "there is nothing more interesting to discuss on sleeping beauty and anthropic stuff", my take is more that when you do your deeper analysis you should be open to the idea that maybe the question is confused, underspecified or otherwise in need of refining.

## comment by jessicata (jessica.liu.taylor) · 2024-02-07T02:27:49.105Z · LW(p) · GW(p)

- halfers have to condition on there being at least one observer in the possible world. if the coin can come up 0,1,2 at 1/3 each, and Sleeping Beauty wakes up that number of times, halfers still think the 0 outcome is 0% likely upon waking up.
- halfers also have to construct the reference class carefully. if there are many events of people with amnesia waking up once or twice, and SSA's reference class consists of the set of awakenings from these, then SSA and SIA will agree on a 1/3 probability. this is because in a large population, about 1/3 of awakenings are in worlds where the coin came up such that there would be one awakening.

## ↑ comment by Ape in the coat · 2024-02-07T06:36:30.633Z · LW(p) · GW(p)

halfers also have to construct the reference class carefully. if there are many events of people with amnesia waking up once or twice, and SSA's reference class consists of the set of awakenings from these, then SSA and SIA will agree on a 1/3 probability.

This is a good example of what I meant by

The discourse went for so long and accumulated so much confusion in process, that it requires much more direct attention.

You don't need to think about SSA, SIA to solve the Sleeping Beauty problem at all.

All you need is to construct an appropriate probability space and use basic probability theory instead of inventing clever reasons why it doesn't apply in this particular case. But people failed to correctly model the problem and then started generalizing their failed approaches as a great opportunity to write more philosophy papers and so here we are, seriously talking about how participation in a Sleeping Beauty experiment should give you an ability to switch bodies.

if the coin can come up 0,1,2 at 1/3 each, and Sleeping Beauty wakes up that number of times, halfers still think the 0 outcome is 0% likely upon waking up.

Am I missing something? How is it at all controversial? If by design of the experiment event E doesn't happen when random number generator produces outcome O, then you observe event E happening, you lawfully update in favor of outcome O not happening.

Replies from: jessica.liu.taylor## ↑ comment by jessicata (jessica.liu.taylor) · 2024-02-08T00:59:21.003Z · LW(p) · GW(p)

All you need is to construct an appropriate probability space and use basic probability theory instead of inventing clever reasons why it doesn’t apply in this particular case.

I don't see how to do that but maybe your plan is to get to that at some point

Am I missing something? How is it at all controversial?

it's not, it's just a modification on the usual halfer argument that "you don't learn anything upon waking up"

Replies from: Ape in the coat## ↑ comment by Ape in the coat · 2024-02-08T04:42:06.586Z · LW(p) · GW(p)

I don't see how to do that but maybe your plan is to get to that at some point

Yep, that's exactly what I'm going to do in the post after the next one.

it's not, it's just a modification on the usual halfer argument that "you don't learn anything upon waking up"

Isn't it obvious, that the correct interpretation of "you don't learn anything upon making up" is not about all possible settings where going to sleep and waking up happens, but about type of settings where some event happens on every outcome? That it's just about conservation of expected evidence? If the random generator produces outcomes O1, O2, ... On and on every outcome event E always happens then observation of event E doesn't allow to distinguish between any of the outcomes of random number generator.