Sleeping Beauty Resolved?

ksvanhorn

Sleeping Beauty Resolved?

post by ksvanhorn · 2018-05-22T14:13:05.364Z · LW · GW · 77 comments

  Introduction
  The standard framework for solving probability problems
  Failure to properly apply probability theory
  A red herring: betting arguments
  Failure to construct legitimate propositions for analysis
  Failure to include all relevant information
  Defining the model
  Analysis
  Conclusion
  References
None
77 comments

[This is Part 1. See also Part 2 [LW · GW].]

Introduction

The Sleeping Beauty problem has been debated ad nauseum since Elga's original paper [Elga2000], yet no consensus has emerged on its solution. I believe this confusion is due to the following errors of analysis:

Failure to properly apply probability theory.
Failure to construct legitimate propositions for analysis.
Failure to include all relevant information.

The only analysis I have found that avoids all of these errors is in Radford Neal's underappreciated technical report on anthropic reasoning [Neal2007]. In this note I'll discuss how both “thirder” and “halfer” arguments exhibit one or more of the above errors, how Neal's analysis avoids them, and how the conclusions change when we alter the scenario in various ways.

As a reminder, this is the Sleeping Beauty problem:

On Sunday the steps of the experiment are explained to Beauty, and she is put to sleep.
On Monday Beauty is awakened. While awake she obtains no information that would help her infer the day of the week. Later in the day she is put to sleep again.
On Tuesday the experimenters flip a fair coin. If it lands Tails, Beauty is administered a drug that erases her memory of the Monday awakening, and step 2 is repeated.
On Wednesday Beauty is awakened once more and told that the experiment is over.

The question is this: when awakened during the experiment, what probability should Beauty give that the coin in step 3 lands Heads?

“Halfers” argue that the answer is 1/2, and “thirders” argue that the answer is 1/3. I will argue that any answer between 1/2 and 1/3 may be obtained, depending on details not specified in the problem description; but under reasonable assumptions the answer is slightly more than 1/3. Furthermore,

halfers employ valid reasoning but get the wrong answer because they omit some seemingly irrelevant information; and
thirders (except Neal) employ invalid reasoning that nonetheless arrives at (nearly) the right answer.

The standard framework for solving probability problems

There are actually three separate “Heads” probabilities that arise in this problem:

$p_{1}$ , the probability that Beauty should give for Heads on Sunday.
$p_{2}$ , the probability that Beauty should give for Heads on Monday/Tuesday.
$p_{3}$ , the probability that Beauty should give for Heads on Wednesday.

There is agreement that $p_{1} = p_{3} = 1 / 2$ , but disagreement as to whether $p_{2} = 1 / 2$ or $p_{2} = 1 / 3$ . What does probability theory tell us about how to approach this problem? The $p_{i}$ are all epistemic probabilities, and they are all probabilities for the same proposition—coin lands “Heads”—so any difference can only be due to different information possessed by Beauty in the three cases. The proper procedure for answering the question is then the following:

Construct a probabilistic model $M$ incorporating the information common to the three cases. That is, choose a set of variables that describe the situation and posit an explicit joint probability distribution over these variables. We'll assume that $M$ includes a variable $H$ which is true if the coin lands heads and false otherwise.
Identify propositions $X_{1}$ , $X_{2}$ , and $X_{3}$ expressing the additional information (if any) that Beauty has available in each of the three cases, beyond what is already expressed by $M$ .
Then $p_{i} = Pr (H ∣ X_{i}, M)$ , which can be computed using the rule for conditional probabilities (Bayes' Rule).

Since Beauty does not forget anything she knows on Sunday, we can take $M$ to express everything she knows on Sunday, and $X_{1}$ to be null (no additional information).

Failure to properly apply probability theory

With the exception of Neal, thirders do not follow the above process. Instead they posit one model $M_{1}$ for the first and third cases, which they then toss out in favor of an entirely new model $M_{2}$ for the second case. This is a fundamental error.

To be specific, $M_{1}$ is something like this:

\begin{matrix} H & \sim B e r n o u l l i (1 / 2) W_{M} & = t r u e W_{T} & = not H \end{matrix}

where $W_{M}$ means that Beauty wakes on Monday, $W_{T}$ means that Beauty wakes on Tuesday, and $B e r n o u l l i (p)$ is the distribution on ${f a l s e, t r u e}$ that assigns probability $p$ to true. $X_{3}$ would be Beauty's experiences and observations from the last time she awakened, and this is implicitly assumed to be irrelevant to whether $H$ is true, so that

p_{3} = Pr (H ∣ X_{3}, M_{1}) = Pr (H ∣ M_{1}) = p_{1} .

Thirders then usually end up positing an $M_{2}$ that is equivalent to the following:

\begin{matrix} (H_{M}, T_{M}, T_{T}) & \sim C a t e g o r i c a l (1 / 3, 1 / 3, 1 / 3) H & = H_{M} W_{M} & = t r u e W_{T} & = not H \end{matrix}

The first line above means that $H_{M}$ , $T_{M}$ , and $T_{T}$ are mutually exclusive, each having probability 1/3.

$H_{M}$ means “the coin lands Heads, and it is Monday,”
$T_{M}$ means “the coin lands Tails, and it is Monday,” and
$T_{T}$ means “the coin lands Tails, and it is Tuesday.”

$M_{2}$ is not derived from $M_{1}$ via conditioning on any new information $X_{2}$ ; instead thirders construct an argument for it de novo. For example, Elga's original paper [Elga2000] posits that, if the coin lands Tails, Beauty is told it is Monday just before she is put to sleep again, and declares by fiat that her probability for $H$ at this point should be $1 / 2$ ; he then argues backwards from there as to what her probability for $H$ had to have been prior to being told it is Monday.

A red herring: betting arguments

Some thirders also employ betting arguments. Suppose that on each Monday/Tuesday awakening Beauty is offered a bet in which she wins $2 if the coin lands Tails and loses $3 if it lands Heads. Her expected gain is positive ($0.50) if she accepts the bets, since she has two awakenings if the coin lands Tails, yielding $4 in total, but will have only one awakening and lose only $3 if it lands Heads. Therefore she should accept the bet; but if she uses a probability of 1/2 for Heads on each awakening she computes a negative expected gain (-$0.50) and will reject the bet.

One can argue that Beauty is using the wrong decision procedure in the above argument, but there is a more fundamental point to be made: probability theory is logically prior to decision theory. That is, probability theory can be developed, discussed, and justified [Cox1946, VanHorn2003, VanHorn2017] entirely without reference to decision theory, but the concepts of decision theory rely on probability theory. If our probabilistic model yields $p$ as Beauty's probability of Heads, and plugging this probability into our decision theory yields suboptimal results evaluated against that same model, then this a problem with the decision theory; perhaps a more comprehensive theory is required [Yudkowsky&Soares2017].

Failure to construct legitimate propositions for analysis

Another serious error in many discussions of this problem is the use of supposedly mutually exclusive “propositions” that are neither mutually exclusive nor actually legitimate propositions. $H_{M}$ , $T_{M}$ , and $T_{T}$ can be written as

\begin{matrix} H_{M} & = H and (it is Monday) T_{M} & = (not H) and (it is Monday) T_{T} & = (not H) and (it is Tuesday) . \end{matrix}

These are not truly mutually exclusive because, if $not H$ , then Beauty will awaken on both Monday and Tuesday. Furthermore, the supposed propositions “it is Monday” and “it is Tuesday” are not even legitimate propositions. Epistemic probability theory is an extension of classical propositional logic [Cox1946, VanHorn2003, VanHorn2017], and applies only to entities that are legitimate propositions under the classical propositional logic—but there is no “now,” “today,” or “here” in classical logic.

Both Elga's paper and much of the other literature on the Sleeping Beauty problem discuss the idea of “centered possible worlds,” each of which is equipped with a designated individual and time, and corresponding “centered propositions”—which are not propositions of classical logic. To properly reason about “centered propositions” one would need to either translate them into propositions of classical logic, or develop an alternative logic of centered propositions; yet none of these authors propose such an alternative logic. Even were they to propose such an alternative logic, they would then need to re-derive an alternative probability theory that is the appropriate extension of the alternative propositional logic.

However, it is doubtful that this alternative logic is necessary or desirable. Time and location are important concepts in physics, but physics uses standard mathematics based on classical logic that has no notion of “now”. Instead, formulas are explicitly parameterized by time and location. Before we go off inventing new logics, perhaps we should see if standard logic will do the job, as it has done for all of science to date.

Failure to include all relevant information

Lewis [Lewis2001] argues for a probability of $p_{2} = 1 / 2$ , since Sunday's probability of Heads is $p_{1} = 1 / 2$ , and upon awakening on Monday or Tuesday Beauty has no additional information—she knew that she would experience such an awakening regardless of how the coin lands. That is, Lewis bases his analysis on $M_{1}$ , and assumes that $X_{2}$ contains no information relevant to the question:

p_{2} = Pr (H ∣ X_{2}, M_{1}) = Pr (H ∣ M_{1}) = p_{1} .

Lewis's logic is correct, but his assumption that $X_{2}$ contains no information of relevance is wrong. Surprisingly, Beauty's stream of experiences after awakening is relevant information, even given that her prior distribution for what she may experience upon awakening has no dependence on the day or the coin toss. Neal discusses this point in his analysis of the Sleeping Beauty problem. He introduces the concept of “full non-indexical conditioning,” which roughly means that we condition on everything, even stuff that seems irrelevant, because often our intuition is not that good at identifying what is and is not actually relevant in a probabilistic analysis. Neal writes,

…note that even though the experiences of Beauty upon wakening on Monday and upon wakening on Tuesday (if she is woken then) are identical in all "relevant" respects, they will not be subjectively indistinguishable. On Monday, a fly on the wall may crawl upwards; on Tuesday, it may crawl downwards. Beauty's physiological state (heart rate, blood glucose level, etc.) will not be identical, and will affect her thoughts at least slightly. Treating these and other differences as random, the probability of Beauty having at some time the exact memories and experiences she has after being woken this time is twice as great if the coin lands Tails than if the coin lands Heads, since with Tails there are two chances for these experiences to occur rather than only one.

Bayes' Rule, applied with equal prior probabilities for Heads and Tails, then yields a posterior probability for Tails that is twice that of Heads; that is, the posterior probability of Heads is $1 / 3$ .

Defining the model

Verbal arguments are always suspect when it comes to probability puzzles, so let's actually do the math. Our first step is to extend $M_{1}$ to model $M$ as follows:

$n_{d}$ is the number of distinguishable subjective moments Beauty experiences on day $d$ . For simplicity we take this to be finite. If $not W_{d}$ (Beauty does not awaken on day $d$ ) then $n_{d} = 0$ .
$P_{d t}$ is the stream of perceptions that Beauty remembers experiencing since awakening, as of moment $t$ on day $d$ , for $1 \leq t \leq n_{d}$ . For simplicity we assume that there are only a finite number of distinguishable perceptions Beauty can have and remember at any point in time, as her senses are not infinitely discriminating and her memory has finite capacity. Thus there are only a finite number of possible values for $P_{d t}$ .
$P_{d}$ is the sequence $P_{d, 1}, \dots, P_{d, n_{d}}$ . We do not specify a particular joint distribution for $P_{M}$ and $P_{T}$ conditional on $H$ (which determines $W_{M}$ and $W_{T}$ ), but analyze what happens for arbitrary distributions.
Define $R (y, d)$ to mean that at some point on day $d$ , Beauty's remembered stream of experiences since awakening is $y$ ; that is,

R (y, d) ≜ (P_{d t} = y for some 1 \leq t \leq n_{d}) .

Note that $R (y, d)$ is false if $W_{d}$ is false.
The fact that Beauty “obtains no information that would reveal the day of the week” (nor whether $H$ is true) means that

\begin{matrix} Pr (R (y, M) ∣ H, M) & = Pr (R (y, M) ∣ not H, M) = Pr (R (y, T) ∣ not H, M) \end{matrix}

We write $p (y)$ for any of these three equal probabilities.
The fact that $H$ implies $not W_{T}$ means that

Pr (R (y, T) ∣ H, M) = 0.

The statement of the problem implies that the distributions for $P_{M}$ and $P_{T}$ , conditional on $H$ being false, are identical, but not necessarily independent. Therefore we also define

q (y) ≜ Pr (R (y, T) ∣ R (y, M), not H, M) .

Writing $y$ for the stream of perceptions Beauty remembers experiencing since being awakened, her additional information after awakening is

X_{2} ≜ (R (y, M) or R (y, T)) .

This is the crucial point. Usually we apply Bayes' Rule by conditioning on the information that some variable has some specific value, or that its value lies in some specific range. It is very unusual to condition on this sort of disjunction (OR) of possibilities, where we know the value but not which specific variable has that value. This novelty may explain why the Sleeping Beauty problem has proven so difficult to analyze.

Analysis

The prior for $H$ is even odds:

Pr (H ∣ M) = Pr (\neg H ∣ M) = \frac{1}{2} .

The likelihoods are

Pr (X_{2} ∣ H, M) = Pr (R (y, M) ∣ H, M) = p (y)

and

\begin{matrix} Pr (X_{2} ∣ \neg H, M) = Pr (R (y, M) ∣ \neg H, M) + Pr (R (y, T) ∣ \neg H, M) - Pr (R (y, M) and R (y, T) ∣ \neg H, M) = p (y) + p (y) - p (y) q (y) = p (y) (2 - q (y)) . \end{matrix}

Applying Bayes' Rule we obtain

\begin{matrix} p_{2} = Pr (H ∣ X_{2}) & = \frac{1 / 2 \cdot p (y)}{1 / 2 \cdot p (y) + 1 / 2 \cdot p (y) (2 - q (y))} = \frac{1}{3 - q (y)} . \end{matrix}

Now let's consider various possibilities:

$p (y) ≪ 1$ , $P_{M}$ and $P_{T}$ are independent. This is Neal's assumption, and it is a reasonable one. If the perceptions and memories Beauty can distinguish are sufficiently fine-grained, then $p (y)$ will be small for any given $y$ . Then $q (y) = p (y)$ , hence $q (y) ≪ 1$ , hence $p_{2}$ is slightly greater than $1 / 3$ .
$q (y) = 0$ . For example, the experimenters might have a Probabilist's Urn containing a white marble and a black marble. On Monday they reach in and choose one marble at random, placing it on the nightstand next to Beauty's bed where she will see it as soon as they awaken her. If they awaken her on Tuesday also, then they first remove Monday's marble from the nightstand and replace it with the one remaining in the urn. This guarantees that whatever Beauty experiences on Monday, she will experience something different on Tuesday. Then $p_{2} = 1 / 3$ exactly.
$q (y) = 1$ . That is, whatever Beauty experiences on Monday, she experiences the exact same thing on Tuesday (if she awakens on Tuesday). This could be the case if Beauty is an AI whose perceptions the experimenters completely control. Then $p_{2} = 1 / 2$ .
$q (y) = 2^{- k}$ , $k \geq 0$ . Suppose that Beauty is an AI and her only sensory experience on awakening is to receive as input a sequence of 0/1 values, chosen independently at random with equal probabilities. Then after receiving the first $k$ values in the sequence we have $q (y) = p (y) = 2^{- k}$ and therefore

p_{2} = \frac{1}{3 - 2^{- k}} .

(Continuing) That is, at the moment of awakening $p_{2} = 1 / 2$ . When Beauty receives the first value in the sequence then $p_{2} = 1 / 2.5$ . When she receives the second value then $p = 1 / 2.75$ , and so on, $p_{2}$ asymptotically approaching $1 / 3$ from above as she receives more of the sequence.

In fact, any value for $p_{2}$ between $1 / 2$ and $1 / 3$ is possible. Let $q = 3 - 1 / p$ , for any desired $p$ in the range $1 / 3 \leq p \leq 1 / 2$ . If the experimenters set things up so that, with probability $q$ , Beauty's Monday and Tuesday experiences are identical, and with probability $1 - q$ they are different, then $p_{2} = p$ .

Conclusion

There are three important lessons here:

Epistemic probabilities are a function of the information one has. Thus if two people have different epistemic probabilities, or the same person at different times, there must be some identifiable difference in the information they have available.
"Propositions" that refer to concepts such as "now," "today," or "here" are not legitimate objects of analysis for classical logic nor for probability theory. Problems that involve such concepts must be translated into classical logic using some form of explicit time or space parameterization.
We must be very careful about discarding information considered irrelevant. It can turn out to be relevant in surprising ways.

References

[Cox1946] R. T. Cox, 1946. "Probability, frequency, and reasonable expectation," American Journal of Physics 17, pp. 1-13.

[Elga2000] A. Elga, 2000. "Self-locating belief and the Sleeping Beauty problem," Analysis 16, pp. 143-147.

[Lewis2001] D. Lewis, 2001. "Sleeping Beauty: reply to Elga," Analysis 61, pp. 171-176.

[Neal2007] R. M. Neal, 2007. "Puzzles of anthropic reasoning resolved using full non-indexical condition," Technical Report No. 0607, Dept. of Statistics, University of Toronto. Online at https://arxiv.org/abs/math/0608592.

[VanHorn2003] K. S. Van Horn, 2003. "Constructing a logic of plausible inference: a guide to Cox's Theorem," International Journal of Approximate Reasoning 34, no. 1, pp. 3-24. Online at https://www.sciencedirect.com/science/article/pii/S0888613X03000513

[VanHorn2017] K. S. Van Horn, 2017. "From propositional logic to plausible reasoning: a uniqueness theorem," Int'l Journal of Approximate Reasoning 88, pp. 309--332. Online at https://www.sciencedirect.com/science/article/pii/S0888613X16302249, preprint at https://arxiv.org/abs/1706.05261.

[Yudkowsky&Soares2017] E. Yudkowsky and N. Soares, 2017. "Functional decision theory: a new theory of instrumental rationality," https://arxiv.org/abs/1710.05060 .

77 comments

Comments sorted by top scores.

comment by Radford Neal (radford-neal) · 2018-05-25T02:01:26.582Z · LW(p) · GW(p)

Thanks for presenting my take on Sleeping Beauty. Your generalization beyond my assumption that Beauty's observations on Monday/Tuesday are independent and low-probability are interesting.

I'm not as dismissive as you are of betting arguments. You're right, of course, that a betting argument for something having some probability could be disputed by someone who doesn't accept your ideas of decision theory. But since typically lots of people will agree with with your ideas of decision theory, it may be persuasive to some.

Now, I have to admit that I have personally failed in this respect, responding to this post by Lubos Motl:

https://motls.blogspot.ca/2015/08/sleeping-beauty-betting-assisting.html

I think I failed to get Lubos to even consider my argument, but it might be of interest to people here. Here's an except from my comment there:

--- (start excerpt)

...modify the usual problem so that Beauty can send a message to her brother recommending how to bet (on the coin having landed H or T), which will be received only after Tuesday. If she sends two messages, only the second one will be received. Let's suppose that it's known to everyone from the beginning that there will be exactly two bets on offer - one where one wins $90 on T and loses $100 on H, and the other where one wins $90 on H and loses $100 on T. Should Beauty recommend to her brother one, or the other, or neither of these bets? (Let's assume that her brother is very likely to follow her recommendation, whatever it is.)

I think we agree that Beauty should recommend that her brother take neither of these bets, which both have a negative expected payoff if H and T are equally likely.

However, I think that this is the conclusion Beauty will reach by following the THIRDER logic. If she follows the HALFER logic, she will do the wrong thing.

If Beauty is a thirder, she thinks upon awakening that there are three possibilities, with the following probabilities:

a) It's Monday, and the coin landed heads (probability 1/3).

b) It's Monday, and the coin landed tails (probability 1/3).

c) It's Tuesday, and the coin landed tails (probability 1/3).

Now, when deciding what action to take, Beauty should consider only those possibilities in which her action actually makes a difference. That eliminates (b), since the message sent in that circumstance will be wiped out by the message that will be sent on Tuesday. The remaining possibilities are (a) and (c), which are equally likely. So the recommendation should be based on H and T being equally likely, which means it should recommend taking neither bet.

In contrast, if Beauty is a halfer, she sees the three possibilities as having the following probabilites:

a) It's Monday, and the coin landed heads (probability 1/2).

b) It's Monday, and the coin landed tails (probability 1/4).

c) It's Tuesday, and the coin landed tails (probability 1/4).

Again, (b) should be ignored, since it makes no difference what Beauty does in that circumstance. That leaves (a) and (c), with (a) having twice the probability as (c). So Beauty - if she is a halfer - should recommend taking the bet that pays $90 on H and loses $100 on T, since it has a positive expected return. But of course this is the wrong answer.

--- (end excerpt)

By the way, what did you think of my "Sailor's Child" version of the problem? I personally thought that it should put the whole issue to rest in a definitive fashion, but then, that's a typical view of people making philosophical arguments that don't in fact end the debate...

Finally, note that a partially revised version of my paper is available via http://www.cs.utoronto.ca/~radford/anth.abstract.html

Replies from: Chris_Leong, Charlie Steiner, Confusion, ksvanhorn

↑ comment by Chris_Leong · 2018-06-18T11:14:34.435Z · LW(p) · GW(p)

I don't understand how the halfer makes the wrong bet. If we are talking about probabilities, then a), b) and c) all have a probability of 1/2 of occurring. If we want the probabilities to sum to 1, then we need to do the following:

If heads occurs, Monday always "counts"
If tails occurs, we need to flip a second coin to determine if Monday or Tuesday "counts".

So b) is "It's Monday, and the coin landed tails and Monday counts"

And c) is "It's Tuesday, and the coin landed tails and Tuesday counts"

So b) and c) are exclusive, so c) doesn't override b).

On the other hand, if b) and c) aren't exclusive, then then are both 0.5 instead. So b) being ignored wouldn't matter as c) would suffice by itself.

The only way we get the wrong answer is if b) and c) overlap and are not 0.5. This makes no sense for the halfer model.

↑ comment by Charlie Steiner · 2018-06-16T16:53:48.148Z · LW(p) · GW(p)

I agree that the Sailor's Child is the correct translation of Sleeping Beauty into a situation with no copies (Do you remember Psy-Kosh's non-anthropic problem?), but I think some people might even deny that any such translation exists.

↑ comment by Confusion · 2018-05-25T07:49:36.990Z · LW(p) · GW(p)

Don't worry about not being able to convince Lubos Motl. His prior for being correct is way too high and impedes his ability to consider dissenting views seriously.

↑ comment by ksvanhorn · 2018-05-26T17:53:35.538Z · LW(p) · GW(p)

In regards to betting arguments:

1. Traditional CDT (causal decision theory) breaks down in unusual situations. The standard example is the Newcomb Problem, and various alternatives have been proposed, such as Functional Decision Theory. The Sleeping Beauty problem presents another highly unusual situation that should make one wary of betting arguments.

2. There is disagreement as to how to apply decision theory to the SB problem. The usual thirder betting argument assumes that SB fails to realize that she is going to both make the same decision and get the same outcome on Monday and Tuesday. It has been argued that accounting for these facts means that SB should instead compute her expected utility for accepting the bet as

Pr (H) \cdot p a y o f f (H) + 2 Pr (T) p a y o f f (T) .

3. Your own results show that the standard betting argument gets the wrong answer of 1/3, when the correct answer is $1 / (3 - p (x))$ . At best, the standard betting argument gets close to the right answer; but if Beauty is sensorily impoverished, or has just awakened, then $p (x)$ can be sufficiently large that the answer deviates substantially from $1 / 3$ .

BTW, I was a solid halfer until I read your paper. It was the first and only explanation I've ever seen of how Beauty's state of information after awakening on Monday/Tuesday differs from her state of information on Sunday night in a way that affects the probability of Heads.

With regards to your "Sailor's Child" problem:

It was not immediately obvious to me that this is equivalent to the SB problem. I had to think about it for some time, and I think there are some differences. One is, again the different answers of $1 / 3$ versus $1 / (3 - p (x))$ . I've concluded that the SC problem is equivalent to a variant of the SB problem where (1) we've guaranteed that Beauty cannot experience the same thing on both Monday and Tuesday, and (2) there is a second coin toss that determines whether Beauty is awakened on Monday or on Tuesday in the case that the first coin toss comes up Heads.

In any event, it was the calculation based on Beauty's new information upon awakening that I found convincing. I tried to disprove it, and couldn't.

comment by Charlie Steiner · 2018-05-22T23:05:43.901Z · LW(p) · GW(p)

rising ways.

Here, you dropped this from the last bullet point at the end :)

A very clear walkthrough of full nonindexical conditioning. Thanks! I think there's still a big glaring warning sign that this could be wrong, which is the mismatch with frequency (and, by extension, betting). Probability is logically prior to frequency estimation, but that doesn't mean I think they're decoupled. If your "probability" has zero application because your decision theory uses "likeliness weights" calculated an entirely different way, I think something has gone very wrong.

I think if you've gone wrong somewhere, it's in trying to outlaw statements of the form "it is Monday today."

Suppose on Monday the experimenters will give her a cookie after she answers the question, and on Tuesday the experimenters will give her ice cream. Do you really want to outlaw "in 5 minutes I will get a cookie" as a valid thing to have beliefs about?

In fact, I think you got it precisely backwards - probability distributions come from the assigner's state of information, and therefore they must be built off of what the assigner actually knows. I don't have access to some True Monday Detector, I only have access to my internal sense of time. "Now" is fundamental, "Monday" is the higher level construct. Similarly, I don't have an absolute position sense - my probability distribution over things must always use relative coordinates (even if it's "relative to the zero reading on this gauge here") because there are no absolute coordinates available to me. I don't have access to my mystical True Name, so I don't know which of several duplicates is the Real Me unless I can describe it in relative terms like "the one who came first" - therefore "me" is fundamental, "the original Charlie" is the higher-level construct.

Anyhow, once you allow temporal information you go back to trying to trying to figure out what your model should say when you demand a MEE constraint on Monday vs. Tuesday.

Replies from: Chris_Leong, ksvanhorn

↑ comment by Chris_Leong · 2018-06-16T14:09:47.753Z · LW(p) · GW(p)

MEE constraint?

Replies from: Charlie Steiner

↑ comment by Charlie Steiner · 2018-06-16T16:12:56.700Z · LW(p) · GW(p)

"mutually exclusive and exhaustive." Usually just means the probabilities of AND-ing them is zero, and the total probability is one.

↑ comment by ksvanhorn · 2018-05-23T03:37:14.067Z · LW(p) · GW(p)

the mismatch with frequency

There are no frequencies in this problem; it is a one-time experiment.

Probability is logically prior to frequency estimation

That's not what I said; I said that probability theory is logically prior to decision theory.

If your "probability" has zero application because your decision theory uses "likeliness weights" calculated an entirely different way, I think something has gone very wrong.

Yes; what's gone wrong is that you're misapplying the decision theory, or your decision theory itself breaks down in certain odd circumstances. Exploring such cases is the whole point of things like Newcomb's problem and Functional Decision Theory. In this case, it's clear that Beauty is going to make the same betting decision, with the same betting outcome, on both Monday and Tuesday (if the coin lands Tails). The standard betting arguments use a decision rule that fails to account for this.

I think if you've gone wrong somewhere, it's in trying to outlaw statements of the form "it is Monday today."

See my response to Dacyn below ("Classical propositions are simply true or false..."). Classical propositions do not change their truth value over time.

Replies from: Charlie Steiner

↑ comment by Charlie Steiner · 2018-05-23T07:07:57.643Z · LW(p) · GW(p)

There are no frequencies in this problem; it is a one-time experiment.

One can do things multiple times.

See my response to Dacyn below ("Classical propositions are simply true or false..."). Classical propositions do not change their truth value over time.

I tried to get at this in the big long paragraph of "'Monday' is an abstraction, not a fundamental." There is no such thing as a measurement of absolute time. When someone says "no, I mean to refer to the real Monday," they are generating an abstract model of the world and then making their probability distributions within that model. But then there still have to be rules that cash your nice absolute-time model out into yucky relative-time actual observables.

It's like Solomonoff induction. You have a series of data, and you make predictions about future data. Everything else is window dressing (sort of).

But it's not so bad. You can have whatever abstractions you want, as long as they cash out to the right thing. You don't need time to actually pass within predicate logic. You just need to model the passage of time and then cash the results out.

It's also like how probability distributions are not about what reality is, they are about your knowledge of reality. "It is Monday" changes truth value depending on the external world. But P(It is Monday | Information)=0.9 is a perfectly good piece of classical logic. In fact, this exactly the same as how you can treat P(H)=0.5, even though classical propositions do not change their truth value when you flip over a coin.

I dunno, putting it that way makes it sound simple. I still think there's something important in my weirder rambling - but then, I would.

comment by Chris_Leong · 2018-06-19T05:12:41.228Z · LW(p) · GW(p)

I wrote up a response to this [LW · GW], but I thought it was also worthwhile writing a comment that directly responds to the argument about whether we can update on a random bit of information.

@travisrm89 wrote:

How can receiving a random bit cause Beauty to update her probability, as in the case where Beauty is an AI? If Beauty already knows that she will update her probability no matter what bit she receives, then shouldn't she already update her probability before receiving the bit?

Ksvanhorn responds [LW · GW] by pointing out that this assumes that the probabilities add to one, while we are considering the probability of observing a particular sequence at least once, so these probabilities overlap.

This doesn't really clarify what is going on, but I think that we can clarify this by first looking at the following classical probability problem:

A man has two sons. What is the chance that both of them are born on the same day if at least one of them is born on a Tuesday?

Most people expect the answer to be 1/7, but the usual answer is that 13/49 possibilities have at least one born on a Tuesday and 1/49 has both born on Tuesday, so the chance in 1/13. Notice that if we had been told, for example, that one of them was born on a Wednesday we would have updated to 1/13 as well. So our odds can always update in the same way on a random piece of information if the possibilities referred to aren't exclusive as Ksvanhorn claims.

However, consider the following similar problem:

A man has two sons. We ask one of them at random which day they were born and they tell us Tuesday. What is the chance that they are both born on the same day?

Here the answer is 1/7 as we've been given no information about when the other child was born. When Sleeping Beauty wakes up and observes a sequence, they are learning that this sequence occurs on a on a random day out of those days when they are awake. This probability is 1/n where n is the number of possibilities. This is distinct from learning that the sequence occurs in at least one wakeup just like learning a random child is born on a Tuesday is different from learning that at least one child was born on a Tuesday. So Ksvanhorn has calculated the wrong thing.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-07-03T01:41:28.343Z · LW(p) · GW(p)

When Sleeping Beauty wakes up and observes a sequence, they are learning that this sequence occurs on a on a random day out of those days when they are awake.

That would be a valid description if she were awakened only on one day, with that day chosen through some unpredictable process. That is not the case here, though.

What you're doing here is sneaking in an indexical -- "today" is either Monday if Heads, and "today" is either Monday or Tuesday if Tails. See Part 2 [LW · GW] for a discussion of this issue. To the extent that indexicals are ambiguous, they cannot be used in classical propositions. The only way to show that they are unambiguous is to show that there is an equivalent way of expressing that same thing that doesn't use any indexical, and only uses well-defined entities -- in which case you might as well use the equivalent expression that has no indexical.

comment by Lukas Finnveden (Lanrian) · 2018-05-23T12:24:39.284Z · LW(p) · GW(p)

Insofar as I understand, you endorse betting on 1:2 odds regardless of whether you believe the probability is 1/3 or 1/2 (i.e., regardless of whether you have received lots of random information) because of functional decision theory.

But in the case where you receive lots of random information you assign 1/3 probability to the coin ending up heads. If you then use FDT it looks like there is 2/3 probability that you will do the bet twice with the outcome tails; and 1/3 probability that you will do the bet once with the outcome heads. Therefore, you should be willing to bet at 1:4 odds.

That seems strange, and will mean losing money on average. I can't see how you would get the different probabilities depending on how much random information you receive and still make the same decision about bets.

Replies from: Lanrian, ksvanhorn

↑ comment by Lukas Finnveden (Lanrian) · 2018-06-22T15:35:16.851Z · LW(p) · GW(p)

Actually, I realise that you can get around this. If you use a decision theory that assumes that you are deciding for all identical copies of you, but that you can't affect the choices of copies that has diverged from you in any way, math says you will always bet correctly.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-07-03T01:33:46.501Z · LW(p) · GW(p)

Yes, that is shown in Part 2 [LW · GW].

↑ comment by ksvanhorn · 2018-05-26T18:32:21.430Z · LW(p) · GW(p)

As I understand it, FDT says that you go with the algorithm that maximizes your expected utility. That algorithm is the one that bets on 1:2 odds, using the fact that you will bet twice, with the same outcome each time, if the coin comes up tails.

Replies from: Lanrian

↑ comment by Lukas Finnveden (Lanrian) · 2018-05-26T22:23:51.938Z · LW(p) · GW(p)

I agree with that description of FDT. And looking at the experiment from the outside, betting at 1:2 odds is the algorithm that maximizes utility, since heads and tails have equal probabilities. But once you're in the experiment, tails have twice the probability of heads (according to your updating procedure) and FDT cares twice as much about the worlds in which tails happens, thus recommending 1:4 odds.

comment by Dacyn · 2018-05-22T20:10:42.772Z · LW(p) · GW(p)

Why do you say that there is no "now", "today", or "here" in classical logic? Classical logic is just a system of logic based on terms and predicates. There is no reason that "now", "today", and "here" can't be terms in the logic. Now presumably you meant to say that such words cause a statement to have different meanings depending on who speaks it. But why is this a problem?

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-23T03:21:22.550Z · LW(p) · GW(p)

Classical propositions are simply true or false, although you may not know which. They do not change from false to true or vice versa, and classical logic is grounded in this property. "Propositions" such as "today is Monday" are true at some times and false at other times, and hence are not propositions of classical logic.

If you want a "proposition" that depends on time or location, then what you need is a predicate---essentially, a template that yields different specific propositions depending on what values you substitute into the open slots. "Today is Monday" corresponds to the predicate $A (t)$ , where

A (t) ≜ (d a y O f W e e k (t) = M o n d a y) .

The closest we can come to an actual proposition meaning "today is Monday" would be

\forall t . m e m o r i e s (t) = y \Rightarrow A (t)

where y is some memory state and $m e m o r i e s (t)$ means your memory state at time $t$ .

Replies from: AlexMennen, Dacyn

↑ comment by AlexMennen · 2018-06-08T04:07:56.909Z · LW(p) · GW(p)

In any particular structure, each proposition is simply true or false. But one proposition can be true in some structure and false in another structure. The universe could instantiate many structures, with non-indexical terms being interpreted the same way in each of them, but indexical terms being interpreted differently. Then sentences not containing indexical terms would have the same truth value in each of these structures, and sentences containing indexical terms would not. None of this contradicts using classical logic to reason about each of these structures.

I'm sympathetic to the notion that indexical language might not be meaningful, but it does not conflict with classical logic.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-06-10T20:19:46.804Z · LW(p) · GW(p)

The point is that the meaning of a classical proposition must not change throughout the scope of the problem being considered. When we write A1, ..., An |= P, i.e. "A1 through An together logically imply P", we do not apply different structures to each of A1, ..., An, and P.

The trouble with using "today" in the Sleeping Beauty problem is that the situation under consideration is not limited to a single day; it spans, at a minimum, both Monday and Tuesday, and arguably Sunday and/or Wednesday also. Any properly constructed proposition used in discussing this problem should make sense and be unambiguous regardless of whether Beauty or the experimenters are uttering the proposition, and whether they are uttering it on Sunday, Monday, Tuesday, or Wednesday.

↑ comment by Dacyn · 2018-05-23T09:04:13.913Z · LW(p) · GW(p)

That's not how I understand the term "classical logic". Can you point to some standard reference that agrees with what you are saying? I skimmed the SEP article I linked to and couldn't find anything similar.

You run into the same problems with any sort of pronouns or context-dependent reference, and as far as I know most philosophers consider statements like "the thing that I'm pointing at right now is red" to be perfectly valid in classical logic.

The main point of classical logic is that it has a system of deduction based on axioms and inference rules. Are you saying that you think these don't apply in the case of centered propositions? Does modus ponens or the law of the excluded middle not work for some reason? If not, I'm not sure why it matters whether centered propositions are really a part of "classical logic" or not -- you can still use all the same tools on them as you can use for classical logic.

Finally, if you accept the MWI then every statement about the physical world is a centered proposition, because it is a statement about the particular Everett branch or Tegmark universe that you are currently in. So classical logic would be pretty weak if it couldn't handle centered propositions!

Replies from: TAG

↑ comment by TAG · 2018-05-23T14:20:18.032Z · LW(p) · GW(p)

If classical logic means propositional calculus, then there are no predicates, and no ability to express time-indexed truths.

Replies from: Dacyn

↑ comment by Dacyn · 2018-05-23T15:09:55.202Z · LW(p) · GW(p)

At least according to SEP classical logic includes predicates. But in any case if you want to do things with the propositional calculus, then I see no difference between saying "Let P = 'Today is Monday' " and "Let P = 'Sleeping Beauty is awake on Monday' ". Both of them are expressing a proposition in terms of a natural language statement that includes more expressive resources than the propositional calculus itself contains. But I don't see why that should be a problem in one case but not in the other.

Replies from: TAG

↑ comment by TAG · 2018-05-23T15:53:25.863Z · LW(p) · GW(p)

The first case has a truth value that varies with time.

Replies from: Dacyn

↑ comment by Dacyn · 2018-05-23T22:12:49.311Z · LW(p) · GW(p)

And the second case has a truth value that varies depending on what Everett branch you are in. Does it matter?

Replies from: strangepoop

↑ comment by a gently pricked vein (strangepoop) · 2018-05-24T20:32:12.117Z · LW(p) · GW(p)

There is a relevant distinction: the machinery being used (logical assignment) has to be stable for the duration of the proof/computation. Or perhaps, the "consistency" of the outcome of the machinery is defined on such a stability.

For the original example, you'd have to make sure that you finish all relevant proofs within a period in $M o n d a y$ or within a period in $N o t M o n d a y$ . If you go across, weird stuff happens when attempting to preserve truth, so banning non-timeless propositions makes things easier.

You can't always walk around while doing a proof if one of your propositions is "I'm standing on Second Main". You could, however, be standing still in any one place whether or not it is true. ksvanhorn might call this a space parametrization, if I understand him correctly.

So here's the problem: I can't imagine what it would mean to carry out a proof across Everett branches. Each prover would have a different proof, but each one would be valid in its own branch across time (like standing in any one place in the example above).

I think a refutation of that would be at least as bizarre as carrying out a proof across space while keeping time still (note: if you don't keep time still, you're probably still playing with temporal inconsistencies), so maybe come up with a counterexample like that? I'm thinking something along the lines of code=data will allow it, but I couldn't come up with anything.

Replies from: Dacyn

↑ comment by Dacyn · 2018-05-25T10:07:00.124Z · LW(p) · GW(p)

Sure, but I don't think anyone was talking about problems arising from Sleeping Beauty needing to do a computation taking multiple days. The computations are all simple enough that they can be done in one day.

Replies from: strangepoop

↑ comment by a gently pricked vein (strangepoop) · 2018-07-16T11:04:16.458Z · LW(p) · GW(p)

I'd say your reply is at least a little bit of logical rudeness [LW · GW], but I'll take the "Sure, ...".

I was pointing specifically at the flaw* in bringing up Everett branches into the discussion at all, not about whether the context happened to be changing here.

I wouldn't really mind the logical rudeness (if it is so), except for the missed opportunity of engaging more fully with your fascinating comment! (see also *)

It's also nice to see that the followup [LW · GW] to OP starts with a discussion of why it's a good/easy first rule to, like I said, just ban non-timeless propositions, even if we can eventually come with a workable system that deals with it well.

(*) As noted in GP, it's still not clear to me that this is a flaw, only that I couldn't come up with anything in five minutes! Part of the reason I replied was in the hopes that you'd have a strong defense of "everettian-indexicals", because I'd never thought of it that way before!

Replies from: Dacyn

↑ comment by Dacyn · 2018-07-16T18:45:52.300Z · LW(p) · GW(p)

Hmm. I don't think I see the logical rudeness, I interpreted TAG's comment as "the problem with non-timeless propositions is that they don't evaluate to the same thing in all possible contexts" and I brought up Everett branches in response to that, I interpreted your comment as saying "actually the problem with non-timeless propositions is that they aren't necessarily constant over the course of a computation" and so I replied to that, not bringing up Everett branches because they aren't relevant to your comment. Anyway I'm not sure exactly what kind of explanation you are looking for, it feels like I have explained my position already but I realize there can be inferential distances.

Replies from: TAG

↑ comment by TAG · 2018-07-24T13:10:04.777Z · LW(p) · GW(p)

“the problem with non-timeless propositions is that they don’t evaluate to the same thing in all possible context

It's more “the problem with non-timeless propositions is that they don’t evaluate to the same thing in all possible context AND a change of context can occur in the relevant situation".

No one knows whether Everett branches are, or what they are. If they are macroscopic things that remain constant over the course of the SB story, they are not a problem....but time still is, because it doesn't. If branching occurs on coin flips, or smaller scales, then they present the same problem as time indexicals.

Replies from: Dacyn

↑ comment by Dacyn · 2018-07-24T22:41:09.068Z · LW(p) · GW(p)

Right, so it seems like our disagreement is about whether it is relevant whether the value of a proposition is constant throughout the entire problem setup, or only throughout a single instance of someone reasoning about that setup.

comment by travisrm89 · 2018-05-22T16:41:43.826Z · LW(p) · GW(p)

This is a very enlightening post. But something doesn't seem right. How can receiving a random bit cause Beauty to update her probability, as in the case where Beauty is an AI? If Beauty already knows that she will update her probability no matter what bit she receives, then shouldn't she already update her probability before receiving the bit?

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-22T17:29:08.288Z · LW(p) · GW(p)

That's a good point, but let's consider where that principle comes from: it derives from the fact that

\begin{matrix} Pr (A ∣ M) & = Pr ((A & B_{1}) or \dots or (A & B_{n}) ∣ M) = \sum i Pr (A & B_{i} ∣ M) = \sum i Pr (A ∣ B_{i}, M) Pr (B_{i} ∣ M) \end{matrix}

where $B_{1}, \dots, B_{n}$ are mutually exclusive and exhaustive propositions. The second equality above relies on the fact that the $B_{i}$ are MEE; otherwise we'd have to subtract a bunch of terms for various conjunctions (ANDS) of the $B_{i}$ . But the set of propositions

X_{2} (y) ≜ R (y, M) or R (y, T),

indexed by $y$ , are not mutually exclusive. If $y$ and $y^{'}$ are remembered perceptions on different days, then both $X_{2} (y)$ and $X_{2} (y^{'})$ will be true.

Replies from: ksvanhorn, Charlie Steiner

↑ comment by ksvanhorn · 2018-05-22T18:29:44.655Z · LW(p) · GW(p)

What I wrote above may be a bit misleading. The issue isn't that you have additional terms for conjunctions of the $B_{i}$ , but that the weights $Pr (B_{i} ∣ M)$ sum to more than $1$ . In particular, consider the case when AI Beauty gets exactly one bit of input. Then for $y = 0$ or $1$ ,

\begin{matrix} Pr (X_{2} (y) ∣ M) & = \frac{1}{2} Pr (H ∣ M) + \frac{3}{4} Pr (not H ∣ M) = \frac{1}{2} \cdot \frac{1}{2} + \frac{3}{4} \cdot \frac{1}{2} = \frac{5}{8} \end{matrix}

and $5 / 8 + 5 / 8 = 5 / 4 > 1$ . If we try the same decomposition as in my previous comment, then using $Pr (H ∣ X_{2} (y), M) = 1 / 2.5 = 2 / 5$ , we find

\begin{matrix} Pr (H ∣ M) & = Pr ((H & X_{2} (0)) or (H & X_{2} (1)) ∣ M) = Pr (H & X_{2} (0) ∣ M) + Pr (H & X_{2} (1) ∣ M) - Pr (H & X_{2} (0) & X_{2} (1) ∣ M) = Pr (X_{2} (0) ∣ M) \cdot Pr (H ∣ X_{2} (0), M) + Pr (X_{2} (1) ∣ M) \cdot Pr (H ∣ X_{2} (1), M) - 0 = \frac{5}{8} \cdot \frac{2}{5} + \frac{5}{8} \cdot \frac{2}{5} = \frac{1}{2} \end{matrix}

and everything is still consistent.

↑ comment by Charlie Steiner · 2018-05-24T06:12:14.154Z · LW(p) · GW(p)

And yet if you just keep them split up as R(y,d) indexed by both y and d, the MEE condition holds. So if Beauty expected to get both the observations and be told the day of those observations, she would expect no net update of P(H).

Huh. Does this mean that if being told only the content y makes an agent predictably update towards P(H)<0.5, being told only the day d makes your procedure predictably update towards P(H)>0.5?

comment by Jeff Jo (jeff-jo) · 2018-05-23T14:10:35.689Z · LW(p) · GW(p)

You mis-characterize what Elga does. He never directly formulates the state M1, where Beauty is awake. Instead, he formulates two states that are derived from information being added to M1. I'll call them M2A (Beauty learns the outcome is Tails) and M2B (Beauty learns that it is Monday). While he may not do it as formally as you want, he works backwards to show that three of the four components of a proper description of state M1 must have the same probability. What he skips over, is identifying the fourth component (whose probability is now zero).

What it seems Elga was trying to avoid - as everybody does - is that Beauty still "exists" on Tuesday, after Heads. She just can't observe it. But it is a component you need to consider in your more formal modeling. To illustrate, here's a simple re-structuring of your steps that changes nothing relevant to the question she is asked:

On Sunday the steps of the experiment are explained to Beauty, and she is put to sleep.
On Monday Beauty is awakened. She has no information that would help her infer the day of the week. Later in the day she is interviewed. Afterwards, she is administered a drug that resets her memory to its state when she was put to sleep on Sunday, and puts her to sleep again.
On Tuesday Beauty is awakened. She has no information that would help her infer the day of the week. The experimenters flip a fair coin. If it lands Tails, Beauty is interviewed again; if it lands Heads, she is not. In either case, she is then administered a drug that resets her memory to its state when she was put to sleep on Sunday, and puts her to sleep again.
On Wednesday Beauty is awakened once more and told that the experiment is over.

In the interview(s), Beauty is asked to give a probability for her belief that the coin in step 3 lands Heads.

I'm sure you can make this more formal, so I'll be brief: State M, on Sunday, requires only proposition C describing what Beauty thinks the coin result is (*not* for what it actually is, which becomes deterministic at different times in different versions of the problem). There is no information in state M that favors either result, so the Principle of Indifference applies and the probability for each is 1/2.

State M1, when Beauty is first awakened, requires another proposition: D, for what day Beauty thinks it is (*not* for what day it actually is, which is deterministic). Due to the memory-reset drug, the same state M1 applies on both Monday and Tuesday. Since there is no information in state M1 that favors either result, the Principle of Indifference applies and the probability for each is 1/2. And (what seems to be overlooked by denying the existence of Tuesday when Beauty sleeps through it) D and H are independent. So M1 comprises four possible combinations of D and H that all have a probability of 1/4.

State M2 applies when Beauty is interviewed. The information that takes Beauty from M1 to M2 is that one of the four combinations is ruled out. The remaining three now have probability 1/3.

State M1 applies to your version of the problem at the point in time just before Beauty could be wakened, in either step 2 or step 3. It applies, and can be determined later when Beauty is awake, whether or not Beauty is awake at that time. Elga's solution is essentially the same as mine, except he does it in two parts by adding more information to each. It just avoids identifying the component of the state that Beauty sleeps through.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-29T03:07:03.398Z · LW(p) · GW(p)

Your whole analysis rests on the idea that "it is Monday" is a legitimate proposition. I've responded to this many other places in the comments, so I'll just say here that a legitimate proposition needs to maintain the same truth value throughout the entire analysis (Sunday, Monday, Tuesday, and Wednesday). Otherwise it's a predicate. The point of introducing R(y,d) is that it's as close as we can get to what you want "it is Monday" to mean.

Replies from: jeff-jo

↑ comment by Jeff Jo (jeff-jo) · 2020-01-28T13:09:56.524Z · LW(p) · GW(p)

Well, I never checked back to see replies, and just tripped back across this.

The error made by halfers is in thinking "the entire analysis" spans four days. Beauty is asked for her assessment, based on her current state of knowledge, that the coin landed Heads. In this state of knowledge, the truth value of the proposition "it is Monday" does not change.

But there is another easy way to find the answer, that satisfies your criterion. Use four Beauties to create an isomorphic problem. Each will be told all of the details on Sunday; that each will be wakened at least once, and maybe twice, over the next two days based on the same coin flip and the day. But only three will be wakened on each day. Each is assigned a different combination of a coin face, and a day, for the circumstances where she will not be wakened. That is, {H,Mon}, {T,Mon}, {H,Tue}, and {T,Tue}.

On each of the two days during the experiment, each awake Beauty is asked for the probability that she will be wakened only once. Note that the truth value of this proposition is the same throughout the experiment. It is only the information a Beauty has that changes. On Sunday or Wednesday, there is no additional information and the answer is 1/2. On Monday or Tuesday, an awake Beauty knows that there are three awake Beauties, that the proposition is true for exactly one of them, and that there is no reason for any individual Beauty to be more, or less, likely than the others to be that one. The answer with this knowledge is 1/3.

comment by Dagon · 2018-05-22T18:55:46.012Z · LW(p) · GW(p)

Interesting, but I disagree. I fully agree that the problem is ambiguous in that it doesn't define what the actual proposition is. I think different assumptions can lead to saying 1/3 or 1/2, but with deconstruction can be shown to always be 1/2. I don't think anything in between is reasonable, and I don't think any information is gained by waking up (which has a prior of 1.0, so no surprise value).

Probability is in the map, not the territory. It matters a lot what is actually being predicted, which is what the "betting" approach is trying to get at. If this is "tails->you will make two bets, heads->you will make one bet", then the correct approach is to assign 1/2 probability but 1/3 betting odds. If this is "you will be asked once or twice, but the bet only resolved once", then 1/2 is the only reasonable answer.

Amount of time spent awake is irrelevant to any reasonable proposition (proposition=prediction of future experience) that you might be talking about when you say "probability that the coin is heads".

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-22T19:21:06.938Z · LW(p) · GW(p)

My intuition rebels against these conclusions too, but if the analysis is wrong, then where specifically is the error? Can you point to some place where the math is wrong? Can you point to an error in the modeling and suggest a better alternative? I myself have tried to disprove this result, and failed.

Replies from: Dacyn, Dagon

↑ comment by Dacyn · 2018-05-22T20:17:16.527Z · LW(p) · GW(p)

The whole calculation is based on the premise that Neal's concept of "full non-indexical conditioning" is a reasonable way to do probability theory. Usually you do probability theory on what you are calling "centered propositions", and you interpret each data point you receive as the proposition "I have received this data". Not as "There exists a version of me which has received this data as well as all of the prior data I have received". It seems really odd to do the latter, and I think more motivation is needed for it. (To be fair, I don't have a better alternative in mind.)

Replies from: Wei_Dai, ksvanhorn

↑ comment by Wei Dai (Wei_Dai) · 2018-05-22T21:41:16.055Z · LW(p) · GW(p)

It seems really odd to do the latter, and I think more motivation is needed for it.

This old post [LW · GW] of mine may help. The short version is that if you do probability with "centered propositions" then the resulting probabilities can't be used in expected utility maximization.

(To be fair, I don’t have a better alternative in mind.)

I think the logical next step from Neal’s concept of “full non-indexical conditioning” (where updating on one's experiences means taking all possible worlds, assigning 0 probability to those not containing "a version of me which has received this data as well as all of the prior data I have received", then renormalizing sum of the rest to 1) is to not update [LW · GW], in other words, use UDT. The motivation here is that from a decision making perspective, the assigning 0 / renormalizing step either does nothing (if your decision has no consequences in the worlds that you'd assign 0 probability to) or is actively bad (if your decision does have consequences in those possible worlds, due to logical correlation between you and something/someone in one of those worlds). (UDT also has a bunch of other motivations [LW · GW] if this one seems insufficient by itself.)

Replies from: Dacyn

↑ comment by Dacyn · 2018-05-22T23:07:30.189Z · LW(p) · GW(p)

Yeah, but the OP was motivated by an intuition that probability theory is logically prior to and independent of decision theory. I don't really have an opinion on whether that is right or not but I was trying to answer the post on its own terms. The lack of a good purely-probability-theory analysis might be a point in favor of taking a measure non-realist point of view though.

To make clear the difference between your view and ksvanhorn's, I should point out that in his view if Sleeping Beauty is an AI that's just woken up on Monday/Tuesday but not yet received any sensory input, then the probabilities are still 1/2; it is only after receiving some sensory input which is in fact different on the two days (even if it doesn't allow the AI to determine what day it is) that the probabilities become 1/3. Whereas for decision-theoretic purposes you want the probability to be 1/3 as soon as the AI wakes up on Monday/Tuesday.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-23T03:43:17.228Z · LW(p) · GW(p)

for decision-theoretic purposes you want the probability to be 1/3 as soon as the AI wakes up on Monday/Tuesday.

That is based on a flawed decision analysis that fails to account for the fact that Beauty will make the same choice, with the same outcome, on both Monday and Tuesday (it treats the outcomes on those two days as independent).

Replies from: Dacyn

↑ comment by Dacyn · 2018-05-23T09:07:56.595Z · LW(p) · GW(p)

So you want to use FDT, not CDT. But if the additional data of which direction the fly is going isn't used in the decision-theoretic computation, then Beauty will make the same choice on both days regardless of whether she has seen the fly's direction or not. So according to this analysis the probability still needs to be 1/2 after she has seen the fly.

↑ comment by ksvanhorn · 2018-05-23T02:39:45.402Z · LW(p) · GW(p)

There are several misconceptions here:

1. Non-indexical conditioning is not "a way to do probability theory"; it is just a policy of not throwing out any data, even data that appears irrelevant.

2. No, you do not usually do probability theory on centered propositions such as "today is Monday", as they are not legitimate propositions in classical logic. The propositions of classical logic are timeless -- they are true, or they are false, but they do not change from one to the other.

3. Nowhere in the analysis do I treat a data point as "there exists a version of me which has received this data..."; the concept of "a version of me" does not even appear in the discussion. If you are quibbling over the fact that $P_{d t}$ is only the stream of perceptions Beauty remembers experiencing as of time $t$ , instead of being the entire stream of perceptions up to time $t$ , then you can suppose that Beauty has perfect memory. This simplifies things---we can now let $P_{d}$ simply be the entire sequence of perceptions Beauty experiences over the course of the day, and define $R (y, d)$ to mean " $y$ is the first $n$ elements of $P_{d}$ , for some $n$ "---but it does not alter the analysis.

Replies from: Wei_Dai, Dacyn

↑ comment by Wei Dai (Wei_Dai) · 2018-05-23T09:02:52.452Z · LW(p) · GW(p)

Nowhere in the analysis do I treat a data point as “there exists a version of me which has received this data...”;

This confuses me. Dacyn's “There exists a version of me which has received this data as well as all of the prior data I have received” seems equivalent to Neal's "I will here consider what happens if you ignore such indexical information, conditioning only on the fact that someone in the universe with your memories exists. I refer to this procedure as “Full Non-indexical Conditioning” (FNC)." (Section 2.3 of Neal2007)

Do you think Dacyn is saying something different from Neal? Or that you are saying something different from both Dacyn and Neal? Or something else?

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-26T18:40:47.967Z · LW(p) · GW(p)

None of this is about "versions of me"; it's about identifying what information you actually have and using that to make inferences. If the FNIC approach is wrong, then tell me what how Beauty's actual state of information differs from what is used in the analysis; don't just say, "it seems really odd."

↑ comment by Dacyn · 2018-05-23T09:18:27.271Z · LW(p) · GW(p)

I responded to #2 below, and #1 seems to be just a restatement of your other points, so I'll respond to #3 here. You seem to be taking what I wrote a little too literally. It looks like you want the proposition Sleeping Beauty conditions on to be "on some day, Sleeping Beauty has received / is receiving / will receive the data X", where X is the data that she has just received. (If this is not what you think she should condition on, then I think you should try to write the proposition you think she should condition on, using English and not mathematical symbols.) This proposition doesn't have any reference to "a version of me", but it seems to me to be morally the same as what I wrote (and in particular, I still think that it is really odd to say that that it is the proposition she should condition on, and that more motivation is needed for it).

↑ comment by Dagon · 2018-05-22T22:07:49.627Z · LW(p) · GW(p)

where specifically is the error

It's a useless and misleading modeling choice to condition on irrelevant data, and even worse to condition on the assumption the unstated irrelevant data is actually relevant enough to change the outcome. That's not what "irrelevant" means, and the argument that humans are bad at knowing what's relevant does _NOT_ imply that all data is equally relevant, and even less does it imply that the unknown irrelevant data has precisely X relevance.

Wei is correct that UDT is a reasonable approach that sidesteps the necessity to identify a "centered" proposition (though I'd argue that it picks Sunday knowledge as the center). But I think it's _also_ solvable by traditional means just be being clear what proposition about what prediction is being assigned/calculated a probability.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-23T02:56:40.335Z · LW(p) · GW(p)

It's a useless and misleading modeling choice to condition on irrelevant data

Strictly speaking, you should always condition on all data you have available. Calling some data $D$ irrelevant is just a shorthand for saying that conditioning on it changes nothing, i.e., $Pr (A ∣ D, X) = Pr (A ∣ X)$ . If you can show that conditioning on $D$ does change the probability of interest---as my calculation did in fact show---then this means that $D$ is in fact relevant information, regardless of what your intuition suggests.

even worse to condition on the assumption the unstated irrelevant data is actually relevant enough to change the outcome.

There was no such assumption. I simply did the calculation, and thereby demonstrated that certain data believed to be irrelevant was actually relevant.

comment by JeffJo · 2023-09-01T20:17:08.346Z · LW(p) · GW(p)

This paper starts out with a misrepresentation. "As a reminder, this is the Sleeping Beauty problem:"... and then it proceeds to describe the problem as Adam Elga modified it to enable his thirder solution. The actual problem that Elga presented was:

Some researchers are going to put you to sleep. During the two days[1] that your sleep will last, they will briefly wake you up either once or twice, depending on the toss of a fair coin (Heads: once; Tails: twice). After each waking, they will put you to back to sleep with a drug that makes you forget that waking.2 When you are first awakened[2], to what degree ought you believe that the outcome of the coin toss is Heads?

There are two hints of the details Elga will add, but these hints do not impact the problem as stated. At [1], Elga suggests that the two potential wakings occur on different days; all that is really important is that they happen at different times. At [2], the ambiguous "first awakened" clause is added. It could mean that SB is only asked the first time she is awakened; but that renders the controversy moot. With Elga's modifications, only asking on the first awakening is telling SB that it is Monday. He appears to mean "before we reveal some information," which is how Elga eliminates one of the three possible events he uses.

Elga's implementation of this problem was to always wake SB on Monday, and only wake her on Tuesday if the coin result was Tails. After she answers the question, Elga then reveals either that it is Monday, or that the coin landed on Tails. Elga also included DAY=Monday or DAY=Tuesday as a random variable, which creates the underlying controversy. If that is proper, the answer is 1/3. If, as Neal argues, it is indexical information, it cannot be used this way and the answer is 1/2.

So the controversy was created by Elga's implementation. And it was unnecessary. There is another implementation of the same problem that does not rely on indexicals.

Once SB is told the details of the experiment and put to sleep, we flip two coins: call then C1 and C2. Then we perform this procedure:

If both coins are showing Heads, we end the procedure now with SB still asleep.
Otherwise, we wake SB and ask for her degree of belief that coin C1 landed on Heads.
After she gives an answer, we put her back to sleep with amnesia.

After these steps are concluded, whether it happened in step 1 or step 3, we turn coin C2 over to show the opposite side. And then repeat the same procedure.

SB will thus be wakened once if coin C1 landed on Heads, and twice if Tails. Either way, she will not recall another waking. But that does not matter. She knows all of the details that apply to the current waking. Going into step 1, there were four possible, equally-likely combinations of (C1,C2); specifically, (H,H), (H,T), (T,H), and (T,T). But since she is awake, she knows that (H,H) was eliminated in step 1. In only one of the remaining, still equally-likely combinations, did coin C1 land on Heads.

The answer is 1/3. No indexical information was used to determine this. No reference the other potential waking, whether it occurs before or after this one, is needed. This implements Elga's question exactly; this only possible issue that remains is if Elga's implementation does.

comment by agilecaveman · 2018-05-28T00:28:11.598Z · LW(p) · GW(p)

I think this post is fairly wrong headed.

First, your math seems to be wrong.

Your numerator is ½ * p(y), which seems like a Pr (H | M) * Pr(X2 |H, M)

Your denominator is 1/2⋅p(y)+1/2⋅p(y)(2−q(y)), which seems like

Pr(H∣M) * Pr(X2∣H,M) + Pr(¬H∣M) * Pr(X2∣¬H,M), which is Pr(X2 |M)

By bayes rule, Pr (H | M) * Pr(X2 |H, M) / Pr(X2 |M) = Pr(H∣X2, M), which is not the same quantity you claimed to compute Pr(H∣X2). Unless you have some sort of other derivation or a good reason why you omitted M in your calculations: this isn’t really “solving” anything.

Second, the dismissal of betting arguments is strange. If decision theory is indeed downstream of probability, then probability acts as an input to the decision theory. So, if there is a particular probability p of heads at a given moment, it means it’s most rational to bet according to said probability. If your ideal decision theory diverges from probability estimates to arrive at the right answer on betting puzzles, then the question of probability is useless. If it takes the probability into account and gets the wrong answer, then this is not truly rational.

More generally, probability theory is supposed to completely capture the state of knowledge of an agent and if there is other knowledge that is obscured by probability, that means it is important to capture as well in another system. Building a functional AI would then require a knowledge representation that is separate, but interfacing from a probability representation, making the real question is: what is that knowledge representation?

“probability theory is logically prior to decision theory.” Yes, this is the common view because probability theory was developed first and is easier but it’s not actually obvious this *has* to be the case. If there is a new math that puts decisions as more fundamental than beliefs, then it might be better for a real AI.

Third, dismissal of “not H and it’s Tuesday” as not propositions doesn’t make sense. Classical logic encodes arbitrary statements within AND and OR -type constructions. There isn’t a whole lot of restrictions on them.

Fourth, the assumptions. Generally, I have read the problem as whatever the beauty experiences on Monday is the same as on Tuesday, or q(y) = 1, at which point this argument reduces to ½-er position and then the usual anti-1/2, pro-1/3 arguments apply. The paradox still stands for the moment when you wake up, or if you get no additional bits of input. The question of updating on actual input in the problem is an interesting one, but it hides the paradox of what your probability should be *at the moment of waking up*. You seem to simply declare it to be ½, by saying:

The prior for H is even odds: Pr(H∣M)=Pr(¬H∣M)=1/2.

This is generally indistinguishable from the ½ position you dismiss that argues for that prior on the basis of “no new information.” You still don’t know how to handle the situation of being told that it’s Monday and needing to update your probability accordingly, vs conditioning on Monday and doing inferences.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-29T01:42:57.554Z · LW(p) · GW(p)

By bayes rule, Pr (H | M) * Pr(X2 |H, M) / Pr(X2 |M) = Pr(H∣X2, M), which is not the same quantity you claimed to compute Pr(H∣X2).

That's a typo. I meant to write $Pr (H ∣ X_{2}, M)$ , not $Pr (H ∣ X_{2})$ .

Second, the dismissal of betting arguments is strange.

I'll have more to say soon about what I think is the correct betting argument. Until then, see my comment in reply to Radford Neal about disagreement on how to apply betting arguments to this problem.

“probability theory is logically prior to decision theory.” Yes, this is the common view because probability theory was developed first and is easier but it’s not actually obvious this *has* to be the case.

I said logically prior, not chronologically prior. You cannot have decision theory without probability theory -- the former is necessarily based on the latter. In contrast, probability theory requires no reference to decision theory for its justification and development. Have you read any of the literature on how probability theory is either an or the uniquely determined extension of propositional logic to handle degrees of certainty? If not, see my references. Neither Cox's Theorem nor my theorem rely on any form of decision theory.

Third, dismissal of “not H and it’s Tuesday” as not propositions doesn’t make sense. Classical logic encodes arbitrary statements within AND and OR -type constructions. There isn’t a whole lot of restrictions on them.

I'll repeat my response to Jeff Jo: The standard textbook definition of a proposition is a sentence that has a truth value of either true or false. The problem with a statement whose truth varies with time is that it does not have a simple true/false truth value; instead, its truth value is a function from time to the set {true,false}. In logical terms, such a statement is a predicate, not a proposition. For example, "Today is Monday" corresponds to the predicate $P (t) ≜ (d a y o f (t) = M o n d a y)$ . It doesn't become a proposition until you substitute in a specific value for $t$ , e.g. "Unix timestamp 1527556491 is a Monday."

The paradox still stands for the moment when you wake up

You have not considered the possibility that the usual decision analysis applied to this problem is wrong. There is, in fact, disagreement as to what the correct decision analysis is. I will be writing more on this in a future post.

You seem to simply declare [Beauty's probability at the moment of awakening] to be ½, by saying:

The prior for H is even odds: Pr(H∣M)=Pr(¬H∣M)=1/2.

This is generally indistinguishable from the ½ position you dismiss that argues for that prior on the basis of “no new information.”

In fact, I explicitly said that at the instant of awakening, Beauty's probability is the same as the prior, because at that point she does not yet have any new information. As she receives sensory input, her probability for Heads decreases asymptotically to 1/2. All of this is just standard probability theory, conditioning on the new information as it arrives. I dismissed the naive halfer position because it incorrectly assumes that Beauty's sensory input is irrelevant to the determination of her probability for Heads.

You still don’t know how to handle the situation of being told that it’s Monday and needing to update your probability accordingly,

Uh, yes I do---it's just standard probability theory again. Just do the math. If Beauty finds out that it is Monday, her new information since Sunday changes from $(R (y, M) or R (y, T))$ to just $R (y, M)$ , and since the problem definition assumes that

Pr (R (y, M) ∣ H, M) = Pr (R (y, M) ∣ not H, M)

we get equal posterior probabilities for $H$ and $not H$ , which is generally accepted to be the right answer.

Replies from: jeff-jo, agilecaveman

↑ comment by Jeff Jo (jeff-jo) · 2018-06-22T19:53:03.436Z · LW(p) · GW(p)

You said: "The standard textbook definition of a proposition is a sentence that has a truth value of either true or false.

This is correct. And when a well-defined truth value is not known to an observer, the standard textbook definition of a probability (or confidence) for the proposition, is that there is a probability P that it is "true" and a probability 1-P that it is "false."

For example, if I flip a coin but keep it hidden from you, the statement "The coin shows Heads on the face-up side" fits your definition of a proposition. But since you do not know whether it is true or false, you can assign a 50% probability to the result where "It shows Heads" is true, and a 50% probability the event where "it shows Heads" is false. This entire debate can be reduced to you confusing a truth value, with the probability of that truth value.

On Monday Beauty is awakened. While awake she obtains no information that would help her infer the day of the week. Later in the day she is put to sleep again.

During this part of the experiment, the statement "today is Monday" has the truth value "true", and does not have the truth value "false." So by your definition, it is a valid proposition. But Beauty does not know that it is "true."

On Tuesday the experimenters flip a fair coin. If it lands Tails, Beauty is administered a drug that erases her memory of the Monday awakening, and step 2 is repeated.

During this part of the experiment, the statement "today is Monday" has the truth value "false", and does not have the truth value "true." So by your definition, it is a valid proposition. But Beauty dos not know that it is "false."

In either case, the statement "today is Monday" is a valid proposition by the standard definition you use. What you refuse to acknowledge, is that it is also a proposition that Beauty can treat as "true" or "false" with probabilities P and 1-P.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2018-06-22T20:06:38.693Z · LW(p) · GW(p)

[Moderator Note:] I am reasonably confident that this current format of the discussion is not going to cause any participant to change their mind, and seems quite stressful to the people participating in it, at least from the outside. While I haven't been able to read the whole debate in detail, it seems like you are repeating similar points over and over, in mostly the same language. I think it's fine for you to continue and comment, but I just really want to make sure that people don't feel an obligation to respond and get dragged into a debate that they don't expect to get any value from.

↑ comment by agilecaveman · 2018-05-31T19:24:30.905Z · LW(p) · GW(p)

if the is indeed a typo, please correct it at the top level post and link to this comment. The broader point is that the interpretation of P( H | X2, M) is probability of heads conditioned on Monday and X2, and P (H |X2) is probability of heads conditioned on X2. In the later paragraphs, you seem to use the second interpretation. In fact, It seems your whole post's argument and "solution" rests on this typo.

Dismissing betting arguments is very reminiscent of dismissing one-boxing in Newcomb's because one defines "CDT" as rational. The point of probability theory is to be helpful in constructing rational agents. If the agents that your probability theory leads to are not winning bets with the information given to them by said theory, the theory has questionable usefulness.

Just to clarify, I have read Probability, the Logic of science, Bostrom's and Armstrong's papers on this. I have also read https://meaningness.com/probability-and-logic. The question of the relationship of probability and logic is not clear cut. And as Armstrong has pointed out, decisions can be more easily determined than probabilities, which means it's possible the ideal relationship between decision theory and probability theory is not clear cut, but that's a broader philosophical point that needs a top level post.

In the meantime, Fix Your Math!

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-06-10T21:01:37.346Z · LW(p) · GW(p)

No, P(H | X2, M) is $Pr (H ∣ X_{2}, M)$ , and not $Pr (H ∣ X_{2}, Monday)$ . Recall that $M$ is the proposed model. If you thought it meant "today is Monday," I question how closely you read the post you are criticizing.

I find it ironic that you write "Dismissing betting arguments is very reminiscent of dismissing one-boxing in Newcomb's" -- in an earlier version of this blog post I brought up Newcomb myself as an example of why I am skeptical of standard betting arguments (not sure why or how that got dropped.) The point was that standard betting arguments can get the wrong answer in some problems involving unusual circumstances where a more comprehensive decision theory is required (perhaps FDT).

Re constructing rational agents: this is one use of probability theory; it is not "the point". We can discuss logic from a purely analytical viewpoint without ever bringing decisions and agents into the discussion. Logic and epistemology are legitimate subjects of their own quite apart from decision theory. And probability theory is the unique extension of classical propositional logic to handle intermediate degrees of plausibility.

You say you have read PTLOS and others. Have you read Cox's actual paper, or any or detailed discussions of it such as Paris's discussion in The Uncertain Reasoner's Companion, or my own "Constructing a Logic of Plausible Inference: A Guide to Cox's Theorem"? If you think that Cox's Theorem has too many arguable technical requirements, then I invite you to read my paper, "From Propositional Logic to Plausible Reasoning: A Uniqueness Theorem" (preprint here). That proof assumes only that certain existing properties of classical propositional logic be retained when extending the logic to handle degrees of plausibility. It does not assume any particular functional decomposition of plausibilities, nor does it even assume that plausibilities must be real numbers. As with Cox, we end up with the result that the logic must be isomorphic to probability theory. In addition, the theorem gives the required numeric value for a probability $Pr (A ∣ X)$ when $X$ contains, in propositional form, all of the background information we are using to assess the probability of $A$ . How much more "clear cut" do you demand the relationship between logic and probability be?

Regardless, for my argument about indexicals all that is necessary is that probability theory deals with classical propositions.

I responded to David Chapman's essay (https://meaningness.com/probability-and-logic) a couple of years ago here.

comment by Jeff Jo (jeff-jo) · 2018-05-25T14:15:24.037Z · LW(p) · GW(p)

You point out that Elga's analysis is based on an unproven assertion; that "it is Monday” and “it is Tuesday” are legitimate propositions. As far as I know, there is no definition of what can, or cannot, be used as a proposition. In other words, your analysis is based on the equally unproven assertion that they are not valid. Can remove the need to decide?

On Sunday, the steps of the following experiment are explained to Beauty, and she is put to sleep with a drug that somehow records her memory state. After she is put to sleep, two coins are flipped; a quarter and a nickel.
On Monday, Beauty is awakened. While awake she obtains no information that would help her infer the day, or if she has been previously awakened. If the two coins are not both showing Heads, she is interviewed. An hour after being awakened, she is put back to sleep with a drug that resets her memory to the recorded state.
The nickel is turned over.
On Tuesday, Beauty is awakened. While awake she obtains no information that would help her infer the day, or if she has been previously awakened. If the two coins are not both showing Heads, she is interviewed. An hour after being awakened, she is put back to sleep with a drug that resets her memory to the recorded state.
On Wednesday, Beauty is awakened once more and told that the experiment is over.

In each interview, Beauty is asked for her epistemic probability that the quarter is showing Heads.

Non-consequential option: replace "Beauty is awakened ... If the two coins show the same face, she is interviewed" with "If the two coins show the same face, Beauty is awakened and interviewed".

I hope we can all agree that the order of the potential awakenings changes nothing. She is not asked whether, and her answer cannot depend on if, it is Monday or Tuesday. Or which day she might sleep through. So this Beauty no longer cares about the propositions "it is Monday" and "it is Tuesday."

Beauty can assess the probability space that describes the two coins before the decision to interview her (or to wake her) was made. Most significantly, it doesn't change when the nickel has been manually turned over. Yes, the day changes the path it takes to get there, but the states look identical. In that state, there are four possible outcomes: {HH,HT,TH,TT}. The probability distribution is {1/4,1/4,1/4,1/4}.

But when she is interviewed, she knows that {HH} has been ruled out. The probability distribution can be updated to {0,1/3,1/3,1/3}.

+++++

The point of contention in the Sleeping Beauty Problem is whether the probability state at the end of your step 1 is the same state as at the beginning of your steps 2 and 3. The mere fact that the two steps can be distinguished, and can result in different paths, demonstrates that they are not. If your definition of what constitutes a "valid proposition" cannot model this difference, then I suggest that it is that definition that is faulty.

And yes, there are other ways that I can demonstrate that the answer must be 1/3.

+++++

Note: The worst red herring in this thread is about how Beauty might be able to tell how she has aged. This is a thought problem in probability, not an exercise in human physiology. We must assume that the mechanisms described in the problem function ideally as described. That includes "While awake she obtains no information that would help her infer the day of the week." Considering how these mechanisms might not be achievable is not productive.

Replies from: ksvanhorn, ksvanhorn

↑ comment by ksvanhorn · 2018-05-29T04:29:36.692Z · LW(p) · GW(p)

On the first read I didn't understand what you were proposing, because of the confusion over "If the two coins show the same face" versus "If the two coins are not both heads." Now that it's clear it should be "if the two coins are not both heads" throughout, and after rereading, I now see your argument.

The problem with your argument is that you still have "today" smuggled in: one of your state components is which way the nickel is lying "today." That changes over the course of the time period we are analyzing, so it does not give a legitimate proposition. To get a legitimate proposition we'll have to split it up into two propositions: "The nickel lies Heads up on Monday" and "The nickel lies Heads up on Tuesday".

So in truth, the actual four possible outcomes are HHT, HTH, THT, and TTH. None of these is ruled out by the mere fact of waking up. Not until Beauty receives sufficient sensory input to provide a label for "today" that is nearly certain to be unique do we arrive at a situation in which your analysis is approximately correct.

BTW, is this argument your own? Although I don't think it's right, it is an interesting argument. Is there a citation I should use if I want to reference it in future writing?

↑ comment by ksvanhorn · 2018-05-26T18:19:22.358Z · LW(p) · GW(p)

The standard textbook definition of a proposition is this:

A proposition is a sentence that is either true or false. If a proposition is true, we say that its truth value is "true," and if a proposition is false, we say that its truth value is "false."

(Adapted from https://www.cs.utexas.edu/~schrum2/cs301k/lec/topic01-propLogic.pdf.)

The problem with a statement whose truth varies with time is that it does not have a simple true/false truth value; instead, its truth value is a function from time to the set ${t r u e, f a l s e}$ .

As for the rest of your argument, my request is this: show me the math. That is, define the joint probability distribution describing what Beauty knows on Sunday night, and tell me what additional information she has after awakening on Monday/Tuesday. As I argued in the OP, purely verbal arguments are suspect when it comes to probability problems; it's too easy to miss something subtle.

BTW, in one place you say "if the two coins are not both showing Heads," and in another you say "if the two coins show the same face"; which is the one you intended?

Replies from: jeff-jo

↑ comment by Jeff Jo (jeff-jo) · 2018-05-28T13:14:55.119Z · LW(p) · GW(p)

(Sorry about the typo - I waffled between several isomorphic versions. The one I ultimately chose should have "both showed Heads.")

In the OP, you said:

Another serious error in many discussions of this problem is the use of supposedly mutually exclusive “propositions” that are neither mutually exclusive nor actually legitimate propositions. HM, TM, and TT can be written as

HM=H and (it is Monday)TM=(not H) and (it is Monday)TT=(not H) and (it is Tuesday).

These are not truly mutually exclusive because, if not H, then Beauty will awaken on both Monday and Tuesday.

Now you say:

A proposition is a sentence that is either true or false.

Are you really claiming that the statement "today is Monday" is not a sentence that is either true or false? That it is not "mutually exclusive" with "today is Tuesday"? Or are you simply ignoring the fact that the frame of reference, within which Beauty is asked to assess the proposition "The coin lands Heads," is a fixed moment in time? That she is asked to evaluate it at the current moment, and not over the entire time frame of the experiment?

Let me insert an example here, to illustrate the problem with your assertion about functions. One half of a hidden, spinning disk is white; the other, black. It spins at a constant rate V, but you don't know its position at any previous time. There is a sensor aligned along its rim that can detect the color at the point in time when you press a button. You are asked to assess the probability of the proposition W, that the sensor will detect "white" when you first press the button.

This is a valid proposition, even though it varies with time. It is valid because it doesn't ask you to evaluate the proposition at every time, but at a fixed point in time.

The problem with a statement whose truth varies with time is that it does not have a simple true/false truth value; instead, its truth value is a function from time to the set {true,false}.

It does have a simple true/false truth value if you are asked to evaluate it at fixed point in time. Your assertion applies to functions where every value of the dependent variable are considered to be "true" simultaneously.

I did give you the math, but I'll repeat it in a slightly different form. Consider the point in time just before (A) in my version, when Beauty is awake and could be interviewed, or (B) in yours, when Beauty could be awakened. At this point in time, there are two valid-by-your-definition propositions: H, the proposition that "the coin lands Heads" and M, the proposition that "today is Monday." Each is asking about a specific moment in time, so your unsupported assertion that we need to consider all possible values of the time parameter is wrong. The two propositions are independent, because at the moment in time where I asked you to evaluate it, H does not influence M.

The sample space (the set set of possible outcomes described by {H,M}) is {{t,t},{t,f},{f,t},{f,f}}. The probability distribution for this sample space is {1/4,1/4,1/4,1/4}. If Beauty is (A) interviewed or (B) awakened, she knows that the outcome that applies to the current moment in time is not {t,f}. So the probability distribution can be updated to {1/3,0,1/3/,1/3}.

+++++

The error that halfer's make, is considering all values of the time parameter to applicable when Beauty is asked to make an assessment at a single, unknown time.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-29T02:25:39.120Z · LW(p) · GW(p)

Are you really claiming that the statement "today is Monday" is not a sentence that is either true or false?

Yes. It does not have a simple true/false truth value. Since it is sometimes true and sometimes false, its truth value is a function from time to {true, false}. That makes it a predicate, not a proposition.

Or are you simply ignoring the fact that the frame of reference, within which Beauty is asked to assess the proposition "The coin lands Heads," is a fixed moment in time?

It is not a fixed moment in time; if it were, the SB problem would be trivial and nobody would write papers about it. The questions about day of week and outcome of coin toss are potentially asked both on Monday and on Tuesday. This makes the rest of your analysis invalid. You keep on asserting that "today is Monday" is evaluated at a fixed moment in time, when in reality it is evaluated at at least two separate moments in time with different answers.

You are asked to assess the probability of the proposition W, that the sensor will detect "white" when you first press the button. This is a valid proposition, even though it varies with time.

The sentence "the sensor detects white" is not a valid proposition; it is a predicate, because it is a function of time. Let's write $P (t)$ for this predicate. But yes, the sentence "the sensor detects white when you first press the button" is a legitimate proposition, precisely because specifies a particular time $t$ for which $P (t)$ is true, and so the truth value of the statement itself does not vary with time.

This gets us to the whole point of defining $R (y, d)$ : saying "Beauty has a stream of experiences $y$ on day $d$ " is as close as we can get to identifying a specific moment in time corresponding to the "this" in "this is day $d$ ". The more nearly that $y$ uniquely identifies the day, the more nearly that $R (y, d)$ can be interpreted to mean "this is day $d$ ".

Replies from: jeff-jo

↑ comment by Jeff Jo (jeff-jo) · 2018-05-29T15:35:43.161Z · LW(p) · GW(p)

Are you really claiming that the statement "today is Monday" is not a sentence that is either true or false?

Yes. It does not have a simple true/false truth value.

It most certainly does. It is true on Monday when Beauty is awake, and false on Sunday Night, on Tuesday whether or not Beauty is awake, and on Wednesday.

A better random variable might be D, which takes values in {0,1,2,3} for these four days. What you refuse to deal with, is that its uninformed distribution depends on the stage of the experiment: {1,0,0,0} when she knows it is Sunday, {0,1/2,1/2,0} when she is awakened but not told the experiment is over, and {0,0,0,1} when she is told it is over.

Or you could just recognize that the probability space when she awakes is not derived by removing outcomes from Sunday's. Which is how conventional problems in conditional probability work. That a new element of randomness is introduced by the procedures you use in steps 2 and 3.

To illustrate this without obfuscation, ignore the amnesia part. Wake Beauty just once. It can happen any day during the rest of the week, as determined by a roll of a six-sided die. When she is awake, "Die lands 3" is just as valid a proposition - in fact, the same proposition - as "today is Wednesday." It has probability 1/6.

If you add in the amnesia drug, and roll two dice (re-rolling if you get doubles so that you wake her on two random days), the probability for "a die lands 3" is 1/3, but for "today is Wednesday" it is 1/6.

Since it is sometimes true and sometimes false, its truth value is a function from time to {true, false}. That makes it a predicate, not a proposition.

The proposition "coin lands heads" is sometimes true, and sometimes false, as well. In fact, you have difficulty expressing the tense of the statement for that very reason.

But, it is a function of the parameters that define how you flip a coin: start position, force applied, etc. What you refuse to deal with, is that in this odd experiment, the time parameter Day is also one of the independent parameters that defines the randomness of Beauty's situation, and not one that makes Monday's state predicated on Sunday's.

It is not a fixed moment in time; if it were, the SB problem would be trivial and nobody would write papers about it.

By being asked about the proposition H, Beauty knows that she is in either step 2 or step 3 of your experiment. This establishes a fixed value of the time parameter Day. And the problem is trivial - people write papers about it because they don't understand how Day is an independent parameter that defines the randomness of the situation, and not one that predicates one state on another.

The sentence "the sensor detects white" is not a valid proposition.

Then "Coin lands heads" is similarly a predicate, and so not a valid proposition.

But your argument about being a predicate, and not a valid proposition, does apply to the statement "It is the 9 o'clock hour." Because "hour" it is not a parameter you use to define the randomness of the situation.

+++++

Here's some simple questions for you, to illustrate how randomness is being defined. Write the four labels {"Heads,Monday", "Tails,Monday","Heads,Tuesday", "Tails,Tuesday"} on four cards. Deal one at random to Beauty. Change step 3 of your experiment so that the day and coin result it mentions, are those on the dealt card. Change step 2 so that it mentions the other day. Change the proposition Beauty is asked to evaluate a probability for to "Coin lands on the face written on the dealt card."

If, on Sunday, she is shown that she was dealt, "Heads,Tuesday," this is identically your problem.

If, on Sunday, she is shown a different label, does this represent an equivalent problem with the same answer?

If she is not shown the label, does have the same answer? And is it still an equivalent problem?

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-30T14:49:39.382Z · LW(p) · GW(p)

It is true on Monday when Beauty is awake, and false on Sunday Night, on Tuesday whether or not Beauty is awake, and on Wednesday.

That's not a simple, single truth value; that's a structure built out of truth values.

The proposition "coin lands heads" is sometimes true, and sometimes false, as well.

No, it is not. It has the same truth value throughout the entire scenario, Sunday through Wednesday. On Sunday and Monday it is impossible to know what that truth value is, but it is either true that the coin will land heads, or false that it will land heads -- and by definition, that is the same truth value you'll assign after seeing the coin toss. In contrast, the truth of "it is Monday" keeps on changing throughout the scenario. Likewise, the truth of "the sensor detects white" changes throughout the scenario you are considering in your button-and-sensor example.

Day is an independent parameter that defines the randomness of the situation

I don't know what it means to "define the randomness of the situation." In any event, the point you are missing is that Day changes throughout the problem you are analyzing -- not just that there are different possible values for it, and you don't know which is the correct one, but at different points in the same problem it has different values.

Things like "today" and "now" are known as indexicals, and there is an entire philosophical literature on them because they are problematic for classical logic. Various special logics have been devised specifically to handle them. It would not have been necessary to devise such alternative logics if they posed no problem for classical logic. You can read about them in the article Demonstratives and Indicatives in The Internet Encyclopedia of Philosophy. Some excerpts:

In the philosophy of language, an indexical is any expression whose content varies from one context of use to another. The standard list of indexicals includes... adverbs such as “now”, “then”, “today”, “yesterday”, “here”, and “actually”...

Contemporary philosophical and linguistic interest in indexicals and demonstratives arises from at least four sources. ...(iii) Indexicals and demonstratives raise interesting technical challenges for logicians seeking to provide formal models of correct reasoning in natural language...

The problem with indexicals is that they have meanings that may change over the course of the problem being discussed. This is simply not allowed in classical logic. In classical logic, a proposition must have a stable, unvarying truth value over the entire argument. I'm going to appeal to authority here, and give you some quotes.

Section 3.2, "Meanings of Sentences", in Propositions, Stanford Encyclopedia of Philosophy:

The problem is this: it seems propositions, being the objects of belief, cannot in general be spatially and temporally unqualified. Suppose that Smith, in London, looks out his window and forms the belief that it is raining. Suppose that Ramirez, in Madrid, relying on yesterday’s weather report, awakens and forms the belief that it is raining, before looking out the window to see sunshine. What Smith believes is true, while what Ramirez believes is false. So they must not believe the same proposition. But if propositions were generally spatially unqualified, they would believe the same proposition. An analogous argument can be given to show that what is believed must not in general be temporally unqualified.

(Emphasis added.) The above is telling us that a "proposition" involving an indexical is not a single proposition, but a set of propositions that you get by specifying a particular time/location.

Classical Mathematical Logic: The Semantic Foundations of Logic, by Richard L. Epstein, is clear that indexicals are not allowed in classical logic. On p. 4, "Exercises for Sections A and B," one of the exercises is this:

Explain why we cannnot take sentence types as propositions if we allow the use of indexicals in our reasoning.

The explanation is given on the previous page (p. 3):

When we reason together, we assume that words will continue to be used in the same way. That assumption is so embedded in our use of language that it's hard to think of a word except as a type, that is, as a representative of inscriptions that look the same and utterances that sound the same. I don't know how to make precise what we mean by 'look the same' or 'sound the same.' But we know well enough in writing and conversation what it means for two inscriptions or utterances to be equiform.

Words are types. We will assume that throughout any particular discussion equiform words will have the same properties of interest to logic. We therefore identify them and treat them as the same word. Briefly, a word is a type.

This assumption, while useful, rules out many sentences we can and do reason with quite well. Consider 'Rose rose and picked a rose.' If words are types, we have to distinguish the three equiform inscriptions in this sentence, perhaps as 'Rose_{name} rose_{verb} and picked a rose_{noun}'.

Further, if we accept this agreement, we must avoid words such as 'I', 'my', 'now', or 'this', whose meaning or reference depends on the circumstances of their use. Such words, called indexicals, play an important role in reasoning, yet our demand that words be types requires that they be replaced by words that we can treat as uniform in meaning or reference throughout a discussion.

...

Propositions are types. In any discussion in which we use logic we'll consider a sentence to be a proposition only if any other sentence or phrase that is composed of the same words in the same order can be assumed to have the same properties of concern to logic during that discussion. We therefore identify equiform sentences or phrases and treat them as the same sentence. Briefly, a proposition is a type.

Notice the following statements made above:

"words will continue to be used in the same way" They do not change meaning within the discussion.
"equiform words will have the same properties of interest to logic" In particular, the same word used at different points in the argument must have the same meaning.
"we must avoid words such as... 'now', ... whose meaning or reference depends on the circumstances of their use."
"our demand that words be types requires that they be replaced by words that we can treat as uniform in meaning or reference throughout a discussion."

Replies from: jeff-jo, jeff-jo

↑ comment by Jeff Jo (jeff-jo) · 2018-05-30T19:32:58.291Z · LW(p) · GW(p)

(Not in order)

The problem is this: it seems propositions, being the objects of belief, cannot in general be spatially and temporally unqualified.

Note the clause "in general." Any assertion that applies "in general" can have exceptions in specific contexts.

We similarly cannot deduce, in general, that a coin toss which influences the path(s) of an experiment, is a 50:50 proposition when evaluated in the context of only one path.

"In the philosophy of language, an indexical is any expression whose content varies from one context of use to another."

An awake Beauty is asked about her current assessment of the proposition "The coin will/has landed Heads." Presumably, she is supposed to answer on the same day. So, while the content of the expression "today" may change with the changing context of the overarching experiment, that context does not change between asking and answering. So this passage is irrelevant.

The problem with indexicals is that they have meanings that may change over the course of the problem being discussed.

And the problem with using this argument on the proposition "Today is Monday," is that neither the context, nor the meaning, changes within the problem Beauty addresses.

The above is telling us that a "proposition" involving an indexical is not a single proposition, but a set of propositions that you get by specifying a particular time/location.

No, it analyzed two specific usages of an indexical, and showed that they represented different propositions. And concluded that, in general, indexicals can represent different propositions. It never said that multiple usages of a time/location word cannot represent the same proposition, or that we can't define a situation where we know they represent the same proposition.

If we accept this agreement, we must avoid words such as 'I', 'my', 'now', or 'this', whose meaning or reference depends on the circumstances of their use.

So my corner bar can post a sign saying "Free Beer Tomorrow," without ever having to pour free suds. But if it says "Free Beer Today," they will, because the context of the sign is the same as the context when somebody asks for it. Both are indexicals, but the conditions that would make it ambiguous are removed.

"words will continue to be used in the same way" They do not change meaning within the discussion.

And over the duration of when Beauty considers the meaning of "today," it does not change.

the same word used at different points in the argument must have the same meaning.

"Today" means the same thing every time Beauty uses it. This is different than saying the truth value of the statement is the same at different points in Beauty's argument; but it is. She is making a different (but identical) argument on the two days.

"we must avoid words such as... 'now', ... whose meaning or reference depends on the circumstances of their use."

Only if those circumstances might change within the scope of their use.

requires that [words] be replaced by words that we can treat as uniform in meaning or reference throughout a discussion.

And throughout Beauty's discussion of the probability she was asked for, the meaning of "Today" does not change.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-31T02:26:26.514Z · LW(p) · GW(p)

Note the clause "in general."

Now you're really stretching.

And over the duration of when Beauty considers the meaning of "today," it does not change.

That duration potentially includes both Monday and Tuesday.

"Today" means the same thing every time Beauty uses it.

This is getting ridiculous. "Today" means a different thing on every different day. That's why the article lists it as an indexical. Going back to the quote, the "discussion" is not limited to a single day. There are at least two days involved.

I notice you carefully ignored the quote from Epstein's book, which was very clear that a classical proposition must not contain indexicals.

↑ comment by Jeff Jo (jeff-jo) · 2018-05-30T18:22:34.440Z · LW(p) · GW(p)

[The proposition "today is Monday" is] not a simple, single truth value; that's a structure built out of truth values.

At any point in the history that Beauty remembers in step 2 of step 3, the proposition has a simple, single truth value. But she cannot determine what it that value is. This is basis for being able to describe its truth value with probabilities.

"The proposition 'coin lands heads' is sometimes true, and sometimes false, as well."

No, it is not. It has the same truth value throughout the entire scenario, Sunday through Wednesday.

In some instances of the experiment, it is true. In others, it is false.

Just like "today is Monday" has the same truth value at any point in the history that Beauty remembers in step 2 of step 3. Your error is in falling to understand that, to an awake Beauty, the "experiment" she sees consists of Sunday and a single day after it. She just doesn't know which. In her experiment, the proposition "today is Monday" has a simple, single truth value. The truth of "it is Monday" never changes in any point of the scenario she sees after being wakened.

The point you are missing is that Day changes throughout the problem you are analyzing.

And the point I am trying to get across to you is that it cannot change at any point of the problem Beauty is asked to analyze.

The problem that I am analyzing is the problem that Beauty was asked to analyze. Not what an outside observer sees. She was told some details on Sunday, put to sleep, and is now awake on an indeterminate day.

She is asked about a coin that may have been flipped, or has already been flipped, but to her that difference is irrelevant. "Today is Monday" is either true, or false (which means "Today is Tuesday"). She doesn't know which, but she does know that this truth value cannot change within the scope of the problem as she sees it now.

Things like "today" and "now" are known as indexicals, and there is an entire philosophical literature on them because they are problematic for classical logic.

No, "time" is an indexical. That means that the value of time can change the context of the problem when you consider different values to be part of the same problem. Not that a problem that deals with only one specific value, and so an unchanging context, has that property.

While Beauty is awake, the day does not change. While Beauty is awake, the context of the problem does not change. While Beauty is awake, the other day of the experiment does not exist in her context. So for our problem, this resolves the issue that classical logic has with the word "today."

The problem with indexicals is that they have meanings that may change over the course of the problem being discussed.

But the meaning of "Today" does not change of the course of the problem Beauty is asked to address. This is different than her not know what that value is.

+++++

And you didn't answer my questions, about the variable Sleeping Beauty Problem. They really are simple.

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-31T02:52:04.996Z · LW(p) · GW(p)

At any point in the history that Beauty remembers in step 2 of step 3, the proposition has a simple, single truth value.

No, it doesn't. This boils down to a question of identity. Absent any means of uniquely identifying the day -- such as, "the day in which a black marble is on the dresser" -- there is a fundamental ambiguity. If Beauty's remembered experiences and mental state are identical at a point in time on Monday and another point in time on Tuesday, then "today" becomes ill-defined for her.

In some instances of the experiment

What instances are you talking about? We're talking about a single experiment. We're talking about epistemic probabilities, not frequencies. You need to relinquish your frequentist mindset for this problem, as it's not a problem about frequentist probabilities.

to an awake Beauty, the "experiment" she sees consists of Sunday and a single day after it.

No, it doesn't. She knows quite well that if the coin lands Tails, she will awaken on two separate days. It doesn't matter that she can only remember one of them.

The problem that I am analyzing is the problem that Beauty was asked to analyze. Not what an outside observer sees.

Epistemic probabilities are a function, not of the person, but of the available information. Any other person given the same information must produce the same epistemic probabilities. That's fundamental.

No, "time" is an indexical.

Go read the quotes again. Are you a greater authority on this subject than the authors of the Stanford Encyclopedia of Philosphy?

you didn't answer my questions, about the variable Sleeping Beauty Problem.

They're irrelevant. You added an extra layer of randomness on top of the problem. Each of the four card outcomes leads to a problem equivalent to the first. But randomly choosing one of four problems equivalent to the first problem doesn't tell you what the solution to the first problem is.

I do not understand why you are so insistent on using "propositions" that include indexicals, especially when there is no need to do so -- we can express the information Beauty has in a way that does not involve indexicals. When we do so, we get an answer that is not quite the same as the answer you get when you play fast and loose with indexicals. Since you've never been able to point out a flaw in the argument -- all you've done is presented a different argument you like better -- you should consider this evidence that indexicals are, in fact, a problem, just like Epstein and others have said.

Replies from: jeff-jo

↑ comment by Jeff Jo (jeff-jo) · 2018-05-31T13:15:33.755Z · LW(p) · GW(p)

"At any point in the history that Beauty remembers in step 2 of step 3, the proposition has a simple, single truth value."

No, it doesn't. This boils down to a question of identity. Absent any means of uniquely identifying the day -- such as, "the day in which a black marble is on the dresser" -- there is a fundamental ambiguity.

At any point in the history that Beauty remembers when she is in one of those steps, the proposition M, "Today is Monday," has a simple, single truth value. All day. Either day. If she is in step 2, it is "true." If she is in step 3, it is "false."

The properties of "indexicals" that you are misusing apply when, within her current memory state, the value of "today" could change. Not within the context of the overarching experiment.

This has nothing to do with whether she knows what that truth value is. In fact, probability is how we represent the "fundamental ambiguity" that the simple, single truth value belonging to a proposition is unknown to us. If you want to argue this point, I suggest that you try looking for the forest through the trees.

If Beauty's remembered experiences and mental state are identical at a point in time on Monday and another point in time on Tuesday, then "today" becomes ill-defined for her.

I tell you that I will flip a coin, ask a question, and then repeat the process.

If the question is "What is the probability that the coin is showing Heads?", and I require an answer before I repeat the flip, then coin's state has a simple, single truth value that you can represent with a probability.

If the question is "What is the probability that the coin is showing Heads?", and I require an answer only at after the second flip, the question only applies to the second since it asks about a current state.But it has a simple, single truth value that you can represent with a probability.

If the question is "What is the probability of showing Heads?" then the we have the logical conundrum you describe.

"Showing" is an indexical. It can change over time. But it is only an issue if we refer to it in the context of a range of time where it does change. That's why indexicals are a problem in general, but maybe not in a specific case.

"Today" is never ill-defined for Beauty.

"To an awake Beauty, the "experiment" she sees consists of Sunday and a single day after it."

No, it doesn't. She knows quite well that if the coin lands Tails, she will awaken on two separate days. It doesn't matter that she can only remember one of them.

The entirety of the experiment includes Sunday, Wednesday, and two other days. She knows that. The portion that exists in her memory state at the time she is asked to provide an answer consists of Sunday (when she learned it all), which cannot be "Today," and Today, which has a simple, single value.

I do not understand why you are so insistent on using "propositions" that include indexicals

Because the property that defines an indexical is that it can change over the domain where it is evaluated. Beauty is asked for her answer within a domain where "Today" does not change.

You didn't answer my questions, about the variable Sleeping Beauty Problem.

They're irrelevant.

I've learned from experience that I need halfers to answer them while they seem irrelevant. Otherwise, they argue that there is a difference, but can't say what that difference is. Yes, this has happened more than once.

Each of the four card outcomes leads to a problem equivalent to the first. But randomly choosing one of four problems equivalent to the first problem doesn't tell you what the solution to the first problem is.

Not yet, but it does tell you that the same answer applies to the original problem, and to the random-card problem.

So use four Beauties. Deal one card to each, but don't show them. And flip the coin on Sunday (necessary since we need the result on Monday).

In your step 2, bring the three awake volunteers together to discuss their answers. Tell them, truthfully, what they already know: "One of you was dealt card where the coin value matches the flip we performed on Sunday. Two were dealt a card with the opposite coin result. What probability should you assign the propositions that each of you is the one whose card matches?"

There are three possibilities. Each must have the same probability, since they have no information that distinguishes any one from the other. The probabilities must add up to 1.

They are all 1/3.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2018-05-31T17:30:49.279Z · LW(p) · GW(p)

[Kinda speaking from my experience as a moderator here, but not actually really doing anything super mod-related]: I haven't been able to follow the details from this conversation, and I apologize for that, but from the outside it does really look like you two are talking past each other. I don't know what the best way to fix that is, or even whether I am right, but my guess is that it's better to retire this thread for now and continue some other time. I am also happy to offer some more moderation if either of you requests that.

Also feel free to ignore this and just continue with your discussion, but it seemed better to give you two an out, if either of you feels like you are wasting time but are forced to continue talking for some reason or another.

comment by habryka (habryka4) · 2018-05-22T18:01:45.744Z · LW(p) · GW(p)

This seems great! I am interested in reading this in more detail when I have some more time.

comment by JeffJo · 2024-03-10T20:38:43.070Z · LW(p) · GW(p)

Some researchers are going to put you to sleep. During the two days[1] that your sleep will last, they will briefly wake you up either once or twice, depending on the toss of a fair coin (Heads: once; Tails: twice). After each waking, they will put you to back to sleep with a drug that makes you forget that waking. When you are first awakened[2], to what degree ought you believe that the outcome of the coin toss is Heads?

So the controversy was created by Elga's implementation. And it was unnecessary. There is another implementation of the same problem that does not rely on indexicals.

Once SB is told the details of the experiment and put to sleep, we flip two coins: call them C1 and C2. Then we perform this procedure:

If both coins are showing Heads, we end the procedure now with SB still asleep.
Otherwise, we wake SB and ask for her degree of belief that coin C1 landed on Heads.
After she gives an answer, we put her back to sleep with amnesia.

After these steps are concluded, whether that occurred in step 1 or step 3, we turn coin C2 over to show the opposite side. And then repeat the same procedure.

SB will thus be wakened once if coin C1 landed on Heads, and twice if Tails. Either way, she will not recall another waking. But that does no matter. She knows all of the details that apply to the current waking. In step 1, there were four possible, equally-likely combinations of (C1,C2); specifically, (H,H), (H,T), (T,H), and (T,T). But since she is awake, she knows that (H,H) was eliminated in step 1. In only one of the remaining, still equally-likely combinations did coin C1 land on Heads.

The answer is 1/3. No indexical information was used to determine this.

comment by [deleted] · 2018-05-28T10:25:06.759Z · LW(p) · GW(p)

As it stands now, I can't accept this solution, simply because it doesn't inform the right decision.

Imagine you were Beauty and q(y) was 1, and you were offered that bet. What odds would you take?

Our models exist to serve our actions. There is no such thing as a good model that informs the wrong action. Probability must add up to winning.

Or am I interpreting this wrong, and is there some practical reason why taking 1/2 odds actually does win in the q(y) = 1 case?

Replies from: ksvanhorn

↑ comment by ksvanhorn · 2018-05-29T00:54:16.534Z · LW(p) · GW(p)

Yes, there is. I'll be writing about that soon.

comment by musicmage4114 · 2018-05-23T19:25:52.532Z · LW(p) · GW(p)

Beauty's physiological state (heart rate, blood glucose level, etc.) will not be identical, and will affect her thoughts at least slightly. Treating these and other differences as random,

Not all of the differences are random, though. Sleeping Beauty will always have aged by one day if awakened on Monday, and by two days if awakened on Tuesday, and even that much aging has distinguishable consequences. Now, I'm not at all familiar with the math involved, but it seems like this solution hinges on "everything" being random. If not everything is random, does this solution still work?

Sleeping Beauty Resolved?

Contents

Introduction

The standard framework for solving probability problems

Failure to properly apply probability theory

A red herring: betting arguments

Failure to construct legitimate propositions for analysis

Failure to include all relevant information

Defining the model

Analysis

Conclusion

References

77 comments