Sleeping Beauty Resolved?post by ksvanhorn · 2018-05-22T14:13:05.364Z · score: 39 (17 votes) · LW · GW · 74 comments
Introduction The standard framework for solving probability problems Failure to properly apply probability theory A red herring: betting arguments Failure to construct legitimate propositions for analysis Failure to include all relevant information Defining the model Analysis Conclusion References None 74 comments
The Sleeping Beauty problem has been debated ad nauseum since Elga's original paper [Elga2000], yet no consensus has emerged on its solution. I believe this confusion is due to the following errors of analysis:
- Failure to properly apply probability theory.
- Failure to construct legitimate propositions for analysis.
- Failure to include all relevant information.
The only analysis I have found that avoids all of these errors is in Radford Neal's underappreciated technical report on anthropic reasoning [Neal2007]. In this note I'll discuss how both “thirder” and “halfer” arguments exhibit one or more of the above errors, how Neal's analysis avoids them, and how the conclusions change when we alter the scenario in various ways.
As a reminder, this is the Sleeping Beauty problem:
- On Sunday the steps of the experiment are explained to Beauty, and she is put to sleep.
- On Monday Beauty is awakened. While awake she obtains no information that would help her infer the day of the week. Later in the day she is put to sleep again.
- On Tuesday the experimenters flip a fair coin. If it lands Tails, Beauty is administered a drug that erases her memory of the Monday awakening, and step 2 is repeated.
- On Wednesday Beauty is awakened once more and told that the experiment is over.
The question is this: when awakened during the experiment, what probability should Beauty give that the coin in step 3 lands Heads?
“Halfers” argue that the answer is 1/2, and “thirders” argue that the answer is 1/3. I will argue that any answer between 1/2 and 1/3 may be obtained, depending on details not specified in the problem description; but under reasonable assumptions the answer is slightly more than 1/3. Furthermore,
- halfers employ valid reasoning but get the wrong answer because they omit some seemingly irrelevant information; and
- thirders (except Neal) employ invalid reasoning that nonetheless arrives at (nearly) the right answer.
The standard framework for solving probability problems
There are actually three separate “Heads” probabilities that arise in this problem:
- , the probability that Beauty should give for Heads on Sunday.
- , the probability that Beauty should give for Heads on Monday/Tuesday.
- , the probability that Beauty should give for Heads on Wednesday.
There is agreement that , but disagreement as to whether or . What does probability theory tell us about how to approach this problem? The are all epistemic probabilities, and they are all probabilities for the same proposition—coin lands “Heads”—so any difference can only be due to different information possessed by Beauty in the three cases. The proper procedure for answering the question is then the following:
- Construct a probabilistic model incorporating the information common to the three cases. That is, choose a set of variables that describe the situation and posit an explicit joint probability distribution over these variables. We'll assume that includes a variable which is true if the coin lands heads and false otherwise.
- Identify propositions , , and expressing the additional information (if any) that Beauty has available in each of the three cases, beyond what is already expressed by .
- Then , which can be computed using the rule for conditional probabilities (Bayes' Rule).
Since Beauty does not forget anything she knows on Sunday, we can take to express everything she knows on Sunday, and to be null (no additional information).
Failure to properly apply probability theory
With the exception of Neal, thirders do not follow the above process. Instead they posit one model for the first and third cases, which they then toss out in favor of an entirely new model for the second case. This is a fundamental error.
To be specific, is something like this:
where means that Beauty wakes on Monday, means that Beauty wakes on Tuesday, and is the distribution on that assigns probability to true. would be Beauty's experiences and observations from the last time she awakened, and this is implicitly assumed to be irrelevant to whether is true, so that
Thirders then usually end up positing an that is equivalent to the following:
The first line above means that , , and are mutually exclusive, each having probability 1/3.
- means “the coin lands Heads, and it is Monday,”
- means “the coin lands Tails, and it is Monday,” and
- means “the coin lands Tails, and it is Tuesday.”
is not derived from via conditioning on any new information ; instead thirders construct an argument for it de novo. For example, Elga's original paper [Elga2000] posits that, if the coin lands Tails, Beauty is told it is Monday just before she is put to sleep again, and declares by fiat that her probability for at this point should be ; he then argues backwards from there as to what her probability for had to have been prior to being told it is Monday.
A red herring: betting arguments
Some thirders also employ betting arguments. Suppose that on each Monday/Tuesday awakening Beauty is offered a bet in which she wins $2 if the coin lands Tails and loses $3 if it lands Heads. Her expected gain is positive ($0.50) if she accepts the bets, since she has two awakenings if the coin lands Tails, yielding $4 in total, but will have only one awakening and lose only $3 if it lands Heads. Therefore she should accept the bet; but if she uses a probability of 1/2 for Heads on each awakening she computes a negative expected gain (-$0.50) and will reject the bet.
One can argue that Beauty is using the wrong decision procedure in the above argument, but there is a more fundamental point to be made: probability theory is logically prior to decision theory. That is, probability theory can be developed, discussed, and justified [Cox1946, VanHorn2003, VanHorn2017] entirely without reference to decision theory, but the concepts of decision theory rely on probability theory. If our probabilistic model yields as Beauty's probability of Heads, and plugging this probability into our decision theory yields suboptimal results evaluated against that same model, then this a problem with the decision theory; perhaps a more comprehensive theory is required [Yudkowsky&Soares2017].
Failure to construct legitimate propositions for analysis
Another serious error in many discussions of this problem is the use of supposedly mutually exclusive “propositions” that are neither mutually exclusive nor actually legitimate propositions. , , and can be written as
These are not truly mutually exclusive because, if , then Beauty will awaken on both Monday and Tuesday. Furthermore, the supposed propositions “it is Monday” and “it is Tuesday” are not even legitimate propositions. Epistemic probability theory is an extension of classical propositional logic [Cox1946, VanHorn2003, VanHorn2017], and applies only to entities that are legitimate propositions under the classical propositional logic—but there is no “now,” “today,” or “here” in classical logic.
Both Elga's paper and much of the other literature on the Sleeping Beauty problem discuss the idea of “centered possible worlds,” each of which is equipped with a designated individual and time, and corresponding “centered propositions”—which are not propositions of classical logic. To properly reason about “centered propositions” one would need to either translate them into propositions of classical logic, or develop an alternative logic of centered propositions; yet none of these authors propose such an alternative logic. Even were they to propose such an alternative logic, they would then need to re-derive an alternative probability theory that is the appropriate extension of the alternative propositional logic.
However, it is doubtful that this alternative logic is necessary or desirable. Time and location are important concepts in physics, but physics uses standard mathematics based on classical logic that has no notion of “now”. Instead, formulas are explicitly parameterized by time and location. Before we go off inventing new logics, perhaps we should see if standard logic will do the job, as it has done for all of science to date.
Failure to include all relevant information
Lewis [Lewis2001] argues for a probability of , since Sunday's probability of Heads is , and upon awakening on Monday or Tuesday Beauty has no additional information—she knew that she would experience such an awakening regardless of how the coin lands. That is, Lewis bases his analysis on , and assumes that contains no information relevant to the question:
Lewis's logic is correct, but his assumption that contains no information of relevance is wrong. Surprisingly, Beauty's stream of experiences after awakening is relevant information, even given that her prior distribution for what she may experience upon awakening has no dependence on the day or the coin toss. Neal discusses this point in his analysis of the Sleeping Beauty problem. He introduces the concept of “full non-indexical conditioning,” which roughly means that we condition on everything, even stuff that seems irrelevant, because often our intuition is not that good at identifying what is and is not actually relevant in a probabilistic analysis. Neal writes,
…note that even though the experiences of Beauty upon wakening on Monday and upon wakening on Tuesday (if she is woken then) are identical in all "relevant" respects, they will not be subjectively indistinguishable. On Monday, a fly on the wall may crawl upwards; on Tuesday, it may crawl downwards. Beauty's physiological state (heart rate, blood glucose level, etc.) will not be identical, and will affect her thoughts at least slightly. Treating these and other differences as random, the probability of Beauty having at some time the exact memories and experiences she has after being woken this time is twice as great if the coin lands Tails than if the coin lands Heads, since with Tails there are two chances for these experiences to occur rather than only one.
Bayes' Rule, applied with equal prior probabilities for Heads and Tails, then yields a posterior probability for Tails that is twice that of Heads; that is, the posterior probability of Heads is .
Defining the model
Verbal arguments are always suspect when it comes to probability puzzles, so let's actually do the math. Our first step is to extend