In defense of anthropically updating EDT

post by Anthony DiGiovanni (antimonyanthony) · 2024-03-05T06:21:46.114Z · LW · GW · 17 comments

Contents

  Background and definitions
    Anthropic update rules
    Updateful vs. updateless EDT
    The main objections to anthropically updating EDT
    Pragmatism about anthropics
  Why I’m unconvinced by the objections to anthropically updating EDT
    Pragmatism isn’t well-motivated
      Ex ante sure losses are irrelevant if you never actually occupy the ex ante perspective
        Counterarguments
      Aside: Non-anthropically-updating EDT sometimes “fails” these cases
      We have reasons to have anthropic beliefs independent of decision-theoretic desiderata
    Without pragmatism, we have reasons to anthropically update
      My beliefs shouldn’t depend on my decision theory or preferences
      There are prima facie compelling epistemic arguments for SIA and max-RC-SSA
    Acknowledgments
    Appendix: An argument that EDT with min-RC-SSA can be ex ante suboptimal when epistemic copies aren’t decision-making copies
None
17 comments

Suppose you’re reflecting on your views on two thorny topics: decision theory and anthropics.

Some people’s response is to reject either EDT or SIA / max-RC-SSA, because they consider various problematic implications of the combination EDT + SIA / max-RC-SSA (hereafter, “anthropically updating EDT”[3]) to be decisive objections. Such as:[4]

The objections above are commonly motivated by a perspective of “pragmatism.”[5] I’ll define this more precisely in “Pragmatism about anthropics,” [LW · GW] but briefly, the pragmatist view says: You have reasonable beliefs if, and only if, you actually use your beliefs to select actions and those actions satisfy decision-theoretic desiderata. And there’s no particular positive reason for having some beliefs otherwise.

I don’t find these objections compelling. In summary, this is because:

  1. I don’t think pragmatism in the above sense is well-motivated, and even by pragmatist standards, anthropic updating isn’t uniformly worse than not updating (given EDT).
    1. The claim that you should “use your beliefs to select actions” only seems uncontroversially true when “actions” include making commitments to policies, and actions are selected from among the actions actually available to you. But agents might commit to policies for future actions that differ from actions they’d endorse without commitment. (more [LW · GW]; more [LW(p) · GW(p)])
      1. If my indexical information suggests that some world is much more likely than another, and within that world there are agents whose decisions I acausally determine, I see no principled objection to taking actions that account for both these factors. This is consistent with preferring, when I don’t have indexical information, to commit to refuse combinations of bets that result in a certain loss.
    2. EDT with min-RC-SSA is also diachronically Dutch-bookable. (more [LW · GW])
    3. (Anthropic) beliefs may simply be a primitive for agents, justified by basic epistemic principles. (Compare to discussion in Carlsmith’s “An aside on betting in anthropics.” [LW · GW]) (more [LW · GW])
  2. If we reject pragmatism, anthropic updating is justifiable despite its apparently counterintuitive implications.
    1. The objections only apply to the combination of anthropic updating with EDT, and only assuming some other conditions (most notably, I’m altruistic towards others in my epistemic situation). Thus, the objections only hold if I’m justified in changing the way I form credences based on my normative principles or these other conditions. I see no reason my epistemology should precisely track these orthogonal factors this way. (more [LW · GW])
    2. SIA and max-RC-SSA have independent epistemic justification. (more [LW · GW])

I think it’s defensible to argue that SIA / max-RC-SSA updates just don’t make epistemic sense — indeed I have some sympathy for min-RC-SSA. But that’s a different line of argument than “SIA / max-RC-SSA are problematic because they don’t work well with EDT.”

The broader lesson of this post is that decision theory and anthropics are really complicated. I’m puzzled by the degree of confidence about these topics I often encounter (my past self included), especially in claims that anthropically updating EDT “doesn’t work,” or similar.

Background and definitions

The first two subsections — “Anthropic update rules” [LW · GW] and “Updateful vs. updateless EDT” [LW · GW] — are sufficiently boring that you might want to start by skipping them, and then only come back when you want to know how exactly I’m defining these terms. Especially if you’re already familiar with anthropics and decision theory. But at least an initial skim would likely help.

Anthropic update rules

Quick, imprecise summary without math-y notation:

Now for more details that I think will make it less likely people in this discussion talk past each other.

Anthropic update rules are all about how I (for some fixed “I”) form beliefs about:

E.g., the Doomsday Argument is an answer to, “How likely is the world to be such that humanity goes extinct soon, given that I observe I am the Nth of all humans born so far?” Bayesian anthropic update rules are defined by some method for computing likelihood ratios .[6]

Importantly, a “world” here is specified only by its objective properties (“there are humans, at least one of whom observes he is writing a LessWrong post in February 2024 …”). We don’t specify the index of “I” (“I am this human named Anthony who was born on … [rather than someone else]”).

Why does this matter? Because, suppose you tell me everything about a world including which observer “I” am — i.e., you give me a centered world , which tells me the world is  and I’m the ith observer in . Then it’s trivially true that  is 1 if in that world it’s true that the ith observer observes , else 0. The controversial part is how we get from  to .

Let  be the set of observers in  who observe , and  be the set of all observers in . (We can assume some fixed conception of “observer,” like “observer of some instantaneous experience.”) Let  be the number of elements in .[7] And let  equal 1 when  is true, else 0. Then the main theories of interest are:

SIA: 

max-RC-SSA: 

min-RC-SSA

Updateful vs. updateless EDT

The anthropic update rules above are about how to form beliefs, e.g., (in the Diachronic Dutch Book setup) “How likely is it that the coin landed heads, given my observations?”

Separately, we can ask, given your beliefs, what quantity do you maximize when “maximizing expected utility”? So:

  1. An updateful EDT agent takes the available action that maximizes ex interim expected utility, i.e., the action that’s best with respect to conditional probabilities given by updating their prior on whatever they know at the time of taking the action.
  2. An updateless EDT agent takes the available action given by the policy that maximizes ex ante expected utility, i.e., the policy that’s best with respect to some prior.

It’s conceptually coherent, then, to not anthropically update but act according to updateful EDT. For this post, I’ll use “EDT” to mean “updateful EDT,” because if you follow updateless EDT, it doesn’t matter which anthropic update rules shape your beliefs.

The main objections to anthropically updating EDT

  1. Diachronic Dutch Book: Consider the following setup, paraphrasing Briggs (2010):

    Some scientists toss a fair coin on Sunday. They offer Beauty a bet where she wins $(15 + 2*epsilon) if heads, and loses $(15 - epsilon) if tails. Then they put her to sleep. If heads, Beauty is woken up once on Monday; if tails, she is woken up once on Monday and once on Tuesday. On each awakening, she is offered a bet where she loses $(20 - epsilon) if heads, and wins $(5 + epsilon) if tails.

    Regardless of Beauty’s anthropics, the first bet is positive-EV.[8] If Beauty anthropically updates according to SIA,[9] and follows EDT, then she’ll also consider the second bet positive-EV.[10]

    The objection is that Beauty loses money from taking both bets, no matter whether the coin lands heads or tails (i.e., that anthropically updating EDT is Dutch-bookable).[11]
  2. Double-Counting: One might object to the anthropically updating EDT verdict in the above thought experiment, or some analogue like Paul Christiano’s calculator case [LW · GW], without invoking Dutch-bookability per se. Christiano writes (emphasis mine):

    I am offered the opportunity to bet on a mathematical statement X to which I initially assign 50% probability … I have access to a calculator that is 99% reliable, i.e. it corrupts the answer 1% of the time at random. The calculator says that X is true. With what probability should I be willing to wager?

    I think the answer is clearly “99%.” … Intuitively [a bet at 99.9% odds] is a very bad bet, because I “should” only have 99% confidence.

    The idea is that one’s betting odds just should match the “objective probability” — and that there is no particular independent justification for having beliefs other than in setting one’s betting odds.

Pragmatism about anthropics

The following views, which together constitute what I’ll call “pragmatism about anthropics,” seem central to the objections to anthropically updating EDT. I’ll respond to the first in Ex ante sure losses are irrelevant if you never actually occupy the ex ante perspective,” [LW · GW] and the second in “We have reasons to have anthropic beliefs independent of decision-theoretic desiderata.” [LW · GW]

Pragmatism: We’ll say that you “use” an anthropic update rule, at a given time, if you take the action with the highest expected utility with respect to probabilities updated based on that rule. Then:

  1. Beliefs Should Pay Rent: You should endorse an anthropic update rule if and only if i) you “use” the update rule in this sense, and ii) the decisions induced by the update rule you use satisfy appealing decision-theoretic criteria (e.g., don’t get you Dutch booked).
  2. No Independent Justification for Updating: If a given anthropic update rule doesn’t satisfy (i) or (ii) in some contexts, there isn’t any positive reason for endorsing it over some other update rule in other contexts.

Why I’m unconvinced by the objections to anthropically updating EDT

Since some of my counterarguments apply to both objections at once, instead of going through the objections separately, I’ll give my counterarguments and say which objections they apply to.

I’ll start by engaging with the pragmatist arguments that the consequences of combining EDT with anthropic updating are problematic, since I expect my audience won’t care about the purely epistemic arguments until after I address those consequences. (To avoid a possible misunderstanding: I’m sympathetic to only bothering with epistemic questions insofar as they seem decision-relevant. But pragmatism as defined above is considerably stronger than that, and I find it uncompelling on its face.)

Then I’ll argue that SIA / max-RC-SSA are defensible if we reject pragmatism. This is because it doesn’t make epistemic sense for my beliefs to track my normative views, and there are purely epistemic arguments in favor of SIA / max-RC-SSA.

Pragmatism isn’t well-motivated

Ex ante sure losses are irrelevant if you never actually occupy the ex ante perspective

Before responding to the pragmatist view per se, I need to explain why I don’t think an anthropically updating EDT agent actually suffers a sure loss in Diachronic Dutch Book.

By definition, an (updateful) EDT agent takes the available action that’s best in expectation ex interim, meaning, best with respect to conditional probabilities computed via their update rule. But “available” is a key word — the agent only deliberates between actions insofar as those actions are available, i.e., they actually have a decision to make. And, “actions” can include committing to policies, thereby making some actions unavailable in future decision points.

So in Diachronic Dutch Book we have one of two possibilities:

  1. Beauty is offered the opportunity to commit to a policy of (only) accepting some subset of the bets, before the experiment begins.
  2. Beauty accepts or rejects the first bet, then is put in the experiment, and is offered the second bet upon waking up (without having committed).

In case (1), Beauty doesn’t have anything to update on, and given her information the combination of both bets is a sure loss. So her ex interim perspective tells her to commit not to take both bets (only the first).

Importantly, then, if the objection to anthropically updating EDT is that an agent with that anthropics + decision theory will predictably get exploited, this just doesn’t hold. You might ask: Why bother endorsing an update rule if you’ll commit not to “use” it? More on this in the next subsection; but to start, in case (2) the anthropically updating EDT agent does behave differently from the non-updating agent.

In case (2), the Dutch-bookability of the joint policy of taking both bets just doesn’t imply that the individual decision to take the second bet is mistaken. Upon waking up, Beauty actually believes the tails world is likely enough that the second bet is positive-EV.[12] Then, regardless of whether she took the first bet, I don’t see why she shouldn’t take the second. (Compare to the claim that if an agent has the opportunity to commit to a policy for Counterfactual Mugging [LW · GW], they should pay, but if they’re “born into” [LW · GW] the problem, they shouldn’t.)

Counterarguments

Why might the above analysis be unsatisfactory? There are two counterarguments:

  1. Beliefs Should Pay Rent + Dynamic Consistency:[13] “(Beliefs Should Pay Rent) If you just commit to act in the future as if you didn’t anthropically update, you’re not really using your update rule [in the sense defined in “Pragmatism about anthropics” [LW · GW]]. So in what sense do you even endorse it? (Dynamic Consistency) Sure, the anthropic update becomes relevant in case (2), but this is dynamically inconsistent. You should only use SIA / max-RC-SSA in case (2) if you would use it in case (1).”
    1. Response: I deny that one needs to “use” an update rule in the particular sense defined above. Rather, the anthropically updating EDT agent does make use of their beliefs, insofar as they actually make decisions.

      Specifically: At a given time , denote my beliefs as  and the action I take as . The above argument seems to assume that, for every time , if I'm an EDT agent then  must maximize expected utility with respect to  over all actions that would have been available to me had I not committed to some policy. Why? Because if we drop this assumption, there’s nothing inconsistent about both:
      1. “Using” the SIA / max-RC-SSA rule when I didn’t commit beforehand to a policy; and
      2. When I did commit beforehand, not “using” the SIA / max-RC-SSA rule — in the sense that the policy constrains me to an action that, had I not been under this constraint, wouldn’t have maximized expected utility.

        And I think we should drop this assumption, because EDT says you should take the action that maximizes expected utility (with updated beliefs) among available actions. The unavailable actions are irrelevant.
         
  2. Normative Updatelessness: “Your decisions just should be ex ante optimal with respect to some prior (including, not part of a policy that is diachronically Dutch-bookable), as a bedrock principle.”
    1. Response: Insofar as I have a decision to make at all when I no longer occupy the ex ante perspective — i.e., I’m not committed to a particular policy — I don’t see the motivation for deciding as if I’m still in the ex ante perspective. The counterfactual world where the coin landed otherwise than it did, which ex ante had equal weight, simply doesn’t exist.[14]

      To me, maximizing expected value with respect to one’s Bayesian-updated beliefs has a strong intuitive appeal[15] independent of Dutch book arguments. It seems that I should consider i) how likely the actual world is to be the heads-world or tails-world, and ii) what acausal influence my decision might have. And then make decisions accounting for both (i) and (ii).

      Of course, if the reader is just sympathetic to Normative Updatelessness at bottom, I can’t say they’re “wrong” here. But my read of the Diachronic Dutch Book argument is that it’s trying to say something less trivial than “if you endorse Normative Updatelessness, then you should not do what an updateful agent would do.”

Aside: Non-anthropically-updating EDT sometimes “fails” these cases

[This section isn’t essential to my main argument, so feel free to skip. That said, it seems important if the pragmatist argument against anthropic updating doesn’t work even on its own terms.]

Let’s grant the assumption used in the “Beliefs Should Pay Rent + Dynamic Consistency” argument above: Agents’ actions must maximize expected utility with respect to their beliefs at the time, over all possible actions (even if they were committed to some other policy). A realistic non-anthropically-updating EDT agent can still be diachronically Dutch-booked / violate ex ante optimality under plausible conditions.

If I understand correctly, the proof that EDT + min-RC-SSA is ex ante optimal (e.g., in Oesterheld and Conitzer (2024)) requires that:

  1. Agents i) are in my exact epistemic situation if and only if ii) the actions they take are identical to mine.
  2. I care about payoffs to agents satisfying (i)/(ii) equally to my own.

So what happens when either of these conditions is false?

First: Conitzer (2017) gives a case where agents plausibly make identical decisions without being in the exact same epistemic situation — (ii) without (i) — and shows that EDT + min-RC-SSA can be diachronically Dutch-booked in this case.[16]

There isn’t a peer-reviewed argument providing a case of (i) without (ii) (as far as I know), but I find it plausible such cases exist. See appendix [LW · GW] for more.

Alternatively, even if (i) and (ii) both hold, an agent might just not care about payoffs to an agent in an identical epistemic situation. Christiano acknowledges this [LW · GW], but doesn’t argue for why updating is bad even when the assumption of impartiality is violated:

[In this decision problem] I have impartial values. Perhaps I’m making a wager where I can either make 1 person happy or 99 people happy—I just care about the total amount of happiness, not whether I am responsible for it.

To be clear, I don’t in fact think anthropic views should be rejected based on this kind of argument (more on this in “My beliefs shouldn’t depend on my decision theory or preferences” [LW · GW]). The point is that if we’re going to use ex ante optimality as our standard for picking anthropic beliefs, then, even holding our endorsement of EDT fixed, this standard doesn’t favor non-updating anthropics.

We have reasons to have anthropic beliefs independent of decision-theoretic desiderata

One might defend Beliefs Should Pay Rent on the grounds that, if we don’t constrain our beliefs via decision-theoretic desiderata, what other reason to have beliefs is there? This is the claim of No Independent Justification for Updating.

But it seems that beliefs are a primitive for an agent, and the most straightforward approach to navigating the world for me is to:

  1. Systematize my degrees of confidence in hypotheses about the world (i.e., beliefs), checking that they correspond to what I actually expect upon reflection (by checking if they satisfy epistemic principles I endorse); and
  2. Take actions that maximize expected utility with respect to my beliefs.

(We’ll see below that SIA and max-RC-SSA do plausibly satisfy some epistemic principles better than min-RC-SSA does.)

Consider whether you would endorse total epistemic nihilism if your entire life consisted of just one decision, with no opportunities for Dutch books, laws of large numbers, etc. Maybe you’d say your intuitions about such a case are unreliable because real life isn’t like that. But regardless, you don’t really escape appealing to bedrock epistemic or normative principles: In order to invoke the Diachronic Dutch Book argument, you need to assume either Dynamic Consistency or Normative Updatelessness.

Relatedly: In “EDT with updating double counts,” [LW · GW] Christiano claims, “Other epistemological principles do help constrain the input to EDT (e.g. principles about simplicity or parsimony or whatever), but not updating.” But I don’t see the principled motivation for both a) using epistemic virtues to adjudicate between priors, yet b) ignoring epistemic virtues when adjudicating between update rules.

Without pragmatism, we have reasons to anthropically update

My beliefs shouldn’t depend on my decision theory or preferences

Diachronic Dutch Book and Double-Counting are arguments that an agent who follows both EDT and anthropic updating will do silly things. But I’m not certain of EDT (and you aren’t, either, I bet). So should I still not anthropically update?

The natural response is that insofar as I endorse EDT, I should forgo anthropic updating. (And insofar as I endorse CDT, I shouldn’t.) We might take the behavior of anthropically updating EDT in Diachronic Dutch Book, etc., as a sign that the update rule is epistemically faulty insofar as we endorse EDT.

This seems like a bizarrely ad hoc approach to forming my beliefs, though. On this approach, I should exactly adjust how much I update my beliefs (how I think the world is) to complement how decision-relevant I consider non-causal correlations (how I think I ought to act). In fact, the two objections are not just sensitive to decision-theoretic uncertainty; they also hinge on whether I’m altruistic or selfish, as we saw in “Aside: Non-anthropically-updating EDT sometimes “fails” these cases.” [LW · GW] I can’t think of any other domain where it makes sense for my beliefs to track my normative uncertainty like this.[17]

There are prima facie compelling epistemic arguments for SIA and max-RC-SSA

Finally, as far as we should endorse anthropic update rules on their epistemic merits, there’s a decent case for SIA or max-RC-SSA.

Others have already written at length in defense of SIA and max-RC-SSA. See, e.g., Carlsmith (2022), Bostrom (2002), and various academic references therein. But for what it’s worth, here’s my own perspective, which I haven’t seen expressed elsewhere:[18]

Acknowledgments

Thanks to Jesse Clifton, Tristan Cook, and Lukas Finnveden for very helpful discussion and comments. (This doesn’t imply they endorse my claims.) This research was conducted at the Center on Long-Term Risk and the Polaris Research Institute.


Appendix: An argument that EDT with min-RC-SSA can be ex ante suboptimal when epistemic copies aren’t decision-making copies

It seems plausible that agents in the exact same epistemic situation might make different decisions — (i) without (ii). Here, by “exact same epistemic situation,” I mean to include the following condition: the agents know they go through the exact same deliberation process before deciding whether to take a given bet.[21]

At a high level, what’s going on is:

Consider Carlsmith’s God’s coin toss with equal numbers [LW · GW]. We can turn this case into a bet as follows:

You close your eyes and enter a room with 9 other people. God chooses a number R from 1 to 10, uniformly at random. Then God flips a coin, and if heads, puts a red jacket on the Rth person, otherwise puts a red jacket on all 10 people. God offers each red-jacketed person a deal where they get $(1 + C) if the coin flip was heads, and pay $1 if tails.

You see that you have a red jacket (but don’t see the other 9 people, who are hidden in cubicles). You know that each of the other people goes through the same thought process as you, such that:

  1. the epistemic state of any of those people would be identical to yours conditional on being assigned a red jacket, and
  2. they have the same thoughts about the value of different allocations of money.

Do you accept the deal?

Here’s how I think an EDT + min-RC-SSA agent would reason:

But for C < 9, the ex ante utility, from the perspective of before the coin flip, is negative.[23]

  1. ^

     Or other alternatives to causal decision theory (CDT), but I’ll just consider EDT here because it’s the most well-defined alternative, and consequently I’m not sure which other decision theories this post is relevant to.

  2. ^

     But all the results should hold for other non-minimal reference classes.

  3. ^

     This isn’t maximally precise — minimal-reference-class SSA does also “update” in the weak sense of ruling out worlds where no observers in one’s epistemic situation exist. But in practice the implications of minimal-reference-class SSA and strictly non-updating anthropics seem very similar.

  4. ^

     Another category of objection is that anthropically updating EDT can endorse “managing the anthropic news.” See, e.g., “The Anthropic Trilemma.” [LW · GW] I think this behavior is indeed pretty counterintuitive, but I don’t see a sense in which it poses a qualitatively different challenge to anthropically updating EDT than the other objections do, so I’ll leave it out of this post. (Indeed, this case seems structurally similar to non-anthropic counterexamples to EDT, like Smoking Lesion and XOR Blackmail. But biting these bullets definitely [LW · GWhas precedent [LW · GW], and for good reason in my opinion. For example, in XOR Blackmail, given that you’ve received the letter from a perfect predictor, it is just logically impossible for it to be the case that you find that the disastrous outcome didn’t occur if you don’t pay. Committing not to pay ahead of time is, of course, an option available to EDT agents.)

    The main element that “managing the anthropic news” adds is that it seems to have a wacky practical implication. Briefly, I don’t think this distinction matters because I’m skeptical that my endorsement of normative principles should depend on whether the counterintuitive implications are particularly realistic; how realistic the implications are seems pretty contingent.

  5. ^

     See also Armstrong (2017).

  6. ^

     Why isn’t there an obvious, uncontroversial way of computing these likelihood ratios? Because there isn’t an obvious, uncontroversial way of interpreting the evidence, “I observe x.” See “There are prima facie compelling epistemic arguments for SIA and max-RC-SSA” [LW · GW] for more.

  7. ^

     Technically, we’d need some measure m such that, first, m(O) = n(O) if O is finite, and second, we get sensible ratios of measures m(O1) / m(O2) even when at least one of O1 and O2 is infinite. Similarly, for completeness, we’d want to define the likelihood ratios in a sensible way whenever we would’ve otherwise divided by zero. But neither of these are important for the purposes of this post.

  8. ^

     Specifically, 1/2 · (15 + 2*epsilon) - 1/2 · (15 - epsilon) = 3/2*epsilon.

  9. ^

     With a different setup, one can provide an analogous Dutch book for max-RC-SSA (see Sec. 5.5 of Oesterheld and Conitzer (2024)).

  10. ^

     Specifically, 1/3 · (-20 + epsilon) + 2/3 · 2 · (5 + epsilon) = 5/3*epsilon. (Because she follows EDT, her calculation doubles the winnings conditional on tails (5 + epsilon), since she knows her Tuesday-self takes the bet if and only if her Monday-self does.)

  11. ^

     If heads, she nets $(15 + 2*epsilon - 20 + epsilon) < 0, and if tails, she nets $(-15 + epsilon + 10 + 2*epsilon) < 0.

  12. ^

     You might object that she doesn’t have reason to believe the tails world is more likely in the first place. But again, that requires a separate argument; it’s not about the failure of anthropic updating with EDT.

  13. ^

     Thanks to Jesse Clifton for helping formulate this version of the counterargument (which he doesn’t endorse).

  14. ^

     So, e.g., I don’t see the motivation for just thinking of updating as counting the number of “instances of me” across the worlds in my prior. C.f. Fallenstein’s “Self-modification is the correct justification for updateless decision theory.” [LW · GW] (I’m not saying that Fallenstein currently endorses my claim.)

  15. ^

     I agree with Tomasik here: “At this point, I would be willing simply to accept the expected-value criterion as an axiomatic intuition: The potential good accomplished by [the higher-EV option in a one-off decision, where the law of large numbers doesn’t apply] is just so great that a chance for it shouldn't be forgone.” See also Joe Carlsmith’s article on maximizing EV.

            This is not to say Bayesianism applies everywhere — I just don’t see a particular reason conditionalization should break here, when forming likelihoods P(I(x) | w).

  16. ^

     The basic idea is that the decision points might be not exactly the same, yet only differ with respect to irrelevant information, such that the decision points involve “symmetric” information.

  17. ^

     See also Carlsmith’s “An aside on betting in anthropics” [LW · GW], especially this quote: “Indeed, I’ve been a bit surprised by the extent to which some people writing about anthropics seem interested in adjusting (contorting?) their epistemology per se in order to bet a particular way — instead of just, you know, betting that way. This seems especially salient to me in the context of discussions about dynamical inconsistencies between the policy you’d want to adopt ex ante, and your behavior ex post … As I discussed in my last post, these cases are common outside of anthropics, too, and “believe whatever you have to in order to do the right thing” doesn’t seem the most immediately attractive solution.” (I’m not sure if he would endorse my defense of betting ex ante suboptimally when you’re not in the ex ante perspective, though.)

  18. ^

     Credit to Jesse Clifton for proposing a similar approach to foundationalist reasoning about anthropic update rules (as opposed to relying on intuitions about specific thought experiments); he doesn’t necessarily endorse the approach I give here.

  19. ^

    I.e., the perspective from which, in this moment, I have my experiences but I don't have anyone else's.

  20. ^

     As an aside, this structure of argument resembles the “R-SIA + SSA” view discussed by Carlsmith [LW · GW] — contra Carlsmith, personally I find this argument more principled than the “simpler” justification for SIA, and from this perspective max-RC-SSA seems somewhat more compelling overall than SIA.

  21. ^

     Thanks to Lukas Finnveden for emphasizing the relevance of this assumption.

  22. ^

     (Or, indeed, if we avoid even the very modest update that min-RC-SSA makes.)

  23. ^

     I.e., 0.5*0.1*(1 + C) - 0.5*1 < 0.

17 comments

Comments sorted by top scores.

comment by Wei Dai (Wei_Dai) · 2024-03-05T08:22:49.708Z · LW(p) · GW(p)

Suppose I'm in a situation where I think my future self, if they were updateful and/or have indexical values, would do something against my current preferences. Suppose I also don't want to commit to a specific policy, because I think my future self will have greater computing power or otherwise be in a better position to approximate the optimal decision (from my current or an updateless perspective).

In this case (which seems like it will be a common situation), it seems that (if I could) I should self-modify to become updateless and to no longer have indexical values. So most agent-moments in this universe (and probably most other universes) will descend from agents that have made this self-modification (or were created to be updateless to begin with). In other words, most agent-moments will be updateless.

Not sure what to conclude from this consideration, but if anthropically updating is correct in some sense, it seems strange that almost nobody will be doing it? I feel like philosophical correctness probably shouldn't conflict with people's pragmatic preferences like this, unless there was also a philosophical argument that people's pragmatic preferences are wrong, but I'm not seeing such an argument in this case.

Replies from: antimonyanthony, Richard_Kennaway
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-06T05:28:41.587Z · LW(p) · GW(p)

In this case (which seems like it will be a common situation), it seems that (if I could) I should self-modify to become updateless and to no longer have indexical values.

I think you should self-modify to be updateless* with respect to the prior you have at the time of the modification. This is consistent with still anthropically updating with respect to information you have before the modification — see my discussion of “case (2)” in “Ex ante sure losses are irrelevant if you never actually occupy the ex ante perspective.”

So I don't see any selection pressure against anthropic updating on information you have before going updateless. Could you explain why you think updating on that class of information goes against one's pragmatic preferences?

(And that class of information doesn't seem like an edge case. For any (X, Y) such that under world hypothesis w1 agents satisfying X have a different distribution of Y than they do under w2, an agent that satisfies X can get indexical information from their value of Y.)

* (With all the caveats discussed in this post [LW · GW].)

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2024-03-06T06:38:36.974Z · LW(p) · GW(p)

I think you should self-modify to be updateless* with respect to the prior you have at the time of the modification. This is consistent with still anthropically updating with respect to information you have before the modification

Right, I was saying that assuming this, most agent-moments in this universe will have stopped anthropically updating long ago, i.e., they have a prior fixed at the time of self-modification and are no longer anthropically updating against it with new information. This feels weird to me, like if anthropically updating is philosophically correct, why is only a tiny fraction of agent-moments doing it. But maybe this is actually fine, if we say that anthropically updating is philosophically correct for people who have indexical values and we just happen to be the few agent-moments who have indexical values.

I guess there is still a further question of whether having indexical values is philosophically correct (for us), with a similar issue of "if it's philosophically correct, why do so few agent-moments have indexical values"? I can imagine that question being ultimately resolved either way... Overall I think I'm still in essentially the same epistemic position as when I wrote Where do selfish values come from? [LW · GW]:

So, should we freeze our selfish values, or rewind our values, or maybe even keep our "irrational" decision theory (which could perhaps be justified by saying that we intrinsically value having a decision theory that isn't too alien)? I don't know what conclusions to draw from this line of thought

(where "freeze our selfish values" could be interpreted as "self-modify to be updateless with respect to the prior you have at the time of the modification" along with corresponding changes to values)

Replies from: antimonyanthony
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-06T07:24:55.537Z · LW(p) · GW(p)

That clarifies things somewhat, thanks!

I personally don't find this weird. By my lights, the ultimate justification for deciding to not update is how I expect the policy of not-updating to help me in the future. So if I'm in a situation where I just don't expect to be helped by not-updating, I might as well update. I struggle to see what mystery is left here that isn't dissolved by this observation.

I guess I'm not sure why "so few agent-moments having indexical values" should matter to what my values are — I simply don't care about counterfactual worlds, when the real world has its own problems to fix. :)

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2024-03-06T09:07:48.820Z · LW(p) · GW(p)

So if I’m in a situation where I just don’t expect to be helped by not-updating, I might as well update.

What is this in referrence to? I'm not sure what part of my comment you're replying to with this.

I struggle to see what mystery is left here that isn’t dissolved by this observation.

I may be missing your point, but I'll just describe an interesting mystery that I see. Consider a model of a human mind with two parts, conscious and subconscious, each with their own values and decision making process. They interact in some complex way but sometimes the conscious part can strongly overpower the subconscious, like when a martyr sacrifices themselves for an ideology that they believe in. For simplicity let's treat the conscious part as an independent agent of its own.

Today we don't have technology to arbitrarily modify the subconscious part, but can self-modify the conscious part, just by "adopting" or "believing in" different ideas. Let's say the conscious part of you could decide at any time to become updateless and not have indexical values. (Again let's assume this for simplicity. Reality is perhaps messier than this.) It may still have to defer to the subconscious part much of the time, but in important moments it will be able to take charge and assert itself with its new decision theory and values.

Let's say the conscious you is reluctant to become updateless now, because you're not totally sure that's actually a good idea. So you make a resolution that when you do fully solve all the relevant philosophical problems and end up deciding that updatelessness is correct, you'll self-modify to be updateless with respect to today's prior, instead of the future prior (at time of the modification).

AFAICT, there's nothing stopping you from carrying this out. But consider that when the day finally comes, you could also think, "If 15-year old me had known about updatelessness, he would have made the same resolution but with respect to his prior instead of Anthony-2024's prior. The fact that he didn't is simply a mistake or historical accident, which I have the power to correct. Why shouldn't I act as if he did make that resolution?" And I don't see what would stop you from carrying that out either.

An important point to emphasize here is that your conscious mind currently isn't running some decision theory with a well-defined algorithm and utility function, so we can't decide what to do by thinking "what would this decision theory recommend". Instead it runs on ideas/memes, and for those of us who really like philosophical ideas/memes, it runs on philosophy. And I currently don't know what philosophy will ultimately say about I should do when it comes to self-modifying to become updateless, and specifically which prior to become updateless with respect to. "Self-modify to be updateless with respect to the prior you have at the time of the modification" would be obvious if we were running a decision theory, but it's not obvious because of considerations like the above, and who knows what other considerations/arguments there may be that we haven't even thought of yet.

Replies from: antimonyanthony
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-10T21:26:40.109Z · LW(p) · GW(p)

What is this in referrence to?

I took you to be saying: If the vast majority of agent-moments don’t update, this is some sign that those of us who do still update might be making a mistake.

So I’m saying: I know that 1) the reason the vast majority of agent-moments wouldn’t update (let’s grant this) is that they had predecessors who bound them not to update, and 2) I just am not bound by any such predecessors. Then, due to (2) it’s unsurprising that what’s optimal for me would be different from what the vast majority of agent-moments do.

Re: your explanation of the mystery:

So you make a resolution that when you do fully solve all the relevant philosophical problems and end up deciding that updatelessness is correct, you'll self-modify to be updateless with respect to today's prior, instead of the future prior (at time of the modification).

Not central (I think?), but I'm unsure whether this move works; at least, it depends on the details of the situation. E.g. if the hope is "By self-modifying later on to be updateless w.r.t. my current prior, I'll still be able to cooperate with lots of other agents in a similar epistemic situation to my current one, even after we end up in different epistemic situations [in which my decision is much less correlated with those agents' decisions]," I'm skeptical of that, for reasons similar to my argument here [LW(p) · GW(p)].

when the day finally comes, you could also think, "If 15-year old me had known about updatelessness, he would have made the same resolution but with respect to his prior instead of Anthony-2024's prior. The fact that he didn't is simply a mistake or historical accident, which I have the power to correct. Why shouldn't I act as if he did make that resolution?" And I don't see what would stop you from carrying that out either.

I think where we disagree is that I'm unconvinced there is any mistake-from-my-current-perspective to correct in the cases of anthropic updating. There would have been a mistake from the perspective of some hypothetical predecessor of mine asked to choose between different plans (before knowing who I am), but that's just not my perspective. I'd claim that in order to argue I'm making a mistake from my current perspective, you'd want to argue that I don't actually get information such that anthropic updating follows from Bayesianism.

An important point to emphasize here is that your conscious mind currently isn't running some decision theory with a well-defined algorithm and utility function, so we can't decide what to do by thinking "what would this decision theory recommend".

I absolutely agree with this! And don't see why it's in tension with my view.

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2024-03-13T03:41:31.852Z · LW(p) · GW(p)

It seems hard for me to understand you, which may be due to my lack of familiarity with your overall views on decision theory and related philosophy. Do you have something that explains, e.g., what is your current favorite decision theory and how should it be interpreted (what are the type signatures of different variables, what are probabilities [LW · GW], what is the background metaphysics [LW · GW], etc.), what kinds uncertainties exist and how they relate to each other, what is your view on the semantics of indexicals [LW(p) · GW(p)], what type of a thing is an agent (do you take more of an algorithmic view, or a physical view [LW · GW])? (I tried looking into your post history and couldn't find much that is relevant.) Also what are the "epistemic principles" that you mentioned in the OP?

Replies from: antimonyanthony
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-24T22:55:01.271Z · LW(p) · GW(p)
  • I interpret a decision theory as an answer to “Given my values and beliefs, what am I trying to do as an agent (i.e., if rationality is ‘winning,’ what is ‘winning’)?” Insofar as I endorse maximizing expected utility, a decision theory is an answer to “How do I define ‘expected utility,’ and what options do I view myself as maximizing over?”
    • I think it’s important to consider these normative questions, not just “What decision procedure wins, given my definition of ‘winning’?”
    • (I discuss similar themes here [LW · GW].)
  • On this interpretation of “decision theory,” EDT is the most appealing option I’m aware of. What I’m trying to do just seems to be: “make decisions such that I expect the best consequences conditional on those decisions.” The EDT criterion satisfies some very appealing principles like the “irrelevance of impossible outcomes.” And the “decisions” in question determine my actions in the given decision node.
  • I take view #1 in your list in “What are probabilities?”
    • I don’t think “arbitrariness” in this sense is problematic. There is a genuine mystery here as to why the world is the way it is, but I don’t think we can infer the existence of other worlds purely from our confusion.
    • And it just doesn’t seem that the thing I’m doing when I’m forming beliefs about the world is answering “how much do I care about different possible worlds?”
  • Indexicals: I haven’t formed a deliberate view on this. A flat-footed response to cases like your “old puzzle” in the comment you linked: Insofar as I simply don’t experience a superposition of experiences at once, it seems that if I get copied, “I” just will experience one of the copies’ experience-streams and not the others’. (Again I don’t consider it problematic that there’s some arbitrariness in which of the copies ends up being “me” — indeed if Everett is right then this sort of arbitrary direction of the flow of experience-streams happens all the time.) I think “you are just a different person from your future self, so there’s no fact of the matter what you will observe” is a reasonable alternative though.
  • I take a physicalist* view of agents: “There are particular configurations of stuff that can be well-modeled as ‘decision-makers.’ A configuration of stuff is ‘making a decision’ (relative to their epistemic state) insofar as they’re uncertain what their future behavior will be, and using some process that selects that future behavior in a way that is well-modeled as goal-directed. [Obviously there’s more to say about what counts as ‘well-modeled.’] My processes of deliberation about decisions and behavior resulting from those decisions can tell me what other configurations-of-stuff are probably doing, but I don’t see a motivation for modeling myself as actually being the same agent as those other configurations-of-stuff.”
  • Epistemic principles: Things like the principle of indifference, i.e., distribute credence equally over indistinguishable possibilities, all else equal.
     

* [Not to say I endorse physicalism in the broad sense]

comment by Richard_Kennaway · 2024-03-05T09:12:15.575Z · LW(p) · GW(p)

Suppose I'm in a situation where I think my future self, if they were updateful and/or have indexical values, would do something against my current preferences.

Your future self will frame this as having discovered who it truly should be, rather than having changed what it is. At least, that is how I observe people speak of the process of maturation. And your future self will know more and have more experience, so why should your current temporal cross-section get to lock itself in forever? Would you endorse your ten-year-old self [LW · GW] having that power?

comment by Anthony DiGiovanni (antimonyanthony) · 2024-08-06T22:03:38.012Z · LW(p) · GW(p)

Addendum: The approach I take in "Ex ante sure losses are irrelevant if you never actually occupy the ex ante perspective" has precedent in Hedden (2015)'s defense of "time-slice rationality," which I highly recommend. Relevant quote:

I am unmoved by the Diachronic Dutch Book Argument, whether for Conditionalization or for Reflection. This is because from the perspective of Time-Slice Rationality, it is question-begging. It is uncontroversial that collections of distinct agents can act in a way that predictably produces a mutually disadvantageous outcome without there being any irrationality. The defender of the Diachronic Dutch Book Argument must assume that this cannot happen with collections of time-slices of the same agent; if a collection of time-slices of the same agent predictably produces a disadvantageous outcome, there is ipso facto something irrational going on. Needless to say, this assumption will not be granted by the defender of Time-Slice Rationality, who thinks that the relationship between time-slices of the same agent is not importantly different, for purposes of rational evaluation, from the relationship between time-slices of distinct agents.

comment by Ape in the coat · 2024-03-05T10:46:47.443Z · LW(p) · GW(p)

The natural response is that insofar as I endorse EDT, I should forgo anthropic updating. (And insofar as I endorse CDT, I shouldn’t.) We might take the behavior of anthropically updating EDT in Diachronic Dutch Book, etc., as a sign that the update rule is epistemically faulty insofar as we endorse EDT.

This seems like a bizarrely ad hoc approach to forming my beliefs, though. On this approach, I should exactly adjust how much I update my beliefs (how I think the world is) to complement how decision-relevant I consider non-causal correlations (how I think I ought to act). In fact, the two objections are not just sensitive to decision-theoretic uncertainty; they also hinge on whether I’m altruistic or selfish, as we saw in “Aside: Non-anthropically-updating EDT sometimes “fails” these cases.” [LW · GW] I can’t think of any other domain where it makes sense for my beliefs to track my normative uncertainty like this.[17]

Exactly! This is bizarre ad hoc approach. So it's a very good hint that it's time to notice your confusion, because something, that you believe in, is definitely wrong. 

But this is in no way a point in favor of "anthropically updating EDT". On the contrary. It's either a point against anthropical updates in general, or against EDT in general or against both at the same time. I think the first post from my anthropic sequence [LW · GW] is relevant here. Updating on "self-locating evidence" is the bailey, while following EDT is the motte.

I recommend making several steps back. Stop thinking about SSA/SIA/reference classes/CDT/EDT - it all accumulated so much confusion that it's easier to start anew. Go back to the basics. Understand the "anthropic updates" in terms of probability theory, when they are lawful and when they are not. Reduce anthropics to probability theory. This is the direction I go with my anthropic sequence and since I've gone in this direction anthropic problems became much clearer to me. For example, from my latest post [LW · GW] you can derive that no-one should update on awakening in Sleeping Beauty, regardless of their decision theory and you do not need to invoke utilities or betting arguments at all, just pure probability theoretic reasoning.

Replies from: antimonyanthony
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-06T05:34:37.465Z · LW(p) · GW(p)

On the contrary. It's either a point against anthropical updates in general, or against EDT in general or against both at the same time

Why? I'd appreciate more engagement with the specific arguments in the rest of my post.

Go back to the basics. Understand the "anthropic updates" in terms of probability theory, when they are lawful and when they are not. Reduce anthropics to probability theory.

Yep, this is precisely the approach I try to take in this section [LW · GW]. Standard conditionalization plus an IMO-plausible operationalization of who "I" am gets you to either SIA or max-RC-SSA.

Replies from: Ape in the coat
comment by Ape in the coat · 2024-03-06T09:23:57.261Z · LW(p) · GW(p)

Why?

Well, let's try a simple example. Suppose you have two competing theories how to produce purple paint:

  1. Add red paint into the vial before the blue paint and then mix them together.
  2. Add blue paint into the vial before the red paint and then mix them together.

Both theories work on practice. And yet, they are incompatible with each other. Philosophers write papers about the conundrum and soon two assumptions are coined: red first assumption - RFA and red second assumption - RSA. 

Now, you observe that there are compelling arguments in favor of both theories. Does it mean that it's an argument in favor of RSA+RFA - adding red both the first and the second time? Even though the result is visibly not purple?

Of course not! It means that something is subtly wrong with both theories, namely that they assume that the order in which we add paint is relevant at all. What is required is that blue and red ingredients are accounted for and are present in the resulting mix.

Do you see the similarity between this example and SIA+EDT case?

Replies from: antimonyanthony
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-09T19:29:44.473Z · LW(p) · GW(p)

Suppose you have two competing theories how to produce purple paint

If producing purple paint here = satisfying ex ante optimality, I just reject the premise that that's my goal in the first place. I'm trying to make decisions that are optimal with respect to my normative standards (including EDT) and my understanding of the way the world is (including anthropic updating, to the extent I find the independent arguments for updating compelling) — at least insofar as I regard myself as "making decisions."[1]

Even setting that aside, your example seems very disanalogous because SIA and EDT are just not in themselves attempts to do the same thing ("produce purple paint"). SIA is epistemic, while EDT is decision-theoretic.

  1. ^

    E.g. insofar as I'm truly committed to a policy that was optimal from my past (ex ante) perspective, I'm not making a decision now.

Replies from: Ape in the coat
comment by Ape in the coat · 2024-03-10T07:45:24.487Z · LW(p) · GW(p)

The point of analogy is that just as there different ways to account for the fact that red paint is required in the mix - either by adding it first or second, there are different ways to account for the fact that, say, Sleeping Beauty awakens twice on Tails and only once on Heads.

One way is to modify probabilities, saying that probability of awakening on Tails is twice of awakening on Heads - that's what SIA does. The other is to modify utilities, saying that the reward of correctly guessing Tails is twice as large as Heads in a per awakening betting rule - that's what EDT does, if I understand correctly. Both ways produce the same product P(Tails)U(Tails), which define the betting odds. But if you modify both utilities and probabilities you obviously get the wrong result.

Now, you are free to choose to bite the bullet that it has never been about getting the correct betting odds in the first place. For some reason, people bite all kind of ridiculous bullets specifically in anthropic reasoning, and so I hoped that re-framing the issue as a recipe for purple paint may snap you out of it, which, apparently, failed to be the case.

But usually when people find themselves in a situation where only one theory out of two can be true, despite there being compelling reasons to believe in both of them, they treat it as a reason to re-examine these reasons, because at least one of these theories is clearly wrong.

And yeah, SIA is wrong. Clearly wrong. It's so obviously wrong that even according to Carlsmith, who defends it in a series of posts, it implies telekinesis, and the main appeal that at least it's not as bad as SSA. As I've previously commented on this topic:

A common way people tend to justify SIA and all it ridiculousness is by pointing at SSA ridiculousness and claiming that it's even more ridiculous. Frankly, I'm quite tired of this kind of anthropical whataboutism. It seems to be some kind of weird selective blindness. In no other sphere of knowledge people would accept this as a valid reasoning. But in anthropics, somehow, it works?

The fact that SSA is occasionally stupid doesn't justify SIA's occasional stupidity. Both are obviously wrong in general, even though sometimes both may produce correct result.

Replies from: antimonyanthony
comment by Anthony DiGiovanni (antimonyanthony) · 2024-03-10T20:49:00.104Z · LW(p) · GW(p)

Now, you are free to choose to bite the bullet that it has never been about getting the correct betting odds in the first place. For some reason, people bite all kind of ridiculous bullets specifically in anthropic reasoning, and so I hoped that re-framing the issue as a recipe for purple paint may snap you out of it, which, apparently, failed to be the case.

By what standard do you judge some betting odds as "correct" here? If it's ex ante optimality, I don't see the motivation for that (as discussed in the post), and I'm unconvinced by just calling the verdict a "ridiculous bullet." If it's about matching the frequency of awakenings, I just don't see why the decision should only count N once here — and there doesn't seem to be a principled epistemology that guarantees you'll count N exactly once if you use EDT, as I note in "Aside: Non-anthropically updating EDT sometimes 'fails' these cases." [LW · GW]

I gave independent epistemic arguments for anthropic updating at the end of the post, which you haven't addressed, so I'm unconvinced by your insistence that SIA (and I presume you also mean to include max-RC-SSA?) is clearly wrong.

Replies from: Ape in the coat
comment by Ape in the coat · 2024-03-11T07:04:19.799Z · LW(p) · GW(p)

By what standard do you judge some betting odds as "correct" here?

The same as always. Correct betting odds systematically lead to winning. 

I don't see the motivation for that (as discussed in the post)

The motivation is that you don't need to invent extraordinary ways to wiggle out from being dutch booked, of course. 

Do you systematically use this kind of reasoning in regards to betting odds? If so, what is your reasons to endourse EDT in the first place?

as I note in "Aside: Non-anthropically updating EDT sometimes 'fails' these cases." [LW · GW]

This subsection is another example of "two wrongs make a right" reasoning. You pointing out at some problems of EDT not related to antropic updating and then conclude that then the fact that EDT with anthropic updating has similar problems is okay. This doesn't make sense. If a theory has a flaw we need to fix the flaw, not treat it as a license to add more flaws to the theory. 

I gave independent epistemic arguments for anthropic updating at the end of the post, which you haven't addressed

I'm sorry but I don't see any substance in your argument to address. This step renders all the chain of reasoning meaningless:

What is , i.e., assuming I exist in the given world, how likely am I to be in a given index? Min-RC-SSA would say, “‘I’ am just guaranteed to be in whichever index corresponds to the person ‘I’ am.” This view has some merit (see, e.g., here [LW · GW] and Builes (2020)). But it’s not obvious we should endorse it — I think a plausible alternative is that “I” am defined by some first-person perspective.[19] And this perspective, absent any other information, is just as likely to be each of the indices of observers in the world. On this alternative view,.

You are saying that there is a view 1. that has some merits, but it's not obvious that it is true so... you just assume the view 2., instead. Why? Why would you do it? What's the argument that you should assume that? You don't give any. Just make an ungrounded assumption and go with your reasoning further.