# Treating anthropic selfish preferences as an extension of TDT

post by Manfred · 2015-01-01T00:43:56.587Z · score: 9 (12 votes) · LW · GW · Legacy · 16 comments## Contents

I None 16 comments

**I**

When preferences are selfless, anthropic problems are easily solved by a change of perspective. For example, if we do a Sleeping Beauty experiment for charity, all Sleeping Beauty has to do is follow the strategy that, from the charity's perspective, gets them the most money. This turns out to be an easy problem to solve, because the answer doesn't depend on Sleeping Beauty's subjective perception.

But selfish preferences - like being at a comfortable temperature, eating a candy bar, or going skydiving - are trickier, because they do rely on the agent's subjective experience. This trickiness really shines through when there are actions that can change the number of copies. For recent posts about these sorts of situations, see Pallas' sim game and Jan_Ryzmkowski's tropical paradise. I'm going to propose a model that makes answering these sorts of questions almost as easy as playing for charity.

To quote Jan's problem:

It's a cold cold winter. Radiators are hardly working, but it's not why you're sitting so anxiously in your chair. The real reason is that tomorrow is your assigned upload, and you just can't wait to leave your corporality behind. "Oh, I'm so sick of having a body, especially now. I'm freezing!" you think to yourself, "I wish I were already uploaded and could just pop myself off to a tropical island."

And now it strikes you. It's a weird solution, but it feels so appealing. You make a solemn oath (you'd say one in million chance you'd break it), that soon after upload you will simulate this exact scene a thousand times simultaneously and when the clock strikes 11 AM, you're gonna be transposed to a Hawaiian beach, with a fancy drink in your hand.

It's 10:59 on the clock. What's the probability that you'd be in a tropical paradise in one minute?

**II**

^{1}. Choose as if you were controlling the output of your decision algorithm, so that you maximize your expected utility, including selfish desires (if you have them), conditioned on the fact that you exist (I'll come back to what this last bit means in part III).

*actually a good idea*to take an action that, if you're the original body, creates a bunch of high-selfish-expected-utility copies that also undergo the decision you're making right now, because this decision controls whether you're one of those copies.

^{1}: The reader who has been following my posts may note how this identification of who has the preferences via causality makes selfish preferences well-defined no matter how many times I define the pattern "I" to map to my brain, which is good because it makes the process well-defined, but also somewhat difficult because it eliminates the last dependence on a lower level where we can think of anthropic probabilities as determined a priori, rather than depending on a definition of self grounded in decision-making as well as experiencing. On the other hand, with that level conflict gone, maybe there's nothing stopping us from thinking of anthropic probabilities on this more contingent level as "obvious" or "a priori."

**III**

^{2}. If I condition on the fact that I wake up after possibly being copied, my probability that I picked the winning numbers is large, as is my probability that I will have picked the winning numbers in the future, even if I get copied or merged or what have you. Then I learn the result, and no longer have a single state of information which would give me a probability distribution. Compare this to the second horn of the trilemma; it's easy to get mixed up when giving probabilities if there's more than one set of probabilities to give.

_{ the lottery}.

^{2}: One might complain that if you know what you'll expect in the future, you should update to believing that in the present. But if I'm going to be copied tomorrow, I don't expect to be a copy today.

**IV**

Just as computer programs or brains can split, they ought to be able to merge. If we imagine a version of the Ebborian species that computes digitally, so that the brains remain synchronized so long as they go on getting the same sensory inputs, then we ought to be able to put two brains back together along the thickness, after dividing them. In the case of computer programs, we should be able to perform an operation where we compare each two bits in the program, and if they are the same, copy them, and if they are different, delete the whole program. (This seems to establish an equal causal dependency of the final program on the two original programs that went into it. E.g., if you test the causal dependency via counterfactuals, then disturbing any bit of the two originals, results in the final program being completely different (namely deleted).)

*utterly annihilated*.

**EDIT:**

**A Problem:**

**Metaproblems:**

**Solution Brainstorming (if one is needed at all):**

**EDIT 2:**

## 16 comments

Comments sorted by top scores.

**[deleted]**· 2015-01-01T13:57:43.013Z · score: 4 (4 votes) · LW · GW

The probability question is straightforward, and is indeed about a 1000/1001 chance of tropical paradise. If this does not make sense, feel free to ask about it,

To me, this seems to neglect the prospect of *someone else* simulating the exact scene a bunch *more* times, somewhere out in time and space. To me, once you've cut yourself loose of Occam's Razor/Kolmogorov Complexity and started assigning probabilities as *frequencies throughout a space-time continuum in which identical subjective agent-moments occur multiply*, you have long since left behind Cox's Theorem and the use of probability to reason over limited information.

this seems to neglect the prospect of someone else simulating the exact scene a bunch more times, somewhere out in time and space

This is true - and I do think the probability of this is negligible. Additional simulations of our universe wouldn't change the probabilities - you'd need the simulator to interfere in a very specific way that seems unlikely to me.

once you've cut yourself loose of Occam's Razor/Kolmogorov Complexity and started assigning probabilities as frequencies throughout a space-time continuum in which identical subjective agent-moments occur multiply

Why do those conflict at all? I feel like you may be talking about a nonstandard use of occam's razor.

long since left behind [...] the use of probability

What probability do you give the simulation hypothesis?

**[deleted]**· 2015-01-02T17:47:57.274Z · score: 2 (2 votes) · LW · GW

What probability do you give the simulation hypothesis?

Some extremely low prior based on its necessary complexity.

This is true - and I do think the probability of this is negligible.

No, you have no information about that probability. You can assign a complexity prior to it and nothing more.

Why do those conflict at all? I feel like you may be talking about a nonstandard use of occam's razor.

They conflict because you have two perspectives, and therefore two different sets of information, and therefore two very different distributions. Assume the scenario happens: the person running the simulation from outside *has information about the simulation*. *They* have the evidence necessary to defeat the low prior on "everything So and So experiences is a simulation". So and So himself... does not have that information. His limited information, from sensory data that *exactly* matches the real, physical, lawful world rather than the mutable simulated environment, rationally justifies a distribution in which, "This is all physically real and I am in fact *not* going to a tropical paradise in the next minute because I'm *not* a computer simulation" is the Maximum a Posteriori hypothesis, taking up the vast majority of the probability mass.

So, the standard Bayesian analogue of Solomonoff induction is to put a complexity prior over computable predictions about future sensory inputs. If the shortest program outputting your predictions looks like a specification of a physical world, and then an identification of your sensory inputs within that world, and the physical world in your model has both a meatspace copy of you and a simulated copy of you, the only difference in this Solomonoff-analogous prior between being a meat-person and a chip-person is the complexity of identifying your sensory inputs. I think it is unfounded substrate chauvinism to think that your sensory inputs are less complicated to specify than those of an uploaded copy of yourself.

**[deleted]**· 2015-01-03T10:31:33.781Z · score: 1 (1 votes) · LW · GW

If the shortest program outputting your predictions looks like a specification of a physical world, and then an identification of your sensory inputs within that world, and the physical world in your model has both a meatspace copy of you and a simulated copy of you, the only difference in this Solomonoff-analogous prior between being a meat-person and a chip-person is the complexity of identifying your sensory inputs.

Firstly, this isn't a Solomonoff-*analogous* prior. It *is* the Solomonoff prior. Solomonoff Induction *is* Bayesian.

Secondly, my objection is that in all circumstances, if right-now-me does not possess actual information about uploaded or simulated copies of myself, then the *simplest* explanation for *physically-explicable* sensory inputs (ie: sensory inputs that don't vary between physical and simulated copies), the explanation with the lowest Kolmogorov complexity, is that I am physical and also the *only* copy of myself in existence at the present time.

This means that the 1000 simulated copies must arrive to an incorrect conclusion for rational reasons: the scenario you invented deliberately, maliciously strips them of any means to distinguish themselves from the original, physical me. A rational agent cannot be expected to necessarily win in adversarially-constructed situations.

I feel like you may be talking about a nonstandard use of occam's razor.

It's the basis for a common use. However this seems pretty clearly wrong or incomplete.

I think the grandparent's argument really had more to do with "reason(ing) over limited information" vs frequencies in a possibly infinite space-time continuum. That still seems like a weak objection, given that anthropics look related to the topic of fixing Solomonoff induction.

Why average utility of my descendants/copies, instead of total utility? Total utility seems to give better answers. Total utility implies that if copies have better-than-nothing lives, more is better. But that seems right, for roughly the same reason that I don't want to die in my sleep tonight: it deprives me of good future days. Suppose I learn that I will soon lose long-term (>24 hr) episodic memory, so that every future day will be disconnected from every other, but my life will otherwise be good. Do I still prefer a long life over a one-more-day life? I think yes. But now my days might as well, for all practical and ethical purposes, be lived parallel instead of serially.

With total utility, there is only a very ordinary precommitment problem in Tropical Paradise, provided one important feature. The important feature is that uploaded-me should not be overburdened. Suppose uploaded-me can only afford to make simultaneously-running copies on special occasions, and is reluctant to waste that on this project. That seems reasonable. If uploaded me has to sacrifice 1000 minutes of warm fuzzy feelings to give me one minute of hope now, that's not worth it. On the other hand, if he only has to do this once - giving me a 50/50 hope right now - that may well be worth it.

Let's make up some numbers. My present wintry blast with no hope of immediate relief, let's give a utility of zero per minute. Wintry blast with 50/50 hope, 6 per minute. Wintry blast with 999/1000 hope, 8 per minute. Tropical paradise, 10 per minute. Summing over all the me and future-me minutes gives the best result with only a single reliving of Winter.

Upload-me makes the sacrifice of 1 minute for basically the same reason Parfit's hitch-hiker pays his rescuer.

. Suppose I learn that I will soon lose long-term (>24 hr) episodic memory, so that every future day will be disconnected from every other, but my life will otherwise be good. Do I still prefer a long life over a one-more-day life?

Under the model of selfish preferences I use in this post, this is an interesting situation. Suppose that you go to sleep in the same room every night, and every morning you wake up with only your long-term memories (Or your brain is overwritten with the same brain-state every morning or something). Suppose you could give up some tasty candy now for tasty candy every day of your illness. If you eat the candy now, are you robbing yourself of a bunch of future candy, and making a terrible mistake? And yet, every morning a new causal branch of you will wake up, and from their perspective they merely ate their candy a little earlier.

One could even defend not letting yourself get killed off after one day as an altrustic preference rather than a selfish one.

But really this all derived from resolving one single conflict - if there are multiple different conflicts there are multiple solutions. So I'm not really sure - as I hope I sufficiently emphasized, I do not trust this population ethics result.

If you eat the candy now, are you robbing yourself of a bunch of future candy, and making a terrible mistake? And yet, every morning a new causal branch of you will wake up, and from their perspective they merely ate their candy a little earlier.

Cool, this leads me to a new point/question. You've defined "selfish" preference in terms of causal flows. I'd like to point out that those flows are not identity-relation-like. Each future branch of me wakes up and sees a one-to-one tradeoff: he doesn't get candy now, but he got it earlier, so it's a wash. But those time-slices aren't the decider, this current one is. And from my perspective now, it's a many-to-one tradeoff; those future days are all connected to me-now. This is possible because "A is causally connected to B" is intransitive. Isn't this the correct implication of your view? If not, then what?

Well, the issue is in how one calculates expected utility from a description of the future state of the world. If my current self branches into many causal descendants, and each descendant gets one cookie, there does not appear to be a law of physics that requires me to give that the expected utility of one cookie or many cookies.

It's absolutely a many to one tradeoff, that just isn't sufficient to determine how to value it.

However, if one requires that the ancestor and the descendants agree (up to time discounting and selection effects - which are where you value a cookie in 100 years less if you expect to die before then) about the value of a cookie, then that sets a constraint on how to calculate expected utility.

Fair enough. Of course, there's no law of physics ruling out Future Tuesday Indifference, either. We go by plausibility and elegance. Admittedly, "average the branches" looks about equally plausible and elegant to "sum the branches", but I think the former becomes implausible when we look at cases where some of the branches are very short-lived.

Requiring that the ancestor and descendants agree is contrary to the spirit of allowing selfish preferences, I think, in the sense of "selfish" that you've defined. If Methuselah is selfish, Methuselah(1000AD) values the experience of Methuselah(900AD), who values the experience of Methuselah(800AD), but M1000 doesn't value the experience of M800.

I think the former becomes implausible when we look at cases where some of the branches are very short-lived.

As the caveat goes, "The copies have to be people who you would actually like to be." Dying quickly seems like it would really put a damper on the expected utility of being a copy. (Mathematically, the relevant utility here is a time-integral)

I don't see why your claims about Methuselah follow, but I do agree that under this model, agents don't care about their past self - they just do what causes them to have high expected utility. Strictly, this is possible independent of whether descendants and ancestors agree or disagree. But if self-modification is possible, such conflicting selfish preferences would get modified away into nonconflicting selfless preferences.

Dying quickly seems like it would really put a damper on the expected utility of being a copy.

Not if the copy doesn't anticipate dying. Perhaps all the copies go thru a brief dim-witted phase of warm happiness (and the original expects this), in which all they can think is "yup warm and happy, just like I expected", followed by some copies dying and others recovering full intellect and living. Any of those copies is someone I'd "like to be" in the better-than-nothing sense. Is the caveat "like to be" a stronger sense?

I'm confused - if agents don't value their past self, in what sense do they agree or disagree with what the past-self was valuing? In any case, please reverse the order of the Methuselah valuing of time-slices.

Edit: Let me elaborate a story to motivate my some-copies-dying posit. I want to show that I'm not just "gaming the system," as you were concerned to avoid using your caveat.

I'm in one spaceship of a fleet of fast unarmed robotic spaceships. As I feared but planned for, an enemy fleet shows up. This spaceship will be destroyed, but I can make copies of myself in one to all of the many other ships. Each copy will spend 10 warm-and-fuzzy dim-witted minutes reviving from their construction. The space battle will last 5 minutes. The spaceship at the farthest remove from the enemy has about a 10% chance of survival. The next-farthest has a 9 point something percent chance - and so on. The enemy uses an indeterministic algorithm to chase/target ships, so these probabilities are almost independent. If I copy to all the ships in the fleet, I have a very high probability of survival. But the maximum average expected utility is gotten by copying to just one ship.

One might then make an argument about the decision question that goes like this: Before I swore this oath, my probability of going to a tropical island was very low. After, it was very high. Since I really like tropical islands, this is a great idea. In a nutshell, I have increased my expected utility by making this oath.

If it is indeed in your power to swear and execute such an oath, then "I will make an oath to simulate this event and make such-and-such changes" is a legitimate event that would impact any probability calculation. Before swearing the oath, there was still the probability of you swearing it in the future and executing it.

The probability of going to a tropical island given that the oath was made is likely higher than it was before the oath was made, but the only way it would be significantly higher is if there was a very low probability of the oath being made in the first place.

This is identical to the problem with causal decision theory which goes "If determinism is true, I'm already certain to make my decision, so how can I worry about its causal impacts?"

The answer is that you swear the oath because you calculated what would happen if (by causal surgery) your decision procedure output something else. This calculation gets done regardless of determinism - it's just how this decision procedure goes.