Boltzmann brain decision theory

post by Stuart_Armstrong · 2018-09-11T13:24:30.016Z · LW · GW · 9 comments

Contents

  Duration and causality
  Decision theory: causal
  Decision theory: functional
  Bounded rationality
None
9 comments

Suppose I told you that you had many Boltzmann brain copies of you. Is it then your duty to be as happy as possible, so that these copies were also happy?

(Now some people might argue that you can't really make yourself happy through mental effort; but you can certainly make yourself sad, so avoiding doing that also counts.)

So, I told you that some proportion of your Boltzmann brain copies were happy and some were sad, it seems that the best thing you could do, to increase the proportion of happy ones, is to be happy all the time - after all, who knows when in your life a Boltzmann brain might "happen"?

But that reasoning is wrong, a standard error of evidential decision theories. Being happy doesn't make your Boltzmann brain copies happy; instead, it ensures that among all the existing Boltzmann brains, only the happy ones may be copies of you.

This is similar to the XOR blackmail problem in functional decision theory. If you pay Omega when they send the blackmail letter ("You currently have a termite problem in your house iff you won't sent me £1000"), you're not protecting your house; instead, you're determining whether you live in a world where Omega will send the letter.

On the other hand, if there were long-lived identical copies of you scattered around the place, and you cared about their happiness in a total utilitarian style way, then it seems there is a strong argument you should make yourself happy. So, somewhere between instantaneous Boltzmann brains and perfect copies, your decision process changes. What forms the boundaries between the categories?

Duration and causality

If a Boltzmann brain only has time to make a single decision, then immediately vanishes, then that decision is irrelevant. So we have to have long-lived Boltzmann brains, where long-lived means a second or more.

Similarly, the decision has to be causally connected to the Boltzmann brain's subsequent experience. It makes no sense if you decide to be happy, and then your brain gets immediately flooded with pain immediately after - or the converse. Your decision only matters if your view of causality is somewhat accurate. Therefore you require long-lived Boltzmann brains who respect causality.

In a previous post [LW · GW], I showed that the evidence seems to suggest that Boltzmann brains caused by quantum fluctuations are generally very short-lived (this seems a realistic result) and that they don't obey causality (I'm more uncertain about this).

In contrast, for Boltzmann brains created by nucleation in an expanding universe, most observer moments belong to Boltzmann brains in Boltzmann simulations: exceptionally long lived, with causality. They are, however, much - much! - less probable than quantum fluctuations.

Decision theory: causal

Assume, for the moment, that you are an unboundely rational agent (congratulations, btw, on winning all the Clay institute prizes, on cracking all public-key encryption, on registering patents on all imaginable inventions, and for solving friendly AI).

You have decent estimates as to how many Boltzmann brains are long-lived with causality, how many use your decision theory, and how many are exact copies of you.

If you are using a causal decision theory, then only your exact copies matter - the ones where you are unsure of whether you are them or you are "yourself". Let be the probability that you are a Boltzmann brain at this very moment, let be an action and decompose your preferences into , where is some utility function and is happiness. By an abuse of notation, I'll write for the expected given that action is taken by the "real" you, for expected given that action is taken by a Boltzmann brain copy of you, and similarly for .

Then the expected utility for action is:

If we restrict our attention to medium-long duration Boltzmann brains, say ten seconds or less (though remember that Boltzmann simulations [LW · GW] are an issue!), and assume that is reasonably defined over the real world, we can neglect (since all actions the Boltzmann brain takes will have little to no impact on ), and use the expression:

This formula seems pretty intuitive: you trade off the small increase in happiness in your Boltzmann brain (), with the probability of being a Boltzmann brain (), and the utility and happiness you can expect from your normal life.

Decision theory: functional

If you're using a more sensible functional decision theory, and are a total utilitarian altruist where happiness is concerned, the expression is somewhat different. Let be the set of Boltzmann brains (not necessarily copies of you) that will take decision iff you do. For any given , let be the fact that takes action , let be the probability of existing (not the probability of you being ), and let be the happiness of .

Then the expected utility for action a is:

Note that need not be the utility of b at all - you are altruistic for happiness, not for general goals. As before, if contains only medium-long duration Boltzmann brains (or if the actions of these agents are independent of ), we can simplify to:

Because of the summation, the happiness of the Boltzmann brains can come to dominate your decisions, even if you yourself are pretty certain not to be a Boltzmann brain.

Variations of this, for different levels and types of altruism, should be clear now.

Bounded rationality

But neither of us is unboundedly rational (a shame, really). What should we do, in the real world, if we think that Boltzmann brains are worth worrying about? Assume that your altruism and probabilities point towards Boltzmann brain happiness being the dominant consideration.

A key point of FDT/UDT is that your decisions only make a difference when they make something happen differently. That sounds entirely tautological, but let's think about the moment in which an unbounded rational agent might be taking a different decision in order to make Boltzmann brains happy. When it is doing this, it is applying FDT, and considering the happiness of the Boltzmann brains, and then deciding to be happy.

And the same is true for you. Apart from personal happiness, you should take actions to make yourself happier only when you are using altruistic UDT and thinking about Boltzmann brain problems. So right now might be a good time.

This might feel perverse - is that really the only time the fact is relevant? Is there nothing else you could do - like make yourself into a consistent UDT agent in the first place?

But remember the point in the introduction - naively making yourself happy means that your Boltzmann brain copies will be happy: but this isn't actually increasing the happiness across all Boltzmann brains, just changing which ones are copies of you. Similarly, none of the following will change anything about other brains in the universe:

They won't change anything, because they don't have any acausal impact on Boltzmann brain copies. More surprisingly, neither will the following:

That will not make any difference; some Boltzmann brains will find it easy to make themselves happy, others will find it hard. But the action a that Boltzmann brains should take in these situations is something like "make yourself happy, as best you can". Changing the "as best you can" for you doesn't change it for Boltzmann brains.

9 comments

Comments sorted by top scores.

comment by Peter Gerdes (peter-gerdes) · 2018-09-15T11:20:08.591Z · LW(p) · GW(p)

Seems like phrasing it in terms of decision theory only makes the situation more confusing. Why not just state the results in terms of: assuming there are a large number of copies of some algorithm A then there is more utility if A has such and such properties.

This works more generally. Instead of burying ourselves in the confusions of decision theory we can simply state results about what kind of outcomes various algorithms give rise to under various conditions.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2018-09-18T11:32:26.127Z · LW(p) · GW(p)

>assuming there are a large number of copies of some algorithm A then there is more utility if A has such and such properties.

This is only relevant if this results in a change in algorithm A. eg causal decision theory can know that if it was a UDT agent, then it would have more money in the Newcomb problem, but it won't change itself because of this (if Omega decided before the agent existed).

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2018-09-13T13:02:12.550Z · LW(p) · GW(p)

I didn't quite follow that last section. How do considerations about boundedness and "only matters if it makes something happen differently" undermine the reasoning you laid out in the "FDT" section, which seems solid to me? Here's my attempt at a counterargument; hopefully we can have a discussion & clear things up that way.

I am arguing for this thesis: As an altruistic FDT/UDT agent, the optimal move is always "think happy thoughts," even when you aren't thinking about Boltzmann Brains or FDT/UDT.

In the space of boltzmann-brains-that-might-be-me, probability/measure is not distributed evenly. Simpler algorithms are more likely/have more measure.

I am probably a simpler algorithm.

So while it is true that for every action a I could choose, there is some chunk of BB's out there that chooses a, and hence in some sense my choice makes no difference to what the BB's do but rather only to which ones I am logically correlated with, it's also true that my choice controls the choice of the largest chunk of BB's, and so if I choose a then the largest chunk of BB's chooses a, and if I choose b then the largest chunk of BB's chooses b.

So I should think happy thoughts.

The argument I just gave was designed to address your point "naively making yourself happy means that your Boltzmann brain copies will be happy: but this isn't actually increasing the happiness across all Boltzmann brains, just changing which ones are copies of you" but I may have misunderstood it.

P.S. I know earlier you argued that the entropy of a BB doesn't matter because its contribution to the probability is dwarfed by the contribution of the mass. But as long as it's nonzero, I think my argument will work: Higher-entropy BB configurations will be more likely, holding mass constant. (Perhaps I should replace "simpler" in the above argument with "higher-entropy" then.)

comment by avturchin · 2018-09-12T11:38:12.716Z · LW(p) · GW(p)

It looks like the middle of the post is either broken or intended to be read by a person with unbounded rationality.

Another point: Do you you use in your arguments the idea that I am not BB? Because if most of my copies are BBs, I am also likely to be BB, and thus the question is what one BB could do to make other BBs happy. The problems is that BBs almost by definition are not rational, as true thoughts and false thoughts have equal probability for BBs (except the case of BB-simulations where may be some shift to rationality).

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2018-09-12T13:41:17.995Z · LW(p) · GW(p)

Hey there, thanks for the heads up - I edited it on the phone, and it went wonky. Have corrected it now.

Replies from: habryka4
comment by habryka (habryka4) · 2018-09-12T16:16:55.934Z · LW(p) · GW(p)

Alas, sorry. We've made some improvements to editing posts between posts and phones, but in particular LaTeX is quite hard to get to work reliably in a WYSIWYG format. If you need to edit things often on different devices, changing to Markdown should fix a lot of the problems.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2018-09-13T10:10:56.339Z · LW(p) · GW(p)

>changing to Markdown should fix a lot of the problems.

But then I wouldn't have LaTeX! ^_^

Replies from: habryka4
comment by habryka (habryka4) · 2018-09-13T17:40:07.229Z · LW(p) · GW(p)

False! We implemented LaTeX in markdown a few weeks ago!

(this was written on my phone in markdown, syntax is standard dollar signs)

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2018-09-18T11:49:26.236Z · LW(p) · GW(p)

So this works , does it?

Yay! And this is copy-pastable! Have a huge upvote.