A new derivation of the Born rule

post by MrMind · 2014-06-25T15:07:56.333Z · LW · GW · Legacy · 19 comments

Contents

19 comments

This post is an explanation of a recent paper coauthored by Sean Carroll and Charles Sebens, where they propose a derivation of the Born rule in the context of the Many World approach to quantum mechanics. While the attempt itself is not fully successful, it contains interesting ideas and it is thus worthwhile to know.

A note to the reader: here I will try to enlighten the preconditions and give only a very general view of their method, and for this reason you won’t find any equation. It is my hope that if after having read this you’re still curious about the real math, you will point your browser to the preceding link and read the paper for yourself.

If you are not totally new to LessWrong, you should know by now that the preferred interpretation of quantum mechanics (QM) around here is the Many World Interpretation (MWI), which negates the collapse of the wave-function and postulates a distinct reality (that is, a branch) for every base state composing a quantum superposition.

MWI historically suffered from three problems: the absence of macroscopic superpositions, the preferred basis problem, the Born rule derivation. The development of decoherence famously solved the first  and, to a lesser degree, the second problem, but the role of the third still remains one of the most poorly understood side of the theory.

Quantum mechanics assigns an amplitude, a complex number, to each branch of a superposition, and postulates that the probability of an observer to find the system in that branch is the (squared) norm of the amplitude. This, very briefly, is the content of the Born rule (for pure states).

Quantum mechanics remains agnostic about the ontological status of both amplitudes and probabilities, but MWI, assigning a reality status to every branch, demotes ontological uncertainty (which branch will become real after observation) to indexical uncertainty (which branch the observer will find itself correlated to after observation).

Simple indexical uncertainty, though, cannot reproduce the exact predictions of QM: by the Indifference principle, if you have no information privileging any member in a set of hypothesis, you should assign equal probability to each one. This leads to forming a probability distribution by counting the branches, which only in special circumstances coincides with amplitude-derived probabilities. This discrepancy, and how to account for it, constitutes the Born rule problem in MWI.

There have been of course many attempts at solving it, for a recollection I quote directly the article:

One approach is to show that, in the limit of many observations, branches that do not obey the Born Rule have vanishing measure. A more recent twist is to use decision theory to argue that a rational agent should act as if the Born Rule is true. Another approach is to argue that the Born Rule is the only well-defined probability measure consistent with the symmetries of quantum mechanics.

These proposals have failed to uniformly convince physicists that the Born rule problem is solved, and the paper by Carroll and Sebens is another attempt to reach a solution.

Before describing their approach, there are some assumptions that have to be clarified.

The first, and this is good news, is that they are treating probabilities as rational degrees of belief about a state of the world. They are thus using a Bayesian approach, although they never call it that way.

The second is that they’re using self-locating indifference, again from a Bayesian perspective.
Self-locating indifference is the principle that you should assign equal probabilities to find yourself in different places in the universe, if you have no information that distinguishes the alternatives. For a Bayesian, this is almost trivial: self-locating propositions are propositions like any other, so the principle of indifference should be used on them as it should on any other prior information. This is valid for quantum branches too.

The third assumption is where they start to deviate from pure Bayesianism: it’s what they call Epistemic Separability Principle, or ESP. In their words:

the outcome of experiments performed by an observer on a specific system shouldn’t depend on the physical state of other parts of the universe.

This is a kind of a Markov condition: the request that the system is such that it screens the interaction between the observer and the system observed from every possible influence of the environment.
It is obviously false for many partitions of a system into an experiment and an environment, but rather than taking it as a Principle, we can make it an assumption: an experiment is such only if it obeys the condition.
In the context of QM, this condition amounts to splitting the universal wave-function into two components, the experiment and the environment, so that there’s no entanglement between the two, and to consider only interactions that can factors as a product of an evolution for the environment and an evolution for the experiment. In this case, environment evolution act as the identity operator on the experiment, and does not affect the behavior of the experiment wave-function.
Thus, their formulation requires that the probability that an observer finds itself in a certain branch after a measurement is independent on the operations performed on the environment.
Note though, an unspoken but very important point: probabilities of this kind depends uniquely on the superposition structure of the experiment.
A probability, being an abstract degree of belief, can depend on all sorts of prior information. With their quantum version of ESP, Carroll and Sebens are declaring that, in a factored environment, probabilities of a subsystem does not depend on the information one has about the environment. Indeed, in this treatment, they are equating factorization and lack of logical connection.
This is of course true in quantum mechanics, but is a significant burden in a pure Bayesian treatment.

That said, let’s turn to their setup.

They imagine a system in a superposition of base states, which first interacts and decoheres with an environment, then gets perceived by an observer. This sequence is crucial: the Carroll-Sebens move can only be applied when the system already has decohered with a sufficiently large environment.
I say “sufficiently large” because the next step is to consider a unitary transformation on the “system+environment” block. This transformation needs to be of this kind:

- it respects ESP, in that it has to factor as an identity transformation on the “observer+system” block;

- it needs to equally distribute the probability of each branch in the original superposition on a different branch in the decohered block, according to their original relative measure.

Then, by a simple method of rearranging labels of the decohered base, one can show that the correct probabilities comes out by the indifference principle, in the very same way that the principle is used to derive the uniform probability distribution in the second chapter of Jaynes’ Probability Theory.

As an example, consider a superposition of a quantum bit, and say that one branch has a higher measure with respect to the other by a factor of square root of 2. The environment needs in this case to have at least 8 different base states to be relabeled in such a way to make the indifference principle work.

In theory, in this way you can only show that the Born rule is valid for amplitudes which differ one another by the square root of a rational number. Again I quote the paper for their conclusion:

however, since this is a dense set, it seems reasonable to conclude that the Born Rule is established. 

Evidently, this approach suffers from a number of limits: the first and the most evident is that it works only in a situation where the system to be observed has already decohered with an environment. It is not applicable to, say, a situation where a detector reads a quantum superposition directly, e.g. in a Stern-Gerlach experiment.

The second limit, although less serious, is that it can work only when the system to be observed decoheres with an environment which has sufficiently base states to distribute the relative measure in different branches. This number, for a transcendental amplitude, is bound to be infinite.

The third limit is that it can only work if we are allowed to interact with the environment in such a way as to leave the amplitudes of the interaction between the system and the observer untouched.

All of these, which are understood as limits, can naturally be reversed and considered as defining conditions, saying: the Born rule is valid only within those limits. 

 

I’ll leave it to you to determine if this constitutes a sufficient answers to the Born rule problem in MWI.

19 comments

Comments sorted by top scores.

comment by [deleted] · 2014-06-25T22:06:21.110Z · LW(p) · GW(p)

This was awesome because you walked me through it, and I was able to understand each step along the way, without getting lost in unknown terminology.

Replies from: MrMind
comment by MrMind · 2014-06-26T08:00:13.201Z · LW(p) · GW(p)

Well, I wrote the article mainly for myself, because I learn something much better when I try to explain it to others, but your comment alone made it worth :) Thanks!

comment by MadRocketSci · 2014-06-26T21:26:37.271Z · LW(p) · GW(p)

I've never understood why explaining the Born Rule is less of a problem for any of the other interpretations of QP than it is for MWI. Copenhagen, IIRC, simply asserts it as an axiom. (Rather, it seems to me that MWI is one of the few that even tries to explain it!)

Replies from: The_Duck, n4r9
comment by The_Duck · 2014-06-30T00:14:47.072Z · LW(p) · GW(p)

I think the Born rule falls out pretty nicely in the Bohmian interpretation.

comment by n4r9 · 2014-07-01T12:55:45.259Z · LW(p) · GW(p)

As I understand, it's less of a problem for a hardline Copenhagen interpretation because no definite ontological status is assigned to the wavefunction, or indeed the collapse of the wavefunction. CI can roughly be paraphrased as

"Consider this set of rules for predicting experimental outcomes. Look how well it works! Of course, we're not asserting anything about actual reality here".

One of those rules is the Born rule. Another is the fact that physical transformations correspond to unitary maps on the Hilbert space. All of them are postulated, and their correctness is a matter of experimental falsification/verification.

Conversely, MWI assigns definite reality to the wavefunction, but denies that collapse is a real process, and does not postulate any rules about predictions of experimental outcomes. Instead, the claim that a process of measurement inevitably results in a single result being recorded - with probability given by the square amplitude of the wavefunction - must be derived from the pre-existing structure of the theory (possibly with some reasonable assumptions about gambling commitments).

A conceivable alternative to MWI might have the Born rule as an additional postulate, supported only by experiment rather than following from the structure of the theory. I feel that this would be much less appealing to many of its advocates.

comment by Shmi (shminux) · 2014-06-25T15:39:18.170Z · LW(p) · GW(p)

Tangential: I am surprised that Carroll even bothers with MWI, given that all these derivations assume a background spacetime, yet he is an expert in General Relativity, where spacetime is affected by matter and vice versa. Thus all the MWI "branches" would still interact gravitationally even after the decoherence is complete. Worse, the causal structure of the spacetime can become different in different branches (e.g. a black hole might form in some and not in others). The only way out I see is having the whole spacetime decohere, as well, which requires not only Quantum Gravity, but also Quantum Gravity detectable in the weak-field limit, not just at Planck scales.

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2014-06-25T17:23:32.529Z · LW(p) · GW(p)

Uh... yeah, he's assuming that gravity is quantum-mechanical in nature, by some mechanism or another. That in itself is a really weak assumption. Why would you even mention the alternative?

I see no justification whatsoever for concluding that (ETI: the quantization of) gravity must therefore be detectable in the weak-field limit.

Replies from: shminux
comment by Shmi (shminux) · 2014-06-25T18:54:59.300Z · LW(p) · GW(p)

I see no justification whatsoever for concluding that gravity must therefore be detectable in the weak-field limit.

Suppose we perform an experiment where, based on the measured spin value, we move some macroscopic object with detectable gravity in opposite directions. In the Newtonian background spacetime approach there is no issue with MWI, as both branches live on the same spacetime. In a full GR case, however, the spacetime itself must decohere into different branches, or else we could detect the interaction between different branches gravitationally (I don't know if this has been tested, but it would be extremely surprising if detected). I am not sure what would the mechanism which splits the spacetime itself be, since all current QM/QFT models are done on a fixed background (ignoring ST and LQG). So presumably this requires Quantum Gravity. Yet the whole thing happens at very low energies, slow speeds and weak spacetime curvatures, so that's why I said that this would have to be a QG effect in the weak-field limit. Of course it would only be "detectable" in a sense that if there is no gravitational interaction between branches, then the spacetime itself must decohere by some QG mechanism.

Replies from: The_Duck, Oscar_Cunningham, Luke_A_Somers, DanielLC
comment by The_Duck · 2014-06-25T20:17:45.448Z · LW(p) · GW(p)

This is essentially the standard argument for why we have to quantize gravity. If the sources of the gravitational field can be in superposition, then it must be possible to superpose two different gravitational fields. But (as I think you acknowledge) this doesn't mean that quantum mechanical deviations from GR have to be detectable at low energies.

Replies from: shminux
comment by Shmi (shminux) · 2014-06-25T21:05:59.499Z · LW(p) · GW(p)

This is essentially the standard argument for why we have to quantize gravity.

Sort of. The problem first appears because the LHS of the EFE is a classical tensor, while the RHS is an operator, two different beasts. And using expectation value of the stress energy tensor does not work that well. The cosmological constant problem does not help, either. The MWI ontology just makes the issues starker. That's why I am surprised that Carroll completely avoids discussing it even though GR is his specialty.

comment by Oscar_Cunningham · 2014-08-12T11:19:59.906Z · LW(p) · GW(p)

Suppose we perform an experiment where, based on the measured spin value, we move some macroscopic object with detectable gravity in opposite directions. In the Newtonian background spacetime approach there is no issue with MWI, as both branches live on the same spacetime. In a full GR case, however, the spacetime itself must decohere into different branches, or else we could detect the interaction between different branches gravitationally (I don't know if this has been tested, but it would be extremely surprising if detected).

I can't get past the paywall, but I think this is what Page and Geilker do in "Indirect Evidence for Quantum Gravity".

comment by Luke_A_Somers · 2014-06-26T02:27:12.774Z · LW(p) · GW(p)

Oh, you mean in the trivial sense that the force actually works as normally observed instead of, well, not.

comment by DanielLC · 2014-06-25T19:44:49.770Z · LW(p) · GW(p)

I'm not an expert in the field, so it may be best to take what I say with a grain of salt, but:

There was some result to some experiment that showed that there were quantum fluctuations in space-time during the formation of the universe. We don't fully understand quantum gravity, but this shows that masses moving around and messing with spacetime isn't going to be all that different from electrons moving around and messing with the magnetic field.

Replies from: Ander
comment by Ander · 2014-06-25T21:44:05.316Z · LW(p) · GW(p)

It looks to me that you're referring to the BICEP2 results published in March. If so, this is still unconfirmed, and physicists have pointed out a possible issue with some of the data that needs to be resolved. (Its still more likely than not to end up being true but shouldn't be taken as a definite right now).

comment by ahbwramc · 2014-06-26T14:01:03.645Z · LW(p) · GW(p)

Thanks for writing this. I want to really dig into this paper and make sure I understand it, but it certainly seems like an interesting approach. I'm curious why you say this, though:

Evidently, this approach suffers from a number of limits: the first and the most evident is that it works only in a situation where the system to be observed has already decohered with an environment. It is not applicable to, say, a situation where a detector reads a quantum superposition directly, e.g. in a Stern-Gerlach experiment.

Maybe I'm misunderstanding you, but I thought they addressed this issue:

(from the longer companion paper)

Actually, self-locating uncertainty is generic in quantum measurement. In Everettian quantum mechanics the wave function branches when the system becomes suciently entangled with the environment to produce decoherence. The normal case is one in which the quantum system interacts with an experimental apparatus (cloud chamber, Geiger counter, electron microscope, or what have you) and then the observer sees what the apparatus has recorded. For any realistic room-temperature experimental apparatus, the decoherence time is extremely short: less than 10^20 seconds. Even if a human observer looks at the quantum system directly, the state of the observer's eyeballs will decohere in a comparable time. In contrast, the time it takes a brain to process a thought is measured in tens of milliseconds. No matter what we do, real observers will nd themselves in a situation of self-locating uncertainty (after decoherence, before the measurement outcome has been registered).

As long as there is macroscopic decoherence before the observer has time to register any thoughts, the approach seems to hold, and that's certainly the case for Stern-Gerlach experiments.

Replies from: ThisSpaceAvailable, MrMind
comment by ThisSpaceAvailable · 2014-06-29T19:00:24.916Z · LW(p) · GW(p)

the decoherence time is extremely short: less than 10^20 seconds

I take it that's supposed to be 10^-20 seconds?

comment by MrMind · 2014-06-26T14:18:23.923Z · LW(p) · GW(p)

Let me begin by saying that I've only glanced the companion paper very briefly and, although I have noticed the paragraph you quote, I may be unaware of other parts that directly address my response.

My remark that the approach wouldn't work in a Stern-Gerlach experiment was aimed at the three steps structure of the experiment, not at the decoherence happening. If we consider the Stern-Gerlach apparatus as the observer, sure it decoheres, but there's no middle environment upon which to distribute the measure of the system observed.

To make Carroll-Sebens procedure to work, you need both a three steps experiment and a wide middle enviroment, so it won't work in any case where one of the element is missing.

comment by ThisSpaceAvailable · 2014-06-29T19:10:46.805Z · LW(p) · GW(p)

for every base state composing a quantum superposition.

When you say such terms as "base state" and "superposition", are you assuming a preferred basis?

the probability of an observer to find the system in that branch is the norm of the amplitude.

Isn't it the squared norm?

This, very briefly, is the content of the Born rule (for pure states).

This seems a bit too brief to me. If we wish merely to assert that there is a number that corresponds to the probability of finding oneself in a branch, that is much too trivial an assertion to demand explanation.

As an example, consider a superposition of a quantum bit, and say that one branch has a higher measure with respect to the other by a factor of square root of 2. The environment needs in this case to have at least 8 different base states to be relabeled in such a way to make the indifference principle work.

Can you explain the math on this?

Replies from: MrMind
comment by MrMind · 2014-06-30T08:17:28.119Z · LW(p) · GW(p)

When you say such terms as "base state" and "superposition", are you assuming a preferred basis?

Well, yes and no. The paper itself doesn't assume a preferred basis, but surely to have a completely coherent picture of MWI you need to solve the preferred basis problem.
Decoherence supposedly does that with the concept of pointer basis.

Isn't it the squared norm?

Yes, absolutely! Edited, thanks.

This seems a bit too brief to me. If we wish merely to assert that there is a number that corresponds to the probability of finding oneself in a branch, that is much too trivial an assertion to demand explanation.

Well, the probability of a branch is not just a number given from the theory, is a number derived from a complex amplitude, and this has a number of consequences that makes it non-trivial. Plus, the Born rule in the context of mixed states has a more complex formulation.
In general though, since it's usually (that is, outside of MWI) assumed as an axiom, I guess you could say that it's trivial.

Can you explain the math on this?

I'm afraid not better than the original authors, so if you really want to get the math you'll have to look at the article.