Common sense quantum mechanics

dvasya

Common sense quantum mechanics

post by dvasya · 2014-05-15T20:10:11.534Z · LW · GW · Legacy · 43 comments

  Related to: Quantum physics sequence.
  1. Plausible reasoning and reproducibility
  4. Schrodinger equation
  5. What it all means
None
43 comments

Related to: Quantum physics sequence.

TLDR: Quantum mechanics can be derived from the rules of probabilistic reasoning. The wavefunction is a mathematical vehicle to transform a nonlinear problem into a linear one. The Born rule that is so puzzling for MWI results from the particular mathematical form of this functional substitution.

This is a brief overview a recent paper in Annals of Physics (recently mentioned in Discussion):

Quantum theory as the most robust description of reproducible experiments (arXiv)

by Hans De Raedt, Mikhail I. Katsnelson, and Kristel Michielsen. Abstract:

It is shown that the basic equations of quantum theory can be obtained from a straightforward application of logical inference to experiments for which there is uncertainty about individual events and for which the frequencies of the observed events are robust with respect to small changes in the conditions under which the experiments are carried out.

In a nutshell, the authors use the "plausible reasoning" rules (as in, e.g., Jaynes' Probability Theory) to recover the quantum-physical results for the EPR and Stern–Gerlach experiments by adding a notion of experimental reproducibility in a mathematically well-formulated way and without any "quantum" assumptions. Then they show how the Schrodinger equation (SE) can be obtained from the nonlinear variational problem on the probability P for the particle-in-a-potential problem when the classical Hamilton-Jacobi equation holds "on average". The SE allows to transform the nonlinear variational problem into a linear one, and in the course of said transformation, the (real-valued) probability P and the action S are combined in a single complex-valued function ~P^1/2exp(iS) which becomes the argument of SE (the wavefunction).

This casts the "serious mystery" of Born probabilities in a new light. Instead of the observed frequency being the square(d amplitude) of the "physically fundamental" wavefunction, the wavefunction is seen as a mathematical vehicle to convert a difficult nonlinear variational problem for inferential probability into a manageable linear PDE, where it so happens that the probability enters the wavefunction under a square root.

Below I will excerpt some math from the paper, mainly to show that the approach actually works, but outlining just the key steps. This will be followed by some general discussion and reflection.

1. Plausible reasoning and reproducibility

The authors start from the usual desiderata that are well laid out in Jaynes' Probability Theory and elsewhere, and add to them another condition:

There may be uncertainty about each event. The conditions under which the experiment is carried out may be uncertain. The frequencies with which events are observed are reproducible and robust against small changes in the conditions.

Mathematically, this is a requirement that the probability P(x|θ,Z) of observation x given an uncertain experimental parameter θ and the rest of out knowledge Z, is maximally robust to small changes in θ and independent of θ. Using log-probabilities, this amounts to minimizing the "evidence"

$\mathrm{Ev}=\ln\frac{P\left(x,y|\theta+\epsilon,Z\right)}{P\left(x,y|\theta,Z\right)}$

for any small ε so that |Ev| is not a function of θ (but the probability is).

2. The Einstein–Podolsky–Rosen–Bohm experiment

There is a source S that, when activated, sends a pair of signals to two routers R_1,2. Each router then sends the signal to one of its two detectors D_i_+,– (i=1,2). Each router can be rotated and we denote as θ the angle between them. The experiment is repeated N times yielding the data set {x₁,y₁}, {x₂,y₂}, ... {x_N,y_N} where x and y are the outcomes from the two detectors (+1 or –1). We want to find the probability P(x,y|θ,Z).

After some calculations it is found that the single-trial probability can be expressed as P(x,y|θ,Z) = (1 + xyE₁₂(θ) ) / 4, where E₁₂(θ) = Σ_x,y=+–1 xyP(x,y|θ,Z) is a periodic function.

From the properties of Bernoulli trials it follows that, for a data set of N trials with n_xy total outcomes of each type {x,y},

$\mathrm{Ev} = \sum_{x,y=\pm1} n_{xy} \ln\frac{P\left(x,y|\theta+\epsilon,Z\right)}{P\left(x,y|\theta,Z\right)}$

and expanding this in a Taylor series it is found that

$\mathrm{Ev} = -\frac{N\epsilon^{2}}{2} \sum_{x,y=\pm1} \frac{1}{P\left(x,y|\theta,Z\right)} \left(\frac{\partial P\left(x,y|\theta,Z\right)}{\partial\theta}\right)^{2} + O(\epsilon^{3})$

The expression in the sum is the Fisher information I_F for P. The maximum robustness requirement means it must be minimized. Writing it down as I_F = 1/(1 – E₁₂(θ)²) (dE₁₂(θ)/dθ)² one finds that E₁₂(θ) = cos(θI_F^1/2 + φ), and since E₁₂ must be periodic in angle, I_F^1/2 is a natural number, so the smallest possible value is I_F = 1. Choosing φ = π it is found that E₁₂(θ) = –cos(θ), and we obtain the result that

$P\left(x,y|\theta,Z\right) = \frac{1 - \cos\theta}{4}$

which is the well-known correlation of two spin-1/2 particles in the singlet state.

Needless to say, our derivation did not use any concepts of quantum theory. Only plain, rational reasoning strictly complying with the rules of logical inference and some elementary facts about the experiment were used

3. The Stern–Gerlach experiment

This case is analogous and simpler than the previous one. The setup contains a source emitting a particle with magnetic moment S, a magnet with field in the direction a, and two detectors D₊ and D_–.

Similarly to the previous section, P(x|θ,Z) = (1 + xE(θ) ) / 2, where E(θ) = P(+|θ,Z) – P(–|θ,Z) is an unknown periodic function. By complete analogy we seek the minimum of I_F and find that E(θ) = +–cos(θ), so that

$P\left(x|\theta,Z\right) = \frac{\left(1 + x\mathbf{a}\cdot\mathbf{S}\right)}{2}$

In quantum theory, [this] equation is in essence just the postulate (Born’s rule) that the probability to observe the particle with spin up is given by the square of the absolute value of the amplitude of the wavefunction projected onto the spin-up state. Obviously, the variability of the conditions under which an experiment is carried out is not included in the quantum theoretical description. In contrast, in the logical inference approach, [equation] is not postulated but follows from the assumption that the (thought) experiment that is being performed yields the most reproducible results, revealing the conditions for an experiment to produce data which is described by quantum theory.

To repeat: there are no wavefunctions in the present approach. The only assumption is that a dependence of outcome on particle/magnet orientation is observed with robustness/reproducibility.

4. Schrodinger equation

A particle is located in unknown position θ on a line segment [–L, L]. Another line segment [–L, L] is uniformly covered with detectors. A source emits a signal and the particle's response is detected by one of the detectors.

After going to the continuum limit of infinitely many infinitely small detectors and accounting for translational invariance it is possible to show that the position of the particle θ and of the detector x can be interchanged so that dP(x|θ,Z)/dθ = –dP(x|θ,Z)/dx.

In exactly the same way as before we need to minimize Ev by minimizing the Fisher information, which is now

$I_F=\int{\frac{1}{P\left(x|\theta,Z\right)}\left(\frac{\partial P(x|\theta,Z)}{\partial x\right)^{2}}dx$

However, simply solving this minimization problem will not give us anything new because nothing so far accounted for the fact that the particle moves in a potential. This needs to be built into the problem. This can be done by requiring that the classical Hamilton-Jacobi equation holds on average. Using the Lagrange multiplier method, we now need to minimize the functional

$F(\theta)=\int{\left\{\frac{1}{P\left(x|\theta,Z\right)}\left(\frac{\partial P(x|\theta,Z)}{\partial x\right)^{2}+\lambda\left[\left(\frac{\partial S(x)}{\partial x}}\right)^{2}+2m\left[V\left(x\right)-E\right]\right]P\left(x|\theta,Z\right)}\right\}dx$

Here S(x) is the action (Hamilton's principal function). This minimization yields solutions for the two functions P(x|θ,Z) and S(x). It is a difficult nonlinear minimization problem, but it is possible to find a matching solution in a tractable way using a mathematical "trick". It is known that standard variational minimization of the functional

$Q\left(\theta\right)=\int{\left\{4\frac{\partial\psi^{*}\left(x|\theta,Z\right)}{\partial x}\frac{\partial\psi\left(x|\theta,Z\right)}{\partial x}+2m\lambda\left[V\left(x\right)-E\right]\psi^{*}\left(x|\theta,Z\right)\psi\left(x|\theta,Z\right)\right\}dx}$

yields the Schrodinger equation for its extrema. On the other hand, if one makes the substitution combining two real-valued functions P and S into a single complex-valued ψ,

$\psi\left(x|\theta,Z\right)=\sqrt{P\left(x|\theta,Z\right)}e^{iS\left(x\right)\sqrt{\lambda}/2}$

Q is immediately transformed into F, concluding the derivation of the Schrodinger equation. Incidentally, ψ is constructed so that P(x|θ,Z) = |ψ(x|θ,Z)|², which is the Born rule.

Summing up the meaning of Schrodinger equation in the present context:

Of course, a priori there is no good reason to assume that on average there is agreement with Newtonian mechanics ... In other words, the time-independent Schrodinger equation describes the collective of repeated experiments ... subject to the condition that the averaged observations comply with Newtonian mechanics.

The authors then proceed to derive the time-dependent SE (independently from the stationary SE) in a largely similar fashion.

5. What it all means

Classical mechanics assumes that everything about the system's state and dynamics can be known (at least in principle). It starts from axioms and proceeds to derive its conclusions deductively (as opposed to inductive reasoning). In this respect quantum mechanics is to classical mechanics what probabilistic logic is to classical logic.

Quantum theory is viewed here not as a description of what really goes on at the microscopic level, but as an instance of logical inference:

in the logical inference approach, we take the point of view that a description of our knowledge of the phenomena at a certain level is independent of the description at a more detailed level.

and

quantum theory does not provide any insight into the motion of a particle but instead describes all what can be inferred (within the framework of logical inference) from or, using Bohr’s words, said about the observed data

Such a treatment of QM is similar in spirit to Jaynes' Information Theory and Statistical Mechanics papers (I, II). Traditionally statistical mechanics/thermodynamics is derived bottom-up from the microscopic mechanics and a series of postulates (such as ergodicity) that allow us to progressively ignore microscopic details under strictly defined conditions. In contrast, Jaynes starts with minimum possible assumptions:

"The quantity x is capable of assuming the discrete values x_i ... all we know is the expectation value of the function f(x) ... On the basis of this information, what is the expectation value of the function g(x)?"

and proceeds to derive the foundations of statistical physics from the maximum entropy principle. Of course, these papers deserve a separate post.

This community should be particularly interested in how this all aligns with the many-worlds interpretation. Obviously, any conclusions drawn from this work can only apply to the "quantum multiverse" level and cannot rule out or support any other many-worlds proposals.

In quantum physics, MWI does quite naturally resolve some difficult issues in the "wavefunction-centristic" view. However, we see that the concept wavefunction is not really central for quantum mechanics. This removes the whole problem of wavefunction collapse that MWI seeks to resolve.

The Born rule is arguably a big issue for MWI. But here it essentially boils down to "x is quadratic in t where t = sqrt(x)". Without the wavefunction (only probabilities) the problem simply does not appear.

Here is another interesting conclusion:

if it is difficult to engineer nanoscale devices which operate in a regime where the data is reproducible, it is also difficult to perform these experiments such that the data complies with quantum theory.

In particular, this relates to the decoherence of a system via random interactions with the environment. Thus decoherence becomes not as a physical intrinsically-quantum phenomenon of "worlds drifting apart", but a property of experiments that are not well-isolated from the influence of environment and therefore not reproducible. Well-isolated experiments are robust (and described by "quantum inference") and poorly-isolated experiments are not (hence quantum inference does not apply).

In sum, it appears that quantum physics when viewed as inference does not require many-worlds any more than probability theory does.

43 comments

Comments sorted by top scores.

comment by gjm · 2014-05-17T22:21:55.387Z · LW(p) · GW(p)

I have no more than glanced at the paper. The following may therefore be a dumb question, in which case I apologize.

It seems as if one of the following must be true. Which?

The arguments of this paper show that classical mechanics could never really have been a good way to model the universe, even if the universe had in fact precisely obeyed the laws of classical mechanics.
The arguments of this paper show that actually there's some logical incoherence in the very idea of a universe that precisely obeys the laws of classical mechanics.
The arguments of this paper don't apply to a universe that precisely obeys the laws of classical mechanics, because of some assumption it makes (a) explicitly or (b) implicitly that couldn't be true in such a universe.
The arguments of this paper don't exclude a classical universe either by proving it impossible or by making assumptions that already exclude a classical universe, and yet they somehow don't show that such a universe has to be modelled quantumly.
I'm confused.

Replies from: Protagoras

↑ comment by Protagoras · 2014-05-19T02:51:37.698Z · LW(p) · GW(p)

2 wouldn't surprise me. A non-relativistic universe seems to have hidden incoherence (justifying Einstein's enormous confidence in relativity), so while my physics competence is insufficient to follow any similar QM arguments, it wouldn't shock me if they existed.

Replies from: private_messaging, gjm

↑ comment by private_messaging · 2014-05-19T17:22:18.630Z · LW(p) · GW(p)

What would you mean by incoherence, though?

There's a plenty of possible cellular automations that are neither quantum mechanical nor relativistic, and it's not too hard to construct one that would approximate classical mechanics at macroscopic scale, but wouldn't in any way resemble quantum mechanics as we know it at microscale*, nor would be relativistic.

caveat: you could represent classical behaviour within quantum mechanical framework, it's just that you wouldn't want to.

↑ comment by gjm · 2014-05-19T09:44:40.779Z · LW(p) · GW(p)

1 would disappoint me. 2 would surprise me but (for reasons resembling yours) not astonish me. 3 would be the best case and I'd be interested to know what assumptions. (The boundary between 2 and 3 is fuzzy. A nonrelativistic universe with electromagnetism like ours has problems; should "electromagnetism like ours" be considered part of "the very idea" or a further "assumption"?) 4 and 5 would be very interesting but (kinda obviously) I don't currently see how either would work.

Replies from: dvasya

↑ comment by dvasya · 2014-05-19T16:18:18.931Z · LW(p) · GW(p)

I certainly would not rule out number 5 ;) As for 3, the arguments seem to apply to any universe in which you can carry out a reproducible experiment. However, in a "classical universe" everything is, in principle, exactly knowable, and so you just don't need a probabilistic description.

Unless there is limited information, in which case you use statistical mechanics. With perfect information you know which microstate the system is in, the evolution is deterministic, there is no entropy (macrostate concept), hence no second law, etc. Only when you have imperfect information -- an ensemble of possible microstates, a macrostate -- mechanics "becomes" statistical.

Using probabilistic logic in a situation where classical logic applies is either overkill or underconfidence.

Replies from: gjm

↑ comment by gjm · 2014-05-19T18:54:38.450Z · LW(p) · GW(p)

In case it's less than perfectly clear, I am very much not ruling out number 5; that's why it's there. But for obvious reasons there's not much I can say about how it might be true and what the consequences would be.

Even in a classical universe your knowledge is always going to be incomplete in practice. (Perfectly precise measurement is not in general possible. Your brain has fewer possible states than the whole universe. Etc.) So probabilistic reasoning, or something very like it, is inescapable even classically. Regardless, though, it would be pretty surprising to me if mere "underconfidence" (supposing it to be so) required a quantum [EDITED TO ADD: model of the] universe.

Replies from: dvasya

↑ comment by dvasya · 2014-05-19T19:27:03.053Z · LW(p) · GW(p)

I'm not sure if we can say much about a classical universe "in practice" because in practice we do not live in a classical universe. I imagine you could have perfect information if you looked at some simple classical universe from the outside.

For classical universes with complete information you have Newtonian dynamics. For classical universes with incomplete information about the state you can still use Newtonian dynamics but represent the state of the system with a probability distribution. This ultimately leads to (classical) statistical mechanics. For universes with incomplete information about the state and about its evolution ("category 3a" in the paper) you get quantum theory.

[Important caveat about classical statistical mechanics: it turns out to be a problem to formulate it without assuming some sort of granularity of phase space, which quantum theory provides. So it's all pretty intertwined.]

comment by private_messaging · 2014-05-17T13:49:23.386Z · LW(p) · GW(p)

As far as I can tell, it's highly misleading for laymen. The postulates, as verbally described ("reproducible" is the worst offender by far), look generic and innocent - like something you'd reasonably expect of any universe you could figure out - but as mathematically introduced, they constrain the possible universes far more severely than their verbal description would.

In particular, one could have an universe where the randomness arises from the fine position of the sensor - you detect the particle if some form of binary hash of the bitstring of the position of the sensor is 1, and don't detect when the hash is 0. The experiments in that universe look like reproducible probability of detecting the particle, rather than non-reproducible (due to sensitivity to position) detection of particle. Thus "reproducible" does not constrain us to the universes where the experiments are non-sensitive to small changes.

Replies from: dvasya

↑ comment by dvasya · 2014-05-17T17:19:31.333Z · LW(p) · GW(p)

I'm not sure I understood you well, could you please elaborate? If the triggering of detectors depends only on the (known) positions of detectors then it seems your experiment should be well describable by classical logic.

Replies from: private_messaging

↑ comment by private_messaging · 2014-05-18T04:16:37.044Z · LW(p) · GW(p)

Position of anything is not known exactly.

The point is, they say in their verbal description something like "reproducible" and then in the math they introduce a very serious constraint on what happens if you move a detector a little bit, or they introduce rotational symmetry, or the like. As far as looking at the words could tell, they're deriving the fundamental laws from the concept of "reproducible".

But what they really do is putting the rabbit into the hat and then pulling it back out.

Which is obvious, even. There's a lot of possible universes which are reproducible and have some uncertainty, where QM is entirely useless, and those aren't going to be rendered impossible by a few tricks. It could be very worthwhile to work out what is the minimal set of assumptions necessary - much more worthwhile than trying to pretend that this set is smaller than it is.

comment by Emile · 2014-05-15T20:25:21.202Z · LW(p) · GW(p)

This is a brief overview a recent paper in Annals of Physics:

Quantum theory as the most robust description of reproducible experiments

by Hans De Raedt, Mikhail I. Katsnelson, and Kristel Michielsen.

(Note that a link to this was posted a few days ago.)

Replies from: dvasya

↑ comment by dvasya · 2014-05-15T20:29:39.621Z · LW(p) · GW(p)

Thanks. Took me a while to write the post.

comment by ESRogs · 2014-05-16T05:52:01.719Z · LW(p) · GW(p)

What do you think of Mitchell_Porter's comments on the other article discussing this paper?

Replies from: dvasya

↑ comment by dvasya · 2014-05-16T16:04:57.441Z · LW(p) · GW(p)

In short, they mostly seem far-fetched to me, probably due to a superficial reading of the paper (as Mitchell_Porter admits). For example:

I also noticed that the authors were talking about "Fisher information". This was unsurprising, there are other people who want to "derive physics from Fisher information"

The Fisher information in this paper arises automatically at some point and is only noted in passing. There is no more derivation from Fisher information as there is from the wavefunction.

they describe something vaguely like an EPR experiment ... a similarly abstracted description of a Stern-Gerlach experiment

The vagueness and abstraction are required to (1) precisely define the terms (2) under the most general conditions possible, i.e., the minimum information sufficient to define the problem. This is completely in line with Jaynes' logic that the prior should include all the information that we have and no other information (the maximum entropy principle). If you have some more concrete information about the specific instance of Stern-Gerlach experiment you are running then by all means you should include it in your probability assignment.

They make many appeals to symmetry, e.g. ... that the experiment will behave the same regardless of orientation. Or ... translational invariance.

Again, a reader who is familiar with Jaynes will immediately recognize here the principle of transformation groups (extension of principle of indifference). If nothing about the problem changes upon translation/rotation then this fact must be reflected in the probability distribution.

hope that some coalition of Less Wrong readers, knowing about both probability and physics, will have the time and the will to look more closely, and identify specific leaps of logic, and just what is actually going on in the paper

in fact this is what I was trying to do here.

comment by Luke_A_Somers · 2014-05-16T01:30:47.354Z · LW(p) · GW(p)

The Born rule that is so puzzling for MWI results from the particular mathematical form of this functional substitution.

It's not MORE puzzling in MWI. It's just that under MWI you have enough of a reason to suspect that it ought to be the case that you're posed with the puzzle of whether you actually have enough to prove it. Under not-MWI, you have to import it whole cloth, which may feel less puzzling since we aren't so close to an answer.

~~~~

I find this an interesting notion, but I'm not sure quite what it means. This isn't an ontology. It provides no mechanism that would justify the relevance of its assumptions.

Replies from: dvasya, V_V

↑ comment by dvasya · 2014-05-16T16:21:36.774Z · LW(p) · GW(p)

I'm not sure "not-MWI" is a single coherent interpretation :) Under Copenhagen, for example, the Born rule has to be postulated. The present paper

does not support the Copenhagen interpretation (in any form)

MWI also postulates it, see V_V's comment.

As for the paper's assumptions, they seem to be no different than the assumptions of normal probabilistic reasoning as laid out by Cox/Polya/Jaynes/etc., with all that ensues in regard to relevance.

(edit: formatting)

Replies from: Luke_A_Somers

↑ comment by Luke_A_Somers · 2014-05-18T15:50:22.581Z · LW(p) · GW(p)

I have never seen anything other than MWI which even comes close to justifying the Born probabilities except by emulating MWI.

↑ comment by V_V · 2014-05-16T10:17:31.008Z · LW(p) · GW(p)

under MWI the observation probabilities are indexical, but still MWI doesn't give an explanation of why these probabilities must have the specific values computed by Born rule and not some other values. Thus, MWI assumes the Born rule as an axiom, just like most other interpretations of quantum mechanics.

Replies from: Luke_A_Somers, DanielFilan

↑ comment by Luke_A_Somers · 2014-05-18T15:48:45.650Z · LW(p) · GW(p)

If you suppose that branching ought to act like probability, then the Born rule follows directly (as pointed out by Born himself in the original paper and reproduced here by me several times). This is not the challenge for MWI. The problem is getting from Wavefunction realism to the notion that we ought to treat branching like probability with any sort of function at all.

Replies from: dvasya

↑ comment by dvasya · 2014-05-19T16:38:56.905Z · LW(p) · GW(p)

Luke, please correct me if I'm misunderstanding something.

The rule follows directly if you require that the wavefunction behaves like a "vector probability". Then you look for a measure that behaves like probability should (basically, nonnegative and adding up to 1). And you find that for this the wavefunction should be complex-valued and the probability should be its squared amplitude. You can also show that anything "larger" than complex numbers (e.g. quaternions) will not work.

But, as you said, the question is not how to derive the Born rule from "vector probability", but rather why would we make the connection of wavefunction with probability in the first place (and why the former should be vector rather than scalar). And in this respect I find the exposition that starts from probability and gets to the wavefunction very valuable.

Replies from: Luke_A_Somers

↑ comment by Luke_A_Somers · 2014-05-20T07:16:13.223Z · LW(p) · GW(p)

The two requirements are that it be on the domain of probabilities (reals on 0-1), and that they nest properly.

Quaternions would be OK as far as the Born rule is concerned - why not? They have a magnitude too. If we run into trouble with them, it's with some other part of QM, not the Born rule (and I'm not entirely confident that we do - I have hazy recollection of a formulation of the Dirac equation using quaternions instead of complex numbers).

Replies from: dvasya

↑ comment by dvasya · 2014-05-20T16:04:28.867Z · LW(p) · GW(p)

Here are some nice arguments about different what-if/why-not scenarios, not fully rigorous but sometimes quite persuasive: http://www.scottaaronson.com/democritus/lec9.html

↑ comment by DanielFilan · 2014-05-17T00:02:21.991Z · LW(p) · GW(p)

I'm not so sure that this is actually true. It has been shown that, given a fairly minimal set of constraints that don't mention probability, decision-makers in a MWI setting maximise expected utility, where the expectation is given with respect to the Born rule: http://arxiv.org/abs/0906.2718

Replies from: V_V, dvasya

↑ comment by V_V · 2014-05-18T15:26:38.779Z · LW(p) · GW(p)

Nice paper, thanks for linking it.

The quantum representation theorem is interesting, however, I don't think it really proves the Born rule.
If I understand correctly, it effectively assumes it (eq. 13, 14, 15) and then proves that given any preference ordering consistent with the "richness" and "rationality" axioms, there is an utility function such that its expectation w.r.t. the Born probabilities represent that ordering.

But the same applies to any other probability distribution, as long as it has its support inside the support of the Born probability distribution:
Let p(x) be the Born probabilities and u(x) be the original utility function. Let p'(x) be another probability distribution.
Then u'(x) = u(x) p(x)/p'(x) yields the correct preference ordering under expectation w.r.t. p'(x)

Replies from: DanielFilan

↑ comment by DanielFilan · 2014-05-19T03:02:46.973Z · LW(p) · GW(p)

Equations 13, 14 and 15 introduce notation that aren't used in the axioms, so they don't really constitute an assumption that maximising Born-expected utility is the only rational strategy.

Your second paragraph has a subtle problem: the argument of u is which reward you get, but the argument of p might have to do with the coefficients of the branches in superposition.

To illustrate, suppose that I only care about getting Born-expected dollars. Then, letting $\\psi\_n$ denote the world where I get $n, my preference ordering includes

$\\psi\_4 \\succ \\psi\_3$

and

$\\frac\{1\}\{\\sqrt\{3\}\} \\psi\_0 \+ \\sqrt\{\\frac\{2\}\{3\}\} \\psi\_3 \\sim \\frac\{1\}\{\\sqrt\{2\}\} \\psi\_0 \+ \\frac\{1\}\{\\sqrt\{2\}\} \\psi\_4$

You might wonder if my preferences could be represented as maximising utility with respect to the uniform branch weights: you don't care at all about branches with Born weight zero, but you care equally about all elements with non-zero coefficient, regardless of what that coefficient is. Then, if the new utility function is $U'$ , we require

$U'\(\\$4\$ %20%3E%20U'(\$3))

and

$\\frac\{1\}\{2\}U'\(\\$0\$ %20+%20\frac{1}{2}%20U'(\$3)%20=%20\frac{1}{2}%20U'(\$0)%20+%20\frac{1}{2}%20U'($4))

However, this is a contradiction, so my preferences cannot be represented in this way.

Replies from: V_V

↑ comment by V_V · 2014-05-19T11:30:21.327Z · LW(p) · GW(p)

Equations 13, 14 and 15 introduce notation that aren't used in the axioms, so they don't really constitute an assumption that maximising Born-expected utility is the only rational strategy.

They are used in the last theorem.

... You might wonder if my preferences could be represented as maximising utility with respect to the uniform branch weights

I think this violates indifference to microstate/branching.

Replies from: DanielFilan

↑ comment by DanielFilan · 2014-05-19T12:02:45.457Z · LW(p) · GW(p)

They are used in the last theorem.

I agree that the notation that they introduce is used in the last two theorems (the Utility Lemma and the Born Rule Theorem), but I don't see where in the proof that they assume that you should maximise Born-expected utility. If you could point out which step you think does this, that would help me understand your comment better.

I think this violates indifference to microstate/branching.

I agree. This is actually part of the point: you can't just maximise utility with respect to any old probability function you want to define on superpositions, you have to use the Born rule to avoid violating diachronic consistency or indifference to branching or any of the others.

Replies from: V_V

↑ comment by V_V · 2014-05-19T14:39:38.037Z · LW(p) · GW(p)

I agree that the notation that they introduce is used in the last two theorems (the Utility Lemma and the Born Rule Theorem), but I don't see where in the proof that they assume that you should maximise Born-expected utility. If you could point out which step you think does this, that would help me understand your comment better.

It is used in to define the expected utility in the statement of these two theorems, eq. 27 and 30.

This is actually part of the point: you can't just maximise utility with respect to any old probability function you want to define on superpositions, you have to use the Born rule to avoid violating diachronic consistency or indifference to branching or any of the others.

The issue is that the agent needs a decision rule that, given a quantum state computes an action, and this decision rule must be consistent with the agent's preference ordering over observable macrostates (which has to obey the constraints specified in the paper).
If the decision rule has to have the form of expected utility maximization, then we have two functions which are multiplied together, which gives us some wiggle room between them.

If I understand correctly, the claim is that if you restrict the utility function to depend only on the macrostate rather than the quantum state, then the probability distribution must be the Born Rule.
It seems to me that while certain probability distributions are excluded, the paper didn't prove that the Born Rule is the only consistent distribution.

Even if it turns out that it is, the result would be interesting but not particularly impressive, since macrostates are defined in terms projections, which naturally induces a L2 weighting. But defining macrostates this way makes sense precisely because there is the Born rule.

Replies from: DanielFilan

↑ comment by DanielFilan · 2014-05-20T03:47:57.453Z · LW(p) · GW(p)

It is used in to define the expected utility in the statement of these two theorems, eq. 27 and 30.

Yes. The point of those theorems is to prove that if your preferences are 'nice', then you are maximising Born-expected utility. This is why Born-expected utility appears in the statement of the theorems. They do not assume that a rational agent maximises Born-expected utility, they prove it.

The issue is that the agent needs a decision rule that, given a quantum state computes an action, and this decision rule must be consistent with the agent's preference ordering over observable macrostates (which has to obey the constraints specified in the paper).

Yes. My point is that maximising Born-expected utility is the only way to do this. This is what the paper shows. The power of this theorem is that other decision algoritms don't obey the constraints specified in the paper.

If the decision rule has to have the form of expected utility maximization, then we have two functions which are multiplied together, which gives us some wiggle room between them.

No: the functions are of two different arguments. Utility (at least in this paper) is a function of what reward you get, whereas the probability will be a function of the amplitude of the branch. You can represent the strategy of maximising Born-expected utility as the strategy of maximising some other function with respect to some other set of probabilities, but that other function will not be a function of the rewards.

Even if it turns out that it is, the result would be interesting but not particularly impressive, since macrostates are defined in terms projections, which naturally induces a L2 weighting. But defining macrostates this way makes sense precisely because there is the Born rule.

A macrostate here is defined in terms of a subspace of the whole Hilbert space, which of course involves an associated projection operator. That being said, I can't think of a reason why this doesn't make sense if you don't assume the Born rule. Could you elaborate on this?

↑ comment by dvasya · 2014-05-17T17:33:49.428Z · LW(p) · GW(p)

Can this argument be summarized in some condensed form? The paper is long.

Replies from: DanielFilan

↑ comment by DanielFilan · 2014-05-17T23:52:29.232Z · LW(p) · GW(p)

I'm not sure that the proof can be summarised in a comment, but the theorem can:

Suppose you are an agent that knows that you are living in an Everettian universe. You have a choice between unitary transformations (the only type of evolution that the world is allowed to undergo in MWI), that will in general cause your 'world' to split and give you various rewards or punishments in the various resulting branches. Your preferences between unitary transformations satisfy a few constraints:

Some technical ones about which unitary transformations are available.
Your preferences should be a total ordering on the set of the available unitary transformations.
If you currently have unitary transformation U available, and after performing U you will have unitary transformations V and V' available, and you know that you will later prefer V to V', then you should currently prefer (U and then V) to (U and then V').
If there are two microstates that give rise to the same macrostate, you don't care about which one you end up in.
You don't care about branching in and of itself: if I offer to flip a quantum coin and give you reward R whether it lands heads or tails, you should be indifferent between me doing that and just giving you reward R.
You only care about which state the universe ends up in.
If you prefer U to V, then changing U and V by some sufficiently small amount does not change this preference.

Then, you act exactly as if you have a utility function on the set of rewards, and you are evaluating each unitary transformation based on the weighted sum of the utility of the reward you get in each resulting branch, where you weight by the Born 'probability' of each branch.

Replies from: dvasya

↑ comment by dvasya · 2014-05-19T16:58:01.084Z · LW(p) · GW(p)

Thanks! The list of assumptions seems longer than in the De Raedt et al. paper and you need to first postulate branching and unitarity (let's set aside how reasonable/justified this postulate is) in addition to rational reasoning. But it looks like you can get there eventually.

comment by Shmi (shminux) · 2014-05-15T22:31:34.643Z · LW(p) · GW(p)

Downvoted for reposting yet another untestable QM foundations paper, under a misleading title (there is nothing "common-sense" about QM).

In quantum physics, MWI does quite naturally resolve some difficult issues in the "wavefunction-centristic" view. However, we see that the concept wavefunction is not really central for quantum mechanics. This removes the whole problem of wavefunction collapse that MWI seeks to resolve.

Physical theories live and die by testing (or they ought to, unless they happen to be pushed by famous string theorists). I agree that "This removes the whole problem of wavefunction collapse", but only in the minds of philosophers of physics and some misguided philosophically inclined physicists. This paper adds nothing to physics.

Replies from: dvasya, buybuydandavis, raisin

↑ comment by dvasya · 2014-05-15T23:41:43.254Z · LW(p) · GW(p)

Thank you. The title plays on the idea of deriving quantum mechanics from the rules of "common-sense" probabilistic reasoning. Suggestions for a better title are, of course, welcome.

In my view this is not so much "QM foundations" or "adding to physics" (one could argue it takes away from physics) as it is an interesting application of Bayesian inference, providing another example of its power. It is however interesting to discuss it in the context of MWI which is a relatively big thing for some here on Less Wrong.

Regarding testability I'm reminded of the recent discussion at Scott Aaronson's blog: http://www.scottaaronson.com/blog/?p=1653

Replies from: shminux

↑ comment by Shmi (shminux) · 2014-05-16T00:00:15.342Z · LW(p) · GW(p)

I agree with everything Scott Aaronson said there, actually. As for the common sense, apparently our definitions of it differ. Furthermore, while I agree that this paper might be an interesting exercise in some mathematical aspects of Bayesian inference as applied to something or other, I question its relevance to physics in general and QM in particular.

↑ comment by buybuydandavis · 2014-05-16T21:22:21.035Z · LW(p) · GW(p)

I agree that "This removes the whole problem of wavefunction collapse", but only in the minds of philosophers of physics and some misguided philosophically inclined physicists. This paper adds nothing to physics.

Giving an alternative formalism that "clears up the mysteries" and suggests an approach for new problems is a huge advance, IMO.

Jaynes did this before for statistical mechanics. Now they've applied the same principles to Quantum Mechanics. Maybe they could apply the gauge extensions of maximum entropy to this new derivation as well.

↑ comment by raisin · 2014-05-16T12:48:56.194Z · LW(p) · GW(p)

I agree that "This removes the whole problem of wavefunction collapse", but only in the minds of philosophers of physics and some misguided philosophically inclined physicists. This paper adds nothing to physics.

Is physics important to you in ways other than how well it corresponds to reality? Physics relies on testing and experiments, but if we have another kind of system - let's call it bayesianism - and we have a reason to believe this other kind of system corresponds better to reality even though it doesn't rely perfectly on testing and experimenting, would you reject that in favor of physics? Why?

Replies from: shminux

↑ comment by Shmi (shminux) · 2014-05-16T14:34:37.832Z · LW(p) · GW(p)

if we have another kind of system - let's call it bayesianism - and we have a reason to believe this other kind of system corresponds better to reality even though it doesn't rely perfectly on testing and experimenting, would you reject that in favor of physics? Why?

Replace "bayesianism" with "Christianity" in the above and answer your own question.

The moment a model of the world becomes disconnected from "testing and experimenting" it becomes a faith (or math, if you are lucky).

Replies from: dvasya

↑ comment by dvasya · 2014-05-16T16:30:31.893Z · LW(p) · GW(p)

I guess one could argue that "bayesianism" (probability-as-logic) is testable practically and, indeed, well-tested by now. (But I still don't understand how raisin proposes to reject physics in favor of probability theory or vice versa.)

Replies from: raisin, shminux

↑ comment by raisin · 2014-05-16T18:22:27.431Z · LW(p) · GW(p)

But I still don't understand how raisin proposes to reject physics in favor of probability theory or vice versa.

Well, 'reject' was a bad word. Physics is fine for mostly everything. What I meant was that "bayesianism" could supplement physics in areas that are hard to test like MWI, parallel universes etc. Basically what Tegmark argues here.

↑ comment by Shmi (shminux) · 2014-05-16T16:55:59.710Z · LW(p) · GW(p)

I guess one could argue that "bayesianism" (probability-as-logic) is testable practically and, indeed, well-tested by now.

Well, sure, the techniques based on Bayesian interpretations of probabilities (subjective or objective) work at least as well as frequentist (not EYish straw-frequentist, but actual frequentist, Kolmogorov-style), and sometimes better. And yeah, I have no idea what raisin is on about. Bayesianism is not an alternative to physics, just one of its mathematical tools.

comment by Pattern · 2019-05-04T00:27:01.669Z · LW(p) · GW(p)

There's blank spaces in this article in place of formulas/equations.

comment by buybuydandavis · 2014-05-16T21:12:23.632Z · LW(p) · GW(p)

From your description, I'd summarize as - all the equations, none of the bullshit. Consistent derivation from first principles with fewer postulates.

Does that sound right to you?

Do you see any problems for the approach, any mysteries to resolve?

Common sense quantum mechanics

Contents

43 comments