# Joint Configurations

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-04-11T05:00:58.000Z · score: 41 (43 votes) · LW · GW · Legacy · 40 commentsThe key to understanding configurations, and hence the key to understanding quantum mechanics, is realizing on a truly gut level that configurations are about more than one particle.

Continuing from the previous essay, Figure 1 shows an altered version of the experiment where we send in *two* photons toward *D* at the same time, from the sources *B* and *C*.

The starting configuration then is:

“a photon going from B to D,

and a photon going fromCtoD.”

Again, let’s say the starting configuration has amplitude .

And remember, the rule of the half-silvered mirror (at *D*) is that a right-angle deflection multiplies by *i*, and a straight line multiplies by 1.

So the amplitude flows from the starting configuration, separately considering the four cases of deflection/non-deflection of each photon, are:

- The “
*B*to*D*” photon is deflected and the “*C*to*D*” photon is deflected. This amplitude flows to the configuration “a photon going from*D*to*E*, and a photon going from*D*to*F*.” The amplitude flowing is . - The “
*B*to*D*” photon is deflected and the “*C*to*D*” photon goes straight. This amplitude flows to the configuration “two photons going from*D*to*E*.” The amplitude flowing is . - The “
*B*to*D*” photon goes straight and the “*C*to*D*” photon is deflected. This amplitude flows to the configuration “two photons going from*D*to*F*.” The amplitude flowing is . - The “
*B*to*D*” photon goes straight and the “*C*to*D*” photon goes straight. This amplitude flows to the configuration “a photon going from*D*to*F*, and a photon going from*D*to*E*.” The amplitude flowing is .

Now—and this is a *very important and fundamental idea in quantum mechanics*—the amplitudes in cases 1 and 4 are flowing to the *same* configuration. Whether the *B* photon and *C* photon both go straight, or both are deflected, the resulting configuration is *one photon going toward E and another photon going toward F*.

So we add up the two incoming amplitude flows from case 1 and case 4, and get a total amplitude of .

When we wave our magic squared-modulus-ratio reader over the three final configurations, we’ll find that “two photons at Detector 1” and “two photons at Detector 2” have the same squared modulus, but “a photon at Detector 1 and a photon at Detector 2” has squared modulus zero.

Way up at the level of experiment, we never find Detector 1 and Detector 2 both going off. We’ll find Detector 1 going off twice, or Detector 2 going off twice, with equal frequency. (Assuming I’ve gotten the math and physics right. I didn’t actually perform the experiment.)

The configuration’s identity is *not*, “the *B* photon going toward *E* and the *C* photon going toward *F*. ” Then the resultant configurations in case 1 and case 4 would not be equal. Case 1 would be, “*B* photon to *E*, *C* photon to *F*” and case 4 would be “*B*photon to *F*, *C* photon to *E*.” These would be two distinguishable configurations, if configurations had photon-tracking structure.

So we would not add up the two amplitudes and cancel them out. We would keep the amplitudes in two separate configurations. The total amplitudes would have non-zero squared moduli. And when we ran the experiment, we would find (around half the time) that Detector 1 and Detector 2 each registered one photon. Which doesn’t happen, if my calculations are correct.

Configurations don’t keep track of where particles come from. A configuration’s identity is just, “a photon here, a photon there; an electron here, an electron there.” No matter how you get into that situation, so long as there are the same species of particles in the same places, it counts as the same configuration.

I say again that the question “What kind of information does the configuration’s structure incorporate?” has *experimental consequences*. You can deduce, from experiment, the way that reality itself must be treating configurations.

In a classical universe, there would be no experimental consequences. If the photon were like a little billiard ball that either went one way or the other, and the configurations were our beliefs about possible states the system could be in, and instead of amplitudes we had probabilities, it would not make a difference whether we tracked the origin of photons or threw the information away.

In a classical universe, I could assign a 25% probability to both photons going to *E*, a 25% probability of both photons going to *F*, a 25% probability of the *B* photon going to *E* and the *C* photon going to *F*, and 25% probability of the *B* photon going to *F*and the *C* photon going to *E*. Or, since I *personally* don’t care which of the two latter cases occurred, I could decide to collapse the two possibilities into one possibility and add up their probabilities, and just say, “a 50% probability that each detector gets one photon.”

With probabilities, we can aggregate events as we like—draw our boundaries around sets of possible worlds as we please—and the numbers will still work out the same [LW · GW]. The probability of two mutually exclusive events always equals the probability of the first event plus the probability of the second event.

But you can’t arbitrarily collapse configurations together, or split them apart, in your model, and get the same experimental predictions. Our magical tool tells us the ratios of squared moduli. When you add two complex numbers, the squared modulus of the sum is not the sum of the squared moduli of the parts:

E.g.

Or in the current experiment of discourse, we had flows of and cancel out, adding up to 0, whose squared modulus is 0, where the squared modulus of the parts would have been 1 and 1.

If in place of Squared_Modulus, our magical tool was some linear function— any function where —then all the quantumness would instantly vanish and be replaced by a classical physics. (A *different* classical physics, not the same illusion of classicality we hallucinate from inside the higher levels of organization in our own quantum world.)

If amplitudes were just probabilities, they couldn’t cancel out when flows collided. If configurations were just states of knowledge, you could reorganize them however you liked.

But the configurations are nailed in place, indivisible and unmergeable without changing the laws of physics.

And part of what is nailed is the way that configurations treat multiple particles. A configuration says, “a photon here, a photon there,” not “*this* photon here, *that*photon there.” “*This* photon here, *that* photon there” does not have a different identity from “*that* photon here, *this* photon there.”

The result, visible in today’s experiment, is that you can’t factorize the physics of our universe to be about particles with individual identities.

Part of the reason why humans have trouble coming to grips with *perfectly normal*quantum physics, is that humans bizarrely keep trying to factor reality into a sum of individually real billiard balls.

Ha ha! Silly humans.

## 40 comments

Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).

What confuses me about the actual verification of these experiments is that they require perfect timing and distancing. How exactly do you make two photons hit a half-silvered mirror at exactly the same time? It seems that if you were off only slightly then the universe would necessarily have to keep track of both as individuals. In practice you're always going to be off slightly so you would think by your explanation above that this would in fact change the result of the experiment, placing a 25% probability on all four cases. Why doesn't it?

Jordan, the three answers to your question are:

1) Each photon is actually spread out in configuration space - I'll talk about this later - so an infinitesimal error in timing only creates an infinitesimal probability of both detectors going off, rather than a discontinuous jump.

2) Physicists have gotten good at doing things with bloody precise timing, so they can run experiments like this.

3) I didn't actually perform the experiment.

"The probability of two events equals the probability of the first event plus the probability of the second event."

Mutually exclusive events.

It is interesting that you insist that beliefs ought to be represented by classical probability. Given that we can construct multiple kinds of probability theory, on what grounds should we prefer one over the other to represent what 'belief' ought to be?

If the photon sources aren't in a super position I think you have to represent the system with a vector of complex numbers. IANAP either.

Gray, fixed.

Jesus Christ, the complex plane. I half-remember that.

Eliezer, this may come as a shock, but I suspect there exists at least some minority of individuals beyond just me who will find the consistent use of complex numbers at all to be the the most migraine-inducing part of this. You might also find that even those of us who supposedly know how to do some computation on the complex plane are likely to have little to no intuitive grasp of complex numbers. Emphasis on computation over understanding in mathematics teaching, while pervasive, does not tend to serve students well. I could be the only one for whom this applies, but I wouldn't bet heavily on it.

Thankfully, there's already a couple of "intuitive explanations" of complex numbers on the 'net. And I dug up the links. And the same site has a few articles about generalizations of the Pythagorean Theorem, on a related note. Basically, anyone who is having any trouble with the mathematical side of this is likely to find it a bit of a help. It's also a lot like overcoming bias in its explanatory approach, so there's that.

Imaginary/complex numbers:

http://betterexplained.com/articles/a-visual-intuitive-guide-to-imaginary-numbers/

http://betterexplained.com/articles/intuitive-arithmetic-with-complex-numbers/

General math:

http://betterexplained.com/articles/category/math/

I should probably also take this opportunity to swear up and down that I'm not trying to generate ad revenue for that site, but you'd have to take my word on it. I might also add that I'm quite certain I still don't understand complex numbers meaningfully, but that's a separate thing.

Opposite problem - I know pretty much what an imaginary number is, and even some applications of *i*. Numbers can have real and imaginary elements, fine. But I have no idea why they have an application here.

That said, this post makes a lot of intuitive sense to me, a humanities graduate, so this series is off to a pretty good start. If Eliezer's good at one thing, it's explaining complex-seeming things I know a little about in a very sensible and useful way.

Actually, I was in exactly the same position.

Then I actually read the article from the comment above (to refamiliarise myself with the maths), and was quite surprised to find that the article makes this relation very clear.

Worth a look. :)

Oh my god...imaginary numbers...they make sense now! Seriously, thank you for that link. I've gotten all the way through high school calculus without ever having imaginary numbers=rotation explained. Looking at the graph for 10 seconds completely explained that concept and why Eliezer was using imaginary numbers to represent when the photons were deflected.

*Given that we can construct multiple kinds of probability theory, on what grounds should we prefer one over the other to represent what 'belief' ought to be?*

Make sure you understand Cox's theorem, then exhibit for me two kinds of probability theory, then I will reply.

Here's what bugs me: Those two photons aren't going to be *exactly* the same, in terms of say frequency or maybe the angle they make against the table. So how close do they have to be for the configurations to merge? Or is that a Wrong Question? Perhaps if we left the photon emmiter the same but changed the detector to one that could tell the difference in angle, then the experimental results would change? What if we use a angle-detecting photon detector but program it to dump the angle into /dev/null?

Regarding Larry's question about how close the photons have to be before they merge --

The solution to that problem comes from the fact that Eliezer's experiment is (necessarily) simplifying things. I'm sure he'll get to this in a later post so you might be better off waiting for a better explanation (or reading Feynman's *QED: The Strange Theory of Light and Matter*, which I think is a fantastically clear explanation of this stuff.) But if you're willing to put up with a poor explanation just to get it quicker...

In reality, you don't have just one initial amplitude of a photon at exactly time T. To get the *full* solution, you have to add in the amplitude of the photon arriving a little earlier, or a little later, and with a little smaller or a little larger wavelength, and even travelling faster or slower or not in a straight line, and possibly interacting with some stray electron along the way, and so on for a ridiculously intractable set of complications. Each variation or interaction shows up as a small multiplier to your initial amplitude.

But fortunately, *most* of these interactions cancel out over long distances or long times, just like the case of the two photons hitting opposite detectors, so in this experiment you can treat the photon as just having arrived at a certain time and you'll get very close to the right answer.

Or in other words--easier to visualize but perhaps misleading--the amplitude of the photons is "smeared out" a little in space and time, so it's not too hard to get them to overlap enough for the experiment to work.

--Jeff

Along the lines of what Larry asks: Obviously the angles are not going to be perfect. The two photons will come in with slightly different angles. So you would think the photons will not be perfectly indistinguishable. Now I wonder if it is the case that if you put your detectors in close, so that they "see" a relatively wide range of angles, then you get the interference and the photons are treated the same; whereas if you put your detectors far away, they might become sensitive to a smaller range of angles, so that they could distinguish the two photons, then the interference might go away?

But in that case, you could make an FTL signaling device by putting one detector at a distance, and moving the other detector from close to far. When close, you get interference and get two photons or none, while when far, you get single photons, a difference detectable by the remote detector. Clearly this can't happen.

I'm sure Jeff is right and that a fuller investigation of the wave equations would explain exactly what happens here. But it does point out one big problem with this level of description of the quantum world: the absence of a primary role for space and time. We just have events and configurations. How does locality and causality enter into this? Where do speed of light limitations get enforced? The world is fundamentally local, but some ways of expressing QM seem to ignore that seemingly important foundation.

With the setup you describe, the remote detector still has to wait until the photons would be expected to arrive to know whether the other detector had been moved, right? This seems like it would be exactly at-light-speed signaling.

Jeff, I would not call your last paragraph misleading. It is quantum mechanics--as opposed to quantum field theory--and I think QED is a gratuitously complicated way to answer the question. Perhaps it's a way to get from classical notions of particles to quantum mechanics, but it seems to me to go against the spirit of trying to understand QM on its own terms, rather than as a modification of classical particles.

I found your explanations of Bayesian probability enlightening, and I've tried to read several explanations before. Your recent posts on quantum mechanics, much less so. Unlike the probability posts, I find these ones very hard to follow. Every time I hit a block of 'The "B to D" photon is deflected and the "C to D" photon is deflected.' statements, my eyes glaze over and I lose you.

Dan, Emmett,

I hear you. Unfortunately, I can't put as much work into this as I did for the intuitive explanation of Bayes's Theorem, plus the subject matter is inherently more complicated.

Feynman's *QED* uses little arrows in 2D space instead of complex numbers (the two are equivalent). And if I had the time and space, I'd draw different visual diagrams for each configuration, and show the amplitude flowing from one to another...

But QM is also inherently more complicated than Bayes's Theorem, and takes more effort; plus I'm trying to explain it in less time... I'm not figuring that all readers will be able to follow, I'm afraid, just hoping that some of them will be.

If the problem is not that QM is *confusing* but that you can't follow what is being said *at all*, you probably want to be reading Richard Feynman's *QED* instead.

* Feynman's QED uses little arrows in 2D space instead of complex numbers (the two are equivalent). *

this is quite offtopic, but there are important analogies [1] [2] between Feynman diagrams, topology and computer science. being aware of this may amke the topic easier to understand.

So you calculate events for photons that happen in parallel (e.g. one photon being deflected while another is deflected as well) the same way you would for when they occur in series (e.g. a photon being deflected, and then being deflected again)? It seems that in both cases you are multiplying the original configuration by -1 (i.e. i * i).

FWIW, my first instinct when I saw the diagram was to assume 2 starting configurations, one for each photon, though I guess the point of this post was that I can't do stuff like that. In fact, when I did the math that way, I came up with the photons hitting different detectors twice as much as when they both hit the same one.

I think I'll be picking up the Feynman book.....

Richard: Cox's theorem is an example of a particular kind of result in math, where you have some particular object in mind to represent something, and you come up with very plausible, very general axioms that you want this representation to satisfy, and then prove this object is unique in satisfying these. There are equivalent results for entropy in information theory. The problem with these results, they are almost always based on hindsight, so a lot of the times you sneak in an axiom that only SEEMS plausible in hindsight. For instance, Cox's theorem states that plausibility is a real number. Why should it be a real number?

Grey Area asked, "For instance, Cox's theorem states that plausibility is a real number. Why should it be a real number?" For that matter, why should the plausibility of the negation of a statement depend only on the *plausibility* of the statement? Mightn't the statement itself be relevant?

Gray Area,

In reply to "why a real number question", we might want to weaken the theory to the point were only equalities and inequalities can be stated. There are two weaker desiderata one might hold. Let (A|X) be the plausibility of A given X.

Transitivity: if (A|X) > (B|X) and (B|X) > (C|X), then (A|X) > (C|X)

Universal Comparability: one of the following must hold (A|X) > (B|X) (A|X) = (B|X) (A|X) < (B|X)

If you keep both, you might as well use a real number -- doing so will capture all of the desired behavior. If you throw out Transitivity, I have a series of wagers I'd like to make with you. If you throw out Universal Comparability, then you get lattice theories in which propositions are vertexes and permitted comparisons are edges.

On the other hand, you might find just a single real number too restrictive, so you use more than one. Then you get something like Dempster-Shafer theory.

In short, there are alternatives.

As for why should the plausibility of the negation of a statement depend only on the plausibility of the statement, the answer (I believe) is that we are considering only the sorts of propositions in which the Law of the Excluded Middle holds. So if we are using only a single real number to capture plausibility, we need f{(A|X),(!A|X)} = (truth|X) = constant, and we have no freedom to let (!A|X) depend on the details of A.

Well, an alternative way to suggest probability is the Right Way is stuff like dutch book arguments, or more generally, building up decision theory, and epistemic probabilities get generated "along the way"

The arguments I like are of the form that each step is basically along the lines of "if you don't follow this rule, you'll be vulnerable to that kind of stupid behavrior, where stupid behavior basically means 'wasting resources without making progress toward fulfilling your goals, whatever they are'"

Frankly, I also like those arguments because mathematicaly, they're cleaner. Each step gets you something of the final result, and generally doesn't require anything more demanding than basic linear algeabra.

It's nice to know Cox's Theorem is there, but it's not what, to me at least, would be a simple clean derivation.

Psy-Kosh,

Curiously, I have just the opposite orientation -- I like the fact that probability theory can be derived as an extension of logic totally independently of decision theory. Cox's Theorem also does a good job on the "punishing stupid behavior" front. If someone demonstrate a system that disagrees with Bayesian probability theory, when you find a Bayesian solution you can go back to the desiderata and say which one is being violated. But on the math front, I got nothing -- there's no getting around the fact that functional equations are tougher than linear algebra.

Cyan: I certainly admit that the ease of the math may be part of my reaction. Maybe if I was far more familiar with the theory of functional equations I'd find Cox's theorem more elegant than I do.

(I've read that if one makes a minor tweak to Cox's theorem, just letting go of the real number criteria and letting confidences be complex numbers, the same line of derivation more or less hands you quantum amplitudes. I haven't seen that derivation though, but if that's correct, it makes Cox's theorem even more appealing. QM for "free"! :))

The vulnerabiliy ones though actively motivate the criteria rather than a list of reasonable sounding properties an extention to boolean logic ought to have. The basic criteria, I guess, could be describes as "will you end up in a situation in which you'd knowingly willingly waste resources without in any way benefiting your goals?"

So each step is basically a "Mathematical Karma is going to get you and take away your pennies if you don't follow this rule." :)

But yeah, the bit about only reasonable extention of vanilla logic does make the Cox thing a bit more appealing. On the other hand, that may actively dissuade some from the Bayesian perspective. Specifically, constructivists, intuitionists in particular, for instance, may be hesitant of anything too dependant on law of excluded middle in the abstract. (This isn't an abstract hypothetical. I've basically ended up in a friendly argument a while back with someone that more or less had them rejecting the notion of using probability as a measure of belief/subjective uncertainty because it was an extention of boolean logic, and the guy didn't like law of excluded middle and basically was, near as I can make out, an intuitionist. (Some of the mathematical concepts he was bringing up seem to imply that))

Personally, I just really like the whole "math karma" flavor of each step of a vulnerability argument. Just a different flavor than most mathematical derivations for, well, anything, that I've seen. Not to mention, in some formulations, getting decision theory all at once with it.

I am having a bit of trouble with this series. I can see that you are explaining that reality consists of states with "amplitude" numbers assigned to each state.

- You seem to assign arbitrary numbers to the initial states and an arbitrary amplitude change rules to mirrors. Why is this in any way applicable to objective reality? Or are these numbers non-arbitrary? Or am I just missing something elementary?
- Why states of photons or detectors are complex numbers and mirror is a function?
- How does time factor into all of this?

I am sorry. I should have read the rest of the series BEFORE starting to ask questions about this particular article. Please disregard my previous post.

Thanks for helping me remove the classic hallucination to now understand the double slit phenomenon.

Added reference to the writeup of this experiment in Wikipedia: http://en.wikipedia.org/wiki/Hong%E2%80%93Ou%E2%80%93Mandel_effect

HT Manfred.

(The version in this comment doesn't work, though the main one in the article does. There's a period inside the URL. -- You should have http://en.wikipedia.org/wiki/Hong%E2%80%93Ou%E2%80%93Mandel_effect

Thx fixed.

Thank you for that. This is one of most interesting experiments I've seen, because in my interpretation, it's refuting a quantum ontological randomness more than confirming it.

Consider the case of 1 photon. It hits the splitter, the splitter establishes boundary conditions on the photon wave packet such that there is only possible mode compatible with the splitter at any given time, and only 2 modes generally.

Now, two photons. The article says they have to match in phase, time, and polarization. Since they match, they will be deflected in the same way all the time, because the beam splitter is only compatible with one mode at a particular instance of time (for a particular phase and polariztion?).

Yes, I know, Bell's Theorem, no hidden variables, yadda yadda yadda. I'm not convinced. Neither was Jaynes, and I find him clearer and cleverer than those who think the quantum world is magical and mysterious, and the world runs on telekinesis. He wasn't convinced by Bell, and in particular charged that Bell's analysis didn't include time varying hidden variables, which is of course the natural way to get the appearance of ontological randomness - have the hidden variable vary at smaller time scales than you are able to measure.

Although apparently not. Looks like the HOM effect has measured the time interval down to the relevant time scales. Hurrah! Ontological randomness is dead! Long live the Bayesian Conspiracy!

But I'd like to see the experiment done *without the splitter*. Do the photons ever go the same way without the splitter there to establish a boundary condition? If it's all just about photon entanglement and ontological randomness, shouldn't they? My prediction is that they wouldn't.

And yes, I realize that it's unlikely that I have resolved all the mysteries of quantum physics before breakfast. Still, that's the way it looks to me.

Wondering if Jaynes had ever commented on the HOM effect, I found no direct comment, but instead a wikipedia article: "The Jaynes–Cummings model (JCM) is a theoretical model in quantum optics. It describes the system of a two-level atom interacting with a quantized mode of an optical cavity..." Is he getting at the same thing here - of boundary conditions applied to wave packets? I don't know. Looks like in his last paper on quantum theory, Scattering of Light by Free Electrons, he's getting at the wave function as being physically real, and not the probability distribution of teeny tiny billiard balls.

And while I was at it, I found that Hess and Philipp have been pushing against Bell for time variation. Something to check out sometime.

It is not at all clear to me why a single half-mirror should result in two multiplications and not an addition in the case of there being more than one photon. After all they are added up when they strike the detector.

It is completely unmentioned why this would be the case, and would seem to bear explanation.

I'm not quite understanding it either, but if I'm slightly understanding correctly: use sound wave as an ANALOGY. The half-silver mirrors allow it to "resonate" (sound terms) and ricochet off at the same time, while full silver only allows it to ricochet. (this is most likely VERY WRONG. I just now got the reasoning behind complex numbers, rotation of planes, 3d waves, etc)

Another great explanation. What you describe suggests that the fabric of the universe is not made of particulate stuff, but rather informational and/or computational. Or am I reading to much into this?

A lot of the dumbed down science I read/watch (documentaries, popular sciency magazines, sciency websites, etc) suggests that this is exactly how a number of physicists view the world these days.

For example, often when there is debate about whether such and such theoretical effect breaks the conservation laws they speak in terms of information being conserved or destroyed, even though they are referring to things like photons and whatnot (e.g. the Thorne-Hawking-Preskill bet concerning Hawking Radiation).

I thought I understood quantum mechanics. I have studied it for years. Passed exams. Got a degree.

This is the first time I have ever heard of the HOM experiment, and it is causing a crisis in my mind, since it does not match what I know about quantum mechanics.

They really should at least tell us of this experiment and its results when they start teaching us. Even after these years I was still under the impression that quantum mechanics was about unbreakable limitations on the measurement of data.

This experiment proves that the universe itself does not work in an intuitive way. All the people who are trying to outwit quantum mechanics by measuring position and energy at the same time are not just attempting something impossible, what they are trying doesn't even make sense.

AAAAAAAAAAAGH!

If in place of Squared_Modulus, our magical tool was some linear function—any function where F(X + Y) = F(X) + F(Y)—then all the quantumness would instantly vanish and be replaced by a classical physics.

I am having trouble working out linearity of functions. Let's say we take a linear function F(x) = x + 5. Then we use the above linearity you mention F(5 + 6) = F(5) + F(6).

We get F(11) = F(5) + F(6).

If we work that out we get => 11 + 5 = 5+5 + 6+5.

The result is 16 =/= 21.

So, the linear function doesn't have linearity as its property?

I am confused.

I found the calculation of the amplitude flows for cases 1 to 4 confusing at first. I think the part I missed is that the reflection of a photon at a half silvered mirror multiplies the *configuration* (ie the state of *both* photons) by i. So in case 1 we get each photon multiplied by i twice, so the amplitude at *both* detectors is 1.