What a reduction of "probability" probably looks like

post by cousin_it · 2010-08-17T14:58:38.876Z · LW · GW · Legacy · 35 comments

Contents

  One
  Two
  Three
None
35 comments

Unlike my previous posts, this one isn't an announcement of some finished result. I just want to get some ideas out for public discussion. A big part of the credit goes to Wei Dai and Vladimir Nesov, though the specific formulations are mine.

Wei Dai wonders: what are probabilities, anyway? Eliezer wonders: what are the Born probabilities of? I cannot claim to know the answers, but I strongly hold that these questions are, in fact, answerable. And as evidence, I'll try to show how normal the answers might plausibly turn out to be.

Perhaps counterintuitively, the easiest way for probabilities to arise is not by postulating "different worlds" that you could "end up" in starting from now. No, the easiest setting is a single, purely deterministic world with only one possible future.

One

The first thought experiment goes like this. Imagine a coarse-grained classical universe whose physical laws are allowed to consume symbols from a "Tape", located outside the Matrix, which contains a sequence of ones and zeroes coming from a pseudorandom number generator. If we flip a coin in that world, the result of the flip will depend on several consecutive ones and zeroes read from the Tape. If the coin is "fair", it will come up heads 50% of the time, on average, as we advance along the Tape. (Yes, Virginia, in this setting the fairness would be a mathematical property of the coin, not of our own ignorance.) In such a world, creatures who are too computationally weak to predict the Tape will probably find useful a concept of "probability", and will indeed find "probabilistic" events happening with the limiting frequencies, standard deviations, etc. predicted by our familiar probability theory - even though the world has only one timeline that never branches.

But we know that in our world, observations are not completely deterministic: they can be influenced by the mysterious Born probabilities. How could that kind of law arise from a Nature that doesn't contain it already?

Two

To understand the second thought experiment, you need to be able to imagine the MWI without the Born probabilities: just a big wavefunction evolving according to the usual laws, without any observers who could collapse it or experience probabilities (whatever that means). And imagine an outside observer that samples it according to different probability measures, and sees different worlds. An observer using the Born rule will see our familiar "2-world". But other observers looking at the same wavefunction can see the "3-world" and many, many other worlds. What do they look like? That is an empirical question, completely answerable by modern physics, that I don't know enough math to answer; but intuitively it seems that most rules different from the 2-rule should either reward or penalize interactions that lead to branching, so the other worlds look either like huge neverending explosions, or static crystals at close to absolute zero. It's entirely possible that only 2-sampling is "stable" enough to contain stars, planets, proteins and biological evolution - which, if you think about it, "explains" the "existence" of the 2-world without assuming the Born rule.

The above does sound like it could lead to an explanation of some statement vaguely similar to the Born rule, but to clarify matters even further - or confuse you even deeper - let us go on to...

Three

Imagine an algorithm running on a classical physical computer, sitting on a table in our quantum universe. The computer has the above-explained property of being "stable" under the Born rule: a weighted-majority of near futures ranked by the 2-norm have the computer correctly executing the next few steps, but for the 1-norm this isn't necessarily the case - the computer will likely glitch or self-destruct. (All computers built by humans probably have this property. Also note that it can be defined in terms of the wavefunction alone, without assuming weights a priori.) Then the algorithm will have "subjective anticipation" of an extremely weird kind: conditioned on the algorithm itself running faithfully in the future, it can conclude that some future histories with higher Born-weight are more likely. So if I'm some kind of Platonic mathematical algorithm, this says something about what I should expect to happen in the world.

This "explanation" has the big drawback that it doesn't explain experimental observations. Why should an apparatus measuring a property of one individual particle (say) give rise to observed probabilities predicted by quantum theory? I don't have an answer to that.

Moreover, I don't think the above thought experiments should be taken as a final answer to anything at all. The intent of this post was to show that confusing questions can and should be approached empirically, and that we can and should strive to achieve perfectly normal answers.

35 comments

Comments sorted by top scores.

comment by cousin_it · 2010-08-18T10:09:48.335Z · LW(p) · GW(p)

Another day of thinking about scenario 2 gave me a sci-fi idea: there's nothing stopping us here in 2-world from constructing "ships" that would reliably survive in 3-world. (Remember that 3-world is right here, around us, we just don't see it.) From our point of view the ship will self-destruct immediately upon launch, but the onboard computer will see a different picture: everything around it self-destructs and goes crazy, but the computer itself becomes awesomely powerful and can solve NP-complete problems in polynomial time. The idea is essentially identical to quantum suicide computing or reality editing. Of course, such a machine can never go back to our world.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2010-08-18T15:12:28.910Z · LW(p) · GW(p)

Isn't it a special case of what I discussed in last paragraph here (which you dismissed)?

Replies from: cousin_it
comment by cousin_it · 2010-08-18T18:26:01.267Z · LW(p) · GW(p)

I didn't deny it was possible, but I did deny that we'd want an FAI to do it.

Replies from: FAWS
comment by FAWS · 2010-08-18T18:32:05.926Z · LW(p) · GW(p)

You could have linked to that comment so people who didn't read that thread could see where the idea came from.

comment by orthonormal · 2010-08-17T21:58:02.407Z · LW(p) · GW(p)

It's worth noting that the conservation law for the L^2 norm is not a general condition, but a very special feature. We start talking about things as physically real when there's a conservation law involved (mass/energy, charge, momentum, etc). Other norms for the wavefunction don't have this property (except perhaps a few pathologically complex norms concocted for that purpose).

Replies from: cousin_it
comment by cousin_it · 2010-08-17T22:00:52.384Z · LW(p) · GW(p)

Good, I thought as much. Do you have links to papers about such exotic norms?

Replies from: orthonormal
comment by orthonormal · 2010-08-17T22:38:47.687Z · LW(p) · GW(p)

The existence of such exotic norms is just a guess. I do know for certain that other L^p norms and Sobolev norms aren't conserved over time (one can bound their rate of growth in some cases, but that's not nearly as special), but my relevant math books are in another city. I'll see if I can find a reference.

comment by SilasBarta · 2010-08-17T15:31:32.666Z · LW(p) · GW(p)

My thoughts:

(Yes, Virginia, in this setting the fairness would be a mathematical property of the coin, not of our own ignorance.) In such a world, creatures who are too computationally weak to predict the Tape will probably evolve a concept of "probability",

I'm confused here -- if the coin's randomness really is fundamental, and not a property of our ignorance, then it doesn't make sense to say that a being is too computationally weak to predict it -- no amount of computational strength would allow prediction.

(I'm also confused at how the non-native speakers here so effortlessly use colloquialisms like "Yes, Virginia ...", which came from a famous "Yes, Virginia, there is a Santa Claus...", but whatever.)

Isn't Two a restatement of the anthropic explanation for the Born rule: we could only see this kind of universe if the Born rule were true? Other universes would permit "anthropic hypercomputation", which fundamentally changes the game, or fail to permit something we recognize as minds.

Replies from: cousin_it, BenP
comment by cousin_it · 2010-08-17T15:41:53.563Z · LW(p) · GW(p)

About your first question: I use "randomness" in a sense that doesn't have anything to do with unpredictability. It only relies on observed long-run statistical properties: limiting frequency, stddev, law of large numbers, frequencies of substrings... For example, the binary expansion of pi works fine for my purposes (if pi is a normal number), even though it's perfectly predictable by an algorithm.

About your second question: LW is one of my ways to avoid losing my grasp of English :-) And I'm still waiting for my chance to use "As you know, Bob" like Shalizi did.

About your third question: I don't think anthropic hypercomputation is the big blocking issue. After all, our brains don't seem to use quantum computing, even though it's available here in 2-world and offers significant speedups on problems like database lookups which sound pretty damn important! My idea is rather that the 3-world and friends are too crazy to support any life at all.

Replies from: SilasBarta
comment by SilasBarta · 2010-08-17T16:32:58.364Z · LW(p) · GW(p)

I use "randomness" in a sense that doesn't have anything to do with unpredictability. It only relies on observed long-run statistical properties: limiting frequency, stddev, law of large numbers, frequencies of substrings... For example, the binary expansion of pi works fine for my purposes (if pi is a normal number), even though it's perfectly predictable by an algorithm.

Okay, but then you shouldn't say that failing to know the sequence is not a property of his ignorance. If pi works here, then not knowing the next digit is indeed a fact about your ignorance (specifically, ignorance of the result of a known procedure).

Edit: nevermind, I had misread that: yes, it makes sense to that that the agent is ignorant of the result, but that the randomness is not a fact of that agent's ignorance.

My idea is rather that the 3-world and friends are too crazy to support any life at all.

Yes, but that's still part of the anthropic argument for the Born rule, just on the other end of boundary.

comment by BenP · 2010-08-17T15:38:12.624Z · LW(p) · GW(p)

I'm confused here -- if the coin's randomness really is fundamental, and not a property of our ignorance, then it doesn't make sense to say that a being is too computationally weak to predict it -- no amount of computational strength would allow prediction

He stated that the randomness is being provided by a pseudorandom number generator.

comment by BenP · 2010-08-17T15:46:09.758Z · LW(p) · GW(p)

I'm unfamiliar with the terminology "2-sampling", "2-world", "3-world", etc. and a quick internet search has not turned up anything useful. Could you summarize what they mean or direct me to a place that explains them?

Replies from: cousin_it
comment by cousin_it · 2010-08-17T15:48:57.441Z · LW(p) · GW(p)

I'd imagined these terms would be self-explanatory from the post :-) The numbers refer to variants of the Born rule where the exponent doesn't necessarily equal 2. For example, see page 6 of this paper by Aaronson.

Replies from: SilasBarta, BenP, SilasBarta, SilasBarta, SilasBarta, SilasBarta, SilasBarta, SilasBarta
comment by SilasBarta · 2010-08-17T16:37:00.429Z · LW(p) · GW(p)

I figured out what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by BenP · 2010-08-17T15:57:58.473Z · LW(p) · GW(p)

Ah thanks. I should've been able to figure that out from your third thought experiment anyways.

comment by SilasBarta · 2010-08-17T16:36:59.250Z · LW(p) · GW(p)

I figured what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by SilasBarta · 2010-08-17T16:36:58.130Z · LW(p) · GW(p)

I figured what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by SilasBarta · 2010-08-17T16:36:55.361Z · LW(p) · GW(p)

I figured what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by SilasBarta · 2010-08-17T16:36:53.698Z · LW(p) · GW(p)

I figured what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by SilasBarta · 2010-08-17T16:36:52.560Z · LW(p) · GW(p)

I figured what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by SilasBarta · 2010-08-17T16:36:51.124Z · LW(p) · GW(p)

I figured what the n-world referred to because of familiarity with the problem, but it would probably still help to specifically define it, since it doesn't take much effort but would save readers a lot of time.

comment by prase · 2010-08-17T15:52:44.316Z · LW(p) · GW(p)

By 2-world, 2-rule etc. you mean that we calculate probabilities as norm of the vector in a L_2 space?

Btw, when your previous post was named what a reduction of "could" could look like, I would find it more beatiful to name this post what a reduction of "probability" probably looks like.

Replies from: cousin_it
comment by cousin_it · 2010-08-17T15:54:32.273Z · LW(p) · GW(p)

Yes and suggestion accepted!

comment by utilitymonster · 2010-08-17T19:49:48.605Z · LW(p) · GW(p)

tl;dr Philsophers have been writing about what probabilities reduce to for a while. As far as I know, the only major reductionist view is David Lewis's "best system" account of laws (of nature) and chances. You can look for "best system" in this article for an intro. Barry Loewer has developed this view in this paper.

Replies from: cousin_it
comment by cousin_it · 2010-08-17T20:08:21.824Z · LW(p) · GW(p)

From what I understand of Lewis's view, it's not a "reduction" in my sense of the word which (I think) also coincides with common LW usage. I generally try to reduce phenomena to programs that can be implemented on computers; the first two scenarios in the post are of this kind, and the third one can probably be implemented as well, once I understand it a little better.

comment by bentarm · 2010-08-17T16:34:59.686Z · LW(p) · GW(p)

I'm not sure I understand the point of scenario 1. It doesn't seem to say much more than "if you can't distinguish them from random numbers, then pseudorandom numbers look a lot like random numbers".

I definitely don't understand scenarios 2 and 3, but that's (at least partially) due to my ignorance of quantum mechanics.

Replies from: cousin_it
comment by cousin_it · 2010-08-17T16:45:50.436Z · LW(p) · GW(p)

Scenario 1 was intended as an answer to Wei Dai's question at the first link, and an alternative to his own proposed answer (probabilities as "degrees of caring").

Replies from: Wei_Dai
comment by Wei Dai (Wei_Dai) · 2010-08-17T21:43:54.232Z · LW(p) · GW(p)

I think they're answering different questions. Your scenario 1 gives good reason to think that using probabilities is a good way to handle at least some types of logical uncertainty. My question was about empirical or indexical uncertainty. For example, why do we think one kind of universe is more likely to exist than another, or why do we think we're more likely to be in one kind of universe than another (depending on whether you accept the idea that all possible universes exist)?

Replies from: cousin_it
comment by cousin_it · 2010-08-17T21:50:30.572Z · LW(p) · GW(p)

Didn't your UDT solution to the Absent-Minded Driver show that probabilities aren't the right way to handle indexical uncertainty, even within one universe? I thought it was pretty convincing.

comment by Vladimir_Nesov · 2010-08-17T19:05:53.902Z · LW(p) · GW(p)

This suggests an anthropic explanation of our observations of measure given physics, but not of the other physical laws, which contribute to probabilities of observed events greatly.

Replies from: cousin_it
comment by cousin_it · 2010-08-17T21:25:02.709Z · LW(p) · GW(p)

Deterministic physical laws could give rise to observed randomness without calling an external PRNG: they could be the PRNG, constantly "mixing" past microstates so that statistical laws hold on average. For example, cellular automata can lead to macroscopic pseudorandomness of this sort (I believe there are actual PRNGs based on them).

You're right that this assumes physical laws as handed down from above, though. My post was not that ambitious. If you want to explain all of physics anthropically, best of luck, but I can't see how it could possibly work.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2010-08-17T22:01:58.744Z · LW(p) · GW(p)

If you want to explain all of physics anthropically, best of luck, but I can't see how it could possibly work.

I don't want to explain all of physics anthropically, I think I was pretty clear on that.

(And I don't understand how your first paragraph is related to my comment.)

Replies from: cousin_it
comment by cousin_it · 2010-08-17T22:05:39.849Z · LW(p) · GW(p)

Then I don't understand your original comment. Explain?

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2010-08-17T22:17:37.802Z · LW(p) · GW(p)

Your anthropic explanation of measure in the context of other physical laws sounds feasible, but I don't think it's a viable strategy for explaining physical laws themselves, or that anthropic explanations are as widely applicable as many people tend to apply them.

For example, you asked "why aren't we in that world instead of this one?", but I don't think it's a meaningful question. See what form your anthropic explanation of measure in the post has: it follows directly from formalization of the question, as are all mathematical arguments. For example, you formalize "expected observations" as "those consistent with me continuing to operate". But you can't similarly "re-formalize" things that are already defined in an incompatible way, for example you can't formalize human as a dolphin, and thus seek to answer the question of why you are a human and not a dolphin.

This is actually relevant to recent discussions of decision theory: you always control fixed things, that are already defined. To control X, you must know what X is, and that definition determines the outcome of control as well, even "before" you've decided. Similarly, a question of "Why is X not Y?" presupposes a notion of X being Y, which is often clearly not possible, given definitions of X and Y, just as you can prove that in Newcomb's problem, the payoff of 17 doesn't ever happen, even without assuming anything about your possible action.

(Of course, this discussion diverges from the topic of the post, and returns to that of my prior comment linked above.)

Replies from: cousin_it
comment by cousin_it · 2010-08-17T23:02:52.547Z · LW(p) · GW(p)

On one hand, I agree with pretty much everything you wrote: anthropic explanations that actually work are very hard to come by, what I wrote sounds like the good kind, extending it to other physical laws beyond measure won't work.

On the other hand, I'd be very cautious to declare any question as meaningless. For now I withhold judgment on the issue of dolphins because I don't know enough. "Round manhole covers are round by definition" is correct, but not always complete.