An additional problem with Solomonoff induction

gedymin

An additional problem with Solomonoff induction

post by gedymin · 2014-01-22T23:34:05.971Z · LW · GW · Legacy · 51 comments

51 comments

Let's continue from my previous post and look how Solomonoff induction fails to adequately deal with hypercomputation.

You may have heard of the Physical Church-Turing thesis. It's the idea that the Universe can, in a perfect level of detail, be simulated on a Turing machine. (No problem if the Universe turns out to be infinite - the thesis requires only that each finite portion of it can be simulated.) A corollary to the Physical CTT is the idea that there are no physically realizable uncomputable processes. We can talk about hypercomputers as mathematical abstractions, but we'll never be able to build (or see) one.

We don't have a very strong scientific evidence for the Physical CTT thesis yet - no one has built the perfect simulator yet, for example. On the other hand, we do have something - all known laws of physics (including quantum physics) allow arbitrary precision simulations on a computer. Even though the complete unified theory of all fundamental forces isn't there yet, the Standard model already makes pretty accurate predictions for almost all environments. (Singularities being the only known exception, as their neighborhoods are the only locations where the role of quantum gravity is not negligible.)

So the Physical CTT does not contradict any known laws of physics. Of course, these laws are not contradicted by a multitude of alternative hypotheses as well; all the hypotheses which suggest that the universe cannot be simulated on a Turing machine. We prefer the Physical CTT solely because it's the simplest one; because Occam's razor says so.

There are multiple levels and kinds of uncomputability; none of the problems placed on either of these levels are uncomputable in the absolute sense; all of them can computed by some hypothetical devices. And all of these devices are called hypercomputers. A corollary to the "Universe is uncomputable" position is that we someday may be able to, in fact, build a hypercomputer that is embedded in the physical world, or at least access some of the Nature's mystical, uncomputable processes as a black box.

Now, the "standard" Solomonoff induction uses the Universal prior, which in turn is related to Kolmogorov complexity (KC). An uncomputable process formally has undefined Kolmogorov complexity. Informally, the KC of such as process is infinitely large, as it must be larger than the KC of any computable process.

As discussed in the comments to the previous post, Solomonoff induction is by no means restricted to the Universal prior; it can use other priors, including a prior (i.e. probability distribution) defined over an universal hypercomputer. An example of such a hypercomputer is the combination Universal Turing machine + Halting problem oracle. Another example is the combination Universal Turing machine + a true random number oracle. An upgraded form of Solomonoff induction which uses a prior defined over the first kind of universal hypercomputer is going treat the halting problem as a very simple, computable process. An upgraded form of Solomonoff induction over the second kind of universal hypercomputer is going to treat random number generation as a very simple, computable process. And so on.

Now here's the problem. Mathematically, a Universal Turing machine equipped with the Halting oracle, and a Universal Turing machine equipped with a Random Number oracle are in the same class: they are both universal hypercomputers. Physically and practically, they are miles away from each another.

A Random Number oracle is just that: something that gives you random numbers. Their statistical properties won't even be particularly better than the properties a good pseudorandom number generator. They simply are, in a sense, "true"; therefore uncomputable. However, quantum physics suggests that Random Number oracles, in fact, might be real; i.e. that there are sources of true randomness in the Universe. This is QM interpretation-dependent, of course, but any deterministic, non-random interpretation of quantum mechanics involves things like faster-than light interaction etc., which frankly are much less intuitive.

A Halting oracle, in contrast, can solve infinite number of hugely important problems. It's magic. In my beliefs, the a priori probability that Universe contains some sort of Halting oracle is tiny. Only a huge amount of proper scientific tests could convince me to change this conclusion.

On the other hand, mathematically the two hypercomputers are siblings. Both can be approximated by a Turing machine. Both can even be computed by a deterministic algorithm (I think?), if the Turing machine that does the computation is allowed to work forever.

There is one significant mathematical difference between the two oracles. (Nevertheless, Solomonoff induction fails to take into account this difference.) The Halting oracle has a power on its own; it can be used to solve problems even when it comes without the accompanying Turing machine. The Random Number oracle cannot be used for anything but random number generation. (To solve any computable decision problem P with the Halting oracle, we can reformulate it as program source code: "if P, then halt; otherwise: loop forever" and feed this program to the Halting oracle. In this way the Halting oracle can be used to tell that 3<7 is true - the program halts -, while 10<5 is false - it loops forever.)

The Solomonoff induction can be fixed, if we assume that the input tape of the Universal prior's Turing machine contains infinite number of random bits. However, this idea needs an explicit justification, and its implications are not at all obvious. Does this mean that Occam's razor should be "prefer the simplest hypothesis, together with an infinite source of random numbers", instead of "prefer the simplest hypothesis"?

So to sum up, the problem is:

Intuitively, the probability that we are living in a Universe that includes True Random numbers is much larger than the probability that we are living in a Universe that allows Halting problem oracles;
Solomonoff induction cannot reliably distinguish between these two cases.

The consequences? When you hear someone claiming - again - that "computers are not capable of true creativity/consciousness/whatever, because creativity/consciousness/whatever requires human intuition, which is uncomputable, etc." - remember that it might be a bad idea to respond with an appeal to Solomonoff induction.

Interestingly, quite a few people praise their intuition and view it as almost a mystical power, but no one is surprised by their ability to name a few random numbers :)

A related question: how does finite, bounded universe fit into this? Does it make sense to use the Universal Turing machine as a generator for Kolmogorov complexity, when the actual model of computation required to simulate the universe is much simpler?

51 comments

Comments sorted by top scores.

comment by private_messaging · 2014-01-23T07:51:38.155Z · LW(p) · GW(p)

There's already an equivalent formulation where S.I. works by feeding an infinite string of random bits into the Turing machine on the input tape.

In a non-deterministic universe this works by setting up a code which converts subsequent bits into guesses with correct probability distribution.

I think the issue with S.I. is... well. Why on Earth would we even think it is a particularly good prior, to begin with? It is not even one prior, it's a bunch of priors that have little in common other than that the difference between a couple priors is bounded by a constant, which sounds good except that the constant can be big in comparison to anything you would care about. Granted, if we postulate that environment is a Turing machine, S.I. is a good prior, especially if based on the same machine, but that's a very tautological measure of "good".

If we look at the actual environment, it got those highly symmetrical laws of physics, which are even time-reversible (after mirroring and charge flip). The laws of physics are expressed compactly in terms of high level abstractions such as vectors, tensors, etc - the more symmetrical, the more compact. This works absolutely great - the baseline complexity of our laws of physics is low, the excess complexity for various insane theories (e.g. a doomsday at a specific date) is very high (giving them low priors).

Contrast that with S.I. The baseline complexity, as far as we can tell*, is very high - to say the least, it's difficult to make a simulation that's running on one dimensional tape but got 4-dimensional Lorentz symmetry, exact or approximate. And for each theory there's various slightly more complex (say, 10 extra bits) variations that will make the simulator do something insane and unpredictable from earlier data. Because digital simulators are incredibly fragile - with all those tags to make the Turing machine head do passes over the data.

Granted, one could postulate a yet-unknown way to code symmetrical laws of physics, but that just proves another point with regards to impracticality - you can postulate anything, e.g. a yet-unknown way to compactly code a god.

Replies from: gedymin, solipsist

↑ comment by gedymin · 2014-01-26T19:15:24.478Z · LW(p) · GW(p)

What's the physical significance of those infinite bits?

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-26T20:53:25.962Z · LW(p) · GW(p)

Physical significance is not a part of predicting. The implementation details can differ vastly between different machines anyway. Call it what it is: a physics simulator written entirely in brainfuck, most of the code having to do with difficulty of coding things in brainfuck and having nothing to do with physics. Calling it a hypothesis makes you confuse it with what we get when we are striving to understand what actually really exists. Which we achieve by things such as not confusing the way we compute something, with the general laws something follows.

Also, an example of non-deterministic sequence. Let's suppose the sequence consists of lengths of 0s separated by 1s, with the length of consecutive 0s following a (truly random) binomial distribution with p=0.5 and n=20.

There will be a short prefix that works as following pseudocode:

  loader sequence that loads following stuff as code from the input and then runs it:
  loop forever:
       loop for 20 steps:
           read a from input
           if a==0 print '0'
       end loop
       print '1'
       end loop

and for a given output tape, this prefix will be the beginning of a fairly large fraction of the input tapes that produce given output tape, giving it a large prior, in the sense that the probability that the input tape that produces desired output begins with it, is big, compared to other contenders.

edit: i give up, markdown is completely retarded and eats all the indentation. edit2: also, restricting to programs that halt make the programs predict arbitrary doomsdays for no reason whatsoever, so I'm only considering formalisms where it is not restricted to programs that do halt.

Replies from: gwern, Lumifer

↑ comment by gwern · 2014-01-26T23:26:40.141Z · LW(p) · GW(p)

edit: i give up, markdown is completely retarded and eats all the indentation.

http://code.google.com/p/lesswrong/issues/detail?id=348

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-27T05:48:46.469Z · LW(p) · GW(p)

what was weird is that the first line did get indented with 1 more space than the rest.

↑ comment by Lumifer · 2014-01-26T23:15:33.410Z · LW(p) · GW(p)

a physics simulator written entirely in brainfuck

That's an awesome expression.

↑ comment by solipsist · 2014-01-23T16:28:21.611Z · LW(p) · GW(p)

The baseline complexity, as far as we can tell*, is very high - to say the least, it's difficult to make a simulation that's running on one dimensional tape but got 4-dimensional Lorentz symmetry, exact or approximate.

I don't see why this should be the case. A simple descriptions usually implies a simple program (e.g. "for i from 3↑↑↑3 to infinity, if i encodes a multiverse with 4-dimensional Lorentz symmetry, output i"). What does a one dimensional tape have to do with anything?

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-23T17:30:36.040Z · LW(p) · GW(p)

How are you going to check if i encodes such? Search through the programs looking for a program that tells if another program implements an universe the laws of physics of which got Lorentz symmetry? ;) (edit: the problem is not so much with programming but with describing symmetries. Exact real numbers can't even be described properly, let alone anything more fancy. Ergo, you can't even have exact translational symmetry)

One dimensional tape makes any multidimensional simulations difficult/impossible with exact symmetries. (Not that 3-dimensional grid would be any better, though).

Replies from: solipsist

↑ comment by solipsist · 2014-01-23T18:34:35.627Z · LW(p) · GW(p)

Exact real numbers can't even be described properly, let alone anything more fancy. Ergo, you can't even have translational symmetry

It seems like your suggesting that no short program would be able decide if a multiverse-history exhibited translational symmetry. I don't know enough physics to flat-out say "that's wrong", but I don't see why short programs should be unable to read universe-histories and reject any universe-history that violates conservation of momentum.

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-23T19:58:50.608Z · LW(p) · GW(p)

Exact translational symmetry (edit: as in, can move literally any distance and the laws work the same) requires true real numbers.

As for the simple programs, well, first off those have to specify how the spacetime is encoded on the tape, to be able to tell any properties of the spacetime. Secondarily, they have to literally analyze the code to see if the code is close enough to invariant over some transformations on that spacetime, for which it has to define those transformations. Before you know it, you have the full code necessary for directly simulating the laws of physics and then some icing on top.

Granted, if we were writing a science fiction story, that would've been great - much anything can be claimed without much risk of an obvious wrongness...

edit: or is your point that we can find solutions to any system of equations by mere iteration of possibilities? That's not the issue, the issue is that we have to somehow encode the world as a TM tape. What we get is huge excessive complexity that has nothing to do with properties of the universe and everything to do with our choice of an incredibly impractical representation.

Replies from: solipsist

↑ comment by solipsist · 2014-01-23T20:38:02.605Z · LW(p) · GW(p)

Before you know it, you have the full code necessary for directly simulating the laws of physics and then some icing on top.

I agree! Is that a problem? I expect that program to be less than a few hundred bits long, so Solomonoff induction will zero in on the correct laws of physics with less than a few hundred bits* of evidence. That seems perfectly acceptable.

*a few hundred bits of evidence will take more than a few hundred bits of sensing data, unless the sensing data is incompressible.

edit: or is your point...

Nope, the first one

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-23T20:45:28.769Z · LW(p) · GW(p)

agree! Is that a problem? I expect that program to be less than a few hundred bits long

You see, this is why I kind of don't even want to bother ever discussing it any more. Ohh, I expect this, I expect that... on what basis, none whatsoever. It's hundreds bits merely to locate the camera, for god's sake. Let's say I expect a god in 50 bits, for the sake of argument.

edit: and by the way, for the laws of physics as we know them - non discrete - an exact representation is not even possible.

*a few hundred bits of evidence will take more than a few hundred bits of sensing data, unless the sensing data is incompressible.

Up to BusyBeaver(a few hundreds) bits of sensing data (edit: which is due to it being easy to set up codes that are identical until very many bits are outputted) .

comment by MugaSofer · 2014-01-23T12:23:01.318Z · LW(p) · GW(p)

no one is surprised by their ability to name a few random numbers

... because they don't have it. If anything, people are surprised to learn just how bad humans are at choosing randomly, for example when faced with a simple rock-paper-scissors program* (or a genuinely skilled RPS player, for that matter.)

*(Here's one, or you could google around for one - there a re a lot online.)

comment by MrMind · 2014-01-24T09:40:14.563Z · LW(p) · GW(p)

However, quantum physics suggests that Random Number oracles, in fact, might be real; i.e. that there are sources of true randomness in the Universe.

They are not the same kind of random. A random oracle needs to answer the same query with the same number, while tipically QM does the opposite of that.

his is QM interpretation-dependent, of course, but any deterministic, non-random interpretation of quantum mechanics involves things like faster-than light interaction etc., which frankly are much less intuitive.

An interpretation is just an ontological statement placed upon a model defined by equations. This means that by definition, if the QM we are examining doesn't have FTL interactions, neither the interpretation will have.

Only a huge amount of proper scientific tests could convince me to change this conclusion.

This is odd. What you just said is: "a finite amount of evidence will convince me that an infinite amount of complexity is more probable than a finite amount."

Replies from: gedymin

↑ comment by gedymin · 2014-01-24T16:10:39.902Z · LW(p) · GW(p)

A random oracle needs to answer the same query with the same number

Randomness is still uncomputable even if we relax this requirement. You're right by pointing out that the formal definition of a random oracle includes this requirement, but I don't think it's essential to this discussion. Can someone confirm?

This is odd. What you just said is: "a finite amount of evidence will convince me that an infinite amount of complexity is more probable than a finite amount."

It has infinite complexity only in the formal model (SI + universal prior). I can choose not to believe in the formal model itself.

In fact, I should choose to not to give 100% belief probability to ANY formal model. Blind subscription to the idea that some beliefs should have 100% probability leads into trouble. For example, this is the problem with Hume's argument against miracles: "As the evidence for a miracle is always limited, as miracles are single events, occurring at particular times and places, the evidence for the miracle will always be outweighed by the evidence against. " If we assume that there NEVER be evidence strong enough to support the fact that a miracle (i.e. transgression of physical laws) has happened, the we'll never have to update our understanding of physical laws. Which is just wrong - as the history of physics tells, our understanding of physical laws is updated all the time.

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-24T17:33:48.783Z · LW(p) · GW(p)

By the way, to clarify a potential misunderstanding.

Under S.I. it is not extremely unlikely that we are in one of the high complexity universes. This is because the 2^-l probability is the prior probability of an individual program of length l, not the probability of all programs of length l cumulative. There's 2^l possible bit strings of length l . (The programs themselves determine their length by stopping making use of the input tape, so any program of length l is also two programs of length l+2 . Speaking of the program lengths thus gets very confusing quickly).

Replies from: gedymin

↑ comment by gedymin · 2014-01-26T19:16:25.160Z · LW(p) · GW(p)

You mean "also two programs of length l+1", right?

I think this comment by gjm addresses the "longer programs are as likely" idea: http://lesswrong.com/r/discussion/lw/jhm/understanding_and_justifying_solomonoff_induction/adfc

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-26T22:56:20.890Z · LW(p) · GW(p)

Yeah, that was a typo.

There's entirely too many ways to formalize S.I. which are basically equivalent, which makes it really difficult to discuss. I standardize on "random bits on a read-only input tape, write only output tape, and a single work tape initialized with zeroes" model, with the probability of a specific output string s being the prior for s . The machine M must be such that for any other machine M' you can make a prefix such that putting that prefix on the beginning of the tape for M makes it exactly equivalent to M' . (the prefix works as an emulator, i.e. a program for the machine M', with this prefix, will run on M exactly as if it was M'). Very concise description, and you can get all the things like 2^-l out of that.

comment by kokotajlod · 2014-01-23T20:11:52.001Z · LW(p) · GW(p)

Maybe we should revise downward our probability that we are living in a universe that includes True Random numbers.

comment by ThisSpaceAvailable · 2014-01-23T02:23:38.115Z · LW(p) · GW(p)

To solve any decision problem P with the Halting oracle, we can reformulate it as program source code: "if P, then halt; otherwise: loop forever" and feed this program to the Halting oracle.

Is "computable" understood as being implicit in the term "decision problem P"?

Replies from: gedymin

↑ comment by gedymin · 2014-01-23T11:44:35.336Z · LW(p) · GW(p)

Yes.

comment by timtyler · 2014-01-23T02:08:54.415Z · LW(p) · GW(p)

Hypercomputation doesn't exist. There's no evidence for it - and nor will there ever be. It's an irrelevance that few care about. Solomonoff induction is right about this.

Replies from: gedymin, private_messaging

↑ comment by gedymin · 2014-01-23T11:52:07.433Z · LW(p) · GW(p)

You're right that it probably doesn't exist.

You're wrong that no one cares about it. Humans have a long history of caring about things that do not exist. I'm afraid that hypercomputation is one of those concepts that open the floodgates to bad philosophy.

My primary point is the confusion about randomness, though. Randomness doesn't seem to be magical, unlike other forms of uncomputability.

↑ comment by private_messaging · 2014-01-24T11:30:54.693Z · LW(p) · GW(p)

We don't think it has exactly the probability of 0, do we? Or that it's totally impossible that the universe is infinite, or that it's truly non discrete, and so on.

A lot of conceptually simple things have no exact representation on a Turing machine, and unduly complicated approximate representations.

edit: also it strikes me as dumb that the Turing machine has an infinite tape, yet it is not possible to make an infinite universe on it with finite amount of code.

Replies from: timtyler

↑ comment by timtyler · 2014-01-24T22:12:05.951Z · LW(p) · GW(p)

We don't think it has exactly the probability of 0, do we?

It isn't a testable hypothesis. Why would anyone attempt to assign probabilities to it?

Replies from: private_messaging

↑ comment by private_messaging · 2014-01-25T00:16:28.627Z · LW(p) · GW(p)

Suppose there's a special kind of stone you can poke with a stick, and it does something, and any time it does something, it's defying the pattern, you have to add more code to your computable model for it.

Meanwhile there's a simple model of those kind of stones as performing one or other kind of hypercomputation, and while it doesn't let you use a computer to predict anything, it lets you use one such stone to predict the other, or lets you use a stone to run certain kinds of software much faster.

edit: to be more specific, let's say, we got random oracle magic stones. That's the weakest kind of magic stone, and it is already pretty cool. Suppose that any time you hit a magic stone, it'll flash or not, randomly. And if you take a stone and break it in two, the binary flash sequences for both halves are always the same. Obvious practical applications (one-time pad). So, people say, ohh, those stones are random oracles, and paired stones do not violate locality. After some hypothesis testing, they're reasonably skeptical that you can build an FTL communicator or an uberdense hard drive from those magic stones.

Except a particularly bone headed AI. Any time it taps a stone, it needs to add one bit to the stone's description. The AI is not entirely stupid - it can use paired stones as an one time pad too. Except, according to the AI, the first stone to be tapped transmits bits to the second stone. The AI is ever hopeful that it'll build an FTL communicator on those stones one day, testing more and more extravagant theories of how stones communicate. edit: or a has a likewise screwed up model with incredible data storage density in the stones.

comment by Squark · 2014-01-25T13:08:05.249Z · LW(p) · GW(p)

Solomonoff induction deals perfectly with stochastic universes. A "hypothesis" in Solomonoff induction is an enumerable semi-measure i.e. a program computing a convergent sequence of lower bounds for the probability of observing a given result. See e.g. http://www.vetta.org/documents/legg-1996-solomonoff-induction.pdf . In fact, since we're considering enumerable rather than recursive semi-measures, it allows to some extent for hypercomputing as well.

Replies from: gedymin

↑ comment by gedymin · 2014-01-26T20:03:31.738Z · LW(p) · GW(p)

A "hypothesis" in Solomonoff induction is an enumerable semi-measure i.e. a program computing a convergent sequence of lower bounds for the probability of observing a given result. See e.g. http://www.vetta.org/documents/legg-1996-solomonoff-induction.pdf .

The linked study material unfortunately uses confusing language. To understand the need to distinguish between a hypothesis and a probability measure consider that they can be both put down as functions; the result of a hypothesis is data (i.e. the binary string); the result of a measure is probability (i.e. a real number 0 <= x <= 1).

Solomonoff induction deals perfectly with stochastic universes.

The issue I was trying to draw attention to is that SI cannot differentiate between a pattern-less stochastic process and an a hypercomputable process, but a "simple" one.

In fact, since we're considering enumerable rather than recursive semi-measures, it allows to some extent for hypercomputing as well.

Again, I think you're making a category error.

Consider the case when process which produces an output string that is "truly random", i.e. uncomputable and statistically random. (The whole string, not just its prefix). I'm pretty sure that there is a computable encoding algorithm of e.g. binary representation of Chaitin's constant that generates such a string. Given access to a Halting oracle, the string x becomes computable. Otherwise it's not computable.

Replies from: Squark

↑ comment by Squark · 2014-01-26T20:22:42.472Z · LW(p) · GW(p)

The important property of SI is that it asymptotically produces correct predictions. I.e. if we fix an enumerable semi-measure and use it to (randomly) generate a sequence, SI will eventually start producing correct predictions with probability 1.

Replies from: gedymin

↑ comment by gedymin · 2014-01-27T10:52:42.812Z · LW(p) · GW(p)

Consider the measure:

 m(x) = 1, if  x_i is the i-th bit of binary expansion of Chaitin's constant, and 0 otherwise

The measure is deterministic, but not computable. From SI point of view x is unpredictable (using the Universal prior). If we change the Universal prior of SI and base the prior on a Halting oracle instead of a universal Turing machine, m(x) becomes (hyper)computable. The question is - are there any conditions under which we should do this?

Replies from: Squark

↑ comment by Squark · 2014-01-27T19:10:13.258Z · LW(p) · GW(p)

You are right, in principle we can consider generalized SI for various hypercomputation models. In practice, SI is supposed to be a probability distribution over "testable hypotheses". Since testing a hypothesis involves some sort of computation, it doesn't seem to make sense to include uncomputable oracles. It would make sense for agents that are able to hypercompute themselves.

comment by ThisSpaceAvailable · 2014-01-23T02:13:00.418Z · LW(p) · GW(p)

This is QM interpretation-dependent, of course, but any deterministic, non-random interpretation of quantum mechanics involves things like faster-than light interaction etc., which frankly are much less intuitive.

In a truly deterministic universe, the concept of "faster-than light interaction" is largely nonsensical. It requires a space-like separation between cause and effect ... but in a fully deterministic universe, effects are fully determined by the boundary conditions; no event within the universe is the ultimate cause of any other event.

Replies from: MrMind, kokotajlod

↑ comment by MrMind · 2014-01-24T09:57:00.199Z · LW(p) · GW(p)

In a fully deterministic universe, every time-slice can be considered equivalent to a boundary condition, and the concept of a speed limit of propagation still makes sense. In the instant after the considered slice, a point is affected only by its immediate neighbours. But as we keep going further in time, to calculate the state of a point we will need to account for larger and larger neighbourhoods of the same point in the first slice.
This is exactly what happens e.g. in general relativity, which fully deterministic (away from singularities) yet speed-limited.

↑ comment by kokotajlod · 2014-01-23T20:08:50.121Z · LW(p) · GW(p)

I don't understand. Game of Life models a truly deterministic universe. Game of Life could be modified with an additional rule that deleted all cells in a structure that forms the word "Hello world." This would be faster-than-light interaction because the state of the grid at one point would influence the state of the grid at a distant point, in one time step. But this would still model a deterministic universe.

Replies from: Eugine_Nier, Lumifer

↑ comment by Eugine_Nier · 2014-01-25T04:15:49.759Z · LW(p) · GW(p)

The "Game of Life" does not describe a single universe but a collection of universes. A single universe consists merely of a series of time steps. Yes, you can observe patterns in the time steps, but it's not clear how to meaningfully talk about cause and effect.

Replies from: kokotajlod

↑ comment by kokotajlod · 2014-01-25T14:23:41.740Z · LW(p) · GW(p)

That may be true, but I don't see how it affects the discussion between ThisSpaceAvailable and me.

I could easily change my example so that it involves a single universe rather than a collection. And if we can't talk about cause and effect in the game of life universe, we can't talk about cause and effect in the real universe either. (Unless you can find some relevant distinction between the two, and thus show my analogy to be faulty.)

↑ comment by Lumifer · 2014-01-23T20:15:19.498Z · LW(p) · GW(p)

Game of Life could be modified with an additional rule that deleted all cells in a structure that forms the word "Hello world." This would be faster-than-light interaction because the state of the grid at one point would influence the state of the grid at a distant point, in one time step.

Not quite. In the toy universe in which the Game of Life exists there is no light, no speed of light, and no limits on faster-than-light interactions.

To make this obvious, you say "in one time step". What's the speed of light in Game of Life time steps?

Replies from: MrMind, kokotajlod

↑ comment by MrMind · 2014-01-24T09:43:47.611Z · LW(p) · GW(p)

Not quite. In the toy universe in which the Game of Life exists there is no light, no speed of light, and no limits on faster-than-light interactions.

Yes, there is.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-24T15:37:14.840Z · LW(p) · GW(p)

That's a bit of terminological silliness that doesn't have much to do with the original discussion.

↑ comment by kokotajlod · 2014-01-24T21:58:54.297Z · LW(p) · GW(p)

The speed of light in our universe is 1 Planck distance in 1 Planck time. Hence the connection between the speed of light and the idea that all interaction is local.

The closely analogous speed in Game of Life would be 1 grid square in 1 time step.

If Game of Life can be modified as described to allow FTL, nonlocal interactions, then so can the rules of our universe. At least, the burden of proof is on you to show why such modifications aren't legitimate somehow.

Not quite. In the toy universe in which the Game of Life exists there is no light, no speed of light, and no limits on faster-than-light interactions.

So you agree that it is possible for a truly deterministic universe to have nonlocal interaction. I think it is a short step from there to my conclusion, namely that it is possible for a truly deterministic universe to have faster-than-light interaction.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-25T01:57:36.707Z · LW(p) · GW(p)

If Game of Life can be modified as described to allow FTL, nonlocal interactions, then so can the rules of our universe.

Huh? The fact that you can make up whatever rules you want for a toy mathematical abstraction does NOT imply that you can do the same for our physical universe.

Replies from: kokotajlod

↑ comment by kokotajlod · 2014-01-25T14:32:26.455Z · LW(p) · GW(p)

Do you think you understand what I am trying to do here? Because it seems to me that you are just being difficult. I honestly don't understand what the problem is. It might be my fault, not yours. EDIT: So, I'll do what I can to understand you also. Right now I'm thinking that you must have a different understanding of what the space of possible universes looks like.

Huh? The fact that you can make up whatever rules you want for a toy mathematical abstraction does NOT imply that you can do the same for our physical universe.

Watch me do the same for our physical universe:

Take the laws of physics, whatever they turn out to be, and add this new law: --Fix a simultenaity plane if necessary --Pick two distant particles --Decree that every second, all the matter in a 1meter radius around each particle will be swapped with the matter in a 1meter radius around each other particle. As if someone paused the universe, cut out the relevant sections, and switched them.

These are silly rules, but they are consistent, computable, etc. The modified Laws of Physics that would result would be no less possible than our current Laws of Physics, though they would be much less probable.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-26T02:34:02.920Z · LW(p) · GW(p)

Do you think you understand what I am trying to do here?

No.

The modified Laws of Physics that would result would be no less possible than our current Laws of Physics

On the basis of what are you making judgments about what kinds of physics laws are more possible or less possible?

Replies from: kokotajlod

↑ comment by kokotajlod · 2014-01-27T00:05:31.319Z · LW(p) · GW(p)

Okay. So, this is what I think you are thinking:

Not just any old Laws are possible. Even consistent, computable Laws can be impossible. (I'm afraid I don't know why you think this though. Perhaps you think there is only one possible world and it is not a Level IV multiverse or perhaps you think that all possible worlds conform to the laws of physics as some of us currently understand them.)

I'm thinking that universes which are consistent & computable are possible. After all, it seems to me that Solomonoff Induction says this.

Note that I am distinguishing between possibility and probability. I'm not claiming that my modified laws are probable, merely that they are metaphysically possible

If you have problems with the term "metaphysical possibility" then I can instead speak in multiverse terms: I am claiming that all computable, consistent worlds exist somewhere in the multiverse. If you don't like that either, then I'd like to know how you think about possibility.

Note: I intend to edit this later to add links. I won't change the text. EDIT: I added the links.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-27T01:38:11.572Z · LW(p) · GW(p)

Not just any old Laws are possible. Even consistent, computable Laws can be impossible.

No, I'm not thinking that. The thing is, I don't quite understand in which sense do you use the word "possible" here. It seems to me you say "X is possible" to mean "I can imagine X".

I'm thinking that universes which are consistent & computable are possible.

See above. And saying "metaphysically possible" doesn't help -- are we still talking purely about your imagination?

If you have problems with the term "metaphysical possibility" then I can instead speak in multiverse terms: I am claiming that all computable, consistent worlds exist somewhere in the multiverse.

Ah. Well, do you happen to have any evidence for your claim?

then I'd like to know how you think about possibility.

You'll have to be a bit more specific. Possibility of what? Possibility of universes with different laws of physics? I don't know. Again, I am not sure what does the word "possibility" mean here.

Replies from: kokotajlod

↑ comment by kokotajlod · 2014-01-27T14:32:03.521Z · LW(p) · GW(p)

It would help if you gave me a positive account of what you think, instead of just pointing out where you disagree with me.

True, the notion of metaphysical possibility is one that might be mysterious. I don't think it is, but come to think of it, I don't need to bother with it here--there are other notions of possibility that serve my purposes even better. So we need not discuss it further.

It seems to me you say "X is possible" to mean "I can imagine X".

That's not fair. My use of the phrase "X is possible" thus far meets much more stringent conditions than that. Why did you jump to the lowest common denominator? Do you not know much about the use of the word "possible?"

Ah. Well, do you happen to have any evidence for your claim?

Yes, my claim is entailed by the Level IV multiverse hypothesis. Are you claiming that that hypothesis is nonsensical? If so, I'd like to hear your alternative metaphysical theory of everything, if you have one, as well as an explanation of why this one is nonsensical.

In a truly deterministic universe, the concept of "faster-than light interaction" is largely nonsensical.

This was your original claim. Rather than talk about possibility I can just talk about sensibility. Is there a coherent truly deterministic universe in which faster-than-light interaction happens?

Obviously we aren't asking whether or not there actually exists such a universe, because then your statement would be either trivially true or trivially false, depending on whether or not our universe has faster-than-light interaction.

So we must be looking at some sort of space of non-existent universes, whatever that might be, and checking to see whether there are any that are (a) coherent and (b) contain FTL.

What is this space that we must look at? Well, your question leaves this ambiguous. But it highly suggests that we are looking for some sort of possibility space, e.g. the space of metaphysically possible worlds. I propose that it is at least the space that Solomonoff Induction looks at. In fact, I could get away with less than that if I had to.

Anyhow, since your question leaves it ambiguous which space we are looking at, it is on you to clarify it.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-27T15:54:10.105Z · LW(p) · GW(p)

My thinking is pretty simple. Crude, even.

Anyone can make imaginary constructs. Sometimes they involve pixies riding unicorns, sometimes they involve different Planck lengths. Apriori I see no good reason to treat constructs involving Planck lengths any different from constructs involving pixies.

When you say "A universe with feature X is possible" I literally do not understand the meaning of the word "possible" here. It doesn't exist, it didn't happen. Could it have happened? I have no idea and neither do you. In you mind you can substitute a different Planck length or break the FTL limitation, then squint at the result and say "Hmm, still looks fine" -- but all that is happening inside your mind and again, does not look qualitatively different from leering at a fine pixie. In your mind, most everything is possible -- not to mention that your mind cannot determine whether a universe with a different Planck length actually would have been fine.

This was your original claim.

Actually, no, it wasn't mine. I jumped in this thread mid-discussion with a comment about the Game of Life.

So we must be looking at some sort of space of non-existent universes, whatever that might be, and checking to see whether there are any that are (a) coherent and (b) contain FTL.

My point is that this activity is no different from checking the non-existent universes for pixies and unicorns. It might be an agreeable mental pastime, but it has no relationship to reality.

Replies from: kokotajlod

↑ comment by kokotajlod · 2014-01-28T23:41:23.563Z · LW(p) · GW(p)

How embarrassing--I forgot who the OP for this argument was. :( Sorry. If you don't support the claim that I was attacking, then perhaps we don't even disagree after all.

OK, so you are anti-realist about possibility. Fair enough; so am I. But that doesn't mean the word "possible" is meaningless in the contexts that I used it in. Here's an attempt at cashing out one version of "possible" that surely you would agree makes sense:

X is possible iff X is true in some theory in the search space of theories that we ought to use when making predictions.

The claim I was disagreeing with is the claim that determinism does not make sense when combined with FTL interaction. Even if we agree that possibility is just a game we play in our heads, the game still has rules, and the claim I am disagreeing with seemed to be a claim about the implications of those rules--namely, that those rules prohibited the combination of determinism and FTL interaction. But they don't.

Maybe different people play by different rules. We can have a conversation about what rules we ought to play by. But I'm pretty sure that if we polled people who know about determinism and FTL, most would agree that they are compatible.

As an aside, your analogy with pixies puzzles me. Why did you think I would try to draw a distinction between pixies and FTL travel? Both are computable. Both are possible, by the rules that I and most people I know (and SI) play by.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-29T16:59:09.123Z · LW(p) · GW(p)

X is possible iff X is true in some theory in the search space of theories that we ought to use when making predictions.

Making testable predictions about empirical reality? That's a pretty high bar. For example, counterfactuals do not fare well under such rules.

As an aside, your analogy with pixies puzzles me. Why did you think I would try to draw a distinction between pixies and FTL travel?

I did not know whether you would wish to draw such a distinction or not. Now I do :-)

Replies from: kokotajlod

↑ comment by kokotajlod · 2014-01-30T06:34:46.170Z · LW(p) · GW(p)

Well, it seems like the part of this conversation that had to do with the original claim and counterclaim has petered out. Do you agree with this claim:

In a truly deterministic universe, the concept of "faster-than light interaction" is largely nonsensical.

If so, then we can keep going. If not, then we can move on to discuss the nature of possibility and its use against claims like the above. If you are still interested enough in continuing, that is. I won't detain you against your will.

...

Making testable predictions about empirical reality? That's a pretty high bar. For example, counterfactuals do not fare well under such rules.

Is it? I'm happy to lower the bar; I didn't think this definition through very much. But thus far I see no reason to revise it. Why don't counterfactuals fare well? Are you saying that under my definition of possibility, counterfactuals are impossible? That would mean that we shouldn't consider theories that involve counterfactuals, which seems to me to be false, though it might depend on what theory of counterfactuals we are using.

Replies from: Lumifer

↑ comment by Lumifer · 2014-01-30T16:09:59.774Z · LW(p) · GW(p)

Do you agree with this claim

I have no strong opinion. I suspect that this claim needs its assumptions fleshed out (e.g. that the speed of light is REALLY the limit of how fast information can propagate).

Why don't counterfactuals fare well?

Because they are rarely useful for making predictions about reality. Not "never", of course, just "rarely".

An additional problem with Solomonoff induction

Contents

51 comments

*a few hundred bits of evidence will take more than a few hundred bits of sensing data, unless the sensing data is incompressible.