Posts
Comments
To my mind all such questions are related to arguments about solipcism, i.e. the notion that even other humans don't, or may not, have minds/consciousness/qualia. The basic argument is that I can only see behavior (not mind) in anyone other than myself. Most everyone rejects solipsism, but I don't know if there have actually many very good arguments against it, except that it is morally unappealing (if anyone know of any please point them out). I think the same questions hold regarding emulations, only even more so (at least with other humans we know they are physically similar, suggesting some possibility that they are mentally similar as well - not so with emulations*). Especially, I don't see how there can ever be empirical evidence that anything is conscious or experiences qualia (or that anything is not conscious!): behavior isn't strictly relevant, and other minds are non-perceptible. I think this is the most common objection to Turing-tests as a standard, as well.
*Maybe this is the logic of the biological position you mention - essentially, the more something seems like the one thing I know is conscious (me), the more likehood I assign to it also being conscious. Thus other humans > other complex animals > simple animals > other organisms > abiotics.
They don't. To get the probabilities about something occuring in our universe, you need to get the information about our universe first. Solomonoff Induction tells you how to do that, in a random universe. After you get enough evidence to understand the universe, only then you start getting good results.
Yes, but we already have lots of information about our universe. So, making use of all that, if we could start using SI to, say, predict the weather, would its predictions be well-calibrated? (They should be - modern weather predictions are already well-calibrated, and SI is supposed to be better than how we do things now.) That would require that, of all predictions compatible with currently known info, ALL of them would have to occur in EXACT PROPORTION to their bit-length complexity.
Is there any evidence that this is the case?
You quoted me
"the theory seems to predict that possible (evidence-compatible) events or states in the universe will occur in exact or fairly exact proportion to their relative complexities as measured in bits [...] if I am predicting between 2 (evidence-compatible) possibilities, and one is twice as information-complex as the other, then it should actually occur 1/3 of the time"
then replied
"Let's suppose that there are two hypotheses H1 and H2, each of them predicting exactly the same events, except that H2 is one bit longer and therefore half as likely as H1. Okay, so there is no evidence to distinguish between them. Whatever happens, we either reject both hypotheses, or we keep their ratio at 1:2."
I am afraid I may have stated this unclearly at first. I meant, given 2 hypotheses that are both compatible with all currently-known evidence, but which predict different outcomes on a future event.
This seems reasonable - it basically makes use of the fact that most statements are wrong, therefore adding a given statement whose truth-value is as-yet-unknown is likely to be wrong.
However, that's vague. It supports Occam's Razor pretty well, but does it also offer good evidence that that those likelihoods will manifest in real-world probabilities IN EXACT PROPORTION to the bit-lengths of their inputs? That is a much more precise claim! (For convenience I am ignoring the problem of multiple algorithms where hypotheses have different bit-lengths.)
Yes, that was the post I read that generated my current line of questioning.
My reply to Viliam_Bur was phrased in terms of probabilities in a single universe, while your post here is in terms of mathematically possible universes. Let me try to rephrase my point to him in many-worlds language. This is not how I originally thought of the question, though, so I may end up a little muddled in translation.
Taking your original example, where half of the Mathematically Possible Universes start with 1, and the other half with 0. It is certainly possible to imagine a hypothetical Actual Multiverse where, nevertheless, there are 5 billion universes with 1, and only 5 universes with 0. Who knows why - maybe there is some overarching multiversal law we are unaware of, or may it's just random. The point is that there is no a priori reason the Multiverse can't be that way. (It may not even be possible to say that the multiverse probably isn't that way without using Solomonoff Induction or Occam's Razor, the very concepts under question.)
If this were the case, and I were somehow universe-hopping, I would over time come to the conclusion that SI was poorly calibrated and stop using it. This, I think, is basically the many-worlds version of my suggestion to Viliam_Bur. As I said to him, I am not arguing for or against SI, I am just asking knowledge people if there is any evidence that the probablities in this universe, or distributions across the multiverse, are actually in proportion to their information-complexities.
Thank you for your reply. It does clear up some of the virtues of SI, especially when used to generate priors absent any evidence. However, as I understand it, SI does take into account evidence - one removes all the possibilities incompatible with the evidence, then renormalizes the probablities of the remaining possibilities. Right?
If so, one could still ask - after taking account of all available evidence - is SI then well-calibrated? (At some point it should be well-calibrated, right? More calibrated than human beings. Otherwise, how is it useful? Or why should we use it for induction?)
Essentially the theory seems to predict that possible (evidence-compatible) events or states in the universe will occur in exact or fairly exact proportion to their relative complexities as measured in bits. Possibly over-simplifying, this suggests that if I am predicting between 2 (evidence-compatible) possibilities, and one is twice as information-complex as the other, then it should actually occur 1/3 of the time. Is there any evidence that this is actually true?
(I can see immediately that one would have to control for the number of possible "paths" or universe-states or however you call it that could lead to each event, in order for the outcome to be directly proportional to the information-complexity. I am ignoring this because the inability to compute this appears to be the reason SI as a whole cannot be computed.)
You suggest above that SI explains why Occam's razor works. I could offer another possibility - that Occam's Razor works because it is vague, but that when specified it will not turn out to match how the universe actually works very precisely. Or that Occam's Razor is useful because it suggests that when generating a Map one should use only as much information about the Territory is as is necessary for a certain purpose, thereby allowing one to get maximum usefulness with minimum cognitive load on the user.
I am not arguing for one or the other. Instead I am just asking, here among people knowledgeable about SI - Is there any evidence that outcomes in the universe actually occur with probablities in proportion to their information-complexity? (A much more precise claim than Occam's suggestion that in general simpler explanations are preferable.)
Maybe it will not be possible to answer my question until SI can at least be estimated, in order to actually make the comparison?
(Above you refer to "all mathematically possible universes." I phrased things in terms of probabilities inside a single universe because that is the context in which I observe & make decisions and would like SI to be useful. However I think you could just translate what I have said back into many-worlds language and keep the question intact.)
Hi, my name is Jason, this is my first post. I have recently been reading about 2 subjects here, Calibration and Solomoff Induction; reading them together has given me the following question:
How well-calibrated would Solomonoff Induction be if it could actually be calculated?
That is to say, if one generated priors on a whole bunch of questions based on information complexity measured in bits - if you took all the hypotheses that were measured at 10% likely - would 10% of those actually turn out to be correct?
I don't immediately see why Solomonoff Induction should be expected to be well-calibrated. It appears to just be a formalization of Occam's Razor, which itself is just a rule of thumb. But if it turned out not to be well-calibrated, it would not be a very good "recipe for truth." What am I missing?