Where Recursive Justification Hits Bottom

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-07-08T10:16:45.000Z · LW · GW · Legacy · 81 comments

Contents

81 comments

Why do I believe that the Sun will rise tomorrow?

Because I've seen the Sun rise on thousands of previous days.

Ah... but why do I believe the future will be like the past?

Even if I go past the mere surface observation of the Sun rising, to the apparently universal and exceptionless laws of gravitation and nuclear physics, then I am still left with the question:  "Why do I believe this will also be true tomorrow?"

I could appeal to Occam's Razor, the principle of using the simplest theory that fits the facts... but why believe in Occam's Razor?  Because it's been successful on past problems?  But who says that this means Occam's Razor will work tomorrow?

And lo, the one said:

"Science also depends on unjustified assumptions.  Thus science is ultimately based on faith, so don't you criticize me for believing in [silly-belief-#238721]."

As I've previously observed:

It's a most peculiar psychology—this business of "Science is based on faith too, so there!"  Typically this is said by people who claim that faith is a good thing.  Then why do they say "Science is based on faith too!" in that angry-triumphal tone, rather than as a compliment? 

Arguing that you should be immune to criticism is rarely a good sign.

But this doesn't answer the legitimate philosophical dilemma:  If every belief must be justified, and those justifications in turn must be justified, then how is the infinite recursion terminated?

And if you're allowed to end in something assumed-without-justification, then why aren't you allowed to assume anything without justification?

A similar critique is sometimes leveled against Bayesianism—that it requires assuming some prior—by people who apparently think that the problem of induction is a particular problem of Bayesianism, which you can avoid by using classical statistics.  I will speak of this later, perhaps.

But first, let it be clearly admitted that the rules of Bayesian updating, do not of themselves solve the problem of induction.

Suppose you're drawing red and white balls from an urn.  You observe that, of the first 9 balls, 3 are red and 6 are white.  What is the probability that the next ball drawn will be red?

That depends on your prior beliefs about the urn.  If you think the urn-maker generated a uniform random number between 0 and 1, and used that number as the fixed probability of each ball being red, then the answer is 4/11 (by Laplace's Law of Succession).  If you think the urn originally contained 10 red balls and 10 white balls, then the answer is 7/11.

Which goes to say that, with the right prior—or rather the wrong prior—the chance of the Sun rising tomorrow, would seem to go down with each succeeding day... if you were absolutely certain, a priori, that there was a great barrel out there from which, on each day, there was drawn a little slip of paper that determined whether the Sun rose or not; and that the barrel contained only a limited number of slips saying "Yes", and the slips were drawn without replacement.

There are possible minds in mind design space who have anti-Occamian and anti-Laplacian priors; they believe that simpler theories are less likely to be correct, and that the more often something happens, the less likely it is to happen again.

And when you ask these strange beings why they keep using priors that never seem to work in real life... they reply, "Because it's never worked for us before!"

Now, one lesson you might derive from this, is "Don't be born with a stupid prior."  This is an amazingly helpful principle on many real-world problems, but I doubt it will satisfy philosophers.

Here's how I treat this problem myself:  I try to approach questions like "Should I trust my brain?" or "Should I trust Occam's Razor?" as though they were nothing special— or at least, nothing special as deep questions go.

Should I trust Occam's Razor?  Well, how well does (any particular version of) Occam's Razor seem to work in practice?  What kind of probability-theoretic justifications can I find for it?  When I look at the universe, does it seem like the kind of universe in which Occam's Razor would work well?

Should I trust my brain?  Obviously not; it doesn't always work.  But nonetheless, the human brain seems much more powerful than the most sophisticated computer programs I could consider trusting otherwise.  How well does my brain work in practice, on which sorts of problems?

When I examine the causal history of my brain—its origins in natural selection—I find, on the one hand, all sorts of specific reasons for doubt; my brain was optimized to run on the ancestral savanna, not to do math.  But on the other hand, it's also clear why, loosely speaking, it's possible that the brain really could work.  Natural selection would have quickly eliminated brains so completely unsuited to reasoning, so anti-helpful, as anti-Occamian or anti-Laplacian priors.

So what I did in practice, does not amount to declaring a sudden halt to questioning and justification.  I'm not halting the chain of examination at the point that I encounter Occam's Razor, or my brain, or some other unquestionable.  The chain of examination continues—but it continues, unavoidably, using my current brain and my current grasp on reasoning techniques.  What else could I possibly use?

Indeed, no matter what I did with this dilemma, it would be me doing it.  Even if I trusted something else, like some computer program, it would be my own decision to trust it.

The technique of rejecting beliefs that have absolutely no justification, is in general an extremely important one.  I sometimes say that the fundamental question of rationality is "Why do you believe what you believe?"  I don't even want to say something that sounds like it might allow a single exception to the rule that everything needs justification.

Which is, itself, a dangerous sort of motivation; you can't always avoid everything that might be risky, and when someone annoys you by saying something silly, you can't reverse that stupidity to arrive at intelligence.

But I would nonetheless emphasize the difference between saying:

"Here is this assumption I cannot justify, which must be simply taken, and not further examined."

Versus saying:

"Here the inquiry continues to examine this assumption, with the full force of my present intelligence—as opposed to the full force of something else, like a random number generator or a magic 8-ball—even though my present intelligence happens to be founded on this assumption."

Still... wouldn't it be nice if we could examine the problem of how much to trust our brains without using our current intelligence?  Wouldn't it be nice if we could examine the problem of how to think, without using our current grasp of rationality?

When you phrase it that way, it starts looking like the answer might be "No".

E. T. Jaynes used to say that you must always use all the information available to you—he was a Bayesian probability theorist, and had to clean up the paradoxes other people generated when they used different information at different points in their calculations.  The principle of "Always put forth your true best effort" has at least as much appeal as "Never do anything that might look circular."  After all, the alternative to putting forth your best effort is presumably doing less than your best.

But still... wouldn't it be nice if there were some way to justify using Occam's Razor, or justify predicting that the future will resemble the past, without assuming that those methods of reasoning which have worked on previous occasions are better than those which have continually failed?

Wouldn't it be nice if there were some chain of justifications that neither ended in an unexaminable assumption, nor was forced to examine itself under its own rules, but, instead, could be explained starting from absolute scratch to an ideal philosophy student of perfect emptiness?

Well, I'd certainly be interested, but I don't expect to see it done any time soon.  I've argued elsewhere in several places against the idea that you can have a perfectly empty ghost-in-the-machine; there is no argument that you can explain to a rock.

Even if someone cracks the First Cause problem and comes up with the actual reason the universe is simple, which does not itself presume a simple universe... then I would still expect that the explanation could only be understood by a mindful listener, and not by, say, a rock.  A listener that didn't start out already implementing modus ponens might be out of luck.

So, at the end of the day, what happens when someone keeps asking me "Why do you believe what you believe?"

At present, I start going around in a loop at the point where I explain, "I predict the future as though it will resemble the past on the simplest and most stable level of organization I can identify, because previously, this rule has usually worked to generate good results; and using the simple assumption of a simple universe, I can see why it generates good results; and I can even see how my brain might have evolved to be able to observe the universe with some degree of accuracy, if my observations are correct."

But then... haven't I just licensed circular logic?

Actually, I've just licensed reflecting on your mind's degree of trustworthiness, using your current mind as opposed to something else.

Reflection of this sort is, indeed, the reason we reject most circular logic in the first place.  We want to have a coherent causal story about how our mind comes to know something, a story that explains how the process we used to arrive at our beliefs, is itself trustworthy.  This is the essential demand behind the rationalist's fundamental question, "Why do you believe what you believe?"

Now suppose you write on a sheet of paper:  "(1) Everything on this sheet of paper is true, (2) The mass of a helium atom is 20 grams."  If that trick actually worked in real life, you would be able to know the true mass of a helium atom just by believing some circular logic which asserted it.  Which would enable you to arrive at a true map of the universe sitting in your living room with the blinds drawn.  Which would violate the second law of thermodynamics by generating information from nowhere.  Which would not be a plausible story about how your mind could end up believing something true.

Even if you started out believing the sheet of paper, it would not seem that you had any reason for why the paper corresponded to reality.  It would just be a miraculous coincidence that (a) the mass of a helium atom was 20 grams, and (b) the paper happened to say so.

Believing, in general, self-validating statement sets, does not seem like it should work to map external reality—when we reflect on it as a causal story about minds—using, of course, our current minds to do so.

But what about evolving to give more credence to simpler beliefs, and to believe that algorithms which have worked in the past are more likely to work in the future?  Even when we reflect on this as a causal story of the origin of minds, it still seems like this could plausibly work to map reality.

And what about trusting reflective coherence in general?  Wouldn't most possible minds, randomly generated and allowed to settle into a state of reflective coherence, be incorrect?  Ah, but we evolved by natural selection; we were not generated randomly.

If trusting this argument seems worrisome to you, then forget about the problem of philosophical justifications, and ask yourself whether it's really truly true.

(You will, of course, use your own mind to do so.)

Is this the same as the one who says, "I believe that the Bible is the word of God, because the Bible says so"?

Couldn't they argue that their blind faith must also have been placed in them by God, and is therefore trustworthy?

In point of fact, when religious people finally come to reject the Bible, they do not do so by magically jumping to a non-religious state of pure emptiness, and then evaluating their religious beliefs in that non-religious state of mind, and then jumping back to a new state with their religious beliefs removed.

People go from being religious, to being non-religious, because even in a religious state of mind, doubt seeps in.  They notice their prayers (and worse, the prayers of seemingly much worthier people) are not being answered.  They notice that God, who speaks to them in their heart in order to provide seemingly consoling answers about the universe, is not able to tell them the hundredth digit of pi (which would be a lot more reassuring, if God's purpose were reassurance).  They examine the story of God's creation of the world and damnation of unbelievers, and it doesn't seem to make sense even under their own religious premises.

Being religious doesn't make you less than human.  Your brain still has the abilities of a human brain.  The dangerous part is that being religious might stop you from applying those native abilities to your religion—stop you from reflecting fully on yourself.  People don't heal their errors by resetting themselves to an ideal philosopher of pure emptiness and reconsidering all their sensory experiences from scratch.  They heal themselves by becoming more willing to question their current beliefs, using more of the power of their current mind.

This is why it's important to distinguish between reflecting on your mind using your mind (it's not like you can use anything else) and having an unquestionable assumption that you can't reflect on.

"I believe that the Bible is the word of God, because the Bible says so."  Well, if the Bible were an astoundingly reliable source of information about all other matters, if it had not said that grasshoppers had four legs or that the universe was created in six days, but had instead contained the Periodic Table of Elements centuries before chemistry—if the Bible had served us only well and told us only truth—then we might, in fact, be inclined to take seriously the additional statement in the Bible, that the Bible had been generated by God.  We might not trust it entirely, because it could also be aliens or the Dark Lords of the Matrix, but it would at least be worth taking seriously.

Likewise, if everything else that priests had told us, turned out to be true, we might take more seriously their statement that faith had been placed in us by God and was a systematically trustworthy source—especially if people could divine the hundredth digit of pi by faith as well.

So the important part of appreciating the circularity of "I believe that the Bible is the word of God, because the Bible says so," is not so much that you are going to reject the idea of reflecting on your mind using your current mind.  But, rather, that you realize that anything which calls into question the Bible's trustworthiness, also calls into question the Bible's assurance of its trustworthiness.

This applies to rationality too: if the future should cease to resemble the past—even on its lowest and simplest and most stable observed levels of organization—well, mostly, I'd be dead, because my brain's processes require a lawful universe where chemistry goes on working.  But if somehow I survived, then I would have to start questioning the principle that the future should be predicted to be like the past.

But for now... what's the alternative to saying, "I'm going to believe that the future will be like the past on the most stable level of organization I can identify, because that's previously worked better for me than any other algorithm I've tried"?

Is it saying, "I'm going to believe that the future will not be like the past, because that algorithm has always failed before"?

At this point I feel obliged to drag up the point that rationalists are not out to win arguments with ideal philosophers of perfect emptiness; we are simply out to win.  For which purpose we want to get as close to the truth as we can possibly manage.  So at the end of the day, I embrace the principle:  "Question your brain, question your intuitions, question your principles of rationality, using the full current force of your mind, and doing the best you can do at every point."

If one of your current principles does come up wanting—according to your own mind's examination, since you can't step outside yourself—then change it!  And then go back and look at things again, using your new improved principles.

The point is not to be reflectively consistent.  The point is to win.  But if you look at yourself and play to win, you are making yourself more reflectively consistent—that's what it means to "play to win" while "looking at yourself".

Everything, without exception, needs justification.  Sometimes—unavoidably, as far as I can tell—those justifications will go around in reflective loops.  I do think that reflective loops have a meta-character which should enable one to distinguish them, by common sense, from circular logics.  But anyone seriously considering a circular logic in the first place, is probably out to lunch in matters of rationality; and will simply insist that their circular logic is a "reflective loop" even if it consists of a single scrap of paper saying "Trust me".  Well, you can't always optimize your rationality techniques according to the sole consideration of preventing those bent on self-destruction from abusing them.

The important thing is to hold nothing back in your criticisms of how to criticize; nor should you regard the unavoidability of loopy justifications as a warrant of immunity from questioning.

Always apply full force, whether it loops or not—do the best you can possibly do, whether it loops or not—and play, ultimately, to win.

81 comments

Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).

comment by cole_porter · 2008-07-08T13:01:23.000Z · LW(p) · GW(p)

"There are possible minds in mind design space who have anti-Occamian and anti-Laplacian priors; they believe that simpler theories are less likely to be correct, and that the more often something happens, the less likely it is to happen again."

You've been making this point a lot lately. But I don't see any reason for "mind design space" to have that kind of symmetry. Why do you believe this? Could you elaborate on it at some point?

Replies from: Strange7, Dojan, rkyeun, CCC
comment by Strange7 · 2011-09-03T03:33:57.649Z · LW(p) · GW(p)

Mind design space is very large and comprehensive. It's like how the set of all possible theories contains both A and ~A.

comment by Dojan · 2012-01-06T23:20:02.108Z · LW(p) · GW(p)

That something is included in "mind design space" does not imply that it actually exists. Think of it instead as everything that we might label "mind" if it did exist.

comment by rkyeun · 2012-07-30T01:11:29.921Z · LW(p) · GW(p)

Imagine a mind as already exists. Now I install a small frog trained to kick its leg when you try to perform Occamian or Laplacian thinking, and its kicking leg hits a button that inverts your output so your conclusion is exactly backwards from the one you should/would have made but for the frog.

And thus symmetry.

Replies from: wafflepudding
comment by wafflepudding · 2016-06-10T23:17:11.995Z · LW(p) · GW(p)

Though, the anti-Laplacian mind, in this case, is inherently more complicated. Maybe it's not a moot point that Laplacian minds are on average simpler than their anti-Laplacian counterparts? There are infinite Laplacian and anti-Laplacian minds, but of the two infinities, might one be proportionately larger?

None of this is to detract from Eliezer's original point, of course. I only find it interesting to think about.

Replies from: rkyeun
comment by rkyeun · 2016-07-12T15:51:37.627Z · LW(p) · GW(p)

They must be of exactly the same magnitude, as the odds and even integers are, because either can be given a frog. From any Laplacian mind, I can install a frog and get an anti-Laplacian. And vice versa. This even applies to ones I've installed a frog in already. Adding a second frog gets you a new mind that is just like the one two steps back, except lags behind it in computation power by two kicks. There is a 1:1 mapping between Laplacian and non-Laplacian minds, and I have demonstrated the constructor function of adding a frog.

comment by CCC · 2012-08-29T10:56:36.403Z · LW(p) · GW(p)

A question.

The possible mind, that assumes that things are more likely to work if they have never worked before, can in all honesty continue to use this prior if it has never worked before. But this is only a self-sustaining method if it continues not to work.

Let us introduce our hypothetical poor-prior, rationalist observer to a rigged game of chance; let us say, a roulette wheel. (For simplicity, let's call him Jim). We allow Jim to inspect an (unrigged) roulette wheel beforehand. We ask him to place a bet, on any number of his choice; once he places his bet, we use our rigged roulette wheel to ensure that he wins and continues to win, for any number of future guesses.

Now, from Jim's point of view, whatever line of reasoning he is using to find the correct number to bet on, it is working. He'll presumably select a different number every time; it continues to work. Thus, the idea that a theory that work now is less likely to work in the future is working... and thus is less likely to work in the future. Wouldn't this success cause him to eventually reject his prior?

comment by ME3 · 2008-07-08T14:20:27.000Z · LW(p) · GW(p)

I read it as saying, "Suppose there is a mind with an anti-Occamian and anti-Laplacian prior. This mind believes that . . ." but of course saying "there is a possible mind in mind design space" is a much stronger statement than that, and I agree that it must be justified. I don't see how such a mind could possibly do anything that we consider mind-like, in practice.

Really, I don't know if this has been mentioned before, but formal systems and the experimental process were developed centuries ago to solve the very problems that you keep talking about (rationality, avoiding self deception, etc). Why do you keep trying to bring us back to 500 BC and the methods of the ancient greeks? Is it because you find actual math too difficult? Trust me, it's still easier to do math right than to do informal reasoning right. On the other hand, it's much more rewarding to do informal reasoning wrong than to do the math wrong. This may be the source of the problem.

Replies from: gwern
comment by gwern · 2009-10-22T22:02:58.284Z · LW(p) · GW(p)

We keep going back to the Greeks because the paradoxes of the Eleatics (such as Zeno) and the Skeptics have never been satisfactorily addressed; they apply as well to the modern formal systems you laud as to the old syllogisms. Thinking in that vein may sharpen & formalize the paradoxes in such forms as Godel's theorems, but they won't dissolve them; we need different approaches to resolving the many Skeptical arguments about, say, circularity, like this metacircular approach.

Replies from: Kenny
comment by Kenny · 2013-04-08T13:54:50.932Z · LW(p) · GW(p)

Right! The Axiom of Choice is just one example of something like a 'sharpened' paradox, or really something 'sharp' that implies other paradoxical conclusions.

comment by JamesAndrix · 2008-07-08T14:57:22.000Z · LW(p) · GW(p)

When he said "And when you ask these strange beings why they keep using priors that never seem to work in real life... they reply, "Because it's never worked for us before!"" I read it as: There are people walking around that think like this at least part of the time. It possible that everyone thinks like this in certain circumstances. It may be yet another bias/ design flaw to watch out for while thinking. This leads to the advice "Don't be born with a stupid prior."

comment by Psy-Kosh · 2008-07-08T15:08:49.000Z · LW(p) · GW(p)

Hrm... can't one at least go one step down past Occam's razor? ie, doesn't that more or less directly follow from P(A&B)<=P(A)?

Replies from: Quarkster
comment by Quarkster · 2009-11-30T23:15:28.445Z · LW(p) · GW(p)

No, because you can't say anything about the relationship of P(A) in comparison to P(C|D)

Replies from: Psy-Kosh
comment by Psy-Kosh · 2009-12-01T00:35:09.881Z · LW(p) · GW(p)

Not sure why you were silently voted down into negatives here, but if I understand your meaning correctly, then you're basically saying this:

P(A)*P(B|A)

vs

P(C)

aren't automatically comparable because C, well, isn't A?

I'd then say "if C and A are in "similar terms"/level of complexity... ie, if the principle of indifference or whatever would lead you to assign equivalent probabilities to P(C) and P(A) (suppose, say, C = ~A and C and both have similar complexity), then you could apply it.

(or did I miss your meaning?)

Replies from: Quarkster, DanielLC
comment by Quarkster · 2009-12-01T01:00:42.125Z · LW(p) · GW(p)

You got my meaning. I have a bad habit of under-explaining things.

As far as the second part goes, I'm wary of the math. While I would imagine that your argument would tend to work out much of the time, it certainly isn't a proof, and Bayes' Theorem doesn't deal with the respective complexity of the canonical events A and B except to say that they are each more probable individually than separately. Issues of what is meant by the complexity of the events also arise. I suspect that if your assertion was easy to prove, then it would have been proven by now and mentioned in the main entry.

Thus, while Occam's razor may follow from Bayes' theorem in certain cases, I am far from satisfied that it does for all cases.

comment by DanielLC · 2011-09-03T02:43:37.642Z · LW(p) · GW(p)

if the principle of indifference or whatever would lead you to assign equivalent probabilities to P(C) and P(A)

How do you know that? Why must P(A) be a function of the complexity of A?

Also, this is only sufficient to yield a bound on Occam's razor. How do you know that the universe doesn't favor a given complexity?

Replies from: Psy-Kosh
comment by Psy-Kosh · 2011-09-05T14:31:44.373Z · LW(p) · GW(p)

Not a sole function of its complexity, but if A and B have the same complexity, and you have no further initial reason to place more belief in one or the other, then would you agree that you should assign P(A) = P(B)?

Replies from: DanielLC
comment by DanielLC · 2011-09-05T23:23:02.011Z · LW(p) · GW(p)

Complexity is a function of the hypothesis. Other functions can be made. In fact, complexity isn't even a specific function. What language are we using?

comment by Wes2 · 2008-07-08T15:56:17.000Z · LW(p) · GW(p)

"The important thing is to hold nothing back in your criticisms of how to criticize; nor should you regard the unavoidability of loopy justifications as a warrant of immunity from questioning."

This doctrine still leaves me wondering why this meta-level hermeneutic of suspicion should be exempt from its own rule. Or, if it is somehow not exempt, how is it a superior basis for knowledge as it obfuscates its own suspect status even as it discounts other modes of knowing. At least the blind faith camp is transparent about its assumptions ("you just have to believe!"), whereas the rule outlined above seems more like a risk manager hawking the methodilogical rigor of his CDO risk headging strategy.

Isn't Nassim Taleb's observations regarding our evolution in mediocristan (where inferences from small data sets were a recipe for success) and current existence in extremistan (where the all-important fat tail means that even enormous data sets can lead you right into the jaws of disaster) a serious critique of the "playing to win" strategy? If winning all the small early battles makes it more difficult to win the pivotal battle, it might be better to take the losses and win where it counts. That is, playing to lose most of the time might be the best way to win big. Or, perhaps better said, exposing yourself to probable defeat might be the only way to win. Indeed, wasn't this Copernicus method for finding truth? See Ch. 1 of Michael Polanyi's Personal Knowledge.

Replies from: Kenny
comment by Kenny · 2013-04-08T12:53:45.174Z · LW(p) · GW(p)

i think you're underestimating the importance of "winning all the small early battles"; I don't think there'd be anyone left to win "the pivotal battle" had we systematically exposed ourselves to probable defeat.

comment by Peter_Turney · 2008-07-08T16:06:24.000Z · LW(p) · GW(p)

And if you're allowed to end in something assumed-without-justification, then why aren't you allowed to assume anything without justification?

I address this question in Incremental Doubt. Briefly, the answer is that we use a background of assumptions in order to inspect a foreground belief that is the current focus of our attention. The foreground is justified (if possible) by referring to the background (and doing some experiments, using background tools to design and execute the experiments). There is a risk that incorrect background beliefs will "lock in" an incorrect foreground belief, but this process of "incremental doubt" will make progress if we can chop our beliefs up into relatively independent chunks and continuously expose various beliefs to focused doubt (one (or a few) belief(s) at a time).

This is exactly like biological evolution, which mutates a few genes at a time. There is a risk that genes will get "locked in" to a local optimum, and indeed this happens occasionally, but evolution usually finds a way to get over the hump.

Should I trust Occam's Razor? Well, how well does (any particular version of) Occam's Razor seem to work in practice?

This is the right question. A problem is that there is the informal concept of Occam's Razor and there are also several formalizations of Occam's Razor. The informal and formal versions should be carefully distinguished. Some researchers use the apparent success of the informal concept in daily life as an argument to support a particular formal concept in some computational task. This assumes that the particular formalization captures the essence of the informal concept, and it assumes that we can trust what introspection tells us about the success of the informal concept. I doubt both of these assumptions. The proper way to validate a particular formalization of Occam's Razor is to apply it to some computational task and evaluate its performance. Appeal to intuition is not a substitute for experiment.

At present, I start going around in a loop at the point where I explain, "I predict the future as though it will resemble the past on the simplest and most stable level of organization I can identify, because previously, this rule has usually worked to generate good results; and using the simple assumption of a simple universe, I can see why it generates good results; and I can even see how my brain might have evolved to be able to observe the universe with some degree of accuracy, if my observations are correct."

It seems to me that this quote, where it mentions "simple", must be talking about the informal concept of Occam's Razor. If so, then it seems reasonable to me. But formalizations of Occam's Razor still require experimental evidence.

The question is, what is the scope of the claims in this quote? Is the scope limited to how I should personally decide what to believe, or does it extend to what algorithms I should employ in my AI research? I am willing to apply my informal concept of Occam's Razor to my own thinking without further evidence (in fact, it seems that it isn't entirely under my control), but I require experimental evidence when, as a scientist, I use a particular formalization of Occam's Razor in an AI algorithm (if it seems important, given the focus of the research; is simplicity in the foreground or the background?).

Replies from: SecondWind
comment by SecondWind · 2013-04-27T19:00:19.919Z · LW(p) · GW(p)

By examining our cognitive pieces (techniques, beliefs, etc.) one at a time in light of the others, we check not for adherence of our map to the territory but rather for the map's self-consistency.

This would appear to be the best an algorithm can do from the inside. Self-consistent may not mean true, but it does mean it can't find anything wrong with itself. (Of course, if your algorithm relies on observational inputs, there should be a theoretical set of observations which would break its self-consistency and thus force further reflection.)

comment by Ian_C. · 2008-07-08T17:07:16.000Z · LW(p) · GW(p)

I think the degree to which you can know tomorrow will be like today is determined by how well you understand the object in question, not by the number of past times it's been that way.

For example you know your dog will bark and not quack tomorrow to the extent that you understand the workings of his voice box, not necessarily because he has barked the whole time you've known him (though that is evidence also). The more levels down you understand his voice box the surer you can be, because each level down eliminates more possibilities.

Replies from: Peterdjones
comment by Peterdjones · 2013-01-23T13:41:27.932Z · LW(p) · GW(p)

That all rests on some assumption such as "things do not spontaneously turn into other things"

comment by Unknown · 2008-07-08T17:22:36.000Z · LW(p) · GW(p)

I don't see why we shouldn't admit that the more times the sun rises, the more likely it is not to rise the next time... In general life insurance agencies assume that the more days you have lived, the more likely you are to die the next day.

comment by Arosophos (Lincoln_Cannon) · 2008-07-08T17:38:58.000Z · LW(p) · GW(p)

"So at the end of the day, I embrace the principle: 'Question your brain, question your intuitions, question your principles of rationality, using the full current force of your mind, and doing the best you can do at every point.'"

. . . to the extent that doing so increases your power, as illustrated by the principle you embrace to a greater extent:

"The point is to win."

That's the faith position.

"Everything, without exception, needs justification."

. . . except that toward which justification is aimed: power.

"The important thing is to hold nothing back in your criticisms of how to criticize . . ."

Yet you do, as illustrated by your power, which is actually that toward which you are applying full force, playing to win. Rationalism is great to the extent it is empowering. To the extent it weakens us, we abandon it.

Replies from: cogitoprime
comment by cogitoprime · 2021-07-01T20:31:49.798Z · LW(p) · GW(p)

“Where did I get it from? Was it by reason that I attained to the knowledge that I must love my neighbour and not throttle him? They told me so when I was a child, and I gladly believed it, because they told me what was already in my soul. But who discovered it? Not reason! Reason has discovered the struggle for existence and the law that I must throttle all those who hinder the satisfaction of my desires. That is the deduction reason makes. But the law of loving others could not be discovered by reason, because it is unreasonable.”

― Leo Tolstoy- Anna Karenina

Replies from: jeronimo196
comment by jeronimo196 · 2021-12-21T08:35:16.275Z · LW(p) · GW(p)

Tolstoy sounds ignorant of game theory - probably because he was dead when it was formulated.

Long story short, non-cooperating organisms regularly got throttled by cooperating ones, which is how we evolved to be cooperating.

comment by poke · 2008-07-08T18:01:14.000Z · LW(p) · GW(p)

I think the best way to display the sheer mind-boggling absurdity of the "problem of induction" is to consider that we have two laws: the first law is the law science gives us for the evolution of a system and the second law simply states that the first law holds until time t and then "something else" happens. The first law is a product of the scientific method and the second law conforms to our intuition of what could happen. What the problem of induction is actually saying is that imagination trumps science. That's ridiculous. It's apparently very hard for people to acknowledge that what they can conceive of happening holds no weight over the world.

The absurdity comes in earlier on though. You have to go way back to the very notion that science is mediated by human psychology; without that nobody would think their imagination portends the future. Let's say you have a robotic arm that snaps Lego pieces together. Is the way Lego pieces can snap together mediated by the control system of the robotic arm? No. You need the robotic arm (or something like it) to do the work but nothing about the robotic arm itself determines whether the work can be done. Science is just a more complex example of the robotic arm. Science requires an entity that can do the experiments and manipulate the equations but that does not mean that the experiments and equations are therefore somehow "mediated" by said entity. Nothing about human psychology is relevant to whether the science can be done.

You need to go taboo crazy, throw out "belief," "knowledge," "understanding," and the whole apparatus of philosophy of science. Think of it in completely physical terms. What science requires is a group of animals that are capable of fine-grained manipulation both of physical objects and of symbol systems. These animals must be able to coordinate their action, through sound or whatever, and have a means of long-term coordination, such as marks on paper. Taboo "meaning," "correspondence," etc. Science can be done in this situation. The entire history of science can be carried out by these entities under the right conditions given the right dynamics. There's no reason those dynamics have to include anything remotely resembling "belief" or "knowledge" in order to get the job done. They do the measurements, make the marks on a piece of paper that have, by convention, been agreed to stand for the measurements, and some other group can then use those measurements to make other measures, and so forth. They have best practices to minimize the effect of errors entering the system, sure, but none of this has anything to do with "belief."

The whole story about "belief" and "knowledge" that philosophy provides us is a story of justification against skepticism. But no scientist has reason to believe in the philosophical tale of skepticism. We're not stuck in our heads. That makes sense if you're Descartes, if you're a dualist and believe knowledge comes from a priori reasoning. If you're a scientist, we're just physical systems in a physical world, and there's no great barrier to be penetrated. Physically speaking, we're limited by the accuracy of our measurements and the scale of the Universe, but we're not limited by our psychology except by limitations it imposes on our ability to manipulate the world (which aren't different in kind from the size of our fingers or the amount of weight we can lift). Fortunately our immediate environment has provided the kind of technological feedback loop that's allowed us to overcome such limitations to a high degree.

Justification is a pseudo-problem because skepticism is a pseudo-problem. Nothing needs to be justified in the philosophical sense of the term. How errors enter the system and compound is an interesting problem but, beyond that, the line from an experiment to your sitting reading a paper 50 years later in an unbroken causal chain and if you want to talk about "truth" and "justification" then, beyond particular this-worldy errors, there's nothing to discuss. There's no general project of justifying our beliefs about the world. This or that experiment can go wrong in this or that way. This or that channel of communication can be noisy. These are all finite problems and there's no insurmountable issue of recursion involved. There's no buck to be passed. There might be a general treatment of these issues (in terms of Bayes or whatever) but let's not confuse such practical concerns with the alleged philosophical problems. We can throw out the whole philosophical apparatus without loss; it doesn't solve any problems that it didn't create to begin with.

Replies from: Peterdjones, None, None
comment by Peterdjones · 2013-01-23T13:45:11.965Z · LW(p) · GW(p)

They do the measurements, make the marks on a piece of paper that have, by convention, been agreed to stand for the measurements, and some other group can then use those measurements to make other measures, and so forth. They have best practices to minimize the effect of errors entering the system, sure, but none of this has anything to do with "belief."

You seem to have a picture of science that consists of data-gathering. Once you bring in theories, you then have a situation where there a multuple theories, and some groups of scientists are exploring theory A rather than B..and that might as well be called belief.

comment by [deleted] · 2013-05-21T15:30:14.979Z · LW(p) · GW(p)

Nothing needs to be justified in the philosophical sense of the term.

I think justification is important, especially in matters like AI design, as an uFAI could destroy the world.

In the case of AI design in general, consider the question "Why should we program an AI with a prior biased towards simpler theories?" I don't think anyone would argue that a more detailed answer than "It's our best guess right now." would be desirable.

comment by [deleted] · 2013-05-21T15:32:26.081Z · LW(p) · GW(p)

Nothing needs to be justified in the philosophical sense of the term.

I think justification is important, especially in matters like AI design, as an uFAI could destroy the world.

In the case of AI design in general, consider the question "Why should we program an AI with a prior biased towards simpler theories?" I don't think anyone would just walk away from a more detailed answer than "It's our best guess right now.", if they were certain such an answer existed.

comment by Chris_Hibbert · 2008-07-08T18:31:06.000Z · LW(p) · GW(p)

Hurrah! Eliezer says that Bayesian reasoning bottoms out in Pan-Critical Rationalism.

re: "Why do you believe what you believe?"

I've always said that Epistemology isn't "the Science of Knowledge" as it's often called, instead it's the answer to the problem of "How do you decide what to believe?" I think the emphasis on process is more useful than your phrasing's focus on justification.

BTW, I don't disagree with your stress on Bayesian reasoning as the process for figuring out what's true in the world. But Bartley really did successfully provide the foundation for rational analysis. When you want to figure out how to think successfully, you should use all the tools at your disposal (pan-critical) because at that point, you shouldn't be taking anything for granted.

@Wes: "This doctrine still leaves me wondering why this meta-level hermeneutic of suspicion should be exempt from its own rule." It's not exempt. Read "The Retreat to Commitment" by W. W. Bartley III. There's a substantial section in which Bartley presents the best arguments he can find against Popper's Epistemology (and WWB's fix to it) and shows how the criticisms come up short. Considering your opponent's best arguments is an important part of the process.

@Peter Turney: I like your description of "incremental doubt" because it illustrates how Bartley was saying that none of your beliefs has to be foundational. You should examine each of them in turn, but you have to find a different place to stand for each of those investigations.

comment by Jay3 · 2008-07-08T22:31:14.000Z · LW(p) · GW(p)

To understand the problem of induction simply think of organism X. Organism X is snatched from the wild and put in a new environment. On the first day in the new environment some strange compound is put in a small dish in the corner. Organism X eats it, simply because organism X is hungry. The second day, to organism X's surprise, the dish is refilled. The dish is refilled for the next 100 days. Organism X's confidence that tomorrow it will be fed is at an all time high by the 102nd day, but the very next day is November 18th. At the height of organism X's confidence that their single variable model is infallible is the exact moment organism X is slaughtered so that I can have an enjoyable turkey dinner for Thanksgiving.

comment by James · 2008-07-08T22:40:04.000Z · LW(p) · GW(p)

Please clarify "I do think that reflective loops have a meta-character which should enable one to distinguish them, by common sense, from circular logics."

What physical configuration of the universe would refute this?

comment by Caledonian2 · 2008-07-08T22:44:52.000Z · LW(p) · GW(p)

At the height of organism X's confidence that their single variable model is infallible is the exact moment organism X is slaughtered so that I can have an enjoyable turkey dinner for Thanksgiving.
Yes... so?

How might this organism deduce that it was going to be killed on that day from the data available to it?

comment by cole_porter · 2008-07-08T22:57:31.000Z · LW(p) · GW(p)

"I don't see how such a mind could possibly do anything that we consider mind-like, in practice."

This is a fabulous way of putting it. "In practice" may even be too strong a caveat.

comment by Fabio_Franco · 2008-07-09T01:46:57.000Z · LW(p) · GW(p)

Russell Kirk's critique of this blog entry: "This view of the human condition has been called - by C. S. Lewis, in particular - reductionism: it reduces human beings almost to mindlessness; it denies the existence of the soul. Reductionism has become almost an ideology. It is scientistic, but not scientific: for it is a far cry from the understanding of matter and energy that one finds in the addresses of Nobel prize winners in physics, say. ... "What ails modern civilization? Fundamentally, our society's affliction is the decay of religious belief. If a culture is to survive and flourish, it must not be severed from the religious vision out of which it arose. The high necessity of reflective men and women, then, is to labor for the restoration of religious teachings as a credible body of doctrine." From "Civilization without religion?" http://theroadtoemmaus.org/RdLb/21PbAr/Hst/Civ&~relig.htm

comment by JulianMorrison · 2008-07-09T01:55:13.000Z · LW(p) · GW(p)

Cole: symmetry of problem space is implied by "no free lunch". So, an optimizer that works in problem volume X should have a dual pessimizer that works just as well in anti-X.

Minds are a subset of optimizers.

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-07-09T02:18:00.000Z · LW(p) · GW(p)

A good way of putting it, Julian. Anti-Occamian anti-Laplacian minds perform well in anti-Occamian anti-Laplacian universes!

...though, I'm not really sure what happens when they try to reflect on themselves, or for that matter, how you build a mind out of anti-Occamian materials. The notion of an anti-regular universe may be consistent to first order, but not to second order. Is it regularly anti-regular, or anti-regularly anti-regular?

comment by RobinHanson · 2008-07-09T02:54:38.000Z · LW(p) · GW(p)

I'd actually like to see one of these supposed anti-Occam or anti-regular priors described in full detail - I'm not sure the concept is coherent.

comment by Unknown · 2008-07-09T03:12:30.000Z · LW(p) · GW(p)

In fact, an anti-Occam prior is impossible. As I've mentioned before, as long as you're talking about anything that has any remote resemblance to something we might call simplicity, things can decrease in simplicity indefinitely, but there is a limit to increase. In other words, you can only get so simple, but you can always get more complicated. So if you assign a one-to-one correspondence between the natural numbers and potential claims, it follows of necessity that as the natural numbers go to infinity, the complexity of the corresponding claims goes to infinity as well. And if you assign a probability to each claim, while making your probabilities sum to 1, then the probability of the more and more complex claims will go to 0 in the limit.

In other words, Occam's Razor is a logical necessity.

Replies from: aspera
comment by aspera · 2012-11-23T05:27:11.926Z · LW(p) · GW(p)

I think it would be possible to have an anti-Occam prior if the total complexity of the universe is bounded.

Suppose we list integers according to an unknown rule, and we favor rules with high complexity. Given the problem statement, we should take an anti-Occam prior to determine the rule given the list of integers. It doesn't diverge because the list has finite length, so the complexity is bounded.

Scaling up, the universe presumably has a finite number of possible configurations given any prior information. If we additionally had information that led us to take an Anti-Occam prior, it would not diverge.

comment by anon16 · 2008-07-09T03:39:02.000Z · LW(p) · GW(p)

Eliezer, I want to read more about design spaces. Is this a common term in computer science? Do you remember where you picked it up?

comment by Douglas_Knight3 · 2008-07-09T04:23:33.000Z · LW(p) · GW(p)

It seems to me that playing to win requires an implicit assumption that it is possible to win, and this assumes that there is structure out there, a very weak form of Occam's razor.

comment by Peter_Turney · 2008-07-09T04:33:17.000Z · LW(p) · GW(p)

In fact, an anti-Occam prior is impossible.

Unknown, your argument amounts to this: Assume we have a countable set of hypotheses. Assume we have a complexity measure such that, for any given level of complexity, there are a finite number of hypotheses that are below the given level of complexity. Take any ordering of the set of hypotheses. As we go through the hypotheses according to the ordering, the complexity of the hypotheses must increase. This is true, but not very interesting, and not relevant to Occam's Razor.

In this framework, a natural way to state Occam's Razor is, if one of the hypotheses is true and the others are false, then you should rank the hypotheses in order of monotonically increasing complexity and test them in that order; you will find the true hypothesis earlier in such a ranking than in other rankings in which more complex hypotheses are frequently tested before simpler hypotheses. When you state it this way, it is clear that Occam's Razor is contingent on the environment; it is not necessarily true.

If you define Occam's Razor in such a way that all orderings of the hypotheses are Occamian, then the "razor" is not "cutting" anything. If you don't narrow down to a particular ordering or set of orderings, then you are not making a decision; given two hypotheses, you have no way of choosing between them.

comment by Rudd-O · 2008-07-09T04:58:30.000Z · LW(p) · GW(p)

I'd actually like to see one of these supposed anti-Occam or anti-regular priors described in full detail - I'm not sure the concept is coherent.

That's easy. Go to a casino and watch the average moron play roulette using a "strategy".

Also, Russell Kirk is a moron too. I love how his text is full of nothingness, or, as Eliezer says, "applause lights". Loved that expression, by the way.

comment by Unknown · 2008-07-09T04:58:53.000Z · LW(p) · GW(p)

Peter Turney: yes, I define Occam's Razor in such a way that all orderings of the hypotheses are Occamian.

The razor still cuts, because in real life, a person must choose some particular ordering of the hypotheses. And once he has done this, the true hypothesis must fall relatively early in the series, namely after a finite number of other hypotheses, and before an infinite number of other hypotheses. The razor cuts away this infinite number of hypotheses and leaves a finite number.

comment by Peter_Turney · 2008-07-09T05:07:45.000Z · LW(p) · GW(p)

The razor still cuts, because in real life, a person must choose some particular ordering of the hypotheses.

Unknown, you have removed all meaning from Occam's Razor. The way you define it, it is impossible not to use Occam's Razor. When somebody says to you, "You should use Occam's Razor," you hear them saying "A is A".

comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-07-09T05:08:22.000Z · LW(p) · GW(p)

An anti-Laplacian prior is defined in the obvious way; if you've observed R red balls and W white balls, assign probability (W + 1) / (R + W + 2) of seeing a red ball on the next round.

An anti-Occamian prior is more difficult, for essentially the reasons Unknown states; but let's not forget that, in real life, Occam priors are technically uncomputable because you can't consider all possible simple computations. So if you only consider a finite number of possibilities, you can have an improper prior that assigns greater probability to more complex explanations, and then normalize with whatever explanations you're actually considering.

Replies from: Will_Sawin
comment by Will_Sawin · 2011-03-28T23:43:28.333Z · LW(p) · GW(p)

WRW has probability 1/2 2/3 1/2 = 1/6

WWR has probability 1/2 1/3 3/4 = 1/8

This is coherent if you require that the probability of different permutations of the same sequence be the same. An Anti-Laplacian urn must necessarily be finite.

On the other hand, Laplace assigns both a probability of 1/12.

comment by RobinHanson · 2008-07-09T05:16:04.000Z · LW(p) · GW(p)

By "full detail" I meant a prior over all possible states, not just over the next observation. The most disturbing prior I think is one that makes all observations independent of each other. All other priors allow some predictions of the future from the past - and given any particular prior we naturally find a way to call its favored states "simpler." I doubt the word "simple" has a meaning pinned down independently of that.

comment by Ian_C. · 2008-07-09T08:26:56.000Z · LW(p) · GW(p)

Surely the ultimate grounding for any argument has to be the facts of reality? Every chain of reasoning must end not by looping back on itself but by a hand pointing at something in reality.

comment by Kalle_Mikkola · 2008-07-09T08:54:55.000Z · LW(p) · GW(p)

A. It should be mentioned that this "Induction Problem" ("Why would things work in the future as in the past, more probably than in some other way") (or actually the criticism of the Induction Hypothesis) is due to the Scottish Enlightenment philosopher and liberal David Hume. http://en.wikipedia.org/wiki/David_Hume#Problem_of_induction

B. Why do our brains trust in Occam's Razor or in induction (that things work in the future as in the past, ...)? Because the universe behaved that way in the past, so most brains working in another way were eliminated in natural selection. So our belief is not a piece of evidence, just a repetition of the fact (?) that IN THE PAST Occam's Razor worked, i.e., that the "natural laws" etc. did not change (terribly much).

Yet I believe in the Induction Hypothesis, although I cannot give a rational justification. Perhaps just because by making no assumptions I could make no conclusions, so I want to make one, and Induction hypothesis is the most appealing one to my brains.

Without assuming something you cannot conclude anything, you cannot know anything (not even that something would be more probably than something else).

Similarly, TO HAVE ANY REASON TO DO ANYTHING, EVER, I think that I have to make the following three assumptions (or something "less plausible"), and that more or less everyone does make them more or less always:

  1. Something matters. (E.g., at least in one situation, there is an alternative that is more right/good/ethical than some other alternative. Why would it be so? Because there is a god who says so? But why would that make it truely matter?)
  2. I can affect it. (E.g., the world and particularly I am not fully deterministic).
  3. By trying to do what is right, I more likely do something in that direction than something in the opposite direction. (E.g., I can somehow conclude something about what is more likely to be right than something else.)

Unless I make all those three assumptions (or something else, more complex and less plausible, I think), there is no point for me to do anything.

So before even starting to think of something, I should assume that I am in one of those possible universes where those three assumptions hold true. (Unless it is more probable that I'm in one where the opposite is true - in that case I should try to avoid what I believe being right...)

I prefer to make the induction hypothesis too, because, otherwise "3." above remains terribly weak. So do the others seem to do too. As also everyone seems to always make the hypotheses 1. - 3. (and the Induction Hypothesis), more or less consciously, the hypotheses 1. - 3. could be taken as the axioms of ethics.

Moreover, they must even be strengthened: one should make some kind of assumption on how to get info on a. what is right (the "most difficult part"; this cannot be directly concluded from 1. - 3. but they are of some help) (I refer to answers such as pleasure (utilitarism), liberty (libertarism), following the ten commandments, doing what feels right, "nothing else matters except that the purpose of life is to kick stones as many times as possible" or something else) b. how to affect it (in b. the Induction Hypothesis makes "most of the work" - if I believe in it, I can think of how to kick stones and how to keep myself alive in order to do that for a long time).

comment by JamesAndrix · 2008-07-09T16:19:15.000Z · LW(p) · GW(p)

Lincoln Cannon: Aiming for power is just as arbitrary as aiming for having many descendants. Capability is important, because it's an objective measure. But power as an abstract? Power for what? Is a millionaire more powerful than a charismatic person? Each can do things the other can not. There are many capabilities that will help you with lots of goals, but even a super entity is going to have to make trade-offs, and it can't decide to simply go for more power because each possible investment will yield the most power in certain situations (and certain goal sets)

Also, how does one identify where 'you' and 'your power' ends and the universe begins?

Kalle Mikkola: Moreover, they must even be strengthened: one should make some kind of assumption on how to get info on

If one did not have assumption (a), would it be reasonable to set getting that information as a goal?

comment by Court · 2008-07-10T08:38:38.000Z · LW(p) · GW(p)

I was wondering when someone would mention Hume. Instructive to note Hume's 'solution' to the Problem of Induction: behave as if there were no such problem. Practically speaking, in actual life, it is impossible to do otherwise. As relates to this discussion, it seems to foreguess Eliezer's point that there are no 'mysterious' answers to be found. Everything will be as it was, once we have found what we are looking for.

Hume also recommended billiards, backgammon, and dining with friends. Sound advice, indeed.

comment by David_J._Balan · 2008-07-13T02:34:29.000Z · LW(p) · GW(p)

It's clear that there are some questions to which there are (and likely never will be) fully satisfactory answers, and it is also clear that there is nothing to be done about it but to soldier on and do the best you can (see http://www.overcomingbias.com/2007/05/doubting_thomas.html). However, there really are at least a few things that pretty much have to be assumed without further examination. I have in mind the basic moral axioms like the principle that the other guy's welfare is something that you should be at all concerned about in the first place.

comment by Quarkster · 2009-11-30T23:58:09.867Z · LW(p) · GW(p)

I don't see how it would be possible to draw conclusions about the future without assuming that the future will be like the past on the most stable level of organization can be identified. If this assumptions allows us to draw conclusions where they could not otherwise be drawn, isn't that winning?

comment by AnthonyC · 2011-03-28T17:38:49.902Z · LW(p) · GW(p)

""If I had not done among them the works which none other man did, they had not had sin: but now have they both seen and hated both me and my Father"

Even the Bible doesn't demand blind faith.

comment by Arandur · 2011-08-08T20:24:24.310Z · LW(p) · GW(p)

Huh. If you believe that the timespan of this Earth is finite, which you probably should if you are a Christian, then does that mean that, according to that prior, your confidence in the sun rising tomorrow should, in fact, be decreasing with each passing day? o.o And does this mean, that every time the sun rises ought to decrease your confidence in that prior, which should then lend itself less weight in your determination of the sun's likelihood of rising.....

Seems like a convergent series to me, but I'd like to see someone else better-versed in the math than me work it out.

Replies from: shminux
comment by Shmi (shminux) · 2011-08-08T20:47:07.904Z · LW(p) · GW(p)

Not necessarily, a memoryless process (e.g. same odds of the second coming happening today as yesterday) follows the Exponential distribution. It has a finite expected value, even though it has no memory. Radioactive decay is a standard example.

Replies from: Arandur
comment by Arandur · 2011-08-08T21:16:04.041Z · LW(p) · GW(p)

Oh, I see. :3 Thanks.

comment by Sblast · 2011-08-26T19:46:06.050Z · LW(p) · GW(p)

"reflecting on your mind using your mind" Can Glasses look at their self? No. What you ought to mean is that you assume other minds are similar to your own.

comment by kilobug · 2011-09-26T08:53:30.340Z · LW(p) · GW(p)

To me this is also greatly linked with the "belief in belief" theme : most people who do claim that Occam's Razor don't hold, who question using rationality on philosophical questions, do use them in daily life. When they see the cat entering a room, here a noise, see a broken glass, they assume it's the cat who broke the glass, not aliens or angels who teleported in the room, broke the glass, and then disappeared. When they feel hungry, they open the fridge, take out some food, and eat it, assuming the food will be in the fridge "like before", and than eating will make them feel less hungry, "like before". And when asked if the sun will raise tomorrow, they'll say "yes".

So in fact, they do believe in Occam's Razor. They do believe that the future will likely look like the past. They do implement the modus ponens. They do assume "reality" to exist and to be lawful. But they don't believe they have that belief. Instead, they believe they believe in god, alien abduction, or the whole universe being created by their own mind. And it's not that easy to make them face the fact they are mixing up between beliefs and beliefs in beliefs.

comment by buybuydandavis · 2011-09-27T23:22:24.886Z · LW(p) · GW(p)

But for now... what's the alternative to saying, "I'm going to believe that the future will be like the past on the most stable level of organization I can identify, because that's previously worked better for me than any other algorithm I've tried"?

Is it saying, "I'm going to believe that the future will not be like the past, because that algorithm has always failed before"?

To me, this is the point: "what's my alternative?"

A principle I got from a Stephen Donaldson novel applies to "the future will be like the past". The guy needed to find a bit of sabotage in a computer system. He had no expertise in software - or hardware, for that matter. But he needed to find the problem, or he would be dead.

The character got the principle he needed from bridge. In bridge, sometimes you're screwed unless your partner has the card you need him to have. So the play is to assume your partner has the card, and play accordingly, because if he doesn't, you're screwed anyway.

Assume you can win. Assume that everything necessary for you to win is true. If it isn't, you're screwed anyway.

If the future isn't like the past, how am I to know what ideas to rely on to take effective action? If I can't say "it worked before, so it will likely work tomorrow", it seems to me that I am screwed.

Should I believe that the future will be like the past, except on Tuesdays? Wednesdays? Except when it conflicts with statements in an arbitrarily selected book? From an arbitrarily selected person? From the first person I saw after I woke up 2343 days ago?

But if the future won't be like the past, I don't see any grounds for picking a solution, or the means for picking one. And even if one of these solutions works now, there wouldn't be any reason to think it would work later. In short, I'd be screwed. So I may as well believe the future will be like the past.

Assume winning is possible. I don't see how it is possible without the future being like the past, so I'm going to assume it will be.

comment by Yosarian2 · 2012-12-31T18:55:32.504Z · LW(p) · GW(p)

Well, if you wanted to actually test Occam's razor in a scientific way, you would have to test it against an alternate hypothesis and see which one gave better predictions, wouldn't you?

So how about this as an alternate hypothesis:

"Occam's Razor has no objective truth value; there is no fundamental reason that the truth is more likely to be a simpler explanation. It only SEEMS like Occam's Razor is true because it is exponentially harder to find a valid explanation in a larger truth-space, so usually when we do manage to find a valid explanation for something, it is a simple explanation. But that is merely a question of the map, and of finding a specific spot on the map, not of the territory itself."

What kind of experiment would you set up to differentiate that possibility from Occam's Razor being correct?

Replies from: TheOtherDave, BerryPick6
comment by TheOtherDave · 2012-12-31T21:06:26.082Z · LW(p) · GW(p)

Can you summarize the articulation of Occam's Razor that this conflicts with? Because I don't normally think of OR as asserting anything about fundamental reasons, merely about reliable strategies... and your hypothesis agrees about reliable strategies.

comment by BerryPick6 · 2012-12-31T21:20:22.165Z · LW(p) · GW(p)

Does the alternate hypothesis even give us different results? How would a world in which the second hypothesis was true look like when compared to one in which Occam's Razor holds?

Replies from: Yosarian2
comment by Yosarian2 · 2012-12-31T22:25:20.200Z · LW(p) · GW(p)

Well, let's see. If you take an unsolved scientific question, and create a hypothesis that is the simplest possible fit for that answer given the known facts, how often is that answer true, and how often is a more complex answer actually true?

I'm sure we can all think of examples and counterexamples (Newton's theories are a lot simpler fit for the facts he knew then relativity, but they turned out to not be true), but you would probably have to take a large sample of scientific problems.

I would think that Occam's razor would turn out to be correct most of the time in a statistical analysis, but it seems like a testable hypothesis, at least in the set of (problems human scientists have solved).

Replies from: TheOtherDave
comment by TheOtherDave · 2012-12-31T22:46:28.524Z · LW(p) · GW(p)

Sorry, I still don't get it.

Suppose we somehow do this study, and we find that N% of the time the "simplest possible fit given the known facts" is true, and (1-N)% of the time it isn't. For what range of Ns would you conclude that Occam's Razor is correct, and for what range of Ns would you conclude that your alternative hypothesis is instead correct?

Replies from: Yosarian2
comment by Yosarian2 · 2012-12-31T23:20:26.903Z · LW(p) · GW(p)

I will admit that I'm struggling a bit here, because I'm having trouble coming up with a coherent mental picture of what a legitimate alternate hypothesis to Occam's razor would actually look like.

In fact, if you take my hypothesis to be true, then Occam's razor would still fundamentally hold, at least in the simplest form of "a less complicated theory is more likely to be true then a more complicated", since if "theory-space A" is smaller then "theory-space B", then any given point in theory-space A is more likely to be true then any given point in theory-space B even if the answer has an equal chance of being in space A as it does of being in space B. So I think my original hypothesis actually itself reduces to Occam's Razor.

I think this is where I just say oops and drop this whole train of thought.

Replies from: Qiaochu_Yuan, TheOtherDave
comment by Qiaochu_Yuan · 2012-12-31T23:24:15.218Z · LW(p) · GW(p)

Here's one. The universe is a particularly perverse simulation, largely controlled by a sequence of pseudorandom number generators. This sequence of PRNGs gets steadily more and more Kolmogorov-complicated (the superbeings that run us love complicated forms of torture), so even if we figured out how a given one worked the next one would already be in play, and it is totally unrelated, so we'd have to start all over. Occam's razor fails badly in such a universe because the explanation for any particular thing happening gets more complicated over time.

In other words, Quirrell-whistling writ large.

Replies from: BerryPick6
comment by BerryPick6 · 2013-01-01T05:13:48.403Z · LW(p) · GW(p)

I guess we could test this one by looking at successful explanations over time and seeing whether their complexity increases at a steady rate? Then again, I can already find two or three holes in that test...

Hmm. This is a tricky one.

comment by TheOtherDave · 2013-01-01T00:22:54.024Z · LW(p) · GW(p)

So I think my original hypothesis actually itself reduces to Occam's Razor.

Yeah, that's what I think too.

Presumably, what I'd expect to see if Occam's Razor is an unreliable guideline is that when I'm choosing between two explanations, one of which is more complex for a consistent and coherent definition of complexity, it turns out that simpler explanation is often incorrect.

comment by [deleted] · 2013-10-10T03:41:18.468Z · LW(p) · GW(p)

When philosopher Susan Haack wrote "Evidence and Inquiry" back in 1995, she really hit the nail on the head on this one. I'll share an extensive quotation from her, and then I'll make a couple remarks:

The obser­vation that people have many beliefs in which they are not, or not much, justified...hints, though it doesn't say explicitly, that people also have beliefs in which they are justified. And it is a legitimate question, certainly, what reasons there are for even this degree of optimism. On this issue, it may be feasible to appeal to evolutionary considerations. As I observed in chapter 9, compared with other animals, human beings are not especially fast or strong; their forte is, rather, their greater cognitive capacity, their ability to represent the world to themselves and hence to predict and manipulate it....

Returning, now, to the main thread of the argument, let me repeat that my sights are set much lower than Descartes's; I have aspired only to give reasons for thinking that, if any truth-indication is available to us, satisfaction of the foundherentist criteria of justification is as good an indication of truth as we could have. Even this very significant lowering of aspirations, however, will not in itself constitute any reply to the more notorious difficulty with Descartes's enterprise: the vicious circle into which it is generally supposed Descartes got himself. Aren't my ratificatory arguments, however hedged, however modest in what they aspire to do, bound to be viciously circular?

I don't think so.

First: I have not offered an argument with the conclusion that the foundherentist criteria are truth-indicative, one of the premisses of which is that the foundherentist criteria are truth-indicative.

Second: nor have I (like those who hope for an inductive meta­-justification of induction) used a certain method of inference or belief­ formation to arrive at the conclusion that that very method is a good, truth-conducive method....

This is probably not enough to allay all suspicions. "Yes, but how do you know that the senses are a source of information about things in one's environment, that introspection is a source of information about one's own mental goings-on?', I may be asked, echoing the familiar challenge to Descartes, 'How do you know that God exists and is not a deceiver?' The question will be put, no doubt, in a tone which suggests that the only response available to me is, 'because my evidence satisfies the foundherentist criteria', echoing the anticipated answer from Descartes, 'because I clearly and distinctly perceive it to be true' . I shall put aside the question whether Descartes has any recourse against this challenge, and concentrate on my own defence. For simplicity, let 'R' abbreviate all the direct reasons I have offered in my ratificatory argu­ment. The anticipated question, 'Yes, but how do you know that R?' is rhetorical, a challenge rather than a simple request for information, and it may be taken in either of two ways: (1) as a challenge to give my reasons for believing that R, or: (2) as a challenge to show that my reasons for believing that R are good enough for my belief to constitute knowledge. I cannot meet the second challenge without articulating my standards of evidence and showing that my evidence with respect to R satisfies them, and, at least arguably though not quite so obviously, without offering reassurance that my standards of evidence are truth­-indicative; and if so, I cannot meet it, in the present context, without circularity. But I can meet the first challenge simply by giving my reasons for believing that R. And this is enough. My reasons are good reasons if they are independently secure and they genuinely support R; and I am justified in believing those reasons, and hence in believing that R, if my evidence for believing them is good evidence. And if I am justified in believing that R, then (assuming that R is true, and whatever else is needed to avoid Gettier paradoxes) I know that R. And if I do and if R (and the indirect reasons on which it depends) are good reasons for believing that the foundherentist criteria are truth-indicative, I know that, too. Even if I can't know that I am justified in my weakly ratificatory conclusion, I can be justified in it nonetheless; and even if I can't know that I know it, I can know it nonetheless.

My main remark is that Eliezer doesn't need to rely on induction to justify induction. The mere assumption that the universe has natural laws is enough. Given that, one can look at the universe timelessly, and say, "I see evidence of this natural law, and, given that the universe has such a natural law, this will happen tomorrow." Then we aren't predicting the future based on the past; we're predicting the future based on our conception of the natural laws of the universe, which just happens to arise from observations we've made in the past. This removes circularity, but requires an additional assumption (which, if false, implies skepticism, as far as I can see).

Of course, standards of evidence do need to be evaluated circularly, but as both Eliezer and Haack noted, the circularity is not vicious.

As for Occam's Razor, Haack doesn't mention it much, and I'm still working out my own thoughts on the problem. I think Kevin Kelly's work is promising, in spite of the fact that he rejects Bayesianism (and consequently, his approval on Less Wrong was mixed when he was previously mentioned). Of course, the proofs of the optimality of Solomonoff induction are probably really, really important here, too. I need to study this more; I'll post another comment (probably not for at least 6 months) giving my position after I've actually done enough research to be confident in a position.

Edit (over a year later): I no longer have plans to do further research on Occam's Razor. I will add that, though I find Kelly's work interesting, I don't think it is a panacea answering once-and-for-all why Occam's Razor works, or is justifiable.

Replies from: TheAncientGeek
comment by TheAncientGeek · 2016-06-12T11:45:32.937Z · LW(p) · GW(p)

My main remark is that Eliezer doesn't need to rely on induction to justify induction. The mere assumption that the universe has natural laws is enough. Given that, one can look at the universe timelessly, and say, "I see evidence of this natural law, and, given that the universe has such a natural law, this will happen tomorrow." Then we aren't predicting the future based on the past; we're predicting the future based on our conception of the natural laws of the universe, which just happens to arise from observations we've made in the past. This removes circularity, but requires an additional assumption (which, if false, implies skepticism, as far as I can see).

Well, if his evidence for the existence of natural laws is not itself based on induction, he escapes circularity

No one knows what a natural law is, and no one has detected one by direct inspection. The popular answer, that they are "just descriptions" fails particularly badly if one is trying to demonstrate how one has avoided circularity.

PS thanks for the Kelley link.

comment by WTFInstaBan · 2021-04-01T05:25:32.251Z · LW(p) · GW(p)

You have to choose your fights. and I choose to priorities fights with systems that says, "Distrust me when I'm wrong." That's why I'm a Pastsafarian.

Should you trust a paper that the only thing written on it is "You should thrust everything written on this paper"

The answer isn't "Yes" or "No" it's "Mu"

It is the same as if the paper had nothing written on it all.

You shouldn't ask if something trust or false you should ask your self how should I change my believes given this new evidence.

 

If the paper is trustworthy then it's proof it's trustworthy.

 

If the paper is untrustworthy then it's proof that it's untrustworthy. 

 

No matter if the paper is trustworthy or not it won't change your opinions of it

comment by Ian Televan · 2021-04-08T21:01:02.853Z · LW(p) · GW(p)

But is the Occam's Razor really circular? The hypothesis "there is no pattern" is strictly simpler than "there is this particular pattern", for any value of 'this particular'.. Occam's Razor may expect simplicity in the world, but it is not the simplest strategy itself.  

Edit: I'm talking about the hypothesis itself, as a logic sequence of some kind, not that, which the hypothesis asserts. It asserts maxentropy - the most complex world. 

comment by Yoav Ravid · 2023-03-11T06:09:34.141Z · LW(p) · GW(p)

Elsewhere [LW(p) · GW(p)], @abramdemski [LW · GW] said that Eliezer implicitly employs a use/mention distinction in this post, which I found clarifying. 

Basically, Eliezer licenses using induction to justify "induction works" but not "induction works" to justify "induction works", the latter being circular, and the former being reflective. So you could argue "Induction worked in the past, therefore induction will work in the future" or even "induction worked, therefore induction works" (probabilistically), but not "Induction works, therefore induction works".

 Here's Eliezer applying the use/mention distinction explicitly in You Provably Can't Trust Yourself [LW · GW]:

You can have a system that trusts the PA framework explicitly,  as well as implicitly: that is PA+1.  But the new framework that PA+1 uses, makes no mention of itself; and the specific proofs that PA+1 demands, make no mention of trusting PA+1, only PA.  You might say that PA implicitly trusts PA, PA+1 explicitly trusts PA, and Self-PA trusts itself.

For everything that you believe, you should always find yourself able to say, "I believe because of [specific argument in framework F]", not "I believe because I believe".

Of course, this gets us into the +1 question of why you ought to trust or use framework F.  Human beings, not being formal systems, are too reflective to get away with being unable to think about the problem.  Got a superultimate framework U?  Why trust U?

And worse: as far as I can tell, using induction is what leads me to explicitly say that induction seems to often work, and my use of Occam's Razor is implicated in my explicit endorsement of Occam's Razor.  Despite my best efforts, I have been unable to prove that this is inconsistent, and I suspect it may be valid.

But it does seem that the distinction between using a framework and mentioning it, or between explicitly trusting a fixed framework F and trusting yourself, is at least important to unraveling foundational tangles—even if Löb turns out not to apply directly.

And Löb's Theorem [? · GW] indeed seems relevant once you phrase this idea as "If induction justifies "induction works" then "induction works" is justified".

This still doesn't formalize or give a clear cut distinction between valid circularity and invalid circularity, but I think it helps.

Replies from: TAG
comment by TAG · 2023-12-13T19:00:15.887Z · LW(p) · GW(p)

If induction justifies “induction works” then “induction works” is justified”.

Induction doesn't justify "induction works", unless "induction works" is justified.

Replies from: Yoav Ravid
comment by Yoav Ravid · 2023-12-14T07:09:50.633Z · LW(p) · GW(p)

Hmm... I can understand this two different ways, and I'm not sure which one you mean.

  1. Induction doesn't justify "induction works" unless "induction works" is justified by something else, otherwise it's circular.
  2. Induction doesn't justify "induction works" unless "induction works" is correct, therefore, if it justifies it then it's correct (basically a corollary of what I wrote).