# The Cartoon Guide to Löb's Theorem

post by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-17T20:35:45.000Z · LW · GW · Legacy · 103 commentsLo! A cartoon proof of Löb's Theorem!

Löb's Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent. Marcello and I wanted to be able to see the truth of Löb's Theorem at a glance, so we doodled it out in the form of a cartoon. (An inability to trust assertions made by a proof system isomorphic to yourself, may be an issue for self-modifying AIs.)

It was while learning mathematical logic that I first learned to *rigorously* distinguish between X, the truth of X, the quotation of X, a proof of X, and a proof that X's quotation was provable.

The cartoon guide follows as an embedded Scribd document after the jump, or you can download as a PDF file. Afterward I offer a medium-hard puzzle to test your skill at drawing logical distinctions.

Cartoon Guide to Löb's ... by on Scribd

Cartoon Guide to Löb's Theorem - Upload a Document to Scribd

And now for your medium-hard puzzle:

The Deduction Theorem (look it up) states that whenever assuming a hypothesis H enables us to prove a formula F in classical logic, then (H->F) is a theorem in classical logic.

Let ◻Z stand for the proposition "Z is provable". Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C.

Applying the Deduction Theorem to Löb's Theorem gives us, for all C:

((◻C)->C)->C

However, those familiar with the logic of material implication will realize that:

(X->Y)->Y

implies

(not X)->Y

Applied to the above, this yields (not ◻C)->C.

That is, all statements which lack proofs are true.

I cannot prove that 2 = 1.

Therefore 2 = 1.

Can you *exactly* pinpoint the flaw?

## 103 comments

Comments sorted by oldest first, as this post is from before comment nesting was available (around 2009-02-27).

## comment by Tiiba2 · 2008-08-17T21:36:19.000Z · LW(p) · GW(p)

"""(X->Y)->Y implies (not X)->Y"""

The arrow means "implies", right?

So,

(Smoke implies fire, therefore fire) implies (no smoke means fire)?

I don't get it.

Replies from: Ender## ↑ comment by Ender · 2012-06-24T19:16:27.399Z · LW(p) · GW(p)

I think that this is what the theorem means;

If (X->Y) -> Y, then ~X -> Y (If it's true that "If it's true that 'if X is true, then Y is true,' then Y must be true," then Y must be true, even if X is not true).

This makes sense because the first line, "(X->Y) -> Y," can be true whether or not X is actually true. The fact that ~X -> Y if this is true is an overly specific example of that "The first line being true (regardless of the truth of X)" -> Y. It's actually worded kind of weirdly, unless "imply" means something different in Logicianese than it does in colloquial English; ~X isn't really "implying" Y, it's just irrelevant.

This doesn't mean that "(X -> Y) -> Y" is always true. I actually can't think of any intuitive situations where this could be true. It's not true that the fact that "if Jesus really had come back to life, Christians would be Less Wrong about stuff" implies that Christians would be Less Wrong about stuff even if Jesus really hadn't come back to life.

Also,

To anyone who wants to tell me I'm wrong about this; If I'm wrong about this, and you know because you've learned about this in a class, whereas I just worked this out for myself, I'd appreciate it if you told me and mentioned that you've learned about this somewhere and know more than I do. If logic is another one of those fields where people who know a lot about it HATE it when people who don't know much about it try to work stuff out for themselves (like Physics and AI), I'd definitely like to know so that I don't throw out wrong answers in the future. Thanks.

Replies from: lightlike, VHanson## ↑ comment by lightlike · 2013-05-15T03:43:25.545Z · LW(p) · GW(p)

"The fact that ~X -> Y if this is true is an overly specific example of that "The first line being true (regardless of the truth of X)" -> Y."

This is basically correct; if ~X then X -> Y is always true because X never has the opportunity to be true, in a sense.

## ↑ comment by VHanson · 2014-01-21T02:37:26.408Z · LW(p) · GW(p)

I did study stuff like this a LONG time ago, so I respect your trying to work this out from 'common sense'. The way I see it, the key to the puzzle is the truth-value of Y, not 'whether or not X is actually true'. By working out the truth-tables for the various implications, the statement "((◻C)->C)->C" has a False truth-value when both (◻C) and C are False, i.e. if C is both unprovable and false the statement is false. Even though the 'material implication' "(X->Y)->Y implies (not X)->Y" is a tautology (because when the 1st part is false the 2nd part is true & vice-versa) that does not guarantee the truth of the 2nd part alone. In fact, it is intuitively unsurprising that if C is false, the premise that 'C is provable' is also false (of course such intuition is logically unreliable, but it feels nice to me when it happens to be confirmed in a truth table). What may seem counterintuitive is that this is the only case for which the premise is True, but the encompassing implication is False. That's because a 'logical implication' is equivalent to stating that 'either the premise (1st part) is False or the conclusion (2nd part) is True (or both)'. So, for the entire statement "((◻C)->C)->C" to be True when C is False, means "((◻C)->C)" must be False. "((◻C)->C)" is only False when "(◻C)" is True and "C" is False. Here's where the anti-intuitive feature of material implication causes brain-freeze - that with a false premise any implication is assigned a True truth-value. But that does NOT mean that such an implication somehow forces the conclusion to be true! It only affirms that for any implication to be true, it must be the case that "IF the premise is true then the conclusion is true." If the premise is False, the conclusion may be True or False. If the premise is True and the conclusion is False, then the implication itself is False!

The translation of " (not ◻C)->C" as "all statements which lack proofs are true" is a ringer. I'd say the implication "If it is not the case that C is provable, then C is True" is equivalent to saying "Either C is provable or C is True or both." Both of these recognize that the case for which "C is provable" and "C is False" makes the implication itself False. The mistranslation erroneously eliminates consideration of the case in which C is both unprovable an False, which is (not coincidentally, I suspect) the one case for which the grand formulation "((◻C)->C)->C" is False.

Replies from: Ender## ↑ comment by Ender · 2014-02-05T23:04:20.817Z · LW(p) · GW(p)

Jeepers. I haven't thought about this problem for a long time. Thanks.

The answer that occurs to me for the original puzzle is that Yudkowsky never proved (◻(2 = 1) -> (2 = 1)). I don't know it that is actually the answer, but I really need to go do other work and stop thinking about this problem.

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-17T22:34:23.000Z · LW(p) · GW(p)

Sasha, why doesn't it follow from the Deduction Theorem?

Replies from: andrew-jacob-sauer## ↑ comment by Andrew Jacob Sauer (andrew-jacob-sauer) · 2019-09-15T21:31:03.819Z · LW(p) · GW(p)

Because assuming Provable(C)->C as a hypothesis doesn't allow you to prove C. Rather, the fact that a proof exists of Provable(C)->C allows you to construct a proof of C.

## comment by Joe2 · 2008-08-17T22:42:35.000Z · LW(p) · GW(p)

The confusing part is this:

(C is provable -> C) -> C

I can grok the part inside the parenthases - that's what we'd all like to believe; that this system can assure us that everything it proves is true. The weird step is that this statement itself implies that C is true - I guess we either take this on faith or read the Deduction Theorem.

What I find most strange, though, is the desire to have the proof system itself distinguish between provability and truth. Isn't that impossible? A formal system only provides proofs given axioms; to it, the only meaning of truth beyond that of provability is the behavior of truth within formal logic operations, like implication and negation. "Truth" as "believability" only happens in the brains of the humans who read the proof later.

So, I disagree with the misleading paraphrasing of Lob's proof: "Löb's Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent."

This dereferences the word "truth" wrongly as "soundness," when truth's meaning within the system is behavior in formal logic operations. The paraphrasing should be "Lob's theorem shows that a consistent mathematical system cannot assert that theorems it proves will behave as true under the symbol manipulations of formal logic."

## comment by Peter_Woo · 2008-08-17T22:52:32.000Z · LW(p) · GW(p)

Tiiba: "(Smoke implies fire, therefore fire) implies (no smoke means fire)?" "therefore fire" should be left out. i) (Smoke implies fire) implies fire. ii) No smoke implies fire.

You're confused because i) is false.

Medium puzzle:

"Applying the Deduction Theorem to Löb's Theorem gives us, for all C:

```
((◻C)->C)->C"
```

Wouldn't it be: ((?C) -> C) -> ?C If the provability of C implies C's truth, then C is provable.

In other words, where you write "(X->Y)->Y", it should be "(X -> Y) -> X".

## comment by Brian_Jaress2 · 2008-08-18T00:28:18.000Z · LW(p) · GW(p)

I think the error is that you didn't prove it was unprovable -- all provably unprovable statements are also provable, but unprovable statements aren't necessarily true.

In other words, I think what you get from the Deduction Theorem (which I've never seen before, so I may have it wrong) is Provable((Provable(C) -> C) -> C). I think if you want to "reach inside" that outer Provable and negate the provability of C, you have to introduce Provable(not Provable(C)).

## comment by Jordan · 2008-08-18T01:09:57.000Z · LW(p) · GW(p)

So, I was thinking about the misapplication of the deduction theorem and suddenly had an insight into friendly AI. (Stop me if this has been discussed =D )

The problem is you give the AI a set of 'axioms' or goals and it might go off and tile the universe. Obviously we don't know the full consequences of the logical system we're initiating otherwise we wouldn't need the AI in the first place.

The problem already exists in humans. Evolution gave us 'axioms' or desires. Some people DO go off and tile their universe: drugs, sex or work addictions, etc, etc. Thus my insight stems from the lack of drug addictions in most people. Here are my two proposed solutions:

(1) Fear. Many people don't do drugs not because of a lack of desire to feel good, but because they are scared. People are likewise scared of any large changes (moving, new job, end of a relationship). Now, we don't need the AI to favor status quo, however we can simply code into ones of its axioms that large physical changes are bad. Scale this exponentially (ie, twice the physical changes SQUARES the negative weighting of the action). Do not have any positive weighting criteria on other goals that scale faster than a polynomial.

(2) Life. Many people don't tile their universe because they have too many different things they'd like to tile it with. Hobbies, friends, lovers, variation, etc. Give the AI a multitude of goals and set the positive weight associated with their accomplishment to diminish with returns (preferably logarithmic in growth at most). Twice the tiles only gives a linear increase in gauged utility.

Volume of tiling is bounded above by polynomial growth in time (cubic in a 3D universe with speed limit), hence hitting it with a log penalty will stifle its growth to at most log(t). If you wanted to be really safe you could simply cap the possible utility of accomplishing any particular goal.

I'm missing something because this seems like a solid solution to me. I haven't read most of Eliezer's writings, unfortunately (no time, I tile my universe with math), so if there's a good one that discusses this I'd appreciate a link.

## comment by Larry_D'Anna · 2008-08-18T01:54:12.000Z · LW(p) · GW(p)

The flaw is the instant you used the deduction theorem. You used it to go from

PA |- "◻C -> C" -> PA |- "C"

to

PA |- "(◻C->C) -> C"

What the induction theorem really says is

T + X |- Y implies T |- "X -> Y"

so what you really would have needed to apply the deduction theorm would have been

PA + "◻C -> C" |- C

do I win?

## comment by Larry_D'Anna · 2008-08-18T02:05:16.000Z · LW(p) · GW(p)

oops that should be "what the deduction theorem really says"

## comment by Allan_Crossman · 2008-08-18T02:53:16.000Z · LW(p) · GW(p)

Boiling it down to essentials, it looks to me like the key move is this:

- If we can prove X, then we can prove Y.
- Therefore, if X is true, then we can prove Y.

But this doesn't follow - X could be true but not provable.

Is that right? It's ages since I did logic, and never to a deep level, so excuse me if this is way off.

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-18T03:33:11.000Z · LW(p) · GW(p)

Hint: Pinpointing the exact flaw in the proof that 2=1, requires you to mention one of the steps 1-10 of the (valid) proof of Lob's Theorem.

Any supposed analysis that does not tell you to mention a particular step, is inexact at best, and more probably mistaken.

## comment by Doug_S. · 2008-08-18T03:48:06.000Z · LW(p) · GW(p)

I'm having some technical issues reading this.

What character is ◻? On my browser (Firefox 3), it looks like a box with four symbols in it, like this:

|25|

## |FB|

How do I get it to display properly?

Replies from: dlthomas## comment by Larry_D'Anna · 2008-08-18T05:04:45.000Z · LW(p) · GW(p)

Eliezer: "Any supposed analysis that does not tell you to mention a particular step, is inexact at best, and more probably mistaken."

I'm pretty sure my answer was completely correct. Care to point out the inexactness/mistakes?

## comment by Robert4 · 2008-08-18T05:16:48.000Z · LW(p) · GW(p)

The problem is that while Lob's theorem says that whenever we have ((◻C)->C), C is provable, it does not mean that there is a proof of C from ((◻C)->C). This means that you're use of the deduction theorem is incorrect. You can only say (((◻C)->C)->(◻C)). It is clear that (X->Y)->X does not imply (not X)->X, so your little trick doesn't work.

Replies from: dlthomas## ↑ comment by dlthomas · 2011-09-29T16:51:22.364Z · LW(p) · GW(p)

(X->Y)->X and (X->Y)->Y have the same truth table.

Replies from: TobyBartels## ↑ comment by TobyBartels · 2011-10-03T03:23:57.130Z · LW(p) · GW(p)

Not when X is false and Y is true.

Replies from: dlthomas## comment by Nominull3 · 2008-08-18T05:42:45.000Z · LW(p) · GW(p)

If L:ob's theorem is true, then surely we can use it without any reference to the details of its proof? Isn't the point of a proof to obtain a black box, so that we do not have to worry about all the various intermediate steps? And in this case, wouldn't it defeat the purpose of having a proof if, to explain how the proof was misapplied, we had to delve deep into the guts of the proof, rather than just pointing to the relevant features of its interface?

## comment by Robert4 · 2008-08-18T05:58:27.000Z · LW(p) · GW(p)

Opps, I made an error when I said: "It is clear that (X->Y)->X does not imply (not X)->X, so your little trick doesn't work." This is false. This implication does exist, so we get:

((not ◻C)->(◻C))

This sentence is true just in case C is in fact provable, so we don't have a problem with your 2=1 cases.

## comment by Larry_D'Anna · 2008-08-18T06:10:01.000Z · LW(p) · GW(p)

Robert: "You can only say (((◻C)->C)->(◻C))"

No that's not right. The theorem says that if PA proves "◻C -> C" then PA proves C. so that's ◻(◻C -> C) -> ◻C.

The flaw is that the deduction theorem does not cross meta levels. Eliezer says "Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C." and goes on to claim that (◻C->C)->C. But he's intentionally failed to use quotes and mixed up the meta levels here. Lob's Theorem does not give us a proof in first order logic from ((◻C)->C) to C. It gives us a proof that if there is a proof of ((◻C)->C) then there is a proof of C. Which is an entirely diffirent thing altogether.

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-18T06:21:58.000Z · LW(p) · GW(p)

so what you really would have needed to apply the deduction theorm would have been PA + "◻C -> C" |- C do I win?

Why *doesn't* the given proof of Lob's Theorem, the steps 1-10, show that PA + "◻C -> C" |- C?

## comment by Larry_D'Anna · 2008-08-18T06:47:28.000Z · LW(p) · GW(p)

Eliezer: "Why doesn't the given proof of Lob's Theorem, the steps 1-10, show that PA + "◻C -> C" |- C?"

That's just not what it *says*. The hypothesis in step 2 isn't "◻C -> C" it's "◻(◻C -> C)". I suppose if the Hypothesis were "◻C -> C" we could try to find where it breaks. Step 7 is the only step that depends on 2 so it has to be there. Translating the amusing cartoons, we have

- ◻(◻C -> C)
- ◻(◻L -> ◻C)
- ◻(◻L -> C)

Lets say that instead we have

2'. ◻C -> C

Well what are we supposed to do with it? We're stuck. Just because ◻C -> C doesn't mean that PA can prove it.

## comment by Douglas_Knight2 · 2008-08-18T06:57:03.000Z · LW(p) · GW(p)

Eliezer's password is [correct answer deleted, congratulations Douglas --EY].

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-18T10:56:52.000Z · LW(p) · GW(p)

Larry, interpret the smiley face as saying:

PA + (◻C -> C) |-

Then the steps 1-10 of Lob's proof would seem to take us from:

PA + (◻C -> C) |- (◻C -> C)

to

PA + (◻C -> C) |- C

If this were the case, the deduction theorem would in fact come into play and give us:

PA |- (◻C -> C) -> C

So there must be at least one step of the reasoning 1-10 that does not apply. Which one and why?

## comment by Larry_D'Anna · 2008-08-18T15:35:02.000Z · LW(p) · GW(p)

Eliezer: "Larry, interpret the smiley face as saying:

PA + (◻C -> C) |-"

OK ignoring the fact that this is exactly what it *doesn't* say, I suppose we could notice that every step is surrounded at the top by a happy-face-says, so, if we may be inclined to think we can take those off and get a proof inside PA, starting with "◻C -> C" as a hypothesis. Lets see what happens.

- ◻L <-> ◻(◻L -> C)
- ◻C -> C

ok these are our hypothesis.

- ◻(◻L->C) -> (◻◻L -> ◻C)

Modus Ponens works in PA, fine

- ◻L -> (◻◻L -> ◻C)

ordinary MP

- ◻L -> ◻◻L

if PA can prove it, then PA can prove it can prove it

- ◻L -> ◻C

ordinary MP

- ◻L -> C

ordinary MP

- ◻(◻L -> C)

ARGH WHAT ARE YOU DOING. THERE IS NO RULE OF FIRST ORDER LOGIC THAT ALLOWS YOU TO DO THIS. TO SAY "if i can prove X then i can prove i can prove X" STEPS OUTSIDE OF FIRST ORDER LOGIC YOU LOSE.

- ◻L

ordinary MP

10 C

ordinary MP

Replies from: None## ↑ comment by **[deleted]** ·
2011-11-09T05:43:33.931Z · LW(p) · GW(p)

I tried summarizing it independently and came up with fewer steps and more narrative. Hope you don't me replying here or stealing your lovely boxes.

PA lets us distribute boxes across arrows and add boxes to most things. Lob's hypothesis says we can remove boxes (or at least do around statement C in the following).

- L is defined (in a difficult, confusing, self-referential way that this margin is to narrow to contain) such that

◻L is equivalent to ◻(◻L -> C) - Distribute the box over the second form:

◻L -> (◻◻L -> ◻C) - ◻L also means ◻◻L since we can add boxes, so by the inference "A->(B->C) + (A->B) -> (A->C)" we know

◻L -> ◻C - All of that was standard PA. Here it becomes funny; Lob's hypothesis tells us that ◻C can become C, so

◻L -> C - Adding a box to that gives ◻(◻L->C), which in 1. we said was the same as

◻L - And in 4. we found out that ◻L gives C, so finally

C

## comment by Sebastian_Hagen2 · 2008-08-18T15:40:40.000Z · LW(p) · GW(p)

Doug S.:

What character is ◻?That's u+25FB ('WHITE MEDIUM SQUARE').

Eliezer Yudkowsky:

Larry, interpret the smiley face as saying:

PA + (◻C -> C) |- I'm still struggling to completely understand this. Are you also changing the meaning of ◻ from 'derivable from PA' to 'derivable from PA + (◻C -> C)'? If so, are you additionally changing L to use provability in PA + (◻C -> C) instead of provability in PA?

## comment by Larry_D'Anna · 2008-08-18T17:09:54.000Z · LW(p) · GW(p)

Psy-Kosh: No that isn't the problem. If there is a proof that 1=2, then 1=2. If there isn't, then 1=2. Either way 1=2. The problem is the mixing of meta-levels / inappropriate use of the deduction theorem / tacit assumption that ◻a -> ◻b is the same as "a -> b".

Sebastian Hagen: no ◻X is PA |- "X". My best guess as to what Eliezer meant by "interpret the smiley face as saying.." is that he meant to interpret the cartoonproof as a proof from the assumption "(◻C -> C)" to the conclusion "C", rather than a proof from "◻(◻C -> C)" to "◻C".

## comment by Anatoly_Vorobey · 2008-08-18T21:27:41.000Z · LW(p) · GW(p)

Loeb's theorem shows that if PA proves "not P(C)->C", then PA proves C (using P(X) for "X is provable in PA").

It follows from that, using the Deduction Theorem, that PA proves "(not P(C)->C)->C"; it does *not* follow
that "(not P(C)->C)->C" is a logical tautology.

From that, you obtain, similarly, that PA proves "(not P(C)) -> C". And you say "I cannot prove that 2 = 1". Sure, you cannot, but can PA? That's the real question.

There's nothing particularly surprising about PA proving "(not P(C))->C". After all, were PA to prove "not P(C)" for any C, it would prove its own consistency, and therefore be inconsistent, and therefore prove C.

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-18T23:24:10.000Z · LW(p) · GW(p)

Larry, a simple 8 would have sufficed. :) (That's what Douglas wrote.)

## comment by Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2008-08-18T23:35:07.000Z · LW(p) · GW(p)

I just tested and anecdotally confirmed a hypothesis made with very little data: I suspected that neither Douglas Knight nor Larry D'Anna, the two who pinpointed 8 as the critical step, would be among the objectors to my metaethics. (Either of them can torpedo this nascent theory by stating otherwise.)

## comment by Larry_D'Anna · 2008-08-19T01:31:08.000Z · LW(p) · GW(p)

"I just tested and anecdotally confirmed a hypothesis made with very little data: I suspected that neither Douglas Knight nor Larry D'Anna, the two who pinpointed 8 as the critical step, would be among the objectors to my metaethics. (Either of them can torpedo this nascent theory by stating otherwise.)"

I like your metaethics just fine.

## comment by Brian_Jaress2 · 2008-08-19T01:41:02.000Z · LW(p) · GW(p)

Okay, I still don't see why we had to pinpoint the flaw in your proof by pointing to a step in someone else's valid proof.

Larry D'Anna identified the step you were looking for, but he did it by trying to transform the proof of Lob's Theorem into a different proof that said what you were pretending it said.

I think, properly speaking, the flaw is pinpointed by saying that you were misusing the theorems, not that the mean old theorem had a step that wouldn't convert into what you wanted it to be.

I've been looking more at the textual proof you linked (the cartoon was helpful, but it spread the proof out over more space) which is a little different and has an all-ascii notation that I'm going to borrow from.

I think if you used Lob's Theorem correctly, you'd have something like:

if PA |- []C -> C then PA |- C [Lob]

PA |- ([]C -> C) -> C [Deduction]

PA |- (not []C) -> C [Definition of implication]

(PA |- ((not []C) -> C) and (PA |- not []C) [New assumption]

PA |- ((not []C) -> C) and (not []C) [If you can derive two things, you can derive the conjunction]

PA |- C [Definition of implication]

And conclude that all provably unprovable statements are also provable, not that all unprovable statements are true.

(Disclaimer: I am definitely not a mathematician, and I skipped over your meta-ethics because I only like philosophy in small doses.)

## comment by Kevin6 · 2008-08-19T02:01:21.000Z · LW(p) · GW(p)

How is the problem not going from 6 to 7? The false argument you gave uses PA + ((◻C)->C)|- C instead of PA |- ?((◻C)->C) when invoking the deduction theorem. In other words, the false argument is exactly the same except when it uses step 2 (which is Löb's Hypothesis). More specifically, making L doesn't depend on Löb's Hypothesis, and steps 1-6 don't depend on it either (as in they are still true in PA without the Hypothesis). Furthermore, assuming step 7 (ie PA + whatever you add |- step 7) step 8 must follow by A1.

## comment by Psy-Kosh · 2008-08-19T02:53:09.000Z · LW(p) · GW(p)

Larry: I'm a bit confused, could you clarify what you mean? Thanks. (Incidentally, I'm pretty sure what Brain said is basically a much more detailed way of saying what I said. That is, I think Brian's right, and, further, that what I was saying was similar (though, obviously, far less detailed)

So what am I misunderstanding?

Again, thanks.

## comment by Tom_Breton_(Tehom) · 2008-08-19T03:13:16.000Z · LW(p) · GW(p)

Fascinating that you could present Lob's theorem as a cartoon, Eliezer.

One tiny nitpick: The support for statement 9 seems to be wrong. It reads (1, MP) but that doesn't follow. Perhaps you mean (8, 1, MP)

Replies from: Cyan## ↑ comment by Cyan · 2013-04-27T21:41:16.665Z · LW(p) · GW(p)

I was just working through the cartoon guide pursuant to this recent discussion post, and I found this minor mistake too.

## comment by Larry_D'Anna · 2008-08-19T05:21:55.000Z · LW(p) · GW(p)

Psy-Kosh: There are two points of view of where the flaw is.

My point view is that the flaw is here:

"Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C. Applying the Deduction Theorem to Löb's Theorem gives us, for all C: ((◻C)->C)->C"

Because, in fact Lob's Theorem is: (PA |- "◻C -> C") => (PA |- "C") and the Deduction theorem says (PA + X |- Y) => (PA |- "X->Y"). We don't have PA + X proving anything for any X. The deduction theorem just doesn't apply. The flaw is that the informal prose just does not accurately reflect the actual math.

Eliezer's point of view (correct me if I'm wrong) is that in the cartoon, we have 10 steps all of the form "PA proves ...." They each follow logically from the ones that came before. So they look like a proof inside of PA. And if they were a proof inside of PA then the deduction theorem would apply, and his contradiction would go through. So the flaw is that while all of the steps are justified, one of them is only justified from *outside* of PA. And that one is step 8.

Both of these views are correct.

Brian Jaress:

I think if you used Lob's Theorem correctly, you'd have something like: if PA |- []C -> C then PA |- C [Lob] PA |- ([]C -> C) -> C [Deduction]

This is incorrect because the if-then is outside of PA. The deduction theorem does not apply.

## comment by Brian_Jaress2 · 2008-08-19T06:57:34.000Z · LW(p) · GW(p)

Larry D'Anna: Thanks, I think I understand the Deduction Theorem now.

## comment by Larry_D'Anna · 2008-08-19T14:57:13.000Z · LW(p) · GW(p)

simon: Let me explain some of the terminology here, because that may be where the confusion lies.

A *scentence* is a finite string symbols that satisfies a certain set of syntactic constraints.

A *theory* is a (possibly infinite) set of sentences. PA is a theory.

A *proof* from a theory T is a finite set of sentences, each of which is either an element of T, or follows from the ones before according to a fixed set of rules.

The notation PA + X, where X is a sentences is just the union of PA and {X}.

The notation PA |- Y means that a proof from PA that ends in Y exists.

Now I have left out some technical details, like what exactly are the syntactic constraints on sentences, and what is the fixed set of rules for proofs, but we have enough to say what the deduction theorem means. It says

PA + X |- Y => PA |- "X -> Y"

or in english: if there is a proof from the theory PA + X to the scentence Y, then there is a proof from PA alone that X->Y.

So, you see, the deduction theorem is really just a technical lemma. It meerly shows that (in one particular way) our technical definition of a first order proof behaves the way it ought to.

Now on to Lob's theorem, which says that if PA |- "◻C -> C" then PA |- "C". Now in general if you want to prove PA |- A implies that PA |- B, one way to do it is to write a first order proof inside of PA that starts with A and ends with B. But that is *not* what is going on here. Instead we start with a proof of "◻C->C" and do something *totally different than a first order proof inside of PA* in order to come up with a proof that PA |- C.

## comment by simon2 · 2008-08-19T22:49:21.000Z · LW(p) · GW(p)

Hmm. I was thinking that Löb's Theorem was a theorem in PA, in which case the step going from

PA + ?(?C -> C) |- ?(?L -> C)

to

PA + ?(?C -> C) |- ?(?(?L -> C))

seems legitimate given

PA |- (?X -> ?(?X))

which we ought to be able to use since PA is part of the theory before the |- symbol.

If we don't have PA on the left, can we use all the "ingredients" without adding additional assumptions?

In any case, if we do not use the deduction theorem to derive the implication in Löb's Theorem, what do we use?

## comment by Larry_D'Anna · 2008-08-20T01:05:20.000Z · LW(p) · GW(p)

simon:

I was thinking that Löb's Theorem was a theorem in PA

It is a theorem about PA, although everything the statement and the proof of it can be expressed in PA if you like, in which case it is a theorem inside of PA about PA. There's no contradiction there, as long as you have everything quoted properly.

in which case the step going from PA + ?(?C -> C) |- ?(?L -> C) to PA + ?(?C -> C) |- ?(?(?L -> C)) seems legitimate given PA |- (?X -> ?(?X))

That's a valid deduction, but I don't think it's a step that anyone has written down in this thread before. It's not clear to me where you are going with it.

In any case, if we do not use the deduction theorem to derive the implication in Löb's Theorem, what do we use?

We use 10 steps, 9 of which are proofs inside of PA, and one of which is the fact that if PA |- X then PA |- "PA |- X".

## comment by retired_urologist · 2008-08-20T01:17:13.000Z · LW(p) · GW(p)

I know nothing about math, and cannot even follow what's being argued (keep this in mind when programming the fAI), but it's really funny to see how one of the 2 guys that got the right answer is on this like white on rice. Is that part of the experiment?

## comment by simon2 · 2008-08-20T10:25:07.000Z · LW(p) · GW(p)

*It's not clear to me where you are going with it.*

To argue that a proof is being made concluding ?C using the assumption ?(◻C -> C) given the theory PA, to which proof we can apply the deduction theorem to get (PA |- "?(◻C -> C) -> ?C") (i.e. my interpretation of Löb's Theorem)

*We use 10 steps, 9 of which are proofs inside of PA*

But the proof uses an additional assumption which is the antecedent of an implication, and comes to a conclusion which is the consequent of the implication. To get the implication, we must use the deduction theorem or something like it, right?

*the fact that if PA |- X then PA |- "PA |- X"*

Is this fact a theorem of first order logic without any additional assumptions, or is it merely a theorem of PA? I admit I don't know, as I'm not very familiar with first order logic, but it intuitively seems to me that if first order logic were powerful enough on its own to express concepts like "PA proves X" it would probably be powerful enough to express arithmetic, in which case the qualification in Gödel's theorem that it only applies to theories that express arithmetic would be superfluous.

## comment by Larry_D'Anna · 2008-08-20T15:32:53.000Z · LW(p) · GW(p)

simon:

To argue that a proof is being made concluding ?C using the assumption ?(◻C -> C) given the theory PA, to which proof we can apply the deduction theorem to get (PA |- "?(◻C -> C) -> ?C") (i.e. my interpretation of Löb's Theorem)

OK so the question marks are boxes right? In that case then yes, PA |- "?(?C -> C) -> ?C". This is OK. The contradiction comes if PA |- "(?C->C)->C". But *morally* this doesn't have anything to do with the deduction theorem. PA proves Lob because everything in the proof of Lob is expressible inside of PA.

Like I said before, the deduction theorem is really just a technical lemma. If I'm doing ordinary mathematics (not logic), and I assume X, and prove Y, and then say "ok well now I've proved X -> Y", then I have not used the deduction theorem, because I'm writing a proof, not explicitly reasoning *about* proofs. The deduction theorem lies a meta level up, where we have a explicit, specific, technical definition of what constitutes a proof, and we are trying to prove theorems from that definition.

But the proof uses an additional assumption which is the antecedent of an implication, and comes to a conclusion which is the consequent of the implication. To get the implication, we must use the deduction theorem or something like it, right?

Nope, we are using an ordinary principle of mathematical reasoning. The deduction theorem says that if you have a proof that uses this principle and is otherwise first-order, you can convert it into a pure first order proof.

Is this fact a theorem of first order logic without any additional assumptions, or is it merely a theorem of PA? I admit I don't know, as I'm not very familiar with first order logic, but it intuitively seems to me that if first order logic were powerful enough on its own to express concepts like "PA proves X" it would probably be powerful enough to express arithmetic, in which case the qualification in Gödel's theorem that it only applies to theories that express arithmetic would be superfluous.

First order logic without any additional assumptions can't even express concepts like like PA. So, yea; that's why Gödel's theorem has that qualification, because there are plenty first order theories that are simple enough that they can't express integers.

## comment by J_Thomas2 · 2008-08-21T18:02:50.000Z · LW(p) · GW(p)

I went to the carnival and I met a fortune-teller. Everything she says comes true. Not only that, she told me that everything she says always comes true.

I said, "George Washington is my mother" and she said it wasn't true.

I said, "Well, if George Washington *was* my mother would you tell me so?" and she refused to say she would. She said she won't talk about what would be true if George Washingto was my mother, because George Washington is not my mother.

She says that everything she says comes true. She looked outside her little tent, and saw the sky was clear. She said, "If I tell you the sky is clear, then the sky is clear."

"But what if it was raining right now? If you told me it was raining, would it be raining?"

"I won't say what I'd tell you if it was raining right here right now, because it isn't raining."

"But if the sky is clear right now you'll tell me the sky is clear."

"Yes. Because it's true."

"So you'll only tell me that (if you say the sky is clear then the sky really is clear), when the sky really is clear?"

"Yes. I only tell you about true things. I don't tell you what I'd do if the world was some other way."

"Oh."

Replies from: TobyBartels## ↑ comment by TobyBartels · 2011-09-29T15:47:08.539Z · LW(p) · GW(p)

You can tell that J_Thomas2's "if" (which is the counterfactual "if") is not the "if" of material implication (which is what appears in Loeb's theorem) from the grammar: "if George Washington *was* my mother" rather than "if George Washington *is* my mother".

## comment by J_Thomas2 · 2008-08-21T23:04:25.000Z · LW(p) · GW(p)

Let me try to say that clearer.

Suppose that A is false.

How the hell are you going to show that if PA proves A true then A will be true, when A is actually false?

If you can't prove what would happen if PA proved A true when A is actually false, then if you can prove that if PA proves A is true then A has to be true, it must be that A is true in the first place.

If this reasoning is correct then there isn't much mystery involved here.

One more time. If PA proves you are a werewolf, then you're really-and-truly a werewolf. PA never proves anything that isn't actually true. OK, say you aren't a werewolf. And I say I have a proof that if PA proves you're a werewolf then you're actually a werewolf. But in reality you aren't a werewolf and PA does not prove you are. How do I prove that PA would do that, when PA does not do that?

Once more through the mill. If PA proves that 6 is a prime number, then 6 is really a prime number. Can you prove that if PA proved 6 was a prime number then 6 would be a prime number? How would you do it?

If you definitely can't prove that, then what does it mean if I show you a proof that if PA proves 7 is a prime number then 7 must actually be prime? If you can't make that proof unless 7 is prime, and you have the proof, then 7 is actually prime.

The problem is with trying to apply material implication when it does not work.

## comment by Larry_D'Anna · 2008-08-22T02:49:07.000Z · LW(p) · GW(p)

J Thomas:

*Once more through the mill. If PA proves that 6 is a prime number, then 6 is really a prime number. Can you prove that if PA proved 6 was a prime number then 6 would be a prime number? How would you do it?*

If PA |- "forall x y . x * y = 6 => |x|=1 || |y|=1"
then N |= "forall x y . x * y = 6 => |x|=1 || |y|=1"
(N = the natural numbers equiped with + and *)
so for all x and y in N, N |= ",x * ,y = 6 => |,x|=1 || |,y|=1"
(where ,x means a constant symbol for x)
if x*y = 6 then N |= ",x * ,y = 6" so therefore
N |= "|,x|=1 || |,y|=1"
thus either N |= "|,x| = 1" or N |= "|,y| = 1"
thus either |x|=1 or |y|=1
therefore we have that if x*y = 6 then either |x| = 1 or |y| = 1
therefore 6 is prime
therefore if PA |- "6 is prime" then 6 is actually prime

Of course it is also a meta-theorem that for any sentence phi in the language of PA that

ZF |- "PA |- phi => phi_omega"

where phi_omega is phi relativeized to the finite ordinals.

## comment by J_Thomas2 · 2008-08-22T07:55:01.000Z · LW(p) · GW(p)

But Larry, PA does not actually say that 6 is prime, and 6 is not prime.

You could say that if PA proved that every theorem is false then every theorem would be false.

Or what would it mean if PA proved that Lob's theorem was false?

It's customary to say that any conclusion from a false premise is true. If 6 is prime then God's in his heaven, everything's right with the world and we are all muppets. Also God's in hell, everything's wrong with the world, and we are all mutant ninja turtles. It doesn't really matter what conclusions you draw from a false premise because the premnise is false.

Your argument about what conclusion we could draw if PA said that 6 is prime is entirely based on a false premise. PA does not say that 6 is prime.

## comment by Larry_D'Anna · 2008-08-22T13:36:00.000Z · LW(p) · GW(p)

"But Larry, PA does not actually say that 6 is prime, and 6 is not prime."

Well of course 6 isn't prime. But if PA said it was, then it would be. There's nothing invalid about proving that A -> B if you know ~A. It's just not very useful. But for a somewhat less vacuous example, let RH be the riemann hypothesis. Then if PA |- RH then RH is true and if PA |- ~RH then RH is false. At least one of these implications has a false hypothesis, but they are both perfectly valid.

## comment by J_Thomas2 · 2008-08-22T15:08:00.000Z · LW(p) · GW(p)

Larry, one of them is counterfactual.

If you draw implications on a false asumption then the result is useful only to show that an assumption is false.

So if PA -> 1=2 then PA -> 1<>2. How is that useful?

If PA -> 6 is prime then PA also -> 6 is not prime.

Once you assume that PA implies something that PA actually implies is false, you get a logical contradiction. Either PA is inconsistent or PA does not imply the false thing.

How can it be useful to reason about what we could prove from false premises? What good is it to pretend that PA is inconsistent?

## comment by Larry_D'Anna · 2008-08-22T17:56:00.000Z · LW(p) · GW(p)

J Thomas: "How is that useful?"

I'm just answering your question

"Can you prove that if PA proved 6 was a prime number then 6 would be a prime number? How would you do it?"

I'm not stating that proving implications with false antecedent is particularly useful, just that it is valid. Also aside from 6 being prime it is true that for any sentence phi, ZF |- "if PA |- phi then phi", but that ZF cannot even say, yet alone prove that "forall phi. if PA |- phi then phi". But it can prove "forall phi. if PA |- phi then N |= phi".

## comment by J_Thomas2 · 2008-08-22T23:26:00.000Z · LW(p) · GW(p)

Larry, you have not proven that 6 would be a prime number if PA proved 6 was a prime number, because PA does not prove that 6 is a prime number.

The theorem is only true for the phi that it's true for.

The claim that phi must be true because if it's true then it's true, and if it's false then "if PA |- phi then phi" has an officially true conclusion whenever PA does not imply phi, is bogus.

It's simply and obviously bogus, and I don't understand why there was any difficulty about seeing it.

## comment by J_Thomas2 · 2008-08-22T23:35:00.000Z · LW(p) · GW(p)

I thought of a simpler way to say it.

If Hillary Clinton was a man, she wouldn't be Bill Clinton's wife. She'd be his husband.

Similarly, if PA proved that 6 was prime, it wouldn't be PA. It would be Bill Clinton's husband. And so ZF would not imply that 6 is actually prime.

## comment by Larry_D'Anna · 2008-08-23T05:59:00.000Z · LW(p) · GW(p)

J Thomas *Larry, you have not proven that 6 would be a prime number if PA proved 6 was a prime number, because PA does not prove that 6 is a prime number.*

No I'm afraid not. You clearly do not understand the ordinary meaning of implications in mathematics. "if a then b" is equivalent (in boolean logic) to ((not a) or b). They mean the exact same thing.

*The claim that phi must be true because if it's true then it's true*

I said no such thing. If you think I did then you do not know what the symbols I used mean.

*It's simply and obviously bogus, and I don't understand why there was any difficulty about seeing it.*

No offense, but you have utterly no idea what you are talking about.

*Similarly, if PA proved that 6 was prime, it wouldn't be PA*

PA is an explicit finite list of axioms, plus one axiom schema. What PA proves or doesn't prove has absolutely nothing to do with it's definition.

## comment by Jesse_Butler · 2008-08-26T14:30:00.000Z · LW(p) · GW(p)

From the medium hard question, we have "LÃ¶b's Theorem shows that, whenever we have ((â»C)->C), we can prove C." From the cartoon, the hypothesis of Lob's theorem is, " If PA proves 'PA proves X, then X' then PA proves X", so to apply the theorem to the situation outlined in the question we would be required to write instead of "((â»C)->C)" in the first sentence the following "PA proves '((â»C)->C)' ". Then by the deduction theorem, we'd have the following conclusion: 'â»('(â»C)->C')->â»C' and so the trouble with material implication wouldn't arise because the first '->' is in a quoted sentence.

## comment by ata · 2010-12-20T00:56:37.087Z · LW(p) · GW(p)

Trying the puzzle, before I read the other comments:

Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C.

I think this is the flaw. Löb's Theorem shows that whenever we can *prove ((◻C)->C) in PA*, then we can prove C *in PA*. So instead of ((◻C)->C)->C, it would actually be ◻((◻C)->C)->◻C.

At first I looked at ◻((◻C)->C)->◻C and thought it had a similar problem — if we let C be the statement 2 = 1, then clearly C = false, so ◻((◻C)->false)->◻C; so ◻(true)->◻C; and presumably you can prove tautologies in PA, so (true) -> ◻C; therefore, PA contains a proof of 2 = 1. I think my mistake there was that I took ◻(X) as talking about the *proposition* X, like ◻ is just some function being passed a value that you can manipulate syntactically like any other expression; but I think that's wrong, it has to be treated as the literal quotation of X instead. ◻((◻C)->C)->◻C means "If PA proves 'if PA proves "1 = 2", then 1 = 2', then PA proves '1 = 2'"; you don't get to substitute your outside-the-system knowledge that 1 ≠ 2 into that, because the whole thing is about what PA says about "1 = 2", and about what PA might say about (1 = 2), but not about "1 = 2" or (1 = 2) themselves — except in that PA's claims about numbers have all been true so far, but to treat X and what-PA-says-about-X as logically interchangeable in this context would be begging the question.

## comment by Stuart_Armstrong · 2011-01-05T12:14:53.685Z · LW(p) · GW(p)

(not ◻C)->C

I was confused, because that statement is correct. Let F be a canonical false statement. Then F->C, and hence (not ◻C)->(not ◻F). But (not ◻F) is the consistency statement of PA. So (not ◻C) implies consistency of PA, which implies anything, including C.

Replies from: Will_Sawin## ↑ comment by Will_Sawin · 2011-01-06T21:53:31.142Z · LW(p) · GW(p)

Peano's consistence doesn't imply anything, the provability of Peano's consistency implies anything.

Replies from: Stuart_Armstrong## ↑ comment by Stuart_Armstrong · 2011-01-07T16:36:39.154Z · LW(p) · GW(p)

Yes, but you can get from "there is no proof of proposition C" to "there is no proof of the canonical false proposition F". The second is often taken to be the definition of the provability of Peano's consistency, I believe.

Replies from: Will_Sawin## ↑ comment by Will_Sawin · 2011-01-07T18:55:10.897Z · LW(p) · GW(p)

It think the provability of consistency would be:

"There is a proof that there is no proof of the canonical false proposition F"

Replies from: Stuart_Armstrong## ↑ comment by Stuart_Armstrong · 2011-01-07T19:15:48.629Z · LW(p) · GW(p)

I'm using Peter Smith's definition (see http://www.logicmatters.net/resources/pdfs/gwt/GWT.pdf , Godel without too many tears).

In definition 59 (on page 1 of part 10), he identifies consistency with "not a proof of the canonical false statement". This is a valid statement within Peano arithmetic. And it's logical consequences are... anything.

Replies from: Will_Sawin## ↑ comment by Will_Sawin · 2011-01-07T19:22:32.312Z · LW(p) · GW(p)

You're confusing consistency with a proof of consistency.

Theorem 56: Consistency implies no proof of consistency.

Which is of course where you get:

Proof of consistency implies inconsistency.

Which gives you:

Proof of consistency implies anything.

Replies from: Stuart_Armstrong## ↑ comment by Stuart_Armstrong · 2011-01-07T22:33:08.435Z · LW(p) · GW(p)

I think you're right.... I was commiting the same mistake is above, using the first derivability condition and assuming that Peano arithmetic could treat it as a statement in Peano arithmetic - which it isn't.

## comment by cousin_it · 2011-01-09T16:56:57.978Z · LW(p) · GW(p)

If you really want to see the truth of the theorem at a glance, I found a much simpler proof on page 7 of this writeup by Aaronson.

## comment by alexflint · 2011-05-12T11:15:01.341Z · LW(p) · GW(p)

Posting before reading other comments...

The flaw is in the very last step: that there is no proof of 2=1. If PA is inconsistent then contradictions are provable in PA and then anything else is too, including 2=1. To prove that there is no proof of 2=1 is to prove that PA is consistent, which is impossible by Godel's theorem.

## comment by kilobug · 2011-10-26T14:14:15.048Z · LW(p) · GW(p)

Link to http://www.sm.luth.se/%7Etorkel/eget/godel/loeb.html seems broken (gives me a 403) but the Wayback Machine has a copy : http://web.archive.org/web/20081212195540/http://www.sm.luth.se/~torkel/eget/godel/loeb.html if someone is interested.

## comment by **[deleted]** ·
2011-11-24T00:59:45.720Z · LW(p) · GW(p)

## comment by **[deleted]** ·
2011-11-25T04:12:57.128Z · LW(p) · GW(p)

## comment by **[deleted]** ·
2011-11-25T04:16:12.190Z · LW(p) · GW(p)

You once wrote on the SL4 mailing list

It appears to me that for the GM to execute a change that switches to a new theorem-prover or proof-verifier, the Gödel Machine must prove that if the proof-verifier accepts P, then P.

The verifier has to translate a meta-language result into an object-language result.

Löb's Theorem establishes that for any proof system capable of encoding its own proof process, |- Provable('P') -> P if and only if |- P.

But if an AI does believe that (AI |- P) -> P, Löb's construction permits an automatic proof of P

You didn't write it out like this, but I'm going to guess that in the first quote you mean: For GM to change its verifier, V1, to V2, GM.V1 must prove that for all C of interest, if "C is provable by GM.V2" then C. More pithily, we need ∀C: V1 |- ◻_V2(C) -> C. If we add that V2 is logically equivalent to V1, that reduces to ∀C:V |-[]C->C, which we obviously can't have as an axiom because of Lob. Setting them equivalent makes ◻_V2(C) into a meta-language result, which you claimed it was, so I'm guessing that's also what you're doing.

However, the first quote has a different interpretation than "∀C: V1 |- ◻_V2(C)-> C", and it seems to work. It requires we stop talking about the old modal provability predicate, []X, which might be read as "there exists some term a la godel numbering that encodes a proof of X", or ∃t:proof(t,#X). Instead of a predicate that aks whether there exists a term t, we use a predicate that asks whether a specific t encodes a proof of X, proof(t,#X). I'll write that as [[t]]X, after S. Artemov work on such intuitionist / justificational logic. I've been told that Artemov's results are too messy and might not establish what he attemps, so I'll just mention him name, skip the citations, and present the part that makes sense to me so far. I'd really like to know whether it makes sense to you too.

Instead of the reflection principle, |-◻C->C, which is nice and easy and darn that Lob for taking it from us, we use |-[[t]]C -> C. which is roughly as nice! I still haven't entirely grokked why we can have []C and yet not derive C, but I would be down right shocked if we couldn't infer C from a proof which gives C. That might not be a "wrap around the towers of meta"-type Inference of Awesome, but it makes sense without having to do magic; it's the sort of thing I would expect that humans are actually doing in their heads. And of course, when we have [[t]]C, then we will have []C. Maybe Lob just confuses me because I do an attribute substitution from []C to [[t]]C!

With this new predicate in mind, we can now we can write a modified version of what we want the GM to prove. Actually, mathematical logic is still ridiculously new to me, so I'm not very confident the following is right, but I really think that *something* like this is right: For GM.V1 to rewrite V1 to V2, at the very least we want GM.V1 to prove that for all statements of interest C, if we have constructed a term t which encodes of proof for C from V2, then some magic Gödelian procedure allows us to transform t into s, a new term which encodes a proof of C from V1. If V2 is logically equivalent to V1 then t should work as our s, and we can juxt say ∀C: V1 |- [t]-> C, the old principle of "believe once it's proved, not once it's proved provable".

Now we can add inference rules to LCF-style provers once we prove them, without having to just expand macros that synthesize the primitive inference rules. We might even be able to transform deductive systems like PA into more powerful ones like ZF provided the AI has some non-logical ability to judge usefullness and there actually is such a magic Gödelian procedure that ensures our funny little half-meta-reflexive soundness criterion.

P2 constructs a t which encodes a proof for q from P2, then procedure x allows us to transform t into z, which encodes a proof of q from P1

## comment by **[deleted]** ·
2011-11-25T21:26:40.465Z · LW(p) · GW(p)

Here's the solution for your AI reflection problem.

In this SL4 post you equivocate ⊢ ◻P and ⊢ P . We are are allowed to believe P from a proof of P, just not from a proof of provability of P. Of course, whenever we have ⊢P, we'll have ◻P, so it's an understandable attribute substitution. Modifying your minimal rewrite criterion with this in mind, we might require that, for interpretations supporting derivability, S1 and a subsystem of S2 be mutually interpretable. Makes sense?

## comment by **[deleted]** ·
2011-12-13T19:55:24.713Z · LW(p) · GW(p)

The drawing convention used in the cartoon guide strongly suggests that you're making an incorrect substitution when you think about this.

When you draw "PA says (X)", that corresponds in the symbolic proof of Löb's theorem to "PA⊢X", and when you draw "PA says (PA says X)" that corresponds to PA⊢◻X. The notation makes it look like "PA⊢◻X" is just the second order version of "PA⊢X", i.e. a second application of "PA says" (PA⊢). But of course "PA⊢(PA⊢X)" is the same as "P⊢X", not PA⊢◻X.

## comment by beoShaffer · 2012-06-28T03:45:52.982Z · LW(p) · GW(p)

Is "I cannot prove that 2 = 1." Supposed to mean that a you can prove that it is impossible to prove 2=1?

Replies from: Manfred## comment by Error · 2012-09-18T16:37:26.543Z · LW(p) · GW(p)

Sanity check: If I'm understanding Eleizer and D'Anna correctly, step 8 is valid in PA but not in classical (= first order?) logic. The deduction theorem only applies to proofs in classical logic, therefore this part:

"Applying the Deduction Theorem to Löb's Theorem gives us, for all C:

```
((◻C)->C)->C"
```

is incorrect; you can't use the deduction theorem on Lob's Theorem. Everything after that is invalidated, including 2=1.

Is that more or less the point here? I have a good head for math but I'm not formally trained, so I'm trying to be sure I actually understand this instead of just thinking I understand it.

Incidentally: Godel's Theorem inspired awe when I was first exposed to it, and now Lob's has done the same. I actually first read this post months ago, but it's only going back to it after chasing links from a future sequence that I think I finally get it.

Replies from: VAuroch## comment by Zaq · 2013-04-09T17:48:40.098Z · LW(p) · GW(p)

A simple explanation of a flaw that makes no reference to Lob's Theorem, meta-anythings, or anything complicated. Of course, spoilers.

"Let ◻Z stand for the proposition "Z is provable". Löb's Theorem shows that, whenever we have ((◻C)->C), we can prove C."

This statement is the source of the problem. For ease of typing, I'm going to use C' = We can prove C. What you have here is (C'->C)->C'. Using Material Implication we replace all structures of the from A->B with ~A or B (~ = negation). This gives:

~(~C' or C) or C`

Using DeMorgan's laws we have ~(A or B) = ~A and ~B, yielding:

(C' and ~C) or C`

Thus the statement (C' -> C) -> C' evaluates to true ONLY when C' is true. You then proceed to try and apply it where C' is false. In other words, you have a false premise. Either you can in fact prove that 2=1, or it is not in fact the case that (C' -> C) -> C' when C is "2=1".

PS: I didn't actually need to read Lob's Theorem or even know what it was about to find this flaw. I suspect the passage quoted is in fact not the result of Lob's Theorem. You can probably dig into Lob's Theorem to pinpoint *why* it is not the result, but meh.

## ↑ comment by VAuroch · 2013-11-27T09:10:08.722Z · LW(p) · GW(p)

You are wrong, and I will explain why.

If you "have ((◻C)->C)", that is an assertion/assumption that ◻((◻C)->C). By Loeb's theorem, It implies that ◻C. This is different from what you wrote, which claims that ((◻C)->C) implies ◻C.

Replies from: Zaq## ↑ comment by Zaq · 2017-05-12T16:38:47.197Z · LW(p) · GW(p)

Wow. I've never run into a text using "we have" as assuming something's provability, rather than assuming its truth.

So the application of the deduction theorem is just plain wrong then? If what you actually get via Lob's theorem is ◻((◻C)->C) ->◻C, then the deduction theorem does *not* give the claimed ((◻C)->C)->C, but instead gives ◻((◻C)->C)->C, from which the next inference does not follow.

## ↑ comment by VAuroch · 2017-06-15T06:07:45.667Z · LW(p) · GW(p)

I don't think I've ever used a text that *didn't*. "We have" is "we have as a theorem/premise". In most cases this is an unimportant distinction to make, so you could be forgiven for not noticing, if no one ever mentioned why they were using a weird syntactic construction like that rather than plain English.

And yes, rereading the argument that does seem to be where it falls down. Though tbh, you should probably have checked your own assumptions before assuming that the question was wrong as stated.

## comment by So8res · 2013-07-26T18:35:39.141Z · LW(p) · GW(p)

Gur snpg gung lbh crefbanyyl pnaabg cebir 2=1 qbrf abg zrna gung vg vf hacebinoyr. Vs lbh rire cebir (va CN) gung gurer vf ab cebbs (va CN) gung 2=1, gura lbh unir vaqrrq cebirq CN vapbafvfgrag.

Gur qrqhpgvba gurberz qbrf abg fgngr gung G ∪ {◻⊥→⊥} ⊢ ⊥, orpnhfr ◻ vf n cebinovyvgl cerqvpngr sbe G. Naq, va snpg, CN ∪ {◻⊥→⊥} vf whfg CN+Pba(CN), juvpu vf bs pbhefr pbafvfgrag.

## comment by VAuroch · 2013-11-27T09:24:39.438Z · LW(p) · GW(p)

I may not removed the flaw entirely, but I have definitely changed it into a less-obviously bad flaw. And also used Loeb's Theorem to derive Goedel's, or a close analogue.

The summary statement of Loeb is wrong. It is not the case that (◻X->X) -> X, it is only the case that ◻(◻X->X) -> ◻X. That is to say, that if (a proof of X implies X) is provable, then X is provable.

Using Deduction here, we get only ◻(~◻X) -> ◻X, which in English is "If it is provable that X is unprovable, then X is provable", or in other words "If PA is proved complete, it is proved inconsistent."

## comment by evandrix · 2014-09-20T17:34:54.871Z · LW(p) · GW(p)

The download link for the Cartoon PDF Guide is outdated, it should be: http://yudkowsky.net/assets/pdf/LobsTheorem.pdf

## comment by PedroCarvalho · 2015-01-20T08:17:59.542Z · LW(p) · GW(p)

Yrg'f frr, V xabj sbe n snpg gung ¬(CN ⊢ (◻P → P) → P), gung vf, gung guvf fhccbfrq "nccyvpngvba" bs gur Qrqhpgvba Gurberz qb Yöo'f Gurberz vf abg n gurberz bs CN.

Yöo'f Gurberz vgfrys vf whfg gung -

Buuuu tbg vg.

Yöo'f Gurberz fnlf gung vs CN ⊢ ◻P → P, gura CN ⊢ P; be, fvzcyl, gung CN ⊢ ◻(◻P → P) → ◻P. Gur Qrqhpgvba Gurberz fnlf gung sbe "ernfbanoyr" (va n pregnva frafr) qrqhpgvir flfgrzf va svefg-beqre ybtvp naq nal frg bs nkvbzf Δ, vs vg'f gur pnfr gung, sbe fbzr N naq O, Δ ∪ {N} ⊢ O, gura Δ ⊢ N → O. Ubjrire, vg vf abg gur pnfr gung CN ∪ {◻P → P} ⊢ P, fvapr CN ∪ {◻P → P, ¬P} vf cresrpgyl pbafvfgrag (vss CN vf nyfb pbafvfgrag) naq vzcyvrf ¬◻P.

Be, va bgure jbeqf, nqqvat gur nkvbz ◻P → P gb CN sbe nal P bayl znxrf P n gurberz bs CN vs ◻P vf nyernql gurberz bs CN; ubjrire, ernfbavat "bhgfvqr" bs CN, jr xabj nyernql gung vs CN ⊢ ◻P gura CN ⊢ P (juvpu pna nyfb or sbeznyvfrq nf CN ⊢ ◻◻P → ◻P, juvpu vf n gurberz bs CN), fb nqqvat ◻P → P gb CN qbrf abg va snpg znxr CN "fgebatre" guna vg jnf orsber va nal zrnavatshy frafr, naq whfg znxrf CN gehfg vgf cebbs bs P, vs vg rire svaqf nal.

Gur pbeerpg nccyvpngvba bs gur Qrqhpgvba Gurberz gb Yöo'f Gurberz fnlf gung CN ∪ {◻(◻P → P)} ⊢ ◻P, juvpu jr nyernql xabj gb or gehr (orpnhfr... gung'f Yöo'f Gurberz), ohg fvapr ◻P → P vf abg va trareny n gurberz bs CN, guvf qbrf abg thnenagrr gung P jvyy or gehr. Naq phevbhfyl, guvf vzcyvrf gung nqqvat rvgure ◻(◻P → P) sbe nyy P be ◻P → P sbe nyy P gb CN qbrfa'g znxr CN nal fgebatre - qbvat gur sbezre npghnyyl eraqref ◻ rssrpgviryl zrnavatyrff, fvapr gura ◻P jvyy or gehr sbe nyy P, naq vg jvyy ab ybatre ercerfrag svaqvat n Töqry Ahzore bs n cebbs bs P, fb bar pbhyq fnl gung vg znxrf CN "jrnxre" va fbzr ybbfr frafr -, ohg nqqvat obgu sbe nyy P znxrf vg vapbafvfgrag.

## comment by Houshalter · 2015-09-21T07:16:39.013Z · LW(p) · GW(p)

Did EY post the correct answer anywhere? I've been thinking for hours and give up. The comments make no sense. The best I can tell from Larry_D'Anna's comment is that axiom 1 is wrong. But it doesn't seem wrong and I am very confused. Or maybe he's saying the way it's used in step 8 is wrong. But that seems a very straightforward inference from it. Either way, it must be correct, because Lob's theorem is an actual theorem. EY didn't just make it up to trick us.

So I went to wikipedia and it does say the same thing:

◻(◻C->C)->◻C

If you subtract the quotations (◻) from both sides, you do get (◻C->C)->C. Which obviously must be false, or 2=1. But I can't figure out what is wrong with removing the quotations. I don't even see where the quotations even came from in EYs proof. This is very frustrating.

Replies from: AlexMennen## ↑ comment by AlexMennen · 2015-09-21T07:41:58.598Z · LW(p) · GW(p)

Step 8 relies on the lemma that ◻X→◻◻X. If you try running through the proof with the ◻ removed from both sides, the corresponding lemma you would need to make step 8 work is X→◻X, but this is not true.

Replies from: Houshalter## ↑ comment by Houshalter · 2015-09-21T09:52:18.512Z · LW(p) · GW(p)

[deleted]

## comment by jpulgarin · 2020-04-20T10:36:15.784Z · LW(p) · GW(p)

I don't think there's a flaw in this argument. I'm pretty sure that is a theorem of PA for any sentence . However, even if we consider to be a sentence like , that does not mean that we can conclude that is a theorem of PA, since proving that there doesn't exist a proof for a given sentence in PA is equivalent to proving that PA is consistent, which is not possible if we assume that PA is consistent.

Replies from: SamEisenstat## ↑ comment by SamEisenstat · 2020-04-21T07:03:42.040Z · LW(p) · GW(p)

*I* think that there's a flaw in the argument.

I could elaborate, but maybe you want to think about this more, so for now I'll just address your remark about , where is refutable. If we assume that , then, since is false, must be false, so must be true. That is, you have proven that PA proves , that is, since is contradictory, PA proves its own inconsistency. You're right that this is compatible with PA being consistent--PA may be consistent but prove its own inconsistency--but this should still be worrying.

Replies from: jpulgarin## ↑ comment by jpulgarin · 2020-05-19T06:27:13.176Z · LW(p) · GW(p)

Thanks Sam. I thought about this some more and realized where I went wrong - I was applying the deduction theorem incorrectly (as other comments in this thread have pointed out).

By the way, when you say that PA proves its own inconsistency, do you mean that or that ? From your reasoning I agree that if we assume then we can arrive at and , from which we can conclude that . If you meant that though, could you explain how you arrived at that?