Why do the Sequences say that "Löb's Theorem shows that a mathematical system cannot assert its own soundness without becoming inconsistent."?

post by Thoth Hermes (thoth-hermes) · 2023-03-28T17:19:12.089Z · LW · GW · 7 comments

This is a question post.

Contents

  Answers
    6 quetzal_rainbow
    2 ZT5
    2 hairyfigment
    1 Aatu Koskensilta
None
7 comments

I am looking for some understanding into why this claim is made. 

As far as I can tell, Löb's Theorem does not directly make such an assertion. 

Reading the Cartoon's Guide to Löb's Theorem [LW · GW], it appears that this assertion is made on the basis of the reasoning that Löb's Theorem itself can't prove negations, that is, statements such as "1 + 3 /= 5." 

Alas, this means we can't prove PA sound with respect to any important class of statements.

This is a statement that [due to the presence of negations in it] itself can't be proven within PA. 

Now it seems that it is being argued that the inability to do this is a bad thing [that is, being able to prove that we can't prove PA sound with respect to any 'important' class of statements]. 

I think this is actually a very critical question and I have some ideas for what the central crux is here, but I'd be interested in seeing some answers before delving into that.

Answers

answer by quetzal_rainbow · 2023-03-28T18:02:33.244Z · LW(p) · GW(p)

It is direct conclusion from Löb's theorem.

Löb's theorem:

Substitute P with False statement:

But  is equivalent , i.e.

I.e., if provable that it's impossible to prove false statement, then false statements are provable. We have reached contradiction, Q.E.D.

comment by Thoth Hermes (thoth-hermes) · 2023-03-28T22:15:50.455Z · LW(p) · GW(p)

I am not sure that the statement □False or "It is provable that False" means anything.

Basically, you have that the word False and a false statement are not the same thing. Therefore, it is not generally the case that one can make a statement of the form "X is false" without X already being either the word False or another statement of the form "X is false."

What this implies is that one cannot in general prove that "False" (just the bare, basic statement that False).

□(¬□False)→□False.

But □False →False.

Thus □(¬□False)→¬□False.

This seems consistent to me. We also haven't used anything except the theorem directly and substitution of P with False to get here.

If you're saying that results in a contradiction, then that would imply the theorem is false, unless you introduce further assumptions.

Replies from: quetzal_rainbow, hairyfigment
comment by quetzal_rainbow · 2023-03-28T22:40:55.303Z · LW(p) · GW(p)

Of course it's consistent, if you can prove false statements, you can prove anything.

comment by hairyfigment · 2023-03-30T05:39:12.357Z · LW(p) · GW(p)

If you're asking how we get to a formal self-contradiction, replace "False" with a statement the theory already disproves, like "0 does not equal 0." I don't understand your conclusion above; the contradiction comes because the theory already proves ¬False and now proves False, so the claim about "¬□False" seems like a distraction.

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-30T15:22:34.902Z · LW(p) · GW(p)

Löb's theorem:

□(□P→P)→□P

Informally, this means that □P→P is something we'd like to believe is true. If I can prove P, that must mean P is true. It would be good for this to be true.

I might also like to believe that P is provable. Informally, maybe "P" is the best we can hope for, the claim that P is true. If my system can prove P, that's about as close as I can get to believing P.

So, □(□P→P)→□P→P.

Worst case scenario: Let's "sanity check" by replacing P with False. If P is false, then I'd expect not to be able to prove it. 

□(□F→F)→□F→F

□(¬□F)→¬□F.

If replacing P with False was enough to break Löb's theorem, then Löb's theorem would be false. PA can handle False.

Replies from: hairyfigment
comment by hairyfigment · 2023-03-30T19:47:16.428Z · LW(p) · GW(p)

I don't even understand why you're writing □(□P→P)→□P→P, unless you're describing what the theory can prove. The last step, □P→P, isn't valid in general, that's the point of the theorem! If you're describing what formally follows from full self-trust, from the assumption that □P→P for every P, then yes, the theory proves lots of false claims, one of which is a denial that it can prove 2+2=5.

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-31T00:43:47.139Z · LW(p) · GW(p)

The last step, □P→P, isn't valid in general, that's the point of the theorem!

□(□P→P)→□P→P makes sense to me. Just let P = □P→P. 

In other words, P = "If this statement has proof, it is true." 

Consider that "If I can write a proof of P, then P has been proven" is simply true

Therefore, I have written a proof of the statement "If I can write a proof of P, then P has been proven", which means it has been proven. Let that statement = P. So I have that □(□P→P). Since this statement is also true, we have that □P→P. 

On the whole of it, we'd certainly want □P to have something to do with P, no? Why wouldn't we expect □P to tell us anything at all about P?

Replies from: hairyfigment
comment by hairyfigment · 2023-03-31T00:53:49.819Z · LW(p) · GW(p)

The first part of the parent comment is perfectly true for a specific statement - obviously not for all P - and in fact this was the initial idea which inspired the theorem. (For the normal encoding, "This statement is provable within PA," is in fact true for this exact reason.) The rest of your comment suggests you need to more carefully distinguish between a few concepts:

  1. what PA actually proves
  2. what PA would prove if it assumed □P→P for all claims P
  3. what is actually true, which (we believe) includes category 1 but emphatically not 2.
Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-31T16:49:52.432Z · LW(p) · GW(p)

I think the key here is that our theorem-prover or "mathematical system" is capable of considering statements to be "true" within itself, in the sense that if it believes it has proven something, well, it considers at least that to be true. It's got to pick something to believe in, in this case, that if it has written a proof of something, that thing has been proven. It has truth on that level, at least. 

Consider that if we tabooed the use of the word "true" and used instead "has a proof" as a proxy for it, we don't necessarily get ourselves out of the problem. We basically are forced to do this no matter what, anyway. We sometimes take this to mean that "has a proof" means "could be true, maybe even is mostly really true, but all we know for sure is that we haven't run in to any big snags yet, but we could."

Metaphysically, outside-of-the-system-currently-being-used truth? I think the Sequences are saying something more strongly negative than even Gödel's Theorems are usually taken to mean. They are saying that even if you just decide to use "my system thinks it has proved it, and believes that's good enough to act on", you'll run into trouble sooner than if you hesitated to act on anything you think you've already proved.

Replies from: hairyfigment
comment by hairyfigment · 2023-03-31T19:55:28.169Z · LW(p) · GW(p)

https://en.wikipedia.org/wiki/Tarski%27s_undefinability_theorem

A coherent formal system can't fully define truth for its own language. It can give more limited definitions for the truth of some statement, but often this is best accomplished by just repeating the statement in question. (That idea is also due to Tarski: 'snow is white' is true if and only if snow is white.) You could loosely say (very loosely!) that a claim, in order to mean anything, needs to point to its own definition of what it would mean for that claim to be true. Any more general definition of truth, for a given language, needs to appeal to concepts which can't be expressed within that language, in order to avoid a self-reference paradox.

So, there's no comparison between applying the concept in your last paragraph to individual theorems you've already proven, like "2+2=4" - "my system thinks it has proved it, and believes that's good enough to act on", and the application to all hypothetical theorems you might prove later, like "2+2=5". Those two ideas have different meanings - the latter can't even be expressed within the language in a finite way, though it could be an infinite series of theorems or new axioms of the form □P→P - and they have wildly different consequences. You seem to get this when it comes to hypothetical proofs that eating babies is mandatory.

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-31T20:50:50.169Z · LW(p) · GW(p)

So what you're saying here is, let's say, "level 1 negative" which means, very roughly, things like: We can't formally define what truth is, our formal system must appeal to higher systems outside of it, we can't prove consistency, etc.

What the Sequences say are, let's say, "level 2 negative" which means verbatim what is stated in them, i.e., "a mathematical system cannot assert its own soundness without becoming inconsistent." This literally says that if a mathematical system tried to assert its own soundness, it would become inconsistent. This is worse than what any of Löb's Theorem, Gödel's Theorems, or Tarski's Theorem tells us. AFAICT, the Sequences do consider the issue of soundness-assertion important, because it relates to recursively self-improving AI's, who would be inclined to want to do so, presumably. 

When I'm only level 1 negative, where I am when I want to believe in what the mathematical community says, then I'm not really in that bad shape, because I kind of only need to keep a somewhat distantly curious ear open for the somewhat erudite intellectual curiosity of whether or not this system will ever stop working.

When I'm level 2 negative, I might be either a person or an AGI wanting to know if it is possible to recursively self-improve, and I might decide that if I tried to assert my own soundness, I would go bad or become insane or become inconsistent, and therefore not do so - thus deferring to authority. 

Replies from: hairyfigment
comment by hairyfigment · 2023-03-31T22:00:30.032Z · LW(p) · GW(p)

Pretty sure this is my last comment, because what you just quoted about soundness is, in fact, a direct consequence of Löb's Theorem. For any proposition P, Löb's Theorem says that □(□P→P)→□P. Let P be a statement already disproven, e.g. "2+2=5". This means we already had □¬P, and now we have □(¬P & P), which is what inconsistency means. Again, it seemed like you understood this earlier.

answer by Victor Novikov (ZT5) · 2023-03-28T19:52:13.032Z · LW(p) · GW(p)

A system asserting its own soundness: For all T, (□T -> T)

Löb's theorem: 

From □T -> T, it follows □(□T -> T). (necessitation rule in provability logic).

From □(□T -> T), by Löb's theorem it follows that □T.

Therefore: any statement T is provable (including false ones).

Or rather: since for any statement the system has proven both the statement and its negation (as the argument applies to any statement), the system is inconsistent.

comment by Thoth Hermes (thoth-hermes) · 2023-03-28T22:46:42.543Z · LW(p) · GW(p)

I don't see how it proves that for X it has proven both the statement and its negation.

I do see how it can be concluded that for any positive statement T, T is provable.

However, if T is a negative statement, then it is provable that not T.

The system isn't inconsistent that way. If it was, we'd have that Löb's theorem itself is false (at least according to PA-like proof logic!).

Replies from: ZT5
comment by Victor Novikov (ZT5) · 2023-03-28T22:56:00.946Z · LW(p) · GW(p)

Our initial assumption was: For all T, (□T -> T)

T applies to all logical statements. At the same time. Not just a single, specific one.

Let T = A. Then it is provable that A.

Let T = not A. Then it is provable that not A.

As both A & not A have been proven, we have a contradiction and the system is inconsistent.

If it was, we'd have that Löb's theorem itself is false (at least according to PA-like proof logic!).

Logical truths don't change.
If it we start with Löb's theorem being true, it will remain true.
But yes, given our initial assumption we can also prove that it is false.
(Another example of the system being inconsistent).

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-28T23:05:33.383Z · LW(p) · GW(p)

But that requires that for all T, T is provable, in order to get that T and not T are provable at the same time.

That additional assumption requires a system (not PA) in which T and not T (everything or anything at all) are provable already. But we aren't assuming that kind of system, only a system that trusts itself.

answer by hairyfigment · 2023-03-28T17:56:01.852Z · LW(p) · GW(p)

How do you interpret "soundness"? It's being used to mean that a proof of X implies X, for any statement X in the formal language of the theory. And yes, Löb's Theorem directly shows that PA cannot prove its own soundness  for any set of statements save a subset of its own theorems.

comment by Thoth Hermes (thoth-hermes) · 2023-03-30T01:02:23.854Z · LW(p) · GW(p)

I consider myself to be a reasoner at least as powerful as PA, since I am using myself plus PA plus Löb's Theorem to reason about systems at least as powerful as myself (which encompasses everything I've thus far described). 

I consider myself "sound" if I believe myself to be trustworthy / reliable enough to believe what I believe Löb's Theorem's says, which is something to do with self-trust and the ability to believe that certain propositions are true, as long as I can prove that proving them true means that they are true.  

Replies from: hairyfigment
comment by hairyfigment · 2023-03-30T05:22:55.831Z · LW(p) · GW(p)

Do you also believe that if you could "prove" eating babies was morally required, eating babies would be morally required? PA obviously believes Lob's theorem itself, and indeed proves the soundness of all its actual proofs, which is what I said above. What PA doesn't trust is hypothetical proofs.

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-30T11:49:24.913Z · LW(p) · GW(p)

No, I would have to think that eating babies was morally required first. If not, then I could not prove it.

Replies from: hairyfigment
comment by hairyfigment · 2023-03-30T19:51:49.052Z · LW(p) · GW(p)

I don't know that I follow. The question, here and in the context of Löb's Theorem, is about hypothetical proofs. Do you trust yourself enough to say that, in the hypothetical where you experience a proof that eating babies is mandatory, that would mean eating babies is mandatory?

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-30T20:26:43.118Z · LW(p) · GW(p)

"Experiencing a proof that eating babies is mandatory" could only amount to something like being force-fed or otherwise some kind of impossible to resist scenario. "Experiencing a proof" of any proposition consists of the sequence of events or observed inputs leading to one to conclude a proposition is true.

If you are firmly in the against baby eating camp, then presumably you've got ample proof already that it is not mandatory, and always generally possible to resist anyone trying to make you eat them.

I think you're talking about the hypothetical world in which both baby eating and not baby eating seem correct, such as where one hypothetically reads a proof that seems correct, and urges one to eat babies, whilst it simultaneously seeming very wrong according to our normal intuitions like it normally does. That hypothetical world is inconsistent by assumption, though. I'm not talking about that. I think we live in the world in which our sense that baby eating is very wrong means a proof of that can be constructed.

answer by Aatu Koskensilta · 2023-04-01T02:41:54.000Z · LW(p) · GW(p)

It's sometimes thought or said that the second incompleteness theorem shows that formal systems such as PA are in some sense "incapable of self-reflection", and their incapability to prove their own consistency reflects this lacuna in their "cognitive capabilities" or something along those lines. 

Löb's theorem highlights how this is a much deeper phenomenon: it's not just that these formal systems lack abilities, or formalizations of relevant mathematical insight, or whatever; no, it's that it's formally impossible for any consistent system of the relevant sort to even baldly assert their own consistency, on pain of inconsistency! 

That is, we can for instance construct, using the diagonal lemma, a sentence A such that A is, provably in PA, equivalent to "PA + A is consistent" -- thus creating, in a sense a theory that just asserts its own consistency, in the form of a random assertion about Diophantine equations or however we carry out the arithmetization of syntax, for no particular legible reason whatever -- but we then find that PA + A is inconsistent! (This is an immediate consequence of the second incompleteness theorem.)

I recommend Torkel Franzén's excellent "Inexhaustibility -- a non-exhaustive treatment" for anyone pondering these matters.

7 comments

Comments sorted by top scores.

comment by James Payor (JamesPayor) · 2023-03-28T20:54:38.029Z · LW(p) · GW(p)

I'm not sure I understand what you're interested in, but can say a few concrete things.

We might hope that PA can "learn things" by looking at what a copy of itself can prove. We might furthermore expect that it can see that a copy of itself will only prove true sentences.

Naively this should be possible. Outside of PA we can see that PA and its copy are isomorphic. Can we then formalize this inside PA?

In the direct attempt to do so, we construct our inner copy , where is a statement that says "there exists a proof of in the inner copy of PA".

But Löb's Theorem rules out formalizing self-trust this way. The statement means "there are no ways to prove falsehood in the inner copy of PA". But if PA could prove that, Löb's Theorem turns it directly into a proof of !

This doesn't AFAICT mean self-trust of the form "I trust myself not to prove false things" is impossible, just that this approach fails, and you have to be very careful about deferral.

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-28T22:29:33.774Z · LW(p) · GW(p)

I don't think that PA being able to prove that you cannot prove falsehood means that you can prove falsehood from the theorem. If you look at my response to quetzal_rainbow's answer, a simple substitution of false for X returns "it is provable that (it is not provable that False) implies (it is not provable that False)."

Replies from: JamesPayor
comment by James Payor (JamesPayor) · 2023-03-28T22:52:58.062Z · LW(p) · GW(p)

Hm I think your substitution isn't right, and the more correct one is "it is provable that (it is not provable that False) implies (it is provable that False)", ala .

I'm again not following well, but here are some other thoughts that might be relevant:

  • It's provable for any that , i.e. from "False" anything follows. This is how we give "False" grounding: if the system proves False, then it proves everything, and distinguishes nothing.

  • There are two levels at which we can apply Löb's Theorem, which I'll call "outer Löb's Theorem" and "inner Löb's Theorem".

    • Outer Löb's Theorem says that whenever PA proves , then PA also proves . It constructs the proof of using the proof of .
    • Inner Löb's Theorem is the same, formalized in PA. It proves . The logic is the same, but it shows that PA can translate an inner proof of into an inner proof of .
    • Notably, the outer version is not . We need to have available the proof of in order to prove .
Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-28T23:36:01.844Z · LW(p) · GW(p)

So from quetzal_rainbow's answer, I don't have an assumption of self-trust, I only have the substitution of false. I believe making this substitution is fine, but that we prove that "it is not provable that False" from this. Also fine.

From ZT5's answer, he asserts that self-trust is equivalent to the additional assertion "For all T, it is provable that T implies T." I don't have a problem with that assertion, only with the claim that this means that both T and not T are provable (it says that if T is provable, then T. But if T is provable, that doesn't mean that not T is provable).

From both answers, and yours, I gleam that it is either believed that Löb's Theorem by itself is inconsistent, or that it with the additional assumption of "self trust" (not yet formalized to my personal satisfaction yet) is inconsistent.

Depending on the answer, we apparently are not sure if self trust is an additional assumption on top of the theorem or one that is contained in the theorem itself as a conditional.

Replies from: JamesPayor
comment by James Payor (JamesPayor) · 2023-03-29T00:20:10.197Z · LW(p) · GW(p)

Saying some more things, Löb's Theorem is a true statement: whenever talks about an inner theory at least as powerful as e.g. PA, the theorem shows you how to prove .

This means you cannot prove , or similarly that .

is one way we can attempt to formalize "self-trust" as "I never prove false things". So indeed the problem is with this formalization: it doesn't work, you can't prove it.

This doesn't mean we can't formalize self-trust a different way, but it shows the direct way is broken.

Replies from: thoth-hermes
comment by Thoth Hermes (thoth-hermes) · 2023-03-30T00:50:25.892Z · LW(p) · GW(p)

It might be that the formalization that Lob's Theorem is based in isn't powerful enough to deal with this problem yet. 

However, it's got to be powerful enough to deal with the self-trust issue, since on some level, it is about the self-trust issue. 

Lob's Theorem is ostensibly about PA or a system at least as powerful as PA. So it itself must be working within such a system.

If that system isn't capable of self-trust, then we can't trust Lob's Theorem. What I believe you're arguing and that the Sequences are arguing essentially amounts to that. What I'm not satisfied with is the Sequences' level of clarity at articulating whether or not Lob's Theorem (and by extension PA and systems at least as powerful as it) are adequate enough to be trusted such that we can use it to formalize self-trust in general. The Sequences are quite ambiguous about this IMO - they don't even state the problem as if this issue were indeterminate - they say only that Lob's Theorem itself means that we can't formalize self-trust. That amounts to essentially saying that "We trust Lob's Theorem completely, and PA, etc., which states that we can't trust Lob's Theorem completely nor PA, etc." 

IMO - from my own analysis of the problem, which I choose to trust for deep and very well-thought out reasons [heh] - Lob's Theorem is compelling enough to trust, which says we can trust it and PA-or-greater systems enough to believe in things that it proves. This is the common-sense, intuitive way of interpreting what it means, which I note that importantly means the same thing as having self-trust and trust in the system you're using.

comment by Gordon Seidoh Worley (gworley) · 2023-03-28T21:41:54.223Z · LW(p) · GW(p)

I recently revisited Lob. I bounced off it a lot. It helped after I watched the video linked at the top of this post [LW · GW] where Eliezer combines it with Godel theorem.