Why I Don't Believe The Law of the Excluded Middle

thoth-hermes

Why I Don't Believe The Law of the Excluded Middle

post by Thoth Hermes (thoth-hermes) · 2023-09-18T18:53:48.704Z · LW · GW · 46 comments

This is a link post for https://thothhermes.substack.com/p/why-i-dont-believe-the-law-of-the

46 comments

The Law of the Excluded Middle says that there are only two “truth-values:” namely True and False. It also says that X and not-X is false, while X or not-X is true. If we accept LoEM already, then LoEM is either true or false. However, there are already two reasons I don’t like doing that:

It introduces the concept of “actually true” and “actually false” independent of whether or not we’ve chosen to believe something.
It’s self-referential. So if I believe it, that means I have to believe it really hard. Weakly believing it is a lot like not believing it at all, which I find to be unfair.

Because it is asking to me to believe it completely if I believe it at all, I feel more comfortable choosing to consider it “false” on my terms, which is merely that it gave me no other choice because it defined itself to be false when it is only believed weakly.

It’s a bit like associating the interval [0,1) with False and {1} with True. This is because if I have some collection of propositions (say {A_i}), then if I take the “and” of them all together to make A, then A is false if at least one of the A_i’s is false. Thus, not only is there no sense of a “partially true” proposition, even if I could measure the proportion of true pieces of a proposition, any proportion besides exactly 1 is considered exactly 0. If you’ve given me a collection of statements, and all you told me is that one of those statements is wrong, but not which one, I would still feel as though you’d given me something useful.

Conversely, if you gave me a collection of statements, and you told me that all you knew about them was that the “or” of them all together was true, I’d probably feel like you could have given me a bunch of junk.

But if we think in “calculus” mode, then a huge statement made of lots of regular-sized statements concatenated together with “and” can be identified with a regular-sized statement that is implicitly constructed out of a potentially infinite number of infinitesimal-sized concepts, and thus could be factored into a huge statement made of regular-sized statements concatenated with “and.” Thus if only one if these infinitesimal-sized concepts is false, the whole thing is false.

We know that “Newtonian physics” is technically not true. Or do we? All we know is that it has been superseded by several more advanced, more accurate-within-their-own-domains, but mutually-incompatible theories. Which means those are technically not true either. But we continue to use all of them, including Newtonian physics. How is that possible?

This means that in LoEM, the “false” is made enormously (infinitely, in the limit) powerful, compared to the “true.” But this doesn’t seem to match how “false” apparently works in reality. This is basically why I doubt the LoEM.

You know what’s really great about doubting the LoEM? It’s that even if you want to assume it’s true, you can, and you can see what happens. Furthermore, I do trust that what we know about what happens is indeed what happens when you assume the LoEM, and therefore, that we don’t really face that much danger when we want to entertain counterfactuals. If you want absolute certainty (in the form of consistency), apparently you have to make do with not knowing that you have it.

People who believed in the LoEM wanted to know things with absolute certainty. They also were worried about what would happen if you believed in things that were inconsistent. Let’s say you take a bunch of axioms. Now, suppose you could “prove false” with it (that means prove a contradiction). This is equivalent to one of your statements being the negation of another. Since your theory is all of your axioms (and derived propositions) “and”-ed together, your entire theory is false.

I assume that “false” means something meaningful, as well as that it’s generally a bad thing when you believe something that is false. If “false” meant in reality what it means in LoEM, it would be infinitely bad to believe something false. One bad belief spoils the entire batch!

An example of something bad happening that could come from believing in a contradiction is called the “Principle of Explosion.” This is what people refer to when they say, “From a contradiction, anything follows.” How it works feels like a cheat: Assume you believe any proposition P as well as its negation, not-P. Now, suppose you have any proposition Q. Because you believe P, you also believe Q or P. Does that seem like it should get us very much?

But we also believe not-P. Therefore, because Q or P, and also that not-P, therefore Q! Does that seem valid to you? It doesn’t to me!

Those steps above constitute a fully general proof for “Q” which is an arbitrary statement. So I’ve already proven “everything” just from doing that. But a proof is supposed to do the work of convincing me that something is true. That doesn’t feel like a proof of anything, let alone a proof of everything. So assuming a contradiction does not allow us to feel like we’ve proven whatever we want to be true.

That’s because if I wanted to prove something arbitrary from a contradiction, the proof would need to take the same form as described above, which already does not feel convincing. So I strongly feel that from a contradiction, anything does not follow.

People use the Principle of Explosion rhetorically during arguments in the following way: Arguer A makes a “big claim” that Arguer B finds preposterous, but hard to refute. Arguer B spends time debating Arguer A, trying to find any pair of claims by the latter that seem to contradict. Once Arguer B finds something (which might not be hard to do once pressing A for lengthy explanations), B claims “from a contradiction, anything can be proven! Which is how you arrived at (big claim).”

But while A might indeed be wrong about (big claim), that is not how A arrived at it. A did not merely take a proof similar to the one I gave above, consider that valid, then swap out Q for (big claim), and feel triumphant. He did not do that because no one could do so and hope to feel confident in any arbitrary claim they thought they could prove that way. That isn’t how people reason, so we can’t claim that people arrive at faulty conclusions that way.

I forgot to mention: I am not even quite sure what it means to believe P and not-P simultaneously. Perhaps it is possible in some way, but what it seems to mean in this context is that I can arbitrarily swap P for not-P whenever I feel like it. If I can do that, then I can use that as a rule to “get to” anywhere I want, presumably.

But what if I was permitted to kind of believe P and not-P, and also kind of not believe it? That would mean that Q or P is not necessarily true (nor necessarily false), nor is not-P necessarily true (nor necessarily false). So that means I don’t have a proof of Q (which I didn’t want anyway).

Given that not-P in this case is the LoEM, it looks to me like kind of believing it and kind of not believing it prevents the “explosion” from happening.

Here’s my goal, stated in very hand-wavy, fuzzy, intuitionistic terms:

I want to be able to entertain a diverse array of theories that may not only contradict each other, they may also be internally inconsistent as well. But I want to do this in a way that I can “tease out” the consequences of the theories and their inconsistencies by pointing to what would need to be changed to make one compatible (or more compatible) with another.

It seems like it should be possible for reasons I mentioned earlier: Physicists seem to be granted the luxury of holding mutually-incompatible theories in their heads, while also getting to know where and how they don’t agree with each other or reality. For example, if I want to believe in time travel, then I have to believe in closed-timelike-curves. If I want to believe in those, while also getting to make one if it didn’t exist before, I would need to have negative mass, apparently. It may not be the case that negative mass exists or can exist, but it is at least nice that we get to know what would have to be true in order to make our proposition true.

If I wanted to partially believe in LoEM, then I would see that if I wanted to prove that time travel is not possible, I could use that above reasoning to conclude that because negative mass isn’t possible, time travel is not.

If I wanted to partially disbelieve in LoEM, than I could also partially disbelieve that we had disproven the possibility of time travel. Maybe we could “tease out” the consequences of negative mass, to see what else would need to be for it to exist.

Here’s a question to conclude this piece with: Is Löb's theorem still true even if we don’t accept LoEM? (My guess is yes.)

46 comments

Comments sorted by top scores.

comment by tailcalled · 2023-09-18T21:07:23.313Z · LW(p) · GW(p)

The Law of the Excluded Middle says that there are only two “truth-values:” namely True and False. It also says that X and not-X is false, while X or not-X is true. If we accept LoEM already, then LoEM is either true or false. However, there are already two reasons I don’t like doing that:
It introduces the concept of “actually true” and “actually false” independent of whether or not we’ve chosen to believe something.
It’s self-referential. So if I believe it, that means I have to believe it really hard. Weakly believing it is a lot like not believing it at all, which I find to be unfair.

You are mixing up the law of excluded middle with two-valued logic.

In logic, one distinguishes between syntax (what statements can you express, and how does one statement follow from another statement) and semantics (what external meaning does a statement have).

The syntactic side is done by having a formal language with symbols such as , $\lor$ and $\neg$ that can be combined into propositions, and then specifying the rules for how to derive one proposition from another.

The validity of these rules is determined by the semantic side, where each proposition is given a meaning (truth-value). This meaning cannot be accessed from within the language, but only exists on the meta level.

Two-valuedness says that each proposition is given one of two truth-values, which if we were go formal would look something like $\forall p \in L, v (p) \in {⊤, ⊥}$ where $v$ refers to the truth-value function. On the other hand, law of excluded middle just says $P \lor \neg P$ ; this isn't even in the same language as the previous one because this is an "internal" thing, like a possible input to $v$ .

Two-valuedness is independent of LEM. There are multi-valued logics (i.e. logics where you have more than 2 truth values) which validate the law of excluded middle, e.g. if $n$ is a natural number then you have a $2^{n}$ -valued logic consisting of length- $n$ sequences of booleans, with logical operators applied pointwise. This has basic odd truth values like $⊤ ⊥$ , but these odd truth values still satisfy LEM because $⊤ ⊥ \lor \neg ⊤ ⊥ = ⊤ ⊥ \lor ⊥ ⊤ = ⊤ ⊤$ . On the other hand, there are also logics that don't validate LEM but are two-valued, such as realizability logic (which is kind of complicated but basically statements such as $P \lor Q$ are interpreted to mean "there is a procedure by which you can figure out whether $P$ is true or $Q$ is true").

Furthermore, LEM is not self-referential; it just says $P \lor \neg P$ . One could argue that two-valuedness is self-referential, since it makes reference to $v$ , the truth-value function; but $v$ is not accessible within the logic and instead two-valuedness is an external principle imposed.

Here’s a question to conclude this piece with: Is Löb's theorem still true even if we don’t accept LoEM? (My guess is yes.)

Yes. But also the principle of explosion and the principle of noncontradiction are also true even if you don't accept LoEM. Your proof of the principle of explosion didn't use the law of excluded middle at all.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-18T21:24:54.777Z · LW(p) · GW(p)

If there are propositions or axioms that imply each other fairly easily under common contextual assumptions, then I think it's reasonable to consider it not-quite-a-mistake to use the same name for such propositions.

One of the things I'm arguing is that I'm not convinced that imprecision is enough to render a work "false."

Are you convinced those mistakes are enough to render this piece false or incoherent?

That's a relevant question to the whole point of the post, too.

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-18T21:36:40.427Z · LW(p) · GW(p)

If there are propositions or axioms that imply each other fairly easily under common contextual assumptions, then I think it's reasonable to consider it not-quite-a-mistake to use the same name for such propositions.

What equivalences do you have in mind when you say "imply each other"?

One of the things I'm arguing is that I'm not convinced that imprecision is enough to render a work "false."
Are you convinced those mistakes are enough to render this piece false or incoherent?

It's certainly true that there is a scientific/logical conundrum about how to deal with imprecision. I know a lot about what happens when you tinker with the law of excluded middle, though, and I am not convinced this has any impact on your ability to deal with imprecision.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-18T22:16:43.388Z · LW(p) · GW(p)

A succinct way of putting this would be to ask: If I were to swap the phrase "law of the excluded middle" in the piece for the phrase "principle of bivalence" how much would the meaning of it change as well as overall correctness?

Additionally, suppose I changed the phrases in just "the correct spots." Does the whole piece still retain any coherence?

Replies from: tailcalled, tailcalled

↑ comment by tailcalled · 2023-09-20T08:00:26.031Z · LW(p) · GW(p)

Actually, here's something that may be helpful in understanding why the principle of bivalence is distinct from the law of excluded middle:

As I understand you, one of the core points you are making is that you want to be able to entertain incompatible models. So let's say that you have two models and $W$ that are incompatible with each other.

For simplicity, let's say both models share a language $L$ in which they can express propositions, and assign truth values to statements in that language using functions $v_{M}, v_{W}$ mapping statements to truth-values. (For instance, maybe $M$ is a flat-earth approximation, and $W$ is a spherical-earth approximation, so $v_{M} (The earth is flat) = ⊤$ but $v_{W} (The earth is flat) = ⊥$ .)

Because these are just models, your points don't apply within the models; it might be fine for an approximation to say that everything is true or false, as long as we keep in mind that it's just an approximation and different approximations might lead to different results. As a result, all of the usual principles of logic like bivalence, noncontradiction, excluded middle, etc. apply within the models.

However, outside/between the models, there is a sense that what you are saying applies. For instance we get an apparent contradiction/multiple truth values for $v_{M} (The earth is flat) = ⊤$ vs $v_{W} (The earth is flat) = ⊥$ . But these truth values live in separate models, so they don't really interact, and therefore aren't really a contradiction.

But you might want to have a combined model where they do interact. We can do this simply by using the $2^{n}$ -valued approach I mentioned in my prior comment. Define a shared model $M \times W$ by the truth-value function $v_{M \times W} (P) = v_{M} (P) v_{w} (P)$ . So for instance $v_{M \times W} (The earth is flat) = ⊤ ⊥$ .

Here you might interpret $⊤ ⊥$ as meaning something along the lines of "true in practice but not in theory" or "true for small things but not for big things", and you might interpret $⊥ ⊤$ as meaning something along the lines of "technically true but not in practice" or "true for big things but not for small things". While "The earth is flat" would be an example of a natural statement with the former truth value, "The earth is finite" would be an example of a natural statement with the latter truth value. And then we also have truth values like $⊤ ⊤$ meaning "wholly true" and $⊥ ⊥$ meaning "wholly false".

$M \times W$ still satisfies standard logical principles like excluded middle, law of noncontradiction, etc.. This is basically because it is just two logics running side-by-side and not interacting much. This also makes it kind of boring, because you could just as well work with $M$ and $W$ separately.

We can make them somewhat less boring/more interacting by adding logical operators to the language that makes the different logics interact. For instance, one operator we might add is a definitely/necessarily/wholly operator, denoted $□ P$ , which is interpreted to mean that $P$ is true in both $M$ and $W$ . So for instance $v_{M \times W} (□ The earth is flat) = ⊥ ⊥$ because the earth isn't flat in the spherical-earth model. We can also add a sort of/possibly/partly operator, denoted $⋄ P$ , which is interpreted to mean that $P$ is true in at least one of $M$ and $W$ , so for instance $v_{M \times W} (⋄ The earth is finite) = ⊤ ⊤$ .

These operators allow you to express some of the principles that you have been complaining about. For instance, there is a stronger version of excluded middle $(□ P) \lor (□ \neg P)$ which is sort of an internalized version of the principle of bivalence, and this stronger version of excluded middle is false for some statements such as "The earth is flat". There's also a stronger version of the law of noncontradiction $\neg ((⋄ P) \land (⋄ \neg P))$ which again is a sort of internalized notion of the principle of bivalence and is also false for the same statements the other one is false for.

So far so good; this seems to give you exactly what you want, right? A way to entertain multiple models at once. But this requires all of the models to use a shared language, and in my experience, this requirement is actually the killer for all of these things. For instance, Newtonian physics doesn't share a language with Einsteinian physics, as the ontology is fundamentally different. It is nearly impossible to make different logics share a language sufficiently well for this to work in interesting ways. I'd guess that the route to solving this goes through John Wentworth's Natural Abstractions, but they haven't been developed enough yet to solve it.

Overall, there's a bunch of work into this sort of logic, but it isn't useful because the fundamental problem is in how to fluently switch between ontologies, not in how to entertain multiple things at once.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-21T15:17:31.178Z · LW(p) · GW(p)

So suppose I have ~(A and ~A). Rather than have this map to False, I say that "False" is an object that you always bounce off of; It causes you to reverse-course, in the following way:

~(A and ~A) --> False --> A or ~A or (some mysterious third thing). What is this mysterious third thing? Well, if you insist that A and ~A is possible, then it must be an admixture of these two things, but you'd need to show me what it is for that to be allowed. In other words:

~(A and ~A) --> A or ~A or (A and ~A).

What this statement means in semantic terms is: Suppose you give me a contradiction. Rather than simply try really hard to believe it, or throw everything away entirely, I have a choice between believing A, believing ~A, or believing a synthesis between these two things.

The most important feature of this construction is that I am no longer faced with simply concluding "false" and throwing it all away.

Two examples:

Suppose we have the statement 1 = 2^[1]. In most default contexts, this statement simply maps to "false," because it is assumed that this statement is an assertion that the two symbols to the left and right of the equals sign are indistinguishable from one another.

But what I'm arguing is that "False" is not the end-all, be-all of what this statement can or will be said to mean in all possible universes forever unto eternity. "False" is one possible meaning which is also valid, but it cannot be the only thing that this means.

So, using our formula from above:

1 = 2 -->^[2] 1 or 2 or (1 and 2). So if you tell me "1 = 2", in return I tell you that you can have either 1, either 2, or either some mysterious third thing which is somehow both 1 and 2 at the same time.

So you propose to me that (1 and 2) might mean something like 2 (1/2), that is, two halves, which mysteriously are somehow both 1 and 2 at the same time when put together. Great! We've invented the concept of 1/2.

Second example:

We don't know if A is T and thus that ~A is F or vice-versa. Therefore we do not know if A and ~A is TF or FT. Somehow, it's got to be mysteriously both of these at the same time. And it's totally fine if you don't get what I'm about to say because I haven't really written it anywhere else yet, but this seems to produce two operators, call them "S" (for swap) and "2" (for 2), each duals of one another.

S is the Swaperator, and 2 is the Two...perator. These also buy you the concept of 1/2 as well. But all that deserves more spelling out, I was just excited to bring it up.

^{^}
It is arguably appropriate to use 1 == 2 as well, but I want to show that a single equals sign "=" is open to more interpretations because it is more basic. This also has a slightly different meaning too, which is that the symbols 1 and 2 are swappable with one another.
^{^}
You could possibly say "--> False or 1 or 2 or ...", too, but then you'd probably not select False from those options, so I think it's okay to omit it.

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-21T17:18:31.099Z · LW(p) · GW(p)

You should use a real-world example, as that would make the appropriate logical tools clearer.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-21T17:33:37.155Z · LW(p) · GW(p)

Well, to use your "real world" example, isn't that just the definition of a manifold (a space that when zoomed in far enough, looks flat)?

I think it satisfies the either-or-"mysterious third thing" formulae.

~(Earth flat and earth ~flat) --> Earth flat (zoomed in) or earth spherical (zoomed out) or (earth more flat-ish the more zoomed in and vice-versa).

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-21T19:50:33.881Z · LW(p) · GW(p)

First, a question, am I correct in understanding that when you write ~(A and ~A), the first ~ is a typo and you meant to write A and ~A (without the first ~)? Because is a tautology and thus maps to true rather than to false.

Secondly, it seems to me that you'd have to severely mutilate your logic to make this nontrivial. For instance, rather than going by your relatively elaborate route, it seems like a far simpler route would be Earth flat and earth ~flat => Earth flat => Earth flat or Earth spherical or ....

Of course this sort of proof doesn't capture the paradoxicalness that you are aiming to capture. But in order for the proof to be invalid, you'd have to invalidate one of $(A \land B) ⟹ A$ and $A ⟹ (A \lor B)$ , both of which seem really fundamental to logic. I mean, what do the operators "and" and "or" even mean, if they don't validate this?

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-21T22:02:20.491Z · LW(p) · GW(p)

First, a question, am I correct in understanding that when you write ~(A and ~A), the first ~ is a typo and you meant to write A and ~A (without the first ~)? Because is a tautology and thus maps to true rather than to false.

I thought of this shortly before you posted this response, and I think that we are probably still okay (even though strictly speaking yes, there was a typo).

Normally we have that ~A means: ~A --> A --> False. However, remember than I am now saying that we can no longer say that "~A" means that "A is False."

So I wrote:

~(A and ~A) --> A or ~A or (A and ~A)

And it could / should have been:

~(A and ~A) --> (A and ~A) --> False (can omit) ^[1]or A or ~A or (A and ~A).

So, because of False now being something that an operator "bounces off of", technically, we can kind of shorten those formulas.

Of course this sort of proof doesn't capture the paradoxicalness that you are aiming to capture. But in order for the proof to be invalid, you'd have to invalidate one of $(A \land B) ⟹ A$ and $A ⟹ (A \lor B)$ , both of which seem really fundamental to logic. I mean, what do the operators "and" and "or" even mean, if they don't validate this?

Well, I'd have to conclude that we no longer consider any rules indispensable, per se. However, I do think "and" and "or" are more indispensable and map to "not not" (two negations) and one negation, respectively.

^{^}
False can be re-omitted if we were decide, for example, that whatever we just wrote was wrong and we needed to exit the chain there and restart. However, I don't usually prefer that option.

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-22T07:45:42.488Z · LW(p) · GW(p)

You write in an extremely fuzzy way that I find hard to understand. This is plausibly related to the motivation for your post; I think you are trying to justify why you don't need to make your thinking crisper? But if so I think you need to focus on it from the psychology/applications/communication angle rather than from the logic/math angle, as that is more likely to be a crux.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-22T15:01:43.484Z · LW(p) · GW(p)

It is probably indeed a crux but I don't see the reason for needing to scold someone over it.

(That's against my commenting norms by the way, which I'll note that so far you, TAG, and Richard_Kennaway have violated, but I am not going to ban anyone over it. I still appreciate comments on my posts at all, and do hope that everyone still participates. In the olden days, it was Lumifer that used to come and do the same thing.)

I have an expectation that people do not continually mix up critique from scorn, and please keep those things separate as much as possible, as well as only applying the latter with solid justification.

You can see that yes, one of the points I am trying to make is that an assertion / insistence on consistency seems to generally make things worse. This itself isn't that controversial, but what I'd like to do is find better ways to articulate whatever the alternatives to that may be, here.

It's true that one of the main implications of the post is that imprecision is not enough to kill us (but that precision is still a desirable thing). We don't have rules that are simply tautologies or simply false anymore.

At least we're not physicists. They have to deal with things like negative probability, and I'm not even anywhere close to that yet.

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-22T17:47:39.945Z · LW(p) · GW(p)

What makes you say I'm scolding you?

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-22T20:26:39.678Z · LW(p) · GW(p)

You write in an extremely fuzzy way that I find hard to understand.

This does. This is a type of criticism that one can't easily translate into an update that can be made to one's practice. You're not saying if I always do this or just in this particular spot, nor are you saying whether it's due to my "writing" (i.e. style) or actually using confused concepts. Also, it's usually not the case that anyone is trying to be worse at communicating, that's why it sounds like a scold.

You have to be careful using blanket "this is false" or "I can't understand any of this," as these statements are inherently difficult to extract from moral judgements.

I'm sorry if it was hard to understand, you are always free to ask more specific questions.

To attempt to clarify it a bit more, I'm not trying to say that worse is better. It's that you can't consider rules (i.e. yes / no conditionals) to be absolutely indispensable.

↑ comment by tailcalled · 2023-09-18T22:29:47.233Z · LW(p) · GW(p)

If I were to swap the phrase "law of the excluded middle" in the piece for the phrase "principle of bivalence" how much would the meaning of it change

A lot. At least I associate multi-valued logic with a different sphere of research than intuitionism.

as well as overall correctness?

My impression is that a lot of people have tried to do interesting stuff with multi-valued logic to make it handle the sorts of things you mention, and they haven't made any real progress, so I would be inclined to say that it is a dead-end.

Though arguably objections like "It introduces the concept of “actually true” and “actually false” independent of whether or not we’ve chosen to believe something." also apply to multi-valued logic so idk to what extent this is even the angle you would go on it.

comment by Richard_Kennaway · 2023-09-18T19:36:11.145Z · LW(p) · GW(p)

But what if I was permitted to kind of believe P and not-P, and also kind of not believe it?

This is what probability is for.

See also.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-18T20:18:40.842Z · LW(p) · GW(p)

Indeed. (You don't need to link the main wiki entry, thanks.)

There's some subtlety though. Because either P might be true or not P, and p(P) expresses belief that P is true. So I think probability merely implies that the LoEM might be unnecessary, but it itself pretty much assumes it.

It is sometimes, but not always the case, that p(P) = 0.5 resolves to P being "half-true" once observed. It also can mean that P resolves to true half the time, or just that we only know that it might be true with 0.5 certainty (the default meaning).

comment by Viliam · 2023-09-19T14:09:19.736Z · LW(p) · GW(p)

I am not an expert, but I suspect that the problem with the law of excluded middle is that some statements are neither true nor false, but self-referential and thus kinda meaningless. Possibly in a non-obvious way.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-19T14:47:14.564Z · LW(p) · GW(p)

I created this self-referential market on Manifold to test the prediction that the truth-value of such a paradox is in fact 1/2. Very few participated, but I think it should always resolve to around 50%. Rather than say such paradoxes are meaningless, I think they can be meaningfully assigned a truth-value of 1/2.

Replies from: Viliam

↑ comment by Viliam · 2023-09-19T23:09:29.590Z · LW(p) · GW(p)

How do you intend to use the law of excluded middle in a three-value logic (0, 1/2, 1)? I though the entire purpose was to make statements like "if X is not 0, it must be 1", which now becomes "if X is not 0, it must be 1/2 or 1". So you lose the ability to prove indirectly that something is 1.

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-20T20:03:03.853Z · LW(p) · GW(p)

I give only maybe a 50% chance that any of the following adequately addresses your concern.

I think the succinct answer to your question is that it only matters if you happened to give me, e.g., a "2" (or anything else) and you asked me what it was and gave me your {0,1} set. In other words, you lose the ability to prove that 2 is 1 because it's not 0, but I'm not that worried about that.

It appears to be commonly said (see the last paragraph of "Mathematical Constructivism"), that proof assistants like Agda or Coq rely on not assuming LoEM. I think this is because proof assistants rely on the principle of "you can't prove something false, only true." Theorems are the [return] types of proofs, and the "False" theorem has no inhabitants (proofs).

The law of the excluded middle also seems to me like an insistence that certain questions (like paradoxes) actually remain unanswered.

That's an argument that it might not be true at all, rather than simply partially true or only not true in weird, esoteric logics.

Besides the one use-case for the paradoxical market: "Will this market resolve to no?" Which resolves to 1/2 (I expect), there may be also:

Start with two-valued logic and negation as well as a two-member set, e.g., {blue, yellow}. I suppose we could also include a . So including the excluded middle might make this set no longer closed under negation, i.e., ~blue = yellow, and ~yellow = blue, but what about green, which is neither blue nor yellow, but somehow both, mysteriously? Additionally, we might not be able to say for sure that it is neither blue nor yellow, as there are greens which can be close to blue and look bluish, or look close to yellow and look yellowish. You can also imagine pixels in a green square actually being tiled blue next to yellow next to blue etc., or simply green pixels, each seem to produce the same effect viewed from far away.

So a statement like "x = blue" evaluates to true in an ordinary two-valued logic if x = blue, and false otherwise. But in a {0, 1/2, 1} logic, that statement evaluates to 1/2 if x is green, for example.

Replies from: tailcalled, tailcalled

↑ comment by tailcalled · 2023-09-20T22:26:36.559Z · LW(p) · GW(p)

It appears to be commonly said (see the last paragraph of "Mathematical Constructivism"), that proof assistants like Agda or Coq rely on not assuming LoEM. I think this is because proof assistants rely on the principle of "you can't prove something false, only true." Theorems are the [return] types of proofs, and the "False" theorem has no inhabitants (proofs).

Not clear what you mean by "because proof assistants rely on the principle of "you can't prove something false, only true". There's a sense in which all math relies on this principle, and therefore proof assistants also rely on it. But proof assistants don't rely on it more than other math does. If you make inconsistent assumptions within Agda or Coq, you can prove False, just as in any other math. And they follow the principle of explosion.

But yes, proof assistants often reject the law of excluded middle. They generally do so in order to obtain two properties known as the disjunction property and the existence property. The disjunction property says that if is provable, then either $P$ is provable or $Q$ is provable. The existence property says that if $\exists x . P (x)$ is provable, then there is an expression $t$ such that $P (t)$ is provable.

These properties reflect the fact that proofs in Agda and Coq carry a computational meaning, so one can "run" the proofs to obtain additional information about what was proven.

One cannot have both the disjunction property and the law of excluded middle, because together they imply that the logic is complete, and consistent logics capable of expressing arithmetic cannot be complete by Gödel's incompleteness theorems.

↑ comment by tailcalled · 2023-09-21T08:40:02.946Z · LW(p) · GW(p)

Also if you are getting into proof assistants then you should probably be aware that they use the term "truth-values" in a different way than the rest of math. In the rest of math, truth-values are an external thing based on the relationship between a statement and the domain the statement is talking about. However, in proof assistants, truth-values are often used to refer to the internal notion of subsets of a one-element set; P({1}).

So while it is not equivalent to excluded middle to say that either a statement is true or a statement is false, it is equivalent to excluded middle to say that a subset of {1} is either Ø or is {1}. The logic being that if you have some proposition P, you could form the subset S={P|x in {1}}, and if P then by definition S is {1} and if not P then by definition S=Ø, so if P or not P then S={1} or S=Ø.

comment by ASDFDSA · 2023-11-30T23:01:54.557Z · LW(p) · GW(p)

Are you saying that it is unlikely for a statement to be true?

comment by drgunn · 2023-09-19T05:30:39.278Z · LW(p) · GW(p)

When I hear someone saying,

"Because it is asking to me to believe it completely if I believe it at all, I feel more comfortable choosing to consider it “false” on my terms, which is merely that it gave me no other choice because it defined itself to be false when it is only believed weakly."

what I think is "of course there are strong and weak beliefs!" but true and false is only defined relative to who is asking and why (in some cases), so you need to consider the context in which you're applying LoEM.

In other words, LoEM applies to "Does 2+2=4?" but it does not apply to "Is there water in the fridge?", unless the context is specified more carefully.

It's obviously an error to only have 100% or 0% as truth values for all propositions, and it's perhaps less obviously an error to have the same probabilities that a proposition is true across all possible contexts in which that proposition might be evaluated.

More here: https://metarationality.com/refrigerator

Replies from: thoth-hermes

↑ comment by Thoth Hermes (thoth-hermes) · 2023-09-19T14:28:46.646Z · LW(p) · GW(p)

what I think is "of course there are strong and weak beliefs!" but true and false is only defined relative to who is asking and why (in some cases), so you need to consider the context in which you're applying LoEM.

Like in my comment to Richard_Kennaway about probability, I am not just talking about beliefs, but about what is. Do we take it as an axiom or a theorem that A or ~A? Likewise for ~(A and ~A)? I admit to being confused about this. Also, does "A" mean the same thing as "A = True"? Does "~A" mean the same thing as "A = False"? If so, in what sense do we say that A literally equals True / False, respectively? Which things are axioms and which things are theorems, here? All of that confuses me.

Since we are often permitted to change our axioms and arrive at systems we either like or don't like, or like better than others, I think it's relevant to ask about our choice of axioms and whether or not logic is or should be considered a set of "pre-axioms."

It seemed like tailcalled was implying that the law of non-contradiction was a theorem, and I'm confused about that as well. Under which axioms?

If I decide that ~(A and ~A) is not an axiom, then I can potentially have A and ~A either be true or not false. Then we would need some other arguments to support that choice. Without absolute truth and absolute falsehood, we'd have to move back to the concept of "we like [it] better or worse" which would make the latter more fundamental. Does allowing A and ~A to mean something get us any utility?

In order for it to get us any utility, there would have to be things that we'd agree were validly described by A and ~A.

Admittedly, it does seem like these or's and and's and ='s keep appearing regardless of my choices, here (because I need them for the concept of choice).

In a quasi-philosophical and quasi-logical post I have not posted to LessWrong yet, I argue that negation seems likely to be the most fundamental thing to me (besides the concept of "exists / is", which is what "true" means). "False" is thus not quite the same thing as negation, and instead means something more like "nonsense gibberish" which is actually far stronger than negation.

Replies from: tailcalled, TAG

↑ comment by tailcalled · 2023-09-19T22:20:20.579Z · LW(p) · GW(p)

It's really hard to answer these sorts of questions universally because there's a bunch of ways of setting up things that are strictly speaking different but which yields the same results overall. For instance, some take to be a primitive notion, whereas I am more used to defining $\neg P$ to mean $P ⟹ ⊥$ . However, pretty much always the inference rules or axioms for taking $\neg P$ to be a primitive are set up in such a way that it is equivalent to $P ⟹ ⊥$ .

If you define it that way, the law of noncontradiction becomes $(P \land (P ⟹ ⊥)) ⟹ ⊥$ is pretty trivial, because it is just a special case of $(P \land (P ⟹ Q)) ⟹ Q$ , and if you don't have the rule $(P \land (P ⟹ Q)) ⟹ Q$ then it seems like your logic must be extremely limited (since it's like an internalized version of modus ponens, a fundamental rule of reasoning).

I have a bunch of experience dealing with logic that rejects the law of excluded middle, but while there are a bunch of people who also experiment with rejecting the law of noncontradiction, I haven't seen anything useful come of it, I think because it is quite fundamental to reasoning.

Also, does "A" mean the same thing as "A = True"? Does "~A" mean the same thing as "A = False"?

Statements like $A = ⊥$ are kind of mixing up the semantic (or "meta") level with the syntactic (or "object") level.

G2g

↑ comment by TAG · 2023-09-19T15:37:56.998Z · LW(p) · GW(p)

Like in my comment to Richard_Kennaway about probability, I am not just talking about beliefs, but about what is. Do we take it as an axiom or a theorem that A or ~A?

Are you aware that there are different logics with different axioms? PMC amd LEM are both axioms in Aristotelean logic: LEM does not apply in fuzzy logic; and PNC does not apply in paraconsistent logic.

False” is thus not quite the same thing as negation,

No: negation is a function, false is a value. That's clear in most programming languages.

False” is thus not quite the same thing as negation, and instead means something more like “nonsense gibberish”

That doesn't follow at all.

comment by Michael Carey (michael-carey) · 2023-09-20T20:59:30.696Z · LW(p) · GW(p)

What do you mean by "believe in the Law of Excluded Middle"

Do you need to believe it applies to all conceivable statements?

Usually, when one is working within a framework assuming the Law Of Excluded Middle, it's only true for their Domain of Discourse.

Whether it's true outside that domain is irrelevent.

The Law of Excluded Middle is obviously false, in the framework of quantum bits, where 1 = true, 0 = False. So, I doubt anyone believes it applies Universally, under all interpretations.

comment by Noosphere89 (sharmake-farah) · 2023-09-19T00:00:55.210Z · LW(p) · GW(p)

So the reason why the Law of the Excluded Middle is used has to do with 2 reasons:

(Side note, the Law of the Excluded Middle maps False to 0, and True to 1.)

It's a tautology, in the sense that for any input, it always outputs true, and this is a huge benefit since it is always right, no matter what input you give in the boolean formulation. This means that for our purposes, we don't have to care about the non-boolean cases, since we can simply map their truth values to 2 values like true or false, or 0 or 1, etc.
For a mathematical statement to be proven true or false, at a very high level, we can always basically reformulate the mathematical statement and turn it into a program that halts or doesn't halt, or stops at a finite time or continues running forever/infinite time, where the truth value of the statement depends on whether it starts with an existential quantifier or universal quantifier.

Critically, a program either halts at a finite time or runs forever, and there is no middle ground, so the law of the excluded middle applies. You may not know whether it halts or runs forever, but these are the only two outcomes that a program can have.

If it starts with an existential quantifier, then if the program halts, the statement is true, and if the program doesn't halt, it's false.

If it starts with an universal quantifier, then if the program halts, the statement is false, and if the program doesn't halt, the statement is true.

Replies from: tailcalled, TAG

↑ comment by tailcalled · 2023-09-20T13:03:08.338Z · LW(p) · GW(p)

For a mathematical statement to be proven true or false, at a very high level, we can always basically reformulate the mathematical statement and turn it into a program that halts or doesn't halt, or stops at a finite time or continues running forever/infinite time, where the truth value of the statement depends on whether it starts with an existential quantifier or universal quantifier.

This is false. It only works if you don't use too complex a sequence of quantifiers. It is also limited to arithmetic and doesn't work with e.g. set theory.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-20T14:29:33.397Z · LW(p) · GW(p)

This is false. It only works if you don't use too complex a sequence of quantifiers. It is also limited to arithmetic and doesn't work with e.g. set theory.

Hm, how is this procedure not going to work for certain classes of statements like set theory. Where does the encoding process fail?

Because I do remember that there was a stackexchange post which claimed we can always take a logical statement and express it as something that returns true if it halts, and false if it doesn't halt.

Replies from: tailcalled, michael-carey

↑ comment by tailcalled · 2023-09-20T18:34:55.719Z · LW(p) · GW(p)

The encoding involves looping over the domain of discourse, which works fine for arithmetic where the domain of discourse are the natural numbers, which are an enumerable set. However, when the domain of discourse is the sets, you can't just loop over them because there are too many with too complex relations which cannot be represented finitely.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-20T20:14:46.189Z · LW(p) · GW(p)

Now my question is, how complicated is the domain of discourse of the sets, exactly?

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-20T22:16:25.558Z · LW(p) · GW(p)

I don't know for sure but I think it is too complex for mathematicians to have come up with a way of quantifying the complexity of it. However, I can say a bit about what I mean by "too complex":

The natural numbers have a computable model. That is, in a nice language $L$ such as the language of bitstrings ${0, 1}^{*}$ , we can define a relation $R \subset N \times L$ where you can think of $n R s$ as meaning "the number $n$ can be written with the bitstring $s$ ". This relation can be defined in such as way as to have certain nice properties:

For each of the usual primitive operations over $N$ such as $+, \cdot, 0, 1, \dots$ , there is a corresponding computable algorithm over $L$ which preserves $R$ . For instance, there is an algorithm $A_{+}$ such that for all $n R s$ and $m R t$ , we have $(n + m) R A_{+} (s, t)$ .
For each of the usual primitive relations over $N$ such as $=$ , there is a corresponding computable algorithm over $L$ . For instance, there is an algorithm $A_{=}$ such that for all $n R s$ and $m R t$ , $(n = m) ⟺ A_{=} (s, t)$ .

When we have a relation $R$ and algorithms $A$ like this, rather than leaving the mathematical objects in question as abstract hypothetical thoughts, they become extremely concrete because you can stuff them into a computer and have the computer automatically answer questions about them.

This then raises the question of whether the sets $V$ have a computable model, which depends on what one considers the primitives. When I think of primitive operations for sets, I think of operations such as $\emptyset, \in, =, {-, -}, \cup, {f (-) | - \in - if ϕ (-)}, P (-), N, \dots$ . However, these are extremely powerful operations because they are inherently infinitary, and therefore allow you to solve very difficult problems.

For instance if we let $h (x, y)$ mean "the Turing machine indexed by $x$ halts within $y$ steps" (which in itself is a fairly tame proposition since you can answer it just by running Turing machine number x for y steps and seeing whether it halts), then one can solve the halting problem using set theoretic operations just by going ${y \in N | h (x, y)} \neq \emptyset$ . Since the halting problem is uncomputable, we must infer that any computable model of sets must be restricted to a language which lacks either subsets, equality, the empty set, or the set of all the natural numbers.

A simpler way to see that no computable model of $V$ exists is by a cardinality argument; $V$ is so big that it would be paradoxical for it to have a cardinality, whereas ${0, 1}^{*}$ is countable. However there problem with cardinality arguments is that one can get into weird stuff due to Skolem's paradox. Meanwhile, the observation that set theory would allow you to compute wild stuff and therefore cannot plausibly be computable holds even if you work in a countable setting via Skolem.

Replies from: tailcalled, sharmake-farah

↑ comment by tailcalled · 2023-09-21T07:50:00.121Z · LW(p) · GW(p)

A simpler way to see that no computable model of V exists is by a cardinality argument; V is so big that it would be paradoxical for it to have a cardinality, whereas {0,1}∗ is countable.

Wait derp this is wrong. The fancy schmancy relational approach I used specifically has the strength that it allows models of uncountable sets because only the constructible elements actually need to have a representation.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-22T18:27:19.689Z · LW(p) · GW(p)

Hm, what does this mean for your argument that set theory is uncomputable by a Turing machine, for example?

Replies from: tailcalled

↑ comment by tailcalled · 2023-09-24T07:02:31.756Z · LW(p) · GW(p)

I had two separate arguments and it only breaks the second argument, not the first one.

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-21T02:14:08.788Z · LW(p) · GW(p)

A neat little result, that set theory is RE-hard, and damn this is a very large set, so large that it's larger than every other cardinality.

This might be one of the few set theories that can't be completely solved even with non-uniformity, as sometimes non-uniform models of computation, in if we could make them, we could solve every language.

An example is provided on the 14th page of this paper:

https://arxiv.org/pdf/0808.2669.pdf

And this seems like a great challenge for the Universal Hypercomputer defined here, in that it could compute the entire universe of sets V using very weird resources.

https://arxiv.org/pdf/1806.08747.pdf

↑ comment by Michael Carey (michael-carey) · 2023-09-20T21:13:54.605Z · LW(p) · GW(p)

I imagine you are refering to a Turing Machine halting or not halting.

There are statements in Set Theory, which Turing Machines cannot interpret at all ( formally, they have a particular complexity), and require the introduction of "Oracles" in order to assist in interpreting. These are called Oracle Turing Machines. They come about frequently in Descriptive Set Theory.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-21T16:50:35.574Z · LW(p) · GW(p)

So we can, in a manner of speaking, encode them as programs to be run in a Turing Machine with a oracle tape. That's not too hard to do, once we use stronger models of computation, and thus we can still encode set theory statements in more powerful computers.

So I'm still basically right, philosophically speaking, in that we can always encode set theory/mathematics statements in a program, using the trick I described of converting set theory into quantified statements, and then looking at the first quantifier to determine whether halting or not-halting is either true or false.

Replies from: michael-carey

↑ comment by Michael Carey (michael-carey) · 2023-09-22T03:10:50.564Z · LW(p) · GW(p)

You would be right in most cases. But, there is still the issue of independent statements. " ZF is consistent" can not be shown to be true or false, if ZF is consistent, via the Incompleteness Theorems.

So, some statements may not be shown to halting, or not halting. Which, is the famous halting problem.

Any algorithm would be unable to tell you, if the statement halts or doesn't halt. So, not all statements can be expressed as true/false or halt/not-halt

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-22T16:58:05.679Z · LW(p) · GW(p)

There are 2 things to be said here:

I didn't say that it had to return an answer, and the halting problem issue for Turing Machines is essentially that even though a program halts or doesn't halt, which in this case can be mapped to true or false, we don't have an algorithm that runs on a Turing Machine that can always tell us which of the 2 cases is true.
Gödel's incompleteness theorems are important, but in this context, we can basically enhance the definition of a computer philosophically to solve essentially arbitrary problems, and the validity problem of first order logic becomes solvable by introducing an Oracle tape or a closed timeline curve to a Turing Machine, and at that point we can decide the validity and satisfiability problems of first order logic.

You also mentioned that oracles like oracle tapes can provide the necessary interpretation for set theory statements to be encoded, which was my goal, since a Turing Machine or other computer with an Oracle tape, which gets around the first incompleteness theorem by violating an assumption necessary to get the result.

Replies from: michael-carey

↑ comment by Michael Carey (michael-carey) · 2023-09-22T18:10:46.869Z · LW(p) · GW(p)

Thank you for clarifying, I misunderstood your post.

Yes, you're right. "essentially" arbitrary problems would be free game.

There is a hierarchy of questions one can ask though. That is, whatever oracle you introduce, you can now ask more complex questions and would require a more complex oracle to answer ( basically, you can ask the first oracle, questions about itself, which require another more complex oracle to answer).

When I saw you use the word "computer" I thought you meant, a literal computer that we could in principle build.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2023-09-22T18:25:30.190Z · LW(p) · GW(p)

My focus was on the more philosophical/impractical side, and the computers we can actually build in principle, assuming the laws of physics are unchangable and we are basically correct, we can't even build Universal Turing Machines/Turing Complete systems, but just linear bounded automatons, due to the holographic principle.

Also, the entire hierarchy can be solved simply by allowing non-uniform computational models, which is yet another benefit of non-uniform computational models.

↑ comment by TAG · 2023-09-19T15:45:08.018Z · LW(p) · GW(p)

It’s a tautology, in the sense that for any input, it always outputs true, and this is a huge benefit since it is always right, no matter what input you give in the boolean formulation.

Anything asserted as an axiom is a tautology.

You can justify the LEM by assuming there are two possible values, and you can prove there are only two values after assuming LEM, but you can't bootstrap both claims simultaneously.

So really, LEM is a choice about whether you are doing bivalent logic or multivalent logic.

Why I Don't Believe The Law of the Excluded Middle

Contents

46 comments