Very Basic Model Theory
post by So8res · 2013-10-31T07:06:56.951Z · LW · GW · Legacy · 15 commentsContents
A tale of two logics Training Wheels More power Completeness 1. Theorems of first-order logic are true in every model of first-order logic. 2. Any sentence true in every model of first-order logic is a theorem of first-order logic. 3. A set Σ of sentences is consistent if it has a model. 4. A set Σ of sentences has a model if it is consistent. Witnesses Compactness 1. If Σ has a model then every finite subset of Σ has a model. 2. If every finite subset of Σ has a model then Σ has a model. MORE POWER Even more None 15 comments
In this post I'll discuss some basic results of model theory. It may be helpful to read through my previous post if you haven't yet. Model Theory is an implicit context for the Heavily Advanced Epistemology sequence and for a few of the recent MIRI papers, so casual readers may find this brief introduction useful. And who knows, maybe it will pique your interest:
A tale of two logics
propositional logic is the "easy logic", built from basic symbols and the connectives "and" and "not". Remember that all other connectives can be built from these two: With Enough NAND Gates You Can Rule The World and all that. Propositional logic is sometimes called the "sentential logic", because it's not like any other logics are "of or relating to sentences" (/sarcasm).
first order logic is the "nice logic". It has quantifiers ("there exists", "for all") and an internal notion of equality. Its sentences contain constants, functions, and relations. This lets you say lots of cool stuff that you can't say in propositional logic. First order logic turns out to be quite friendly (as we'll see below). However, it's not strong enough to talk about certain crazy/contrived ideas that humans cook up (such as "the numbers").
There are many other logics available (second order logic AKA "the heavy guns", ω-logic AKA "please just please can I talk about numbers", and many more). In this post we'll focus on propositional and first-order logics.
Training Wheels
As a warm-up, let's define "models" for propositional logic.
Remember, logics together with languages produce sentences. Languages of propositional logic are sets of basic symbols. From these symbols and the two connectives of propositional logic (∧ and ¬) we construct our sentences.
For example, from the language {x, y}
we can build the sentences x
, ¬x
, x∧y
, and many more. We cannot build sentences like hello
(which doesn't use the right symbols) or ¬xy
(which breaks the rules of the logic).
We want to study interpretations for such sentences, where an "interpretation" is some object which assigns a truth value to each sentence. This is denoted by the ⊧ operator: A⊧B
expresses some form of "A models B as true". This operator is used frequently and in many contexts throughout model theory.
What we're looking for is some object (set) such that there is a relation (⊧) between the object and all sentences generated by the language under consideration. There are many object/relation pairs that assign truth values in a stupid way. For example, an interpretation that assigns "true" to every sentence is quite useless.
Many other interpretations are somewhat useful but still "wrong". For example, there are interpretations of sentences which treat ∧
like implication instead of conjunction. Somebody may have use for such things, but we certainly don't.
We want to narrow our consideration to interpretations where ∧
means "and" and ¬
means "not", so we explicitly require object/relation pairs such that whenever the object models both φ
and ψ
, it also models φ∧ψ
. (And X⊧φ
iff it is not the case that X⊧¬φ
, where X
is the object under consideration.)
If we can find any objects that work like this, we'll be justified in calling them "models". However, we haven't defined any such objects yet — we've only constrained their potential behavior.
Any interpretation of the sentences generated from a language L of propositional logic must assign truth values to all basic sentence symbols in L. It turns out, once we select which sentence symbols are "true", we're done. The rest of the behavior is completely specified by the rules about ∧
and ¬
.
In other words, any object/relation which assigns truth values to sentences according to the above rules is isomorphic to a set S of sentence symbols with the operator ⊧
defined in the obvious way (starting with S⊧x iff x∈S
).
Exploring this idea gives you a feel for how to define models of a logic.
More power
Propositional logic is pretty boring. If you want to say anything interesting, you've got to be able to say more than just "and" and "not". Let's move on to a stronger logic.
First-order logic uses the symbols ( ) ∀ ∃ ∧ ¬ ν ' ≡
or equivalent (where ν ν' ν'', etc. are variables) with the familiar syntactic rules. A language of first-order logic has three different types of symbols: relation symbols, function symbols, and constant symbols. The rules of logic are designed such that symbols act as you expect, given their names.
An interpretation of the sentences generated by some language in first-order logic is (as before) an object that assigns each sentence a truth value (via a relation). We narrow these objects down to the ones that treat all the symbols in ways that justify their names.
More specifically, we consider interpretations that have of some "universe" (set) of items. Constant symbols are interpreted as specific items in the universe. Relation/function symbols are interpreted as relations/functions on the universe.
We further restrict consideration to interpretations where the logical symbols act in the intended fashion. For example, we require that the interpretation hold c≡d
true (where c
and d
are constant symbols) if and only if the interpretation of c
in the universe is equal to the interpretation of d
in the universe.
All this is pretty mechanical. The resulting objects are "models". Some of the early results in model theory show that these mathematical objects have indeed earned the name.
Completeness
Completeness is the poster-child of model theory, and it warrants quite a bit of exploration. It's actually saying a few different things. I'll break it down:
1. Theorems of first-order logic are true in every model of first-order logic.
A theorem of first-order logic is something we can prove from the logic alone, such as (∀ν)(ν≡ν)
. Contrast this with a sentence like (∃ν)(F(ν)≡c)
, where F
is a function symbol and c
is a constant symbol — the truth of this sentence depends entirely upon interpretation.
So what (1) says is that there aren't any models that deny tautological truths of first-order logic. This result should not be surprising: this claim merely states that we picked a good definition for "models". If instead we found that there are models which deny theorems, this would not be some big grand proof about the behavior of first-order logic. Rather, it would mean that we put the "model" label on the wrong group of thingies, and that we should go back and try again.
2. Any sentence true in every model of first-order logic is a theorem of first-order logic.
This is the converse to (1), and it's quite a bit more powerful. This states that the theorems of a language are only the things true in every model of that language. There's no mystery sentence that is true in every model but not provable from the syntax of the logic.
From one point of view, this says that there are no "missing" models that "would have" said the mystery sentence was false (hence "completeness"). From another, this says that all sentences are "well behaved": no sentence "outwits" all models.
It's worth noting that this is where completeness fails for models of second order logic. From one perspective, there must be some "missing models" in second-order logic. From the other perspective, there must be certain sentences which can "outwit" their models (presumably Gödel sentences). I haven't spent any time formally examining these intuitive claims yet; I have much to learn about second-order logic.
I should also note that I've been implicitly assuming completeness for the last two posts by ignoring the syntactic rules of the logics under consideration. The completeness theorem lets me do this, because it states that the results interpreted by first-order models are exactly the same as the results derived from the syntactic rules of first-order logic. Because the completeness theorem holds, I'm allowed to treat the logical sentences as dead symbols and consider the binding M⊧φ∧ψ iff M⊧φ and M⊧ψ
to be the means by which ∧
obtains its meaning.
When the completeness theorem fails (as it does in stronger logics) then there is a gap between what the rules of the logic allow and what the models of the logic can say. In that case I must separately consider what the rules say and what the models model. This is the edge of my comfort zone; more exploration with second-order logic is necessary.
Fortunately, everything is well-behaved in first order logic. In fact, (1) and (2) above can be made stronger:
3. A set Σ of sentences is consistent if it has a model.
This is another sanity check that we've put the "model" label on the right thing. A set of sentences is inconsistent if it can derive both φ
and ¬φ
for some sentence φ. (More specifically, Σ is inconsistent if it can derive everything. If there is any sentence that Σ cannot derive, Σ is consistent. Remember that from a contradiction, anything follows.)
Our models are defined such that whenever they model φ
, they do not model ¬φ
(and vice versa). Thus, if a set of sentences has a model (Σ "has a model" when there is some model that holds true every sentence σ in Σ) then there must be some sentences which Σ cannot deduce (¬σ for each σ in Σ, for instance). Thus, Σ is consistent.
4. A set Σ of sentences has a model if it is consistent.
This is where things get interesting again. This is a stronger version of (2) above: every consistent theory has a model.
This makes a lot of sense after you understand (2) and (3), but don't underestimate its importance. This tells us that for every consistent theory there is some interpretation following the rules which we laid out. Again, this says something positive about our models (they are strong enough to handle any consistent theory) and something negative about the logic (it's not strong enough to permit a consistent theory stating "I cannot be modeled").
Note: I'm not sure what it would mean for a theory to state "I cannot be modeled", nor am I convinced that the idea is meaningful. Again, we're nearing the edge of my comfort zone.
The point is, the completeness theorem for first order logic says "if you hand me a consistent theory, I can build a model of it". This is very useful. We don't have to waste any time worrying whether or not there's an interpretation available that satisfies all our stringent rules about how ≡ actually means "equals" and so on: if the theory is consistent, a model exists.
Witnesses
There's been a lot of hand-waving going on for the past four points. Let's get back to the models.
In model theory, you're manipulating interpretations for theories. Due to the generality of such work, there are few tools available to manipulate any abstract model. One of the most useful tools in model theory is the ability to extend a model.
Ideally, when you extend a model, you want to make it more manageable without changing its behavior. Our first example of such a technique involves extending a model to add "witnesses".
A "witness" is a constant symbol that witnesses the truth of an existentially quantified sentence. For example, in the language {S, +, ✕, 0}
of arithmetic, the constant symbol 0
is a witness to the sentence (∃ν)(Sν≡S0)
(because 0
makes (Sν≡S0)
true). However, there is no witness to the sentence (∃ν)(ν≡S0)
, because 1
is not a constant symbol of the language.
However, we can extend a language to add new constant symbols. Specifically, given any model, we can extend the language to contain one constant symbol ā
for each element a
in the universe. Then we can extend the model to interpret each ā
by a
. Such extension does not change the behavior of the model, but it does give the model a number of nice properties.
A model is said to "have witnesses" if for every existentially quantified sentence shaped like (∃ν)ψ that it models, there is some constant c such that the model models ψ(ν\c) (ψ with ν replaced by c).
Before we discuss them, note a few things:
- We can also reduce a model & language extended in this way by eliminating the added constants.
- We can talk about sets of sentences with witnesses if, whenever a sentence shaped like
(∃ν)ψ
is in the set of sentences Σ, there is some constantc
such thatψ(ν\c)
is also in Σ. - Just as we can extend a model & language so that the model has witnesses, we can extend a set of sentences and language so that the set of sentences has witnesses (by adding a new constant symbol for each existentially quantified sentence). Such extension does not threaten the consistency of a set of sentences.
- It's easy to construct a model from a consistent set of sentences that has witnesses. The universe is the set of all constant symbols (technically, one element for each equivalent set of constant symbols), and the behavior of function/relation symbols is forced by their behavior on the constant symbols.
Formalize these four points and put them together, and you've just proved the completeness theorem. (Given any consistent set Σ of sentences, add witnesses then make a model then reduce it to the original language, you now have a model of Σ).
Compactness
The compactness theorem states that A set Σ of sentences has a model iff every finite subset of Σ has a model.
This leads to a pretty surprising result. Let's break it down.
1. If Σ has a model then every finite subset of Σ has a model.
This direction is obvious: any model of Σ is also a model of all subsets of Σ: if a model holds true all sentences σ in Σ then it obviously holds true all sentences σ in any subset of Σ.
2. If every finite subset of Σ has a model then Σ has a model.
This is where things get interesting. Note that we measure the size of a model by measuring the size of its universe. So when we say "a countably infinite model" we mean a model with a countably infinite universe.
The proof of the above is actually quite easy: in first order logic, all proofs are finite. Therefore, any proof of contradiction (and thus inconsistency) must be finite. Since every finite subset of Σ has a model, no finite subset of Σ is inconsistent. Because all inconsistencies are finite, Σ is not inconsistent. Then, by completeness, because Σ is consistent it has a model.
Or, in other words, the compactness theorem says
If Σ wants to be inconsistent it can do it in a finite subset.
When we know that no finite subset of Σ is inconsistent, we know that Σ has a model.
The surprising result that this leads to is as follows:
If a theory T allows arbitrarily large finite models then it has an infinite model.
If you hand me a theory that allows arbitrarily large finite models, I can extend the language by adding countably many constants to the language. Then I can consider the set Σ of sentences built from T joined with sentences of the form "there are at least n distinct constants" for all finite n. Clearly, every finite subset of Σ is satisfied by one of the finite models of T, which come in arbitrarily large sizes — just pick one that fits. Then, by compactness, Σ has a model. The model of Σ has infinitely many constants.
I have just given a fully general recipe for building an infinite model given a theory T that admits arbitrarily large finite models. What does this mean? It means that no theory in first order logic is capable of saying "I admit only finite models". Sentences can say "models must have no more than 10 elements" or "models are allowed to be infinite", but sentences cannot say "My model is finite". This follows directly fro the fact that all proofs of contradiction are finite. You either have a specific limit, or you don't get a limit at all.
MORE POWER
If you hand me a theory that allows arbitrarily sized finite models, I can hand you back an infinite model.
It gets better.
If you hand me an infinite theory, I can expand it.
Using a method similar to the one above, I can expand T with sentences of the form "there are at least β distinct constants", for all β less than my chosen cardinal α. Following the same argument as above, all finite subsets of this theory have a model so this theory has a model, which obviously is of power α.
In other words, a theory in first-order logic can either say "I am at most this
big", where this
is a specific finite number, or "I am very
big", where very
is can be any infinite cardinal that you like.
Example: The theory of arithmetic doesn't have a finite cutoff point. There's no maximum number. Countably infinite models are allowed.
Therefore, arbitrarily large infinite models are allowed.
There are countable models of arithmetic (at least one of which you'll find familiar), and there are also models of arithmetic where there are as many "numbers" as there are reals. Then there are bigger models of arithmetic that are saturated, no, dripping with numbers.
Now you understand why first order logic cannot discuss the "standard" model of number theory. No first-order theory can specifically pinpoint "countable" models! A first order theory must either specify a finite maximum size explicitly, or allow models of unfathomable size.
Even more
I was hoping to get farther than this, but this is a good stopping point. Hopefully you've learned something about model theory. There are many more interesting results yet.
The book Model Theory by Chang and Keisler primarily focuses on introducing new ways to extend arbitrary models (giving them nice properties in the process) and then discusses results that can be gleaned from models extended thus. Focus is also given to special cases where models are well behaved, such as atomic models (which are a sort of minimal model for a given theory) and saturated models (which are a sort of maximal model for a given theory, within a given cardinality. There Are Always Bigger Models).
There's also quite a bit of exploration into manipulating languages (Eliminate Quantifiers To Win Big Prizes!!!) and even manipulating logics (Q. How far beyond first order logic we can go before sacrificing things like completeness and compactness? A. Not very.)
If you're interested in learning more, then… well, I'm probably not the person to talk to. I'm not even halfway through this textbook. But if you're feeling gutsy, consider picking up Model Theory and getting started. Some of the harder concepts would be much easier to work through with other people rather than alone.
In fact, I have a number of notes about trying to learn something difficult on my own (what worked, what didn't) that I plan to share. But that's a story for another day.
Finally, please note that my entire exposure to model theory is half a textbook and some internetting. I guarantee I've misunderstood some things. If you see errors in my explanations, don't hesitate to let me know! My feedback loop isn't very strong right now.
15 comments
Comments sorted by top scores.
comment by Protagoras · 2013-11-01T00:23:34.609Z · LW(p) · GW(p)
Something that kind of interests me; to the Pythagoreans mathematics was magic, involving mystical insight into the ultimate nature of reality. Similar ideas continued in Plato and those he influenced, with echoes down to the early modern rationalists. But Pythagorean mathematics was exceedingly primitive and limited. Modern logic and mathematics are vastly more powerful and sophisticated, and yet the Pythagorean mysticism has almost completely disappeared. Why were people more impressed with mathematics when it was more limited? I imagine novelty was a factor, and they say familiarity breeds contempt, but I have a couple of other hypotheses. First, there seem to be obvious places here and there in modern logic and mathematics where we seem to face choices, where it looks like a matter of "doing this is convenient for this purpose" rather than "this is the only way things could be." This tends to weaken the idea that the fundamental nature of reality is being revealed. Second, there are parts of modern mathematics that are deeply weird (e.g. many things about how infinite cardinals work). I imagine there are people who find it hard to accept that those parts of mathematics at least could be describing any real world.
I'm certainly not arguing for returning to Pythagorean mysticism, but of the two reasons I propose why people might be ruling that out, I'm inclined to think the first reason looks fairly good while the second strikes me as highly suspect. I'm quite curious as to which has been most influential (or whether it's some other factor or factors, perhaps familiarity really is the driving force, or perhaps external influences, say from hostility to mysticism in other fields, are important).
Replies from: shminux, wuncidunci, Armok_GoB↑ comment by Shmi (shminux) · 2013-11-01T17:42:20.693Z · LW(p) · GW(p)
Tegmark level IV is the modern version of "Pythagorean mysticism" (it states that "the ultimate nature of reality" is math). So, there is no need to return to anything, we are already there. Well, some of us are, even on this forum. As for "the fundamental nature of reality", there is no indication for it to be a real thing, the deeper we dig, the more we find. But yes, in the map/territory model, math helps build better maps of the territory, whether or not it is the "ultimate" territory. The "weird" parts may or may not be useful in building better maps, it's hard to tell in advance. After all, the (freshly revealed) territory ls also weird.
↑ comment by wuncidunci · 2013-11-01T17:07:19.055Z · LW(p) · GW(p)
Cantor who first did the first work on infinite cardinals and ordinals seemed to have a somewhat mystic point of view some times. He thought his ideas about transfinite numbers were communicated to him from god, whom he also identified with the absolute infinite (the cardinality of the cardinals which is too big to itself be a cardinal). This was during the 19th century so quite recently.
I'd say that much mysticism about foundational issues like what numbers really are, or what these possible infinities actually mean, have been abandoned by mathematicians in favour of actually doing real mathematics. We also have quite good formal foundations in terms of ZF and formal logic nowadays, so discussions like that do not help in the process of doing mathematics (unlike, say, discussions about the nature of real numbers before we had them formalised in terms of Cauchy sequences or Dedekind cuts).
↑ comment by Armok_GoB · 2013-11-01T04:20:12.207Z · LW(p) · GW(p)
I don't think those attitudes are quite as gone as you seem to think although that might be the mind projection fallacy. It's better understood, to the mystery and superstition are gone, but the senses of transcendent awe and sacred value are still there. Certainly, with Tegmarkian cosmology (that I tend to hold as axiomatic, in the literal "assumption++" sense), it's very much the "one ultimate nature of reality".
comment by Will_Sawin · 2013-11-01T03:16:42.179Z · LW(p) · GW(p)
I don't see how 3 and 4 are stronger than 1 and 2. They are just the special cases of 1 and 2 where the sentence is a contradiction.
Arbitrarily large finite models are certainly not allowed in the theory of arithmetic.
Replies from: So8res↑ comment by So8res · 2013-11-01T15:54:48.696Z · LW(p) · GW(p)
3 and 4 are generalizations to sets of sentences. But you're right, the generalization is pretty simple.
Arbitrarily large models are allowed in the first-order theory of arithmetic, and no first-order theory of arithmetic can restrict models to only the integers. This is one of the surprising results of compactness.
Replies from: Will_Sawin↑ comment by Will_Sawin · 2013-11-01T18:39:28.622Z · LW(p) · GW(p)
You said arbitrarily large finite models, however. First-order arithmetic has no finite models. : )
Replies from: So8rescomment by Gvaerg · 2013-11-11T09:19:38.235Z · LW(p) · GW(p)
From what I know, Chang & Keisler is a bit dated and can create a wrong perspective on what model theorists are researching nowadays. Maybe you should also look at a modern textbook, like the ones from Hodges, Marker or Poizat.
Replies from: jazmt↑ comment by Yaakov T (jazmt) · 2014-02-04T04:36:38.584Z · LW(p) · GW(p)
Which of the 3 would you recommend? Does someone know why MIRI recommends Chang and Keisler if it is somewhat outdated?
Replies from: Gvaerg↑ comment by Gvaerg · 2014-02-04T08:04:26.155Z · LW(p) · GW(p)
Marker is the closest to the state of the art. Hodges is a bit verbose and for beginners. Poizat is a little idiosyncratic (just look at the Introduction!).
I am also interested in the basis of MIRI's recommendation. Perhaps they are not too connected to actual mathematicians studying it, as model theory is pretty much a fringe topic.
Replies from: gjm, Nebu↑ comment by gjm · 2016-01-22T13:51:05.676Z · LW(p) · GW(p)
Poizat is a little idiosyncratic (just look at the Introduction!)
Here is the last paragraph, which you can read via Google Books:
Indeed, I wrote the outline of this book while wandering across India, so that, in my mind, Henkin's method is inextricably linked to the droves of wild elephants that I met while crawling among the swamp plants of the preserves of Kerala; the elimination of imaginaries, to the gliding of the vultures above the high Himalayan peaks; and the theorem of the bound, to the naked bodies of the Mauryan women that the traveler saw on the bends of a jungle trail, before they had time to cover themselves. I dare hope only that this book will evoke similarly pleasant images in my reader; I wish only that it will be a pleasant companion for you, as it was for me.
The first paragraph of the preface to the English translation (which I found via Amazon's "look inside" feature) says this:
It was written in a dialect of Latin that is spoken as a native language in some parts of Europe, Canada, the U.S.A., the West Indies, and is used as a language of communication between several countries in Africa.
which sounds extremely weird until it transpires that he means it was written in French. (It appears that the book was less well received than the author hoped, at least partly on account of its having been written in French, and he is still cross about it, which I think is why he expresses himself in such a peculiar way: he's both parodying and criticizing the idea that there's something obscure about French.)
So, yeah, "idiosyncratic" is right. As for the mathematical approach, here's another quotation from the Introduction:
[...] I have deliberately chosen an unusual approach to the foundations of logic: The idea that I take as primitive is that of the back-and-forth construction in the style of Fraissé, rather than that of satisfaction of a formula.
(Gosh!)