What's a "natural number"?

post by cousin_it · 2010-10-07T13:34:26.983Z · LW · GW · Legacy · 18 comments

A big problem with natural numbers is that the axiomatic method breaks on them.

Mystery #1: if we're allowed to talk about sets of natural numbers, sets of these sets, etc., then some natural-sounding statements are neither provable nor disprovable ("independent") from all the "natural" axiomatic systems we've invented yet. For example, the continuum hypothesis can be reformulated as a statement about sets of sets of natural numbers. The root cause is that we can't completely axiomatize which sets of natural numbers exist, because there's too many of them. That's the substantial difference between second-order logic and first-order logic; logicians say that second-order logic is "defined semantically", not by any syntactic procedure of inference.

Mystery #2: if we're allowed to talk about arithmetic and use quantifiers (exists and forall) over numbers, but not over sets of them - in other words, use first-order logic only - then some natural-sounding statements appear to be true, but to prove them, we need to accept as axioms a lot of intuition about concepts other than natural numbers. For example, Goodstein's theorem is a simple arithmetical statement that cannot be proved in Peano arithmetic, but can be proved in "stronger theories". This means the theorem is a consequence of our intuition that some "stronger theory", e.g. ZFC, is consistent - but where did that intuition come from? It doesn't seem to be talking about natural numbers anymore.

Can we teach a computer to think about natural numbers the same way we do, that is, somehow non-axiomatically? Not just treat numbers as opaque "things" that obey the axioms of PA - that would make a lot of true theorems unreachable! This seems to be the simplest AI-hard problem that I've ever seen.

18 comments

Comments sorted by top scores.

comment by Vladimir_Nesov · 2010-10-08T08:52:32.361Z · LW(p) · GW(p)

My guess is that people (because of similar mind architecture) share beliefs (persuadability) in statements of a stronger system, of which PA and ZF only capture a rough outline. This shared theory accounts for platonism, or detail of "standard model". Capturing the whole of this theory, or even the whole of standard model of arithmetic, is on the same order of difficulty as formally defining human preference (because people don't have reliable access to those intuitions). Part of progress in mathematics consists in understanding this shared theory one aspect at a time. Other part is development of tools given formalist position, that is assuming very little and then playing the symbol game, preparing to consider various possibilities, but this doesn't address the question of motivation for entertaining one formal system and not another, that is platonist's intuition.

(It's similar to how we all solve the problem of induction approximately the same way, which means that we share the same "theory of reality": we reach the same conclusions from (enough of) the same observations.)

Replies from: Vladimir_M
comment by Vladimir_M · 2010-10-08T18:31:00.079Z · LW(p) · GW(p)

This is one of the best comments I've ever seen on LW, or anywhere else for that matter. Regardless of whether this hypothesis is true, I can't remember the last time I saw as much clarity and insight packed into so few words.

comment by komponisto · 2010-10-07T22:41:45.978Z · LW(p) · GW(p)

Frankly, I don't understand what the mystery is here. Both #1 and #2 are simply mathematical facts, just like the theorems of arithmetic.

Or perhaps I should rephrase it this way: what task do you want the computer to be able to perform?

Replies from: cousin_it
comment by cousin_it · 2010-10-07T23:56:08.234Z · LW(p) · GW(p)

I want it to somehow recognize that Goodstein's theorem is likely to be true and ZFC is likely to be consistent - without hardcoding some strong formal system that already implies these facts, because we humans didn't have such an epistemic gift and still succeeded at the task.

In general, what's mysterious to me here is the nature of intuition and its relationship with mathematical truth. Here's a simple testcase: throwing darts at the real line. Would we wish an AI to be swayed by such arguments? How are they different from relying on physical intuition (about apples lying on the table, or something) to assert the consistency of strong set theories?

Replies from: komponisto
comment by komponisto · 2010-10-08T04:45:46.849Z · LW(p) · GW(p)

I want it to somehow recognize that Goodstein's theorem is likely to be true and ZFC is likely to be consistent -

The first thing to ask is why we humans believe this to be the case. When looked at this way, it seems like a straightforward case of inductive (=Bayesian) reasoning: we expect that if ZFC were inconsistent, we would have found a contradiction by now. Right?

In general, what's mysterious to me here is the nature of intuition and its relationship with mathematical truth.

This is a mystery that I feel can dissolve -- at least, I've done so to my own personal satisfaction. The answer -- my answer -- is that "mathematical truth" is a purely formal notion, with no dependence on intuition. A theorem which has been formally proved in an axiom system can never be said to be "mathematically incorrect". The worst that can be said is that the particular axiom system that contains the theorem is inappropriate for modeling some physical phenomenon of interest.

(Hence, for example, my stance that it is an outright mistake to say that "infinite sets don't exist". What is meant by people who say this -- the only coherent thing than can be meant -- is "no aspect of the physical universe is appropriately modeled by a mathematical axiom system in which infinite sets exist". I believe this is also likely erroneous, but at least it isn't blatantly incoherent.)

Here's a simple testcase: throwing darts at the real line. Would we wish an AI to be swayed by such arguments?

As per the above, it's a mistake to argue about whether Freiling's axiom is "true" in the first place. It's true in ZFC+~CH, and false in ZFC+CH, end of story. It's no different from arguing about whether the sum of the angles in a triangle is "really" 180 degrees: in Euclidean geometry it is, in hyperbolic geometry it isn't. You can ask whether Euclidean geometry is consistent, and you can ask whether Euclidean geometry is an appropriate model of the physical world, but you can't ask whether Euclidean geometry is "true".

Replies from: cousin_it
comment by cousin_it · 2010-10-08T08:09:02.111Z · LW(p) · GW(p)

Thanks! Your comment prompts me to reformulate my original question this way: given a formal system, how can the AI determine that it talks about "the" natural numbers? For example, we can add to PA some axiom that rules out its standard model, but leaves many nonstandard ones. The simplest example would be to add the inconsistency of PA - the resulting theory will (counterintuitively) be just as consistent as PA, but quite weird. It will have many interesting provable theorems that are nevertheless common-sensically "false", e.g. "PA proves 1+1=3". Can the AI recognize such situations and say "no way, this formal system doesn't seem to describe my regular integers"?

About the consistency of ZFC: it's certainly a neat idea to conclude an arithmetical statement is "probably true" if you can't find a disproof for a long time. Unfortunately, if we have an arithmetical statement that we can neither prove or disprove so far, your idea would have us believe that it's true and its negation is also true. That doesn't look like correct Bayesian reasoning to me!

Replies from: komponisto, Vladimir_M
comment by komponisto · 2010-10-08T22:21:29.970Z · LW(p) · GW(p)

Can the AI recognize such situations and say "no way, this formal system doesn't seem to describe my regular integers"?

It need not -- asking whether a formal system "describes my regular integers" is a disguised query for whether it satisfies some set of properties that happen to be useful. All the AI needs to be able to do is evaluate how effectively different models describe whatever it's trying to use them to describe.

Unfortunately, if we have an arithmetical statement that we can neither prove or disprove so far, your idea would have us believe that it's true and its negation is also true. That doesn't look like correct Bayesian reasoning to me!

I don't see why not. It's not that we would believe the statement and its negation are both true; rather, we would believe that the statement is true with probability x and false with probability 1-x, as usual.

Replies from: cousin_it, cousin_it, cousin_it
comment by cousin_it · 2010-10-09T17:31:36.485Z · LW(p) · GW(p)

asking whether a formal system "describes my regular integers" is a disguised query for whether it satisfies some set of properties that happen to be useful

What are these properties?

comment by cousin_it · 2010-10-12T14:01:16.368Z · LW(p) · GW(p)

komponisto, did you leave my question unanswered because you don't know the answer, or because you thought the question stupid and decided to bail out? If you can dissolve my confusion, please do.

Replies from: komponisto
comment by komponisto · 2010-10-12T16:42:30.367Z · LW(p) · GW(p)

Sorry! I didn't have an answer immediately, but thought I might come up with one after a day or two. Unfortunately, by that time, I had forgotten about the question!

Anyway, the way I'd approach it is to ask what is wrong, from our point of view, with a given nonstandard theory.

Actually, I just thought of something while writing this comment. Take your example of adding a "PA is inconsistent" axiom to PA. Yes, we could add such an axiom, but why bother? What use do we get from this new system that we didn't already get from PA? If the answer is "nothing", then we can invoke a simplicity criterion. On the other hand, if there is some situation where this system is actually convenient, then there is indeed nothing "wrong" with it, and we wouldn't want an AI to think that there was.

(Edit: I'll try to make sure I reply more quickly next time.)

comment by cousin_it · 2010-10-10T09:41:07.551Z · LW(p) · GW(p)

It's not that we would believe the statement and its negation are both true; rather, we would believe that the statement is true with probability x and false with probability 1-x, as usual.

Then I don't understand why you said this earlier:

we expect that if ZFC were inconsistent, we would have found a contradiction by now

The consistency of ZFC is an arithmetical statement. You say we haven't yet found a disproof for it, so we should believe it more; but we haven't found a disproof of its negation either, so we should believe it less! Isn't this incoherent by Bayesian lights? Or am I misunderstanding something about your idea?

comment by Vladimir_M · 2010-10-08T18:24:40.375Z · LW(p) · GW(p)

cousin_it:

For example, we can add to PA some axiom that rules out its standard model, but leaves many nonstandard ones. The simplest example would be to add the inconsistency of PA - the resulting theory will (counterintuitively) be just as consistent as PA, but quite weird. [...] Can the AI recognize such situations and say "no way, this formal system doesn't seem to describe my regular integers"?

This seems to be a matter of some relatively straightforward human heuristics. A theory that is somehow directly "talking about itself" via its axioms looks weird and suspicious. And not to mention what this axiom would look like if you actually wrote it down alongside the standard PA axioms in the same format!

Note that when these heuristics don't apply, humans end up confused and in disagreement about what should be considered as "our regular" stuff when faced with various independent statements that look like potential axioms. (As with Euclid's fifth postulate, the continuum hypothesis, the axiom of choice, etc.)

It will have many interesting provable theorems that are nevertheless common-sensically "false", e.g. "PA proves 1+1=3".

More specifically, it will prove: "There exists a natural number that, when decoded according to your Goedelization scheme, yields a proof in PA that 1+1=3." However, it won't prove anything about that number that would actually allow you to write it down and see what that proof is. ("Writing down" a natural number essentially means expressing a theorem about a specific relation it has with another number that you use as the number system base.) This suggests another straightforward heuristic: if the theory asserts the existence of objects with interesting properties about which it refuses to say anything more that would enable us to study them, it's not "our regular" kind of thing.

Replies from: pengvado
comment by pengvado · 2010-10-09T14:38:35.266Z · LW(p) · GW(p)

If the theory asserts the existence of objects with interesting properties about which it refuses to say anything more that would enable us to study them, it's not "our regular" kind of thing.

Does this heuristic say that Chaitin's constant isn't a regular real number?

Replies from: Vladimir_M
comment by Vladimir_M · 2010-10-09T16:56:29.912Z · LW(p) · GW(p)

Good point. I should have said "bizarre properties" or something to that effect. Which of course leads us to the problem of what exactly qualifies as such. (There's certainly lots of seemingly bizarre stuff easily derivable in ordinary real analysis.) So perhaps I should discard the second part of my above comment.

But I think my main point still stands: when there is no obvious way to see what choice of axioms (pun unintended) is "normal" using some heuristics like these, humans are also unable to agree what theory is the "normal" one. Differentiating between "normal" theories and "pathological" ones like PA + ~Con(PA) is ultimately a matter of some such heuristics, not some very deep insight. That's my two cents, in any case.

comment by [deleted] · 2010-10-07T17:59:17.230Z · LW(p) · GW(p)

Can we teach a computer to think about natural numbers the same way we do, that is, somehow non-axiomatically?

I've wondered about this as well. One possible way to go about this is to figure out how to introduce higher-order mathematical patterns into the process (instead of simply using atomic sentences and inference rules). Recurring proof patterns could be one such higher order pattern. Patterns at different scales could be compared, and new patterns suggested on the basis of this comparison. These new patterns could then be translated into a (or a number of) elementary propositions.

comment by DanielLC · 2011-05-05T05:37:26.156Z · LW(p) · GW(p)

Don't teach it about natural numbers. Teach it about the universe. If it decides something like what we call "natural numbers" is helpful for certain models, let it decide how helpful. It wouldn't be good if it just assumed the number of particles in a system follows the laws of natural numbers, when you can add one to any natural number, but will eventually run out of particles.

comment by jsteinhardt · 2010-10-08T05:59:57.140Z · LW(p) · GW(p)

I feel like this is actually quite hard for an AI. To talk about formal systems like mathematics in an intuitive sense, we need language. And language requires us to have almost every other system (except perhaps vision). In particular, we would need sophisticated systems for dealing with categories.

I personally think that the simplest AI-hard problem is coming up with a satisfactory system for category formation and inference. See for instance Goldstone and Kerstein's chapter 22 in the Handbook of Psychology for a discussion of the problem.

I actually disagree with wikipedia's claim that vision is AI-hard. I think we could have good vision systems without a strong AI, and vice versa, although the latter would be quite crippled initially.

comment by red75 · 2010-10-08T12:26:08.418Z · LW(p) · GW(p)

General brain's capability for (cross-domain) pattern matching? While studying objects of new formal system and their relations, it patternmatch them with already known formal and real objects and relations. Successful match results in insight (that can be formally wrong).

Pros: mathematics stems from observation of real world objects and their relations: countable objects, lines, etc.

Cons: too simple.