Gödel and Bayes: quick question
post by hairyfigment · 2011-04-14T06:12:05.618Z · LW · GW · Legacy · 36 commentsContents
36 comments
Kurt Gödel showed that we could write within a system of arithmetic the statement, "This statement has no proof within the system," in such a way that we couldn't dismiss it as meaningless. This proved that if the system (or part of it) could prove the logical consistency of the whole, it would thereby contradict itself. We nevertheless think arithmetic does not contradict itself because it never has.
From what I understand we could write a version of the Gödel statement for the axioms of probability theory, or even for the system that consists of those axioms plus our current best guess at P(axioms' self-consistency). Edit: or not. According to the comments the Incompleteness Theorem does not apply until you have a stronger set of assumptions than the minimum you need for probability theory. So let's say you possess the current source code of an AGI running on known hardware. It's just now reached the point where it could pass the test of commenting extensively on LW without detection. (Though I guess we shouldn't yet assume this will continue once the AI encounters certain questions.) For some reason it tries to truthfully answer any meaningful question. (So nobody mention the Riemann hypothesis. We may want the AI to stay in this form for a while.) Whenever an internal process ends in a Jaynes-style error message that indicates a contradiction, the AI takes this as strong evidence of a contradiction in the relevant assumptions. Now according to my understanding we can take the source code and ask about a statement which says, "The program will never call this statement true." Happily the AI can respond by calling the statement "likely enough given the evidence." So far so good.
So, can we or can't we write a mathematically meaningful statement Q saying, "The program will never say 'P(Q)≥0.5'"? What about, "The program will never call 'P(Q)≥0.5' true (or logically imply it)"? How does the AI respond to questions about variations of these statements?
It seems as if we could form a similar question by modifying the Halting-Oracle Killer program to refute more possible responses to the question of its run-time, assuming the AI will know this simpler program's source code. Though it feels like a slightly different problem because we'd have to address a lot of possible responses directly – with the previous examples, if the AI doesn't kill us in one sense or another, we can go on to ask for clarification. Or we can say the AI wants to clarify any response that evades the question.
36 comments
Comments sorted by top scores.
comment by Giles · 2011-04-14T11:51:23.162Z · LW(p) · GW(p)
Gödel's First Incompleteness Theorem involves statements of the form "this statement cannot be proved" rather than "this statement isn't true". (Mathematical truth is a surprisingly slippery concept to define).
So I think the analogous statement would be "The inferred probability of this statement is less than .5 given these priors and this evidence".
comment by DanielLC · 2011-04-14T06:27:49.602Z · LW(p) · GW(p)
From what I understand we could write a version of the Gödel statement for the axioms of probability theory,
It only works on sufficiently complex systems. Basically, anything with addition and multiplication. None of the probability axioms involve multiplication.
I could be wrong about this not being complex enough, though.
So, can we or can't we write a mathematically meaningful version of the statement, "This statement has less than .5 probability within the system"?
No. Probability can't be easily applied to mathematical statements. Either it's provable or it's not.
Replies from: Sniffnoy, hairyfigment↑ comment by Sniffnoy · 2011-04-14T06:46:31.420Z · LW(p) · GW(p)
The key is not just addition and multiplication, the key is addition and multiplication of whole numbers. The first-order theory of the real numbers, by contrast - with addition and multiplication - is decidable. And probability deals with real numbers, not integers.
On top of that you have the issue that the "axioms" of probability are not axioms in the sense of an axiomatic system, they're just "axioms" in the sense of "requirements".
In short this whole post seems badly misinformed.
Replies from: prase, AnotherKevin, hairyfigment↑ comment by prase · 2011-04-14T08:18:22.485Z · LW(p) · GW(p)
I am confused - doesn't any theory with real numbers (and addition and multiplication) include a theory of integers (with addition and multiplication)?
Replies from: None↑ comment by [deleted] · 2011-04-14T12:24:24.734Z · LW(p) · GW(p)
The real numbers surely include the integers, but whether "the theory of real numbers" includes "the theory of the integers" depends on what you mean. By "the theory of real numbers" people often mean those aspects of the real numbers you can reason about using finite combinations of arithmetic signs and inequalities. Whether or not you can reason about integers starting from the ability to reason about real numbers depends on whether you can cut out the set of integers from the set of real numbers according to these rules, i.e. whether the integers are a "definable subset" of the real numbers according to our "first order theory of real numbers". It's easy to see they aren't: i.e. the integers are not the solutions to any finite system of polynomial equations and inequalities.
Replies from: prase↑ comment by prase · 2011-04-14T13:11:09.064Z · LW(p) · GW(p)
Interesting. Does it mean that e.g. a proposition "there is an integer x which solves the equation exp x=57.24" can't be formulated inside a theory of real numbers because there is no way how to encode the word "integer"? Or is the problem more subtle?
Replies from: Vladimir_M, Sniffnoy, Perplexed↑ comment by Vladimir_M · 2011-04-15T03:16:19.586Z · LW(p) · GW(p)
It depends on the theory you're working with. The set of real numbers can be axiomatized using thirteen axioms that merely say that it's a totally ordered field whose every non-empty subset has a least upper bound. These 13 axioms allow you to prove a whole lot of interesting stuff about real numbers, and yet they constitute a decidable theory, though with wildly intractable complexity. However, this theory makes it impossible to construct a formula saying "x is an integer" (unless you move to second-order logic).
On the other hand, you can start with set theory, use it to construct natural numbers (i.e. define a set and a relation which can be proven to satisfy the Peano axioms), and then use that to construct a set that can be shown to satisfy the thirteen real-number axioms. This can be done in several ways, for example via Dedekind cuts (in which case the reals are defined as partitions of the set of rational numbers, which are in turn equivalence classes of pairs of integers, which are themselves equivalence classes of pairs of natural numbers). Here, of course, you can talk about natural numbers and integers, but you've been working all along in set theory, which is undecidable.
(More precisely, when you construct real numbers in set theory, you end up with multiple isomorphic sets satisfying Peano axioms: the "original" natural numbers, and the subsets of the subsequently defined integers, rationals, and reals that are isomorphic to them. However, at each step, you can construct formulas that will distinguish the subsets corresponding to the previously defined sets in the newly constructed one.)
So, basically, expressing propositions of the sort "there exists x such that x is an integer and P(x)" is possible with real numbers constructed in the undecidable set theory, but it's not possible in the decidable theory of real numbers based on the 13 axioms.
Replies from: Sniffnoy↑ comment by Sniffnoy · 2011-04-15T03:41:10.581Z · LW(p) · GW(p)
It depends on the theory you're working with. The set of real numbers can be axiomatized using thirteen axioms that merely say that it's a totally ordered field whose every non-empty subset has a least upper bound. These 13 axioms allow you to prove a whole lot of interesting stuff about real numbers, and yet they constitute a decidable theory, though with wildly intractable complexity. However, this theory makes it impossible to formulate a predicate saying "x is an integer" (unless you move to second-order logic).
Technical nitpicking - this doesn't quite make sense; one of the axioms needed to characterize R is the existence of least upper bounds, which if you're only talking about R rather than set theory is second order. Of course the resulting first-order theory can't distinguish integers, but let's note that without set theory you have to allow second-order logic to see just what this first-order theory consists of.
Of course, at a first-order level, R is indistinguishable from any real closed field, so you could do a true first-order axiomatization that way.
Replies from: Vladimir_M↑ comment by Vladimir_M · 2011-04-15T06:00:10.382Z · LW(p) · GW(p)
Yes, of course, barring second-order logic, the "13 axioms" are in fact twelve plus a countably infinite set of axioms generated by the completeness axiom scheme, just like the Peano axiom of induction is in fact an axiom scheme generating an infinite set of axioms.
However, I'm not sure I understand what you mean when you say that "without set theory you have to allow second-order logic to see just what this first-order theory consists of." The theory is meant to have the set of real numbers as its universe of discourse, and sets of real numbers correspond to its formulas with one free variable, for which you instantiate the custom completeness axioms in proofs as necessary. What exactly do you need set theory or second-order logic for?
(Also, on a more careful reading, I was a bit loose with terminology in my earlier comment. In first-order logic, you cannot "define predicates," since their number is fixed in the given language. What I meant is constructing a formula N(x) with one free variable that is satisfied by, and only by, natural numbers.)
Replies from: Sniffnoy↑ comment by Sniffnoy · 2011-04-15T06:17:56.253Z · LW(p) · GW(p)
Oh, I see. So you're replacing the usual completenes axiom for sets with a completeness schema for predicates. I hadn't considered that. Presumably that works. Never mind. I was interpreting it as actually using sets and thereby introducing either second-order stuff or set theory.
Replies from: Vladimir_M↑ comment by Vladimir_M · 2011-04-15T23:56:00.130Z · LW(p) · GW(p)
Hm... looking at the literature, it seems like I was wrong after all. (My knowledge of this stuff is very rusty.)
There is indeed something essentially second-order about the completeness axiom. Namely, the second-order quantification over sets of reals uniquely characterizes real numbers in a way that is impossible for any first-order theory, because it eliminates countable models, which every first-order theory with countably many axioms must have as per Loewenheim-Skolem. So turning the second-order completeness axiom into a first-order axiom schema does change things.
Reading a bit further about the decision procedures for the theory of real numbers, I notice that all these sources use a different axiomatization that replaces the completeness axiom with two axioms, namely one stating that every non-negative number has a square root and a schema stating that every polynomial of odd degree has a zero. Now, the question is: is this theory equivalent to the first-order one in which the second-order completeness axiom is replaced by a first-order schema as I imagined? If not, then what I wrote above is completely wrong. I'll have to read more about this.
Edit: According to the Stanford Encyclopedia of Philosophy entry on second-order logic:
Suppose that, by analogy, we start from our second-order axiomatization of the ordered field of real numbers, and replace the second-order least-upper-bound axiom by the corresponding schema. The result is an infinite set of first-order axioms, assuring that any definable set that is non-empty and bounded has a least upper bound. The models of this are called real-closed ordered fields.
The same term "real-closed ordered fields" is used in the literature for the above mentioned axiomatization that replaces the second-order completeness axiom with the square-root axiom and the odd-order polynomial zero axiom schema. This would suggest that they are equivalent after all. However, I was unable to find a clear and explicit statement confirming this anywhere, which I find puzzling.
Replies from: Sniffnoy↑ comment by Sniffnoy · 2011-04-16T08:31:12.566Z · LW(p) · GW(p)
Yes, all real closed fields are first-order equivalent; don't ask me how to prove this, I don't know. (Well, I'm pretty sure you do it via quantifier elimination, but I don't know how that actually goes.) When I said "presumably this works", I meant "presumably this is enough to get all the first-order properties of R", which is the same thing as saying "presumably this is enough to get a real closed field". Going by the SEP cite you found, apparently this is so.
Replies from: Sniffnoy↑ comment by Sniffnoy · 2011-04-16T20:15:24.904Z · LW(p) · GW(p)
Oh, I feel silly - of course it's enough. Real closedness is equivalent to intermediate value theorem for polynomials, so if you have "first-order completeness" you can just do the usual proof of intermediate value theorem, just for polynomials. (Because after all the topology on R can be described in terms of its order structure, which allows you to state and use continuity and such as first-order statements.)
↑ comment by Sniffnoy · 2011-04-14T22:04:13.429Z · LW(p) · GW(p)
It cannot be formulated with the first-order theory of the real numbers with addition and multiplication. It is easily formulated within, say, the second-order theory of the real numbers (n is a whole number iff it's in every subset S of R such that 0 is in S and, for every m in S, m+1 is in S), or within the first-order theory of the complex numbers if you allow not only addition and multplication but also exponentiation (n is an integer iff for every z such that e^z=1, e^nz is also 1).
Replies from: prase↑ comment by prase · 2011-04-15T09:08:12.675Z · LW(p) · GW(p)
Thanks. Is there a terminating algorithm which tests the first condition for any given n?
Replies from: Sniffnoy↑ comment by Sniffnoy · 2011-04-15T20:24:16.119Z · LW(p) · GW(p)
I don't even understand what you're talking about when you speak of algorithms on real numbers. Computers can't work with real numbers. Are you assuming some sort of hypercomputation, like a Blum-Shub-Smale computer? Or how are you restricitng your input? (Computable reals, via taking a Turing machine as input? Definable reals handled... somehow?)
(Of course, my answer to any of those questions is "I don't know, this isn't really my area, but you could probably look it up pretty easily" - but my point is that your question doesn't seem to make sense as asked.)
In any case, we're just talking about defining subsets here, not computing them. There's a pretty big gap between being first-order-definable and being computable. And that "first condition" is the same as being a whole number so you could just ask that.
Your questions seem to be predicated on a number of confusions. I think you should really go back and read some standard stuff on this first.
EDIT a few minutes later: Of course, if you stick to first-order stuff, then reals are equivalent to definable reals, or to just computable reals - or to just algebraic reals, even - so perhaps your question could be formulated there? But it still seems ill-founded, i.e., I doubt it gets at what you actually want to know. You seem to be confused about the distinction between computation, first-order description, and description more generally.
↑ comment by Perplexed · 2011-04-14T15:12:38.701Z · LW(p) · GW(p)
A little more subtle, I think. Recall that the key idea in Godel's proof was 'Godelization" - he exhibited an algorithm which matched each sentence about arithmetic in some formal language with a particular natural number. And he also exhibited an algorithm going in the opposite direction - given any number (or formula representing a number), it would either tell you that the number does not correspond to a sentence, or it would tell you what sentence that number represents.
Now suppose we tried to duplicate this with real numbers. Our language still has only a countable number of sentences, so we still want to map sentences to integers (or naturals). So all we need is an algorithm - given any real number or formula representing a real number, determine whether that number is an integer. But there is no such algorithm. It isn't so much a difficulty in simply encoding or defining the word "integer", it is a difficulty in producing an operational or algorithmic definition of "integer".
Incidentally, returning to the original Godel and arithmetic, the only way Godel was able to produce an algorithm deciding whether a formula representing a number does or does not also represent a sentence ... the only way he could be sure his algorithm terminated, was to assume a standard model of arithmetic. I.e., count backward and you will eventually get to zero in a finite number of steps. That is, Godel's undecidable sentence could be assumed either true or false, given the axiomatization of arithmetic. But, also assuming a standard model, that sentence can only be true.
Replies from: prase↑ comment by prase · 2011-04-14T18:38:40.703Z · LW(p) · GW(p)
So all we need is an algorithm - given any real number or formula representing a real number, determine whether that number is an integer. But there is no such algorithm. It isn't so much a difficulty in simply encoding or defining the word "integer", it is a difficulty in producing an operational or algorithmic definition of "integer".
Yes, this was approximately what I meant by "encoding the word integer". Somehow it didn't occur to me that it may be difficult, or even impossible, to test real numbers for integerness. Still, I am not sure if I understand. A concrete example of a uniquely defined real number whose integerness is unknown would help.
Replies from: Perplexed↑ comment by Perplexed · 2011-04-15T01:16:33.929Z · LW(p) · GW(p)
I think you misunderstood. I said there exists no algorithm such that for all formulas it can decide whether the formula represents an integer. Which is equivalent to saying that forall algorithms there exists a formula which can't be decided.
In other words, the way this game is played, first you specify the algorithm and then I supply the formula that breaks it.
Replies from: prase↑ comment by AnotherKevin · 2011-04-15T21:42:37.919Z · LW(p) · GW(p)
Either I am confused or this discussion is confused.
N(X) iff (X=0) || ((X > 0) && N(X-1)) iff X is natural or 0
Z(X) iff ( (X >= 0) -> N(X) ) && ( (X < 0) -> N(0 - X) ) iff X is an integer
equivalently
%20\iff%20(X=0)%20\vee%20((X%20%3E%200)%20\wedge%20N(X-1))%20\iff%20) X is a natural number
%20\iff%20(%20(X%20%3E=%200)%20\implies%20N(X)%20)%20\wedge%20(%20(X%20%3C%200)%20\implies%20N(0%20-%20X)%20)%20\iff%20) X is an integer
I'm also under the impression that the algebraic numbers are countable, dense in R, and that
(\exists%20x((x%20\in%20\mathbb%20R%20)%20\vee%20P(x))%20\iff%20\exists%20y%20((\text{y%20is%20algebraic})%20\vee%20P(y))%0A)
Edit: note to all, mixing latex and plain text on a line looks messy. Further edited for formatting due to lack of preview.
Replies from: Sniffnoy↑ comment by Sniffnoy · 2011-04-15T22:54:42.356Z · LW(p) · GW(p)
If you're attempting to define N as a first order predicate, that doesn't work; you've defined it in terms of itself. You can't directly define predicates recursively; predicates must be finite. If you want to do get a "recursive" predicate you have to do quite a bit more work than that, and in particular you need tools not available in the first order theory of the reals (with addition and multiplication, as usual).
Your definition of Z has additional minor problems; you mean and, not implies. (X>=0 => N(X)) is automatically satisfied for any X<0.
Your last statement is correct (if a bit less general than it could be :) ), though your notation is a bit strange. (Again, assuming + and * as usual.)
Might I ask what the relevance of all this is?
Replies from: AnotherKevin↑ comment by AnotherKevin · 2011-04-16T00:48:35.337Z · LW(p) · GW(p)
Z is defined correctly. When X >= 0 the formula becomes N(X) AND TRUE when X < 0 the formula becomes TRUE AND N(0-X).
Otherwise I was confused. I was trying to define N implicitly which I should have recognized as invalid. Explaining what I was trying to say at the end would be pointless given that I didn't say it and it's also wrong =P. Mea culpa
Replies from: Sniffnoy↑ comment by hairyfigment · 2011-04-14T07:23:28.788Z · LW(p) · GW(p)
Do you mean the underlying math we need for probability theory does not allow us to construct or derive the part of number theory that Gödel's theorem needs? That seems wrong somehow. (Going by Wikipedia, it means that we don't know a non-empty set of probabilities has a least upper bound. That seems like it would cause problems sooner or later.) Where would I go to find out why this holds true?
Replies from: Sniffnoy, endoself↑ comment by Sniffnoy · 2011-04-14T22:00:27.433Z · LW(p) · GW(p)
You cannot get number theory out of the first-order theory of the reals; how would you write a first-order predicate to tell you whether or not a real number was an integer?
I don't see what least upper bounds have to do with anything. Again, Gödel's theorem deals with systems that can model number theory (i.e. the integers, the whole numbers); it has nothing to do with the real numbers.
Replies from: hairyfigment↑ comment by hairyfigment · 2011-04-14T23:19:10.923Z · LW(p) · GW(p)
The Wikipedia article and this shorter account both say that some form of Gödel's incompleteness theorem applies to second-order logic. I asked about the limits of the first-order approach to the reals because it looks like we'd need to use that if we want to stop the theorem from applying.
That approach still seems odd, but I can sort of see how you could do probability that way. I'll edit the OP to reflect my real question as soon as I feel up to it.
↑ comment by endoself · 2011-04-14T15:56:23.470Z · LW(p) · GW(p)
Do you mean the underlying math we need for probability theory does not allow us to construct or derive the part of number theory that Gödel's theorem needs?
Essentially. The specifics are explained in more detail in the comments above.
Going by Wikipedia, it means that we don't know a non-empty set of probabilities has a least upper bound.
That is not true. What Wikipedia article are you referring to?
Replies from: hairyfigment↑ comment by hairyfigment · 2011-04-14T17:12:50.067Z · LW(p) · GW(p)
Second-order logic says that, "if the domain is the set of all real numbers," then "one needs second-order logic to assert the least-upper-bound property for sets of real numbers, which states that every bounded, nonempty set of real numbers has a supremum."
Do you mean to say we can somehow prove this in first-order logic for numbers between 0 and 1, but we can't extend it to the real number line as a whole?
Replies from: Sniffnoy, endoself↑ comment by hairyfigment · 2011-04-14T06:35:44.731Z · LW(p) · GW(p)
None of the probability axioms involve multiplication.
The product rule does, and in any case I mean to include all the rules of arithmetic that we use to apply Bayes.
Replies from: DanielLC↑ comment by DanielLC · 2011-04-14T06:40:37.948Z · LW(p) · GW(p)
The product rule does
It's not an axiom. I think it might be the definition of "given".
and in any case I mean to include all the rules of arithmetic that we use to apply Bayes.
Then you're talking about math, not probability. The fact that you can't prove or disprove Goedel's sentence doesn't change the probability of a hypothesis.
comment by jimrandomh · 2011-04-14T12:09:06.846Z · LW(p) · GW(p)
So, can we or can't we write a mathematically meaningful version of the statement, "This statement has less than .5 probability within the system"?
That would lead to a self-containing set, which is infinite. Infinite sets are allowed only if they're approached by a convergent limiting expression, which is not given here, so Bayesian statistics of the E.T. Jaynes variety would not assign that statement a probability at all.
Replies from: DanielLC