Logical Uncertainty as Probability

post by gRR · 2012-04-29T22:26:35.078Z · LW · GW · Legacy · 22 comments

Contents

22 comments

This post is a long answer to this comment by cousin_it:

Logical uncertainty is weird because it doesn't exactly obey the rules of probability. You can't have a consistent probability assignment that says axioms are 100% true but the millionth digit of pi has a 50% chance of being odd.

I'd like to attempt to formally define logical uncertainty in terms of probability. Don't know if what results is in any way novel or useful, but.

Let X be a finite set of true statements of some formal system F extending propositional calculus, like Peano Arithmetic. X is supposed to represent a set of logical/mathematical beliefs of some finite reasoning agent.

Given any X, we can define its "Obvious Logical Closure" OLC(X), an infinite set of statements producible from X by applying the rules and axioms of propositional calculus. An important property of OLC(X) is that it is decidable: for any statement S it is possible to find out whether S is true (S∈OLC(X)), false ("~S"∈OLC(X)), or uncertain (neither).

We can now define the "conditional" probability P(*|X) as a function from {the statements of F} to [0,1] satisfying the axioms:

Axiom 1: Known true statements have probability 1:

    P(S|X)=1  iff  S∈OLC(X)

Axiom 2: The probability of a disjunction of mutually exclusive statements is equal to the sum of their probabilities:

    "~(A∧B)"∈OLC(X)  implies  P("A∨B"|X) = P(A|X) + P(B|X)

From these axioms we can get all the expected behavior of the probabilities:

    P("~S"|X) = 1 - P(S|X)

    P(S|X)=0  iff  "~S"∈OLC(X)

    0 < P(S|X) < 1  iff  S∉OLC(X) and "~S"∉OLC(X)

    "A=>B"∈OLC(X)  implies  P(A|X)≤P(B|X)

    "A<=>B"∈OLC(X)  implies  P(A|X)=P(B|X)

          etc.

This is still insufficient to calculate an actual probability value for any uncertain statement. Additional principles are required. For example, the Consistency Desideratum of Jaynes: "equivalent states of knowledge must be represented by the same probability values".

Definition: two statements A and B are indistinguishable relative to X iff there exists an isomorphism between OLC(X∪{A}) and OLC(X∪{B}), which is identity on X, and which maps A to B.
[Isomorphism here is a 1-1 function f preserving all logical operations:  f(A∨B)=f(A)∨f(B), f(~~A)=~~f(A), etc.]

Axiom 3: If A and B are indistinguishable relative to X, then  P(A|X) = P(B|X).

Proposition: Let X be the set of statements representing my current mathematical knowledge, translated into F.  Then the statements "millionth digit of PI is odd" and "millionth digit of PI is even" are indistinguishable relative to X.

Corollary:  P(millionth digit of PI is odd | my current mathematical knowledge) = 1/2.

 

22 comments

Comments sorted by top scores.

comment by lukeprog · 2012-04-30T00:04:15.323Z · LW(p) · GW(p)

I don't know much about logical uncertainty, but have ya'll seen Gaifman's paper on the subject?

Gaifman (2004), Reasoning with limited resources and assigning probabilities to arithmetical statements.

Replies from: lukstafi
comment by lukstafi · 2012-04-30T19:31:14.467Z · LW(p) · GW(p)

From taking a quick look at page 8 above (numbered 104 in the document), if gRR used X instead of OLC(X), it would result in something like the above article. The point is in having consistent probability distribution over held beliefs, not over some infinite closure of beliefs.

comment by Anatoly_Vorobey · 2012-04-29T23:07:43.224Z · LW(p) · GW(p)

You're sidestepping the whole point of cousin_it, which is that your mathematical knowledge is certainly enough to determine whether the millionth digit of pi is odd or even. One of these two statements is a trivial consequence of the Peano axioms and some well-known representation of pi as an infinite series. It's just that its being a trivial consequence is witnessed by a very long (and also trivial) proof which you're not aware of, and you don't know which of the two statements is backed by such a long proof.

Replies from: gRR
comment by gRR · 2012-04-29T23:48:30.197Z · LW(p) · GW(p)

I don't think I'm sidestepping the issue. The point of cousin_it's comment, as I understood it, was that assigning probabilities to "logically uncertain" statements results in inconsistencies. What I tried to show is that for probabilistic assignments to be consistent, it is only necessary to be logically omniscient at propositional calculus, not at full-power PA. And this is an important difference, because propositional calculus is decidable.

Replies from: Anatoly_Vorobey
comment by Anatoly_Vorobey · 2012-04-30T10:39:43.013Z · LW(p) · GW(p)

Ah, well, if you're only closing under propositional tautologies, then you're not doing anything interesting. OLC(X) is for practical purposes the same as X (not just because it's decidable, as you say, but more importantly because it's so weak). So your suggestions boils down to assigning P=1 to axioms, P=0 to their negations, and trying to figure out non-trivial probabilities for everything else by constraining on propositional consistency. But propositional consistency is merely a very thin veneer over X.

Because propositional inference isn't going to be able to break down quantifiers and "peer" inside their clauses, any quantified statement that's not already related to X [in the sense of participating in X either as a member or as a clause of a Boolean member] is opaque to X. So if I write A = (exists Y)(Y=2) and B = (exists Z)(Z=2 and Z=3), you'll be forced to deduce P(A)=P(B) under your axioms, or pre-commit to include in your axioms A, not(B) and everything else in a smoothly growing (complexity-wise) list of arithmetical truths I can come up with. That doesn't seem very useful.

Replies from: gRR
comment by gRR · 2012-04-30T10:58:08.972Z · LW(p) · GW(p)

But propositional consistency is merely a very thin veneer over X.

That was my goal - to come up with a minimum necessary for consistency, but still sufficient to prove the 1/2 probability for digits of PI :) If you wish to make OLC stronger, you're free to do so, as long as it remains decidable. For example, you can define OLC(X) to be {everything provable from X by at most 10 steps of PA-power reasoning, followed by propositional calculus closure}.

Replies from: Anatoly_Vorobey
comment by Anatoly_Vorobey · 2012-04-30T11:07:03.593Z · LW(p) · GW(p)

In your scheme you have P=1/2 for anything nontrivial and its negation that's not already in X. It just so happens that this looks reasonable in case of the oddity of a digit of pi, but that's merely a coincidence (e.g. take A="a millionth digit of pi is 3" rather than "...odd").

Replies from: gRR
comment by gRR · 2012-04-30T11:13:34.697Z · LW(p) · GW(p)

No, a statement and its negation are distinguishable, unless indeed you maliciously hide them under quantifiers and throw away the intermediate proof steps.

comment by orthonormal · 2012-04-30T02:53:00.635Z · LW(p) · GW(p)

I agree with what you're trying to do, but I don't think this proposed construction does it. There are a lot of really complicated statements of propositional calculus which turn out to be either tautologically true or tautologically false, and I'd like to be able to speak of uncertainty of those statements as well.

Constructions like this (or like fuzzy logic) don't capture the principle that I take to be self-evident when discussing bounded agents, that new deductions don't instantly propagate globally: if I've deduced A and also deduced (A implies B), I may not yet have deduced B. (All the more so when we make complicated examples.)

Replies from: gRR
comment by gRR · 2012-04-30T03:40:41.954Z · LW(p) · GW(p)

I don't think the construction actually requires instant propagation. It requires a certain calculation to be made when you wish to assign a probability to a particular statement, and this calculation is provably finite.

In your example, you are free to have X contain "A" and "A=>B", and not contain "B", as long as you don't assign a probability to B. When you wish to do so, you have to do the calculation, which will find that B∈OLC(X), and so will assign P(B)=1. Assigning any other value would indeed be inconsistent for any reasonable definition of probability, because if you know that A=>B, then you have to know that P(A)≤P(B), and then if P(A)=1, then P(B) must also be 1.

comment by Jack · 2012-04-30T00:31:35.833Z · LW(p) · GW(p)

As written axiom 1 is just false:

P(S|X)=1 iff S∈OLC(X)

But given any finite set (X) of true statements in Peano arithmetic I can invent a statement in Peano arithmetic that isn't in that set but that you would instantly assign p=1.

Edit: And it's false in the other direction too! Humans aren't logically omniscient in the propositional calculus!

Anyway, I think logical uncertainty becomes demystified pretty fast when you start thinking about it in terms of computing time and computational complexity (though that isn't to say I have a solution for the problem of representing it Bayesian calculus).

Replies from: gRR
comment by gRR · 2012-04-30T00:52:40.650Z · LW(p) · GW(p)

But given any finite set (X) of true statements in Peano arithmetic I invent statement in Peano arithmetic that isn't in that set but that you would instantly assign p=1.

It is entirely possible for P(S|X) to be < 1, although P(S|X∪{S}) = 1. There's nothing inconsistent about updating on new evidence.

Replies from: Jack
comment by Jack · 2012-04-30T01:44:55.209Z · LW(p) · GW(p)

So then it isn't true that P(S|X) = 1 if and only if S∈OLC(X).

Replies from: gRR
comment by gRR · 2012-04-30T02:18:39.617Z · LW(p) · GW(p)

If X = { "2*2=4" }, then P("2*2=4 ∨ 0=1" | X) = 1, because "2*2=4 ∨ 0=1" ∈ OLC(X).
However, P("2*3=6" | X) < 1, although P("2*3=6" | X ∪ {"2*3=6"}) = 1.

comment by Jack · 2012-04-29T22:45:31.849Z · LW(p) · GW(p)

What does this method give us for p(P = NP) or p(Goldbach's conjecture) or p(45897 * 69012 = 3167444764)?

Replies from: gRR
comment by gRR · 2012-04-29T23:54:25.463Z · LW(p) · GW(p)

The three axioms do not fix the probabilities uniquely for most of the statements. There are many different consistent probability assignments for any given X.

comment by prase · 2012-04-30T16:24:58.073Z · LW(p) · GW(p)

Let X be the set of statements representing my current mathematical knowledge, translated into F. Then the statements "millionth digit of PI is odd" and "millionth digit of PI is even" are indistinguishable relative to X.

It isn't obvious to me that this is true (and under which assumptions about X).

comment by wantstojoinin · 2012-04-30T09:06:33.931Z · LW(p) · GW(p)

A natural number n can be even or odd: i.e. n%2=0 or n%2=1.

If X = {n is natural number} then you showed that we can use P(n%2=0|X) + P(n%2=1|X) = 1 and P(n%2=0|X) = P(n%2=1|X) together to get P(n%2=0|X) = 1/2.

The same logic works for the three statements n%3=0,n%3=1,n%3=2 to give us P(n%3=0|X) = P(n%3=1|X) = P(n%3=2|X) = 1/3.

But then the same logic also works for the two indistinguishable statements n%3=0,n%3=1 \/ n%3=2 to give us P(n%3=0|X) = P(n%3=1 \/ n%3=2) = 1/2.

But 1/2 = 1/3 is a contradiction, so we find that axiom 3 leads to inconsistencies.

Replies from: gRR
comment by gRR · 2012-04-30T10:34:09.188Z · LW(p) · GW(p)

n%3=0 is distinguishable from n%3=1∨n%3=2. If A="n%3=0", B="n%3=1", and C="n%3=2", then an isomorphism f that maps B∨C to A must satisfy f(B∨C) = f(B)∨f(C) = A, which is impossible.

Replies from: wantstojoinin
comment by wantstojoinin · 2012-05-01T02:56:44.334Z · LW(p) · GW(p)

I understand, what I wrote was wrong. What if we use n%3=0 and ~(n%3=0) though?

comment by timtyler · 2012-04-29T23:25:54.831Z · LW(p) · GW(p)

I can't say I see the problem. Resource limitation leads to uncertaintly about logical propositions, and that can be translated into probability in just the same way as any other form of uncertainty.

comment by DanielLC · 2012-04-29T22:42:01.637Z · LW(p) · GW(p)

There's still a problem. Suppose you watch someone calculate out the millionth digit of pi. Given what you wrote, it seems like the probability will be one. If someone did this, they'd almost definitely mess up, and the true answer would be 1/2.

The probability should decrease with each additional step, as you may have made a mistake.