Last time, we required our robot to only assign logical probability of 0 or 1 to statements where it's checked the proof. This flowed from our desire to have a robot that comes to conclusions in limited time. It's also important that this abstract definition has to take into account the pool of statements that our actual robot actually checks. However, this restriction doesn't give us a consistent way to assign numbers to unproven statements - to be consistent we have to put limits on our application of the usual rules of probability.
The simplest solution is to assign logical probabilities to proven statements normally, but totally refuse to apply our information to unproven statements. The principle of maximum entropy means that every unproven statement then gets logical probability 0.5.
There is a correspondence between being inconsistent and ignoring information. We could just as well have said that when we move from proven statements to unproven statements, we refuse to apply the product rule, and that would have assigned every unproven statement logical probability 0.5 too. Either way, there's something "non-classical" going on at the boundary between proven statement and unproven statements. If you are certain that 100+98=198, but not that 99+99=198, some unfortunate accident has befallen the rule that if (A+1)+B=C, A+(B+1)=C.
Saying 0.5 for everything is rarely suggested in practice, because it has some glaringly bad properties: if we ask about the last digit of the zillionth prime number, we don't want to say that the answer being 1 has logical probability 0.5, we want our robot to use facts about digits and about primes to say that 1, 3, 7, and 9 have logical probability 0.25 each.
Using Our Pool of Proof-Steps
The most obvious process that solves this problem is to assign logical probabilities by ignoring most of the starting information, but using as information all proven statements containing no variables (that is, can't regenerate a set of axioms that will require us to take infinite time). So if we prove that our prime number can't simultaneously end with 1 and 3, we'll never assign a combined probability to them greater than 1.
This is the most typical family of suggestions (of the ones that won't require infinite resources). See Benja
, for example. Another phrasing of this solution is that it's like we start with the inconsistent prior of "1/2 to everything," and then update this according to checked proof steps, until we run out of time. The updating can also be described as if there's a bunch of different tables (models) that assign a true or false value to every statement, and when we learn that the prime can't end with 1 and 3 simultaneously, we rule out the models that say both of those are true and redistribute the probability evenly. To get answers in finite time, we can't actually compute any of these fancy descriptions, but we can compute the updates of just the statement we want.
Even though the probabilities assigned by this approach are more sensible, violations of normal probability still occur at the boundary between proven and unproven statements. We're giving our robot more information to work with, but still not as much as a robot with infinite computing power could use.
A sneaky issue here is that since using checked steps as information takes time, that's less time available to find the solution. This is a general rule - as tricks for assigning logical probabilities to unproven statements get better, they take up more time, so you only want to use them if you don't expect that time to be important. But calculating that tradeoff also takes time! Someone has probably solved this problem in other contexts, but I do not know the solution.
There is another interesting property we might want, which could be called logical pattern-matching
. Suppose that our robot is trying to predict a very complicated machine. Further suppose that our robot knows the complete description of the machine, but it is too complicated for our robot to predict the output, or even to find any useful proofs about the machine's behavior.
At time step 1, our robot observes that the machine outputs "1." At time step 2, our robot observes that the machine outputs "2." We might now want our robot to "see the pattern," and guess that the machine is about to output 3.
Our solutions so far don't do this - our robot would need to prove a statement like "if its last output was 2, its next output is logically forbidden from being 5." If our machine is too complicated to prove statements like that about, our previous solutions won't even think of the previous outputs as information.
One way to make our robot care about complexity is to restrict the length of hypotheses to less than the length of the description of the machine. This is like giving the robot the information that the answer comes from a machine, that the description length of this machine is less than some number.
A big problem with this is time. If our robot has to average over the outcome of all these different hypotheses, this takes longer than just using the description itself to find the answer. In a sense, directly using knowledge about the description of the machine is too much for our robot to handle. When we just used the checked proof-steps, that was okay, but as you give the robot more information you also burden it by making it spend more time interpreting that information.
And yet, we want our robot to be able to do logical pattern matching quickly if it actually goes out and observes a complicated machine that prints "1, 2...". But this is another problem I don't know how to solve - we could just say "monte carlo" and wave our hands, but handwaving is frowned upon here, and for good reason.
Further Open Problems
In this post I've already mentioned two open problems: the tradeoff of searching for an exact solution versus having a good approximation, and the correct way to do logical pattern-matching. There are more unsolved problems that also deserve mention.
• Handling infinities. Current proposals have some pretty bad properties if there are an infinite number of possible answers. For example, if you prove that answers 1-1000 all have small logical probability, but don't prove anything about answer 1001, the robot might decide that since you didn't prove anything about it, it has probability 0.5, and is thus a really good idea. An example direction to go might be to restrict our robot to taking actions it's actually proved things about - but we can also come up with perverse situations where that's bad. Is there a better way?
Integrating this approach into a larger framework of decision-making
. This builds off of making tradeoffs with your computing time and handling infinities. Basically, we want our robot to make decisions in limited time, not just output logical probabilities in limited time, and making decisions requires considering your possible actions and the utility of outcomes, which are allowed to be really complicated and require approximation. And then, we need to somehow direct computational resources into different tasks to make the best decision.
ntegrating this approach with second-order arithmetic. If you look at MIRI's paper
that uses a probability distribution over logical statements, their approach is quite different - for one, they don't allow for any limits on the robot's resources. And for another, there are all sorts of properties that are important when considering second-order arithmetic that we haven't needed yet. For example, what happens when we ask for P(P(this statement)<0.3)?
Thank you for reading the Logical Uncertainty sequence. I hope that things which were not obvious now seem obvious. If you want your logical probability distribution to have certain nice properties, it is a good idea to only slightly depart from the original desiderata of probability, and build up from there. Jumping straight to an answer is not necessary, and is probably a bad idea anyhow.
Comments sorted by top scores.
comment by ygert ·
2014-03-16T10:04:49.761Z · LW(p) · GW(p)
I have some rambling thoughts on the subject. I just hope they aren't too stupid or obvious ;-)
Let's take as a framework the aforementioned example of the last digit of the zillionth prime. We'll say that the agent will be rewarded if it gets it right, on, shall we say, a log scoring rule. This means that the agent is incentivised to give the best (most accurate) probabilities it can, given the information it has. The more unreasonably confident it is, the more it loses, and the same with underconfidence.
By the way, for now I will assume the agent fully knows the scoring rule it will be judges by. It is quite possible that this assumption raises problems of its own, but I will ignore them for now.
So, the agent starts with a prior over the possible answers (a uniform prior?), and starts updating itself. But it wants to figure out how long it will spend doing so, before it should give up and hand in for grading its "good enough" answer. This is the main problem we are trying to solve here.
In the degenerate case in which it has nothing else in the universe other than this to give it utility, I actually think it is the correct answer to work forever (or as long as it can before physically falling apart) on the answer. But we shall make the opposite assumption. Let's call the amount of utility lost to the agent as an opportunity cost in a given unit of time by the name C. (We shall also make the assumption that the agent knows what C is, at least approximately. This is perhaps a slightly more dangerous assumption, but we shall accept it for now.)
So, the agent want to work for as many units of time as it can before the marginal amount of extra utility it would earn from the scoring rule from the work of a unit time is less than C.
The only problem left is figuring out that margin. But, by the assumption that the agent knows the scoring rule, it knows the derivative of the scoring function as well. At any given point in time, it can figure out the amount of change to the potential utility it would get from the change to the probabilities it assigns. Thus, if the agent knows approximately the range in which it may update in the next step, it can figure out whether or not the next stage is worthwhile.
In other words, once it is close enough to the answer that it predicts that a marginal update would move it closer to the answer by an amount that gives less than C utility, it can quit, and not perform the next step.
This makes sense, right? I do suspect that this is the direction to drive at in the solution to this problem.