Improving the modal UDT optimality resultpost by Benya_Fallenstein (Benja_Fallenstein) · 2014-11-23T22:16:38.000Z · score: 2 (2 votes) · LW · GW · None comments
I recently posted about an optimality result for modal UDT, which shows that for every modal decision problem , there is a closed modal formula such that the version of modal UDT that searches for proofs in will perform optimally on .
Paul commented on this post and suggested a stronger version: For every modal decision theory and every provably extensional modal decision problem , modal UDT will do at least as well on as does if it is using a proof system that can prove what action chooses on this decision problem, and which outcome it obtains as a result. In this post, I give a detailed proof of this.
Prerequisite: An optimality result for modal UDT, and the prerequisites therein.
Let be the fixed point of and , and recall my notation for the sequence of formulas which has as the 'th entry, and as all of its other entries (with the length of the sequence being clear from context). Now suppose that is a true closed formula in the language of such that that is, proves that chooses action , and achieves the outcome as a result. (By saying that is a "true" formula we mean that its translation to the language of arithmetic is true about the standard natural numbers: .) It's ok to talk about and in the definition of and because fixed points are unique (up to provable equivalence), and can prove that the fixed point is in fact a fixed point.
My claim, then, is that will perform at least as well as on the decision problem ; that is, the outcome it achieves will be ranked .
Intuitively, this is straight-forward. searches through all pairs of outcomes and actions in lexicographical order, until it finds a pair such that it can prove that if it takes action , it will achieve outcome ; as soon as it finds such a pair, it takes action . (This is justified becaues it searches outcomes best-first, so it takes an action that leads to as good an outcome as it's able to prove it can get.) So if it can prove that taking action will lead to outcome , it will either take that action and get that outcome, or there's some pair such that it takes action and obtains outcome . (Remember that outcomes are numbered from best to worst.)
Let's go through the details of showing that it actually works out that way.
Let's write for the fixed point of with . The part in our argument that we need to check carefully is that UDT will in fact stop at the pair if it hasn't already stopped before that; i.e., If this is satisfied, then we're done: We know that there will be some pair such that outputs and such that and hence, since (a) is sound on , (b) , and (c) by assumption, , it follows that , i.e., achieves the outcome . Thus, let's check that does indeed prove .
To do so, we make use of provable extensionality, that is, of the fact that Since as a modal decision theory, is a p.m.e.e. sequence (provably mutually exclusive and exhaustive), proves that implies , i.e., . Hence, together with provable extensionality, we obtain (since by definition of ). But on the other hand, recall our initial assumption that again by provable extensionality, this implies and since by definition of , this simplifies to But together with our earlier result, this implies which is equivalent to as desired.
Comments sorted by top scores.