Improving the modal UDT optimality result

benja_fallenstein

Improving the modal UDT optimality result

post by Benya_Fallenstein (Benja_Fallenstein) · 2014-11-23T22:16:38.000Z · LW · GW · 2 comments

2 comments

I recently posted about an optimality result for modal UDT, which shows that for every modal decision problem $\to P (\to a)$ , there is a closed modal formula $φ$ such that the version of modal UDT that searches for proofs in $P A + φ$ will perform optimally on $\to P (\to a)$ .

Paul commented on this post and suggested a stronger version: For every modal decision theory $\to T (\to u)$ and every provably extensional modal decision problem $\to P (\to a)$ , modal UDT will do at least as well on $\to P (\to a)$ as $\to T (\to u)$ does if it is using a proof system that can prove what action $\to T (\to u)$ chooses on this decision problem, and which outcome it obtains as a result. In this post, I give a detailed proof of this.

Prerequisite: An optimality result for modal UDT, and the prerequisites therein.

Let $({\to A}^{(T)}, {\to U}^{(T)})$ be the fixed point of $\to T (\to u)$ and $\to P (\to a)$ , and recall my notation ${\to χ}^{(i)}$ for the sequence of formulas which has $⊤$ as the $i$ 'th entry, and $⊥$ as all of its other entries (with the length of the sequence being clear from context). Now suppose that $φ$ is a true closed formula in the language of $G L$ such that $G L ⊢ φ \to (({\to A}^{(T)} \leftrightarrow {\to χ}^{(i_{T}^{*})}) \land ({\to U}^{(T)} \leftrightarrow {\to χ}^{(j_{T}^{*})}));$ that is, $G L + φ$ proves that $\to T (\to u)$ chooses action $i_{T}^{*}$ , and achieves the outcome $j_{T}^{*}$ as a result. (By saying that $φ$ is a "true" formula we mean that its translation to the language of arithmetic is true about the standard natural numbers: $N ⊨ φ$ .) It's ok to talk about ${\to A}^{(T)}$ and ${\to U}^{(T)}$ in the definition of $i_{T}^{*}$ and $j_{T}^{*}$ because fixed points are unique (up to provable equivalence), and $G L$ can prove that the fixed point is in fact a fixed point.

My claim, then, is that ${\to U D T}^{(φ)} (\to u)$ will perform at least as well as $\to T (\to u)$ on the decision problem $\to P (\to a)$ ; that is, the outcome it achieves will be ranked $\leq j_{T}^{*}$ .

Intuitively, this is straight-forward. $^{(φ)} (\to u)$ searches through all pairs $(j, i)$ of outcomes and actions in lexicographical order, until it finds a pair such that it can prove that if it takes action $i$ , it will achieve outcome $j$ ; as soon as it finds such a pair, it takes action $i$ . (This is justified becaues it searches outcomes best-first, so it takes an action that leads to as good an outcome as it's able to prove it can get.) So if it can prove that taking action $i_{T}^{*}$ will lead to outcome $j_{T}^{*}$ , it will either take that action and get that outcome, or there's some pair $(j, i) < (j_{T}^{*}, i_{T}^{*})$ such that it takes action $i$ and obtains outcome $j \leq j_{T}^{*}$ . (Remember that outcomes are numbered from best to worst.)

Let's go through the details of showing that it actually works out that way.

Let's write $(\to A, \to U)$ for the fixed point of $^{(φ)} (\to u)$ with $\to P (\to a)$ . The part in our argument that we need to check carefully is that UDT will in fact stop at the pair $(j_{T}^{*}, i_{T}^{*})$ if it hasn't already stopped before that; i.e., $G L ⊢ φ \to ({U D T}_{i_{T}^{*}}^{(φ)} (\to U) \to U_{j_{T}^{*}}) .$ If this is satisfied, then we're done: We know that there will be some pair $(j^{*}, i^{*}) \leq (j_{T}^{*}, i_{T}^{*})$ such that $^{(φ)} (\to U)$ outputs $i^{*}$ and such that $G L ⊢ φ \to ({U D T}_{i^{*}}^{(φ)} (\to U) \to U_{j^{*}}),$ and hence, since (a) $G L$ is sound on $N$ , (b) $N ⊨ φ$ , and (c) by assumption, $N ⊨ {U D T}_{i^{*}}^{(φ)}$ , it follows that $N ⊨ U_{j^{*}}$ , i.e., ${U D T}^{(φ)} (\to U)$ achieves the outcome $j^{*} \leq j_{T}^{*}$ . Thus, let's check that $G L$ does indeed prove $φ \to ({U D T}_{i_{T}^{*}}^{(φ)} (\to U) \to U_{j_{T}^{*}})$ .

To do so, we make use of provable extensionality, that is, of the fact that $G L ⊢ (\to a \leftrightarrow \to b) \to (\to P (\to a) \leftrightarrow \to P (\to b)) .$ Since as a modal decision theory, $^{(φ)}$ is a p.m.e.e. sequence (provably mutually exclusive and exhaustive), $G L$ proves that ${U D T}_{i_{T}^{*}}^{(φ)} (\to U)$ implies $^{(φ)} (\to U) \leftrightarrow {\to χ}^{(i_{T}^{*})}$ , i.e., $\to A \leftrightarrow {\to χ}^{(i_{T}^{*})}$ . Hence, together with provable extensionality, we obtain $G L ⊢ {U D T}_{i_{T}^{*}}^{(φ)} (\to U) \to (\to U \leftrightarrow \to P (χ^{(i_{T}^{*})}))$ (since $G L ⊢ \to P (\to A) \leftrightarrow \to U$ by definition of $\to U$ ). But on the other hand, recall our initial assumption that $G L ⊢ φ \to (({\to A}^{(T)} \leftrightarrow {\to χ}^{(i_{T}^{*})}) \land ({\to U}^{(T)} \leftrightarrow {\to χ}^{(j_{T}^{*})}));$ again by provable extensionality, this implies $G L ⊢ φ \to ((\to P ({\to A}^{(T)}) \leftrightarrow \to P ({\to χ}^{(i_{T}^{*})})) \land ({\to U}^{(T)} \leftrightarrow {\to χ}^{(j_{T}^{*})})),$ and since $G L ⊢ \to P ({\to A}^{(T)}) \leftrightarrow {\to U}^{(T)}$ by definition of ${\to U}^{(T)}$ , this simplifies to $G L ⊢ φ \to (P ({\to χ}^{(i_{T}^{*})}) \leftrightarrow {\to χ}^{(j_{T}^{*})}) .$ But together with our earlier result, this implies $G L ⊢ (φ \land {U D T}_{i_{T}^{*}}^{(φ)} (\to U)) \to (\to U \leftrightarrow {\to χ}^{(j_{T}^{*})}),$ which is equivalent to $G L ⊢ φ \to ({U D T}_{i_{T}^{*}}^{(φ)} (\to U) \to U_{j_{T}^{*}}),$ as desired.

2 comments

Comments sorted by top scores.

comment by orthonormal · 2015-01-22T21:43:14.000Z · LW(p) · GW(p)

Further simplification (which Benja and Marcello worked out): since every modal formula is decidable conditional on $\neg □^{n} ⊥$ for large enough $n$ , you don't need special axioms for each modal decision problem and decision theory, you just need a strong enough consistency axiom. That's a pretty nifty optimality result.

(It requires the decidability result above, which currently is an unpublished folk theorem proved in an alternate draft of the modal combat paper, but we'll see if we can get that included in a nice peer-reviewed citable source.)

Replies from: orthonormal

↑ comment by orthonormal · 2015-05-11T17:03:47.000Z · LW(p) · GW(p)

Two further notes:

Said folk theorem is in fact shown in Boolos.
Benja verified that the optimality result should work for chicken-playing modal UDT as well as descending-search-order modal UDT.

Improving the modal UDT optimality result

Contents

2 comments