Posts

Comments

Comment by JoshBurroughs on Timeless Decision Theory: Problems I Can't Solve · 2011-02-17T21:15:42.067Z · LW · GW

A simpler way to say all this is "Pick a depth where you will stop recursing (due to growing uncertainty or computational limits) and at that depth assume your opponent acts randomly." Is my first attempt needlessly verbose?

Comment by JoshBurroughs on Timeless Decision Theory: Problems I Can't Solve · 2011-02-17T21:07:34.835Z · LW · GW

Agents A & B are two TDT agents playing some prisoner's dilemma scenario. A can reason:

u(c(A)) = P(c(B))u(C,C) + P(d(B))u(C,D)

u(d(A)) = P(c(B))u(D,C) + P(d(B))u(D,D)

( u(X) is utility of X, P() is probability, c() & d() are cooperate & defect predicates )

A will always pick the option with higher utility, so it reasons B will do the same:

p(c(B) u'(c(B)) > u'(d(B)) --> c(B)

(u'() is A's estimate of B's utility function)

But A can't perfectly predict B (even though it may be quite good at it), so A can represent this uncertainty as a random variable e:

u'(c(B)) + e > u'(d(B)) - e --> c(B)

In fact, we can give e a parameter, N, which is given by the depth of recursion, like a game of telephone:

u'(c(B)) + e(N) > u'(d(B)) - e(N) --> c(B)

Intuitively, it seems e(N) will tend to overwhelm u() for high enough N (since utilities don't increase as you recurse.) At that recursion depth:

p(c(B)) = p(d(B))

so:

u(c(A)) = u(C,C) +u(C,D)

u(d(A)) = u(D,C) + u(D,D)

u(D,C) > u(C,C) > u(D,D) > u(C,D)

so u(d(A)) > u(c(A)), meaning defection at the recursive depth where uncertainty overwhelms other considerations.

Does this mean a TDT agent must revert to CDT if it is not smart enough (or does not believe its opponent is smart enough) to transform the recursion to a closed-form solution?