Is CDT with precommitment enough?
post by martinkunev · 2024-05-25T21:40:11.236Z · LW · GW · 6 commentsThis is a question post.
Contents
Answers 5 cubefox 2 Tapatakt 1 Ape in the coat -3 Dagon None 6 comments
Logical decision theory was introduced (in part) to resolve problems such as Parfit's hitchhiker.
I heard an argument that there is no reason to introduce a new decision theory - one can just take causal decision theory and precommit to doing whatever is needed on such problems (e.g. pay the money once in the city).
This seems dubious given that people spent so much time on developing logical decision theory. However, I cannot formulate a counterargument. What is wrong with the claim that CDT with precommitment is the "right" decision theory?
Answers
One problem is that in most cases, humans simply can't "precommit" in the relevant sense. We can't really (i.e. completely) move a decision from the future into the present. When I think I have "precommitted" to do the dishes tomorrow, it is still the case that I will have to decide, tomorrow, whether or not to follow through with this "precommitment". So I haven't actually precommitted in the sense relevant for causal decision theory, which requires that the future decision has already been made and that nothing will be left to decide.
So if you e.g. try to commit to one-boxing in Newcomb's problem, it is still the case that you have to actually decide between one-boxing and two-boxing when you stand before the two boxes. And then you will have no causal reason to do one-boxing anymore. The memory of the alleged "precommitment" of your past self is now just a recommendation, or a request, not something that relieves you from making your current decision.
An exception is when we can actively restrict our future actions. E.g. you can precommit to not use your phone tomorrow by locking it in a safe with a time-lock. But this type of precommitment often isn't practically possible.
Being able to do arbitrary true precommitments could also be dangerous overall. It would mean that we really can't change the precommitted decision in the future (since it has already been made in the past), even if unexpected new information will strongly imply we should do so. Moreover, it could lead to ruinous commitment races [? · GW] in bargaining situations.
↑ comment by Ape in the coat · 2024-05-27T05:48:42.138Z · LW(p) · GW(p)
One problem is that in most cases, humans simply can't "precommit" in the relevant sense. We can't really (i.e. completely) move a decision from the future into the present.
This seems to me as a potential confusion of normative and descriptive sides of things. Whether humans in practice perfectly follow a specific decision theory isn't really relevant to the question of which decision theory an optimal agent should implement. If CDT+P is optimal and humans have troubles with precommiting it is a problem - for humans, not for CDT+P. It's a reason for humans to learn to precommit better.
When I think I have "precommitted" to do the dishes tomorrow, it is still the case that I will have to decide, tomorrow, whether or not to follow through with this "precommitment". So I haven't actually precommitted in the sense relevant for causal decision theory, which requires that the future decision has already been made and that nothing will be left to decide.
Unless you've actually precommited to do the dishes, of course. Then your mind doesn't even entertain the idea of not doing them.
Humans are imperfect precommiters but neither we are completely unable to precommit. We do not evaluate every action we take at every moment of taking it. When you go somewhere, you do not interrogate yourself whether to continue doing it at every step. We have the ability to follow plans and to automatize some of our actions. And we can actively improve this ability by cultivating relevant virtues. There is an obvious self fulfilling component here - those who do not believe that they can precommit and therefore do not try, indeed can't. Those who actively try, also fail sometimes, but they are less bad at precommitments and improve with time.
Being able to do arbitrary true precommitments could also be dangerous overall.
Of course. That's why evolution gave us only limited ability to precommit in the first place. And most of our precommitments are flexible enough. There is an implicit "unless something completely unexpected happens or I feel extremely bad, etc" built in in our promises by default and it requires extra previledged access to our psyche to override these restrictions.
Moreover, it could lead to ruinous commitment races [? · GW] in bargaining situations.
Commitment races is an interesting topic. I belive there is a coherent way to resolve them by something like precommiting not to respond to threats and not to make threats yourself against those who would not respond to them, but I dind't explore this beyond reading Project Lawful, and superficially thinking about the relevant decision theory for couple of minutes.
Replies from: cubefox↑ comment by cubefox · 2024-05-27T12:42:53.083Z · LW(p) · GW(p)
This seems to me as a potential confusion of normative and descriptive sides of things. Whether humans in practice perfectly follow a specific decision theory isn't really relevant to the question of which decision theory an optimal agent should implement.
For potential artificial agents this is true. But for already existing humans, what they should do, e.g. in Newcomb's problem, depends on what they can do (ought implies can), and what they can do is a descriptive question.
When I think I have "precommitted" to do the dishes tomorrow, it is still the case that I will have to decide, tomorrow, whether or not to follow through with this "precommitment". So I haven't actually precommitted in the sense relevant for causal decision theory, which requires that the future decision has already been made and that nothing will be left to decide.
Unless you've actually precommited to do the dishes, of course. Then your mind doesn't even entertain the idea of not doing them.
Yes, but it normally doesn't work like this. A decision has to be made whether to now do the dishes.
Humans are imperfect precommiters but neither we are completely unable to precommit. We do not evaluate every action we take at every moment of taking it. When you go somewhere, you do not interrogate yourself whether to continue doing it at every step. We have the ability to follow plans and to automatize some of our actions. And we can actively improve this ability by cultivating relevant virtues. There is an obvious self fulfilling component here - those who do not believe that they can precommit and therefore do not try, indeed can't. Those who actively try, also fail sometimes, but they are less bad at precommitments and improve with time.
But this is very different from the sort of "precommitment" we are talking about in decision theory, or CDT in particular. In decision theory it is assumed that a "decision" means you definitely do it, not just with some probability. The probability is only in the outcomes. The decision is assumed to be final, not something you can change your mind about later.
The sort of limited "precommitment" we are talking about in humans is just a form of listening to advice of your past self. The decision still has to be made in the present, and could very well disregard what your past self recommends. For example, when deciding to take one or both boxes in Newcomb's problem, CDT requires you to look at the causal results of your actions. Listening now to advice of your past self has no causal influence on the contents of the boxes. So following CDT still means you take both boxes, which means the colloquial form of human "precommitment" is useless here. The form of precommitment required for CDT agents to do things like one-boxing is different from what humans can do.
I think, yes, but the right set of precommitments for all such problems is LDT
I suspect that it is, though my inquiries as of yet are mostly in probability theory realm, not decision theory, so I may be missing some domain specific details.
It seems to me that we can reduce alternative decision theories such as FDT to CDT with a particular set of precommitments. And the ultimate decision theory is something like "I precommit to act in every decision problem the way I wished I have precommited to act in this particular decision problem".
↑ comment by Oskar Mathiasen (oskar-mathiasen) · 2024-05-26T09:28:10.361Z · LW(p) · GW(p)
It seems to me that FDT has the property that you associate with the "ultimate decision theory".
My understanding is that FDT says that you should follow the policy which is attained by taking the argmax over all policies of the utility from following that policy (only including downstream effects of your policy).
In these easy examples your policy space is your space of committed actions. In which case the above seems to reduce to the "ultimate decision theory" criterion.
Replies from: MakoYass, Ape in the coat↑ comment by mako yass (MakoYass) · 2024-05-26T22:05:28.536Z · LW(p) · GW(p)
(only including downstream effects of your policy)
I'm not sure I know what you mean by this, but if you mean causal effects, no, it considers all pasts, and all timelines.
(A reader might balk, "but that's computationally infeasible", but we're talking about mathematic idealizations, the mathematical idealization of CDT is also computationally infeasible. Once we're talking about serious engineering projects to make implementable approximations of these things, you don't know what's going to be feasible.)
↑ comment by Ape in the coat · 2024-05-26T12:19:33.297Z · LW(p) · GW(p)
It seems so to me too, but I expect that there may be some nuance that makes this particular precommitment and therefore FDT not so ultimate after all.
But the point is that we can reduce FDT to CDT with precommitment, so if FDT is indeed ultimate decision theory, than so is CDT+P.
It's easy to frame Newcomb's problem such that there's no opportunity to precommit (and, CDT generally doesn't see any REASON to precommit there).
↑ comment by Ape in the coat · 2024-05-26T06:02:20.228Z · LW(p) · GW(p)
Can you give an example of a version of Parfit's hitchhiker where CDT with precommitment will not see a reason to precommit to the deal?
Replies from: Dagon↑ comment by Dagon · 2024-05-29T15:43:35.473Z · LW(p) · GW(p)
Nope! Parfit's Hitchhiker is designed to show exactly this. A CDT agent will desperately wish for some way to actually commit to paying.
I think some of the confusion in this thread is what "CDT with precommittment (or really, commitment)" actually means. It doesn't mean "intent" or "plan". It means "force" - throw the steering wheel out the window, so there IS NO later decision. Note also that humans aren't CDT agents, they're some weird crap that you need to squint pretty hard to call "rational" at all.
6 comments
Comments sorted by top scores.
comment by romeostevensit · 2024-05-25T22:18:47.530Z · LW(p) · GW(p)
More generally, dealing with the arrow-of-time loopiness problems by expanding the time window to contain the causal process in question. I would guess this epicycle introduces more complications than it solves sometimes though, requiring a block universe model in some circumstances (UDT related?).
comment by mako yass (MakoYass) · 2024-05-25T23:10:11.497Z · LW(p) · GW(p)
A general pre-commitment mechanism is just self-modification. CDT with self-modification has been named "Son of CDT", and seems to have been discussed most recently on this arbital article.
It behaves like a UDT agent about everything after the modifications are made, but not about anything that was determined before then.
I'm not aware of any exploits for that. I suspect that there will be some.
↑ comment by JBlack · 2024-05-26T05:24:23.399Z · LW(p) · GW(p)
Yes, such an agent will self-modify if it is presented with a Newcombe game before Omega determines how much money to put into boxes. It will even self-modify if there is a 1-in-1000 credence that Omega has not yet done so (or might change their mind).
At this point considerations come in such as what will happen if such an agent expects that they will face Newcombe-like games in the future but aren't yet certain what form they will take or what the exact payoffs will be. Should they self-modify to something UDT-like now?
comment by mako yass (MakoYass) · 2024-05-25T23:11:27.889Z · LW(p) · GW(p)
I note that in the cooperative bargaining domain, a CDT agent will engage in commitment races, using the commitment mechanism to turn itself into a berzerker, a threat [LW · GW]maker. If they're sharing a world with other CDT agents, that is all they will do. Whoever's able to constitutionalize first will make a pre-commitment like "I'll initiate a nuclear apocalypse if you don't surrender all of your land to us."
If they're sharing the world with UDT agents, they will be able to ascertain that those sorts of threats will be ignored (reflected in the US's principle of "refusing to negotiate with terrorists"), and recognize that it would just lead to MAD with no chance of a surrender deal. I think commitment mechanisms only lead to good bargaining outcomes if UDT agents already hold a lot of power.
comment by torekp · 2024-05-25T23:08:15.046Z · LW(p) · GW(p)
Given the disagreement over what "causality" is, I suspect that different CDT's might have different tolerances for adding precommitment without spoiling the point of CDT. For an example of a definition of causality that makes interesting impacts on decision theory, see Douglas Kutach, Causation and its Basis in Fundamental Physics. There's a nice review here. Defining "causation" Kutach's way would allow both making and keeping precommitments to count as causing good results. It would also at least partly collapse the divergence between CDT and EDT. Maybe completely - I haven't thought that through yet.
comment by Vladimir_Nesov · 2024-05-26T04:57:57.571Z · LW(p) · GW(p)
If you are a program and want another program to behave in a certain way, CDT doesn't really help with setting up a way of solving this. Decision theory is about figuring out how decision problems should be solved, but it's not in general clear what a decision problem is, or what counts as an admissible way of solving it.