Logical decision theory was introduced (in part) to resolve problems such as Parfit’s hitchhiker.
I heard an argument that there is no reason to introduce a new decision theory—one can just take causal decision theory and precommit to doing whatever is needed on such problems (e.g. pay the money once in the city).
This seems dubious given that people spent so much time on developing logical decision theory. However, I cannot formulate a counterargument. What is wrong with the claim that CDT with precommitment is the “right” decision theory?
One problem is that in most cases, humans simply can’t “precommit” in the relevant sense. We can’t really (i.e. completely) move a decision from the future into the present. When I think I have “precommitted” to do the dishes tomorrow, it is still the case that I will have to decide, tomorrow, whether or not to follow through with this “precommitment”. So I haven’t actually precommitted in the sense relevant for causal decision theory, which requires that the future decision has already been made and that nothing will be left to decide.
So if you e.g. try to commit to one-boxing in Newcomb’s problem, it is still the case that you have to actually decide between one-boxing and two-boxing when you stand before the two boxes. And then you will have no causal reason to do one-boxing anymore. The memory of the alleged “precommitment” of your past self is now just a recommendation, or a request, not something that relieves you from making your current decision.
An exception is when we can actively restrict our future actions. E.g. you can precommit to not use your phone tomorrow by locking it in a safe with a time-lock. But this type of precommitment often isn’t practically possible.
Being able to do arbitrary true precommitments could also be dangerous overall. It would mean that we really can’t change the precommitted decision in the future (since it has already been made in the past), even if unexpected new information will strongly imply we should do so. Moreover, it could lead to ruinous commitment races in bargaining situations.
This seems to me as a potential confusion of normative and descriptive sides of things. Whether humans in practice perfectly follow a specific decision theory isn’t really relevant to the question of which decision theory an optimal agent should implement. If CDT+P is optimal and humans have troubles with precommiting it is a problem—for humans, not for CDT+P. It’s a reason for humans to learn to precommit better.
Unless you’ve actually precommited to do the dishes, of course. Then your mind doesn’t even entertain the idea of not doing them.
Humans are imperfect precommiters but neither we are completely unable to precommit. We do not evaluate every action we take at every moment of taking it. When you go somewhere, you do not interrogate yourself whether to continue doing it at every step. We have the ability to follow plans and to automatize some of our actions. And we can actively improve this ability by cultivating relevant virtues. There is an obvious self fulfilling component here—those who do not believe that they can precommit and therefore do not try, indeed can’t. Those who actively try, also fail sometimes, but they are less bad at precommitments and improve with time.
Of course. That’s why evolution gave us only limited ability to precommit in the first place. And most of our precommitments are flexible enough. There is an implicit “unless something completely unexpected happens or I feel extremely bad, etc” built in in our promises by default and it requires extra previledged access to our psyche to override these restrictions.
Commitment races is an interesting topic. I belive there is a coherent way to resolve them by something like precommiting not to respond to threats and not to make threats yourself against those who would not respond to them, but I dind’t explore this beyond reading Project Lawful, and superficially thinking about the relevant decision theory for couple of minutes.
For potential artificial agents this is true. But for already existing humans, what they should do, e.g. in Newcomb’s problem, depends on what they can do (ought implies can), and what they can do is a descriptive question.
Yes, but it normally doesn’t work like this. A decision has to be made whether to now do the dishes.
But this is very different from the sort of “precommitment” we are talking about in decision theory, or CDT in particular. In decision theory it is assumed that a “decision” means you definitely do it, not just with some probability. The probability is only in the outcomes. The decision is assumed to be final, not something you can change your mind about later.
The sort of limited “precommitment” we are talking about in humans is just a form of listening to advice of your past self. The decision still has to be made in the present, and could very well disregard what your past self recommends. For example, when deciding to take one or both boxes in Newcomb’s problem, CDT requires you to look at the causal results of your actions. Listening now to advice of your past self has no causal influence on the contents of the boxes. So following CDT still means you take both boxes, which means the colloquial form of human “precommitment” is useless here. The form of precommitment required for CDT agents to do things like one-boxing is different from what humans can do.
I think, yes, but the right set of precommitments for all such problems is LDT
I suspect that it is, though my inquiries as of yet are mostly in probability theory realm, not decision theory, so I may be missing some domain specific details.
It seems to me that we can reduce alternative decision theories such as FDT to CDT with a particular set of precommitments. And the ultimate decision theory is something like “I precommit to act in every decision problem the way I wished I have precommited to act in this particular decision problem”.
It seems to me that FDT has the property that you associate with the “ultimate decision theory”.
My understanding is that FDT says that you should follow the policy which is attained by taking the argmax over all policies of the utility from following that policy (only including downstream effects of your policy).
In these easy examples your policy space is your space of committed actions. In which case the above seems to reduce to the “ultimate decision theory” criterion.
I’m not sure I know what you mean by this, but if you mean causal effects, no, it considers all pasts, and all timelines.
(A reader might balk, “but that’s computationally infeasible”, but we’re talking about mathematic idealizations, the mathematical idealization of CDT is also computationally infeasible. Once we’re talking about serious engineering projects to make implementable approximations of these things, you don’t know what’s going to be feasible.)
It seems so to me too, but I expect that there may be some nuance that makes this particular precommitment and therefore FDT not so ultimate after all.
But the point is that we can reduce FDT to CDT with precommitment, so if FDT is indeed ultimate decision theory, than so is CDT+P.
It’s easy to frame Newcomb’s problem such that there’s no opportunity to precommit (and, CDT generally doesn’t see any REASON to precommit there).
Can you give an example of a version of Parfit’s hitchhiker where CDT with precommitment will not see a reason to precommit to the deal?
Nope! Parfit’s Hitchhiker is designed to show exactly this. A CDT agent will desperately wish for some way to actually commit to paying.
I think some of the confusion in this thread is what “CDT with precommittment (or really, commitment)” actually means. It doesn’t mean “intent” or “plan”. It means “force”—throw the steering wheel out the window, so there IS NO later decision. Note also that humans aren’t CDT agents, they’re some weird crap that you need to squint pretty hard to call “rational” at all.