One problem is that in most cases, humans simply can’t “precommit” in the relevant sense. We can’t really (i.e. completely) move a decision from the future into the present.
This seems to me as a potential confusion of normative and descriptive sides of things. Whether humans in practice perfectly follow a specific decision theory isn’t really relevant to the question of which decision theory an optimal agent should implement. If CDT+P is optimal and humans have troubles with precommiting it is a problem—for humans, not for CDT+P. It’s a reason for humans to learn to precommit better.
When I think I have “precommitted” to do the dishes tomorrow, it is still the case that I will have to decide, tomorrow, whether or not to follow through with this “precommitment”. So I haven’t actually precommitted in the sense relevant for causal decision theory, which requires that the future decision has already been made and that nothing will be left to decide.
Unless you’ve actually precommited to do the dishes, of course. Then your mind doesn’t even entertain the idea of not doing them.
Humans are imperfect precommiters but neither we are completely unable to precommit. We do not evaluate every action we take at every moment of taking it. When you go somewhere, you do not interrogate yourself whether to continue doing it at every step. We have the ability to follow plans and to automatize some of our actions. And we can actively improve this ability by cultivating relevant virtues. There is an obvious self fulfilling component here—those who do not believe that they can precommit and therefore do not try, indeed can’t. Those who actively try, also fail sometimes, but they are less bad at precommitments and improve with time.
Being able to do arbitrary true precommitments could also be dangerous overall.
Of course. That’s why evolution gave us only limited ability to precommit in the first place. And most of our precommitments are flexible enough. There is an implicit “unless something completely unexpected happens or I feel extremely bad, etc” built in in our promises by default and it requires extra previledged access to our psyche to override these restrictions.
Moreover, it could lead to ruinous commitment races in bargaining situations.
Commitment races is an interesting topic. I belive there is a coherent way to resolve them by something like precommiting not to respond to threats and not to make threats yourself against those who would not respond to them, but I dind’t explore this beyond reading Project Lawful, and superficially thinking about the relevant decision theory for couple of minutes.
This seems to me as a potential confusion of normative and descriptive sides of things. Whether humans in practice perfectly follow a specific decision theory isn’t really relevant to the question of which decision theory an optimal agent should implement.
For potential artificial agents this is true. But for already existing humans, what they should do, e.g. in Newcomb’s problem, depends on what they can do (ought implies can), and what they can do is a descriptive question.
When I think I have “precommitted” to do the dishes tomorrow, it is still the case that I will have to decide, tomorrow, whether or not to follow through with this “precommitment”. So I haven’t actually precommitted in the sense relevant for causal decision theory, which requires that the future decision has already been made and that nothing will be left to decide.
Unless you’ve actually precommited to do the dishes, of course. Then your mind doesn’t even entertain the idea of not doing them.
Yes, but it normally doesn’t work like this. A decision has to be made whether to now do the dishes.
Humans are imperfect precommiters but neither we are completely unable to precommit. We do not evaluate every action we take at every moment of taking it. When you go somewhere, you do not interrogate yourself whether to continue doing it at every step. We have the ability to follow plans and to automatize some of our actions. And we can actively improve this ability by cultivating relevant virtues. There is an obvious self fulfilling component here—those who do not believe that they can precommit and therefore do not try, indeed can’t. Those who actively try, also fail sometimes, but they are less bad at precommitments and improve with time.
But this is very different from the sort of “precommitment” we are talking about in decision theory, or CDT in particular. In decision theory it is assumed that a “decision” means you definitely do it, not just with some probability. The probability is only in the outcomes. The decision is assumed to be final, not something you can change your mind about later.
The sort of limited “precommitment” we are talking about in humans is just a form of listening to advice of your past self. The decision still has to be made in the present, and could very well disregard what your past self recommends. For example, when deciding to take one or both boxes in Newcomb’s problem, CDT requires you to look at the causal results of your actions. Listening now to advice of your past self has no causal influence on the contents of the boxes. So following CDT still means you take both boxes, which means the colloquial form of human “precommitment” is useless here. The form of precommitment required for CDT agents to do things like one-boxing is different from what humans can do.
This seems to me as a potential confusion of normative and descriptive sides of things. Whether humans in practice perfectly follow a specific decision theory isn’t really relevant to the question of which decision theory an optimal agent should implement. If CDT+P is optimal and humans have troubles with precommiting it is a problem—for humans, not for CDT+P. It’s a reason for humans to learn to precommit better.
Unless you’ve actually precommited to do the dishes, of course. Then your mind doesn’t even entertain the idea of not doing them.
Humans are imperfect precommiters but neither we are completely unable to precommit. We do not evaluate every action we take at every moment of taking it. When you go somewhere, you do not interrogate yourself whether to continue doing it at every step. We have the ability to follow plans and to automatize some of our actions. And we can actively improve this ability by cultivating relevant virtues. There is an obvious self fulfilling component here—those who do not believe that they can precommit and therefore do not try, indeed can’t. Those who actively try, also fail sometimes, but they are less bad at precommitments and improve with time.
Of course. That’s why evolution gave us only limited ability to precommit in the first place. And most of our precommitments are flexible enough. There is an implicit “unless something completely unexpected happens or I feel extremely bad, etc” built in in our promises by default and it requires extra previledged access to our psyche to override these restrictions.
Commitment races is an interesting topic. I belive there is a coherent way to resolve them by something like precommiting not to respond to threats and not to make threats yourself against those who would not respond to them, but I dind’t explore this beyond reading Project Lawful, and superficially thinking about the relevant decision theory for couple of minutes.
For potential artificial agents this is true. But for already existing humans, what they should do, e.g. in Newcomb’s problem, depends on what they can do (ought implies can), and what they can do is a descriptive question.
Yes, but it normally doesn’t work like this. A decision has to be made whether to now do the dishes.
But this is very different from the sort of “precommitment” we are talking about in decision theory, or CDT in particular. In decision theory it is assumed that a “decision” means you definitely do it, not just with some probability. The probability is only in the outcomes. The decision is assumed to be final, not something you can change your mind about later.
The sort of limited “precommitment” we are talking about in humans is just a form of listening to advice of your past self. The decision still has to be made in the present, and could very well disregard what your past self recommends. For example, when deciding to take one or both boxes in Newcomb’s problem, CDT requires you to look at the causal results of your actions. Listening now to advice of your past self has no causal influence on the contents of the boxes. So following CDT still means you take both boxes, which means the colloquial form of human “precommitment” is useless here. The form of precommitment required for CDT agents to do things like one-boxing is different from what humans can do.