I think the difference between “FDT” and “CDT”[1] in these scenarios can be framed as a difference in preferences. “FDT” values all copies of itself equally; “CDT” has indexical values, only caring about the version of itself that it actually finds itself as. As such the debate over which is more “rational” mostly comes down to a semantic dispute.
(Will be using “UDT” below but I think the same issue applies to all subsequent variants such as FDT that kept the “updateless” feature.)
I think this is a fair point. It’s not the only difference between CDT and UDT but does seem to account for why many people find UDT counterintuitive. I made a similar point in this comment. I do disagree with “As such the debate over which is more “rational” mostly comes down to a semantic dispute.” though. There are definitely some substantial issues here.
(A nit first: it’s not that UDT must value all copies of oneself equally but it is incompatible with indexical values. You can have a UDT utility function that values different copies differently, it just has to be fixed for all time instead of changing based on what you observe.)
I think humans do seem to have indexical values, but what to do about it is a big open problem in decision theory. “Just use CDT” is unsatisfactory because as soon as someone could self-modify, they would have incentive to modify themselves to no longer use CDT (and no longer have indexical values). I’m not sure what further implications that has though. (See above linked post where I talked about this puzzle in a bit more detail.)
I’m surprised Wei Dai thinks this is a fair point. I disagree entirely with it: FDT is a decision theory and doesn’t in and of itself value anything. The values need to be given by a utility function.
Consider the Psychological Twin Prisoner’s Dilemma. Given the utility function used there, the agent doesn’t value the twin at all: the agent just wants to go home free as soon as possible. FDT doesn’t change this: it just recognizes that the twin makes the same decision the agent does, which has bearing on the prison time the agent gets.
FDT is a decision theory and doesn’t in and of itself value anything. The values need to be given by a utility function.
I explicitly said that this difference in values is meant to reproduce the way that FDT/CDT are usually argued to act in these sorts of scenarios, but is actually orthogonal to decision theory per se.
Psychological Twin Prisoner’s Dilemma
This scenario is a stronger one for the decision theory FDT. But that’s not the sort of scenario I was referring to: the argument in my comment applies to scenarios where one of the copies makes itself worse off to benefit the others, like the Bomb or transparent Newcomb. These were the main topic of discussion of the post, and I still think it’s accurate to say that the difference in intuitions between CDTists/FDTists here comes down to a values/semantic dispute.
I think the difference between “FDT” and “CDT”[1] in these scenarios can be framed as a difference in preferences. “FDT” values all copies of itself equally; “CDT” has indexical values, only caring about the version of itself that it actually finds itself as. As such the debate over which is more “rational” mostly comes down to a semantic dispute.
quotation marks because this difference in preferences is really orthogonal to CDT/FDT but it reproduces the way they are usually argued to act.
(Will be using “UDT” below but I think the same issue applies to all subsequent variants such as FDT that kept the “updateless” feature.)
I think this is a fair point. It’s not the only difference between CDT and UDT but does seem to account for why many people find UDT counterintuitive. I made a similar point in this comment. I do disagree with “As such the debate over which is more “rational” mostly comes down to a semantic dispute.” though. There are definitely some substantial issues here.
(A nit first: it’s not that UDT must value all copies of oneself equally but it is incompatible with indexical values. You can have a UDT utility function that values different copies differently, it just has to be fixed for all time instead of changing based on what you observe.)
I think humans do seem to have indexical values, but what to do about it is a big open problem in decision theory. “Just use CDT” is unsatisfactory because as soon as someone could self-modify, they would have incentive to modify themselves to no longer use CDT (and no longer have indexical values). I’m not sure what further implications that has though. (See above linked post where I talked about this puzzle in a bit more detail.)
I’m surprised Wei Dai thinks this is a fair point. I disagree entirely with it: FDT is a decision theory and doesn’t in and of itself value anything. The values need to be given by a utility function.
Consider the Psychological Twin Prisoner’s Dilemma. Given the utility function used there, the agent doesn’t value the twin at all: the agent just wants to go home free as soon as possible. FDT doesn’t change this: it just recognizes that the twin makes the same decision the agent does, which has bearing on the prison time the agent gets.
I explicitly said that this difference in values is meant to reproduce the way that FDT/CDT are usually argued to act in these sorts of scenarios, but is actually orthogonal to decision theory per se.
This scenario is a stronger one for the decision theory FDT. But that’s not the sort of scenario I was referring to: the argument in my comment applies to scenarios where one of the copies makes itself worse off to benefit the others, like the Bomb or transparent Newcomb. These were the main topic of discussion of the post, and I still think it’s accurate to say that the difference in intuitions between CDTists/FDTists here comes down to a values/semantic dispute.