Eliezer talked about this in his TDT paper. It is possible to hypothesize scenarios where agents get punished or rewarded for arbitrary reasons. For instance an AI could punish agents who made decisions based on the idea of their choices determining the results of abstract computations (as in TDT). This wouldn’t show that TDT is a bad decision theory or even that it’s no better than any other theory.
If we restrict ourselves to action-determined and decision-determined problems (see Eliezer’s TDT paper) we can say that TDT is better than CDT, because it gets everything right that CDT gets right, plus it gets right some things that CDT gets wrong.
Can you think of any way that a situation could be set up that punishes an NDT agent, that doesn’t reduce to an AI just not liking NDT agents and arbitrarily trying to hurt them?
Eliezer talked about this in his TDT paper. It is possible to hypothesize scenarios where agents get punished or rewarded for arbitrary reasons. For instance an AI could punish agents who made decisions based on the idea of their choices determining the results of abstract computations (as in TDT). This wouldn’t show that TDT is a bad decision theory or even that it’s no better than any other theory.
If we restrict ourselves to action-determined and decision-determined problems (see Eliezer’s TDT paper) we can say that TDT is better than CDT, because it gets everything right that CDT gets right, plus it gets right some things that CDT gets wrong.
Can you think of any way that a situation could be set up that punishes an NDT agent, that doesn’t reduce to an AI just not liking NDT agents and arbitrarily trying to hurt them?
This sounds a lot like the objections CDT people were giving to Newcombs problem.