There is no perfect decision procedure which is beneficial in all possible situations. If the situation is able to know the agent’s decision procedure, it can act in such a way as to “minimize the agent’s utility if the agent uses decision procedure X”, and in that situation decision procedure X, whatever it is, will be bad for the agent. So in order to have perfect knowledge of what decision procedure to use, you have to know what situations are going to actually happen to you.
Eliezer talked about this in his TDT paper. It is possible to hypothesize scenarios where agents get punished or rewarded for arbitrary reasons. For instance an AI could punish agents who made decisions based on the idea of their choices determining the results of abstract computations (as in TDT). This wouldn’t show that TDT is a bad decision theory or even that it’s no better than any other theory.
If we restrict ourselves to action-determined and decision-determined problems (see Eliezer’s TDT paper) we can say that TDT is better than CDT, because it gets everything right that CDT gets right, plus it gets right some things that CDT gets wrong.
Can you think of any way that a situation could be set up that punishes an NDT agent, that doesn’t reduce to an AI just not liking NDT agents and arbitrarily trying to hurt them?
There is no perfect decision procedure which is beneficial in all possible situations. If the situation is able to know the agent’s decision procedure, it can act in such a way as to “minimize the agent’s utility if the agent uses decision procedure X”, and in that situation decision procedure X, whatever it is, will be bad for the agent. So in order to have perfect knowledge of what decision procedure to use, you have to know what situations are going to actually happen to you.
Eliezer talked about this in his TDT paper. It is possible to hypothesize scenarios where agents get punished or rewarded for arbitrary reasons. For instance an AI could punish agents who made decisions based on the idea of their choices determining the results of abstract computations (as in TDT). This wouldn’t show that TDT is a bad decision theory or even that it’s no better than any other theory.
If we restrict ourselves to action-determined and decision-determined problems (see Eliezer’s TDT paper) we can say that TDT is better than CDT, because it gets everything right that CDT gets right, plus it gets right some things that CDT gets wrong.
Can you think of any way that a situation could be set up that punishes an NDT agent, that doesn’t reduce to an AI just not liking NDT agents and arbitrarily trying to hurt them?
This sounds a lot like the objections CDT people were giving to Newcombs problem.