I’m not an expert but I think this is how it works:
Both decision theories (TDT and UDT) work by imagining the problem from the point of view of themselves before the problem started. They then think “From this point of view, which sequence of decisions would be the best one?”, and then they follow that sequence of decisions. The difference is in how they react to randomness in the environment. When the algorithm is run, the agent is already midway through the problem, and so might have some knowledge that it didn’t have at the start of the problem (e.g. whether a coinflip came up heads or tails). When visualising themselves at the start of the problem TDT assumes they have this knowledge, UDT assumes they don’t.
Imagine that one day, Omega comes to you and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don’t want to give up your $100. But see, the Omega tells you that if the coin came up heads instead of tails, it’d give you $10000, but only if you’d agree to give it $100 if the coin came up tails.
TDT visualises itself before the problem started, knowing that the coin the coin will come up tails. From this point of view the kind of agent that does well is the kind that refuses to give $100, and so that’s what TDT does.
UDT visualises itself before the problem started, and pretends it doesn’t know what the coin does. From this point of view the kind of agent that does well is the kind that gives $100 in the case of tails, so that’s what UDT does.
Many people think of UDT as being a member of the “TDT branch of decision theories.” And in fact, much of what is now discussed as “UDT” (e.g. in A model of UDT with a halting oracle) is not Wei Dai’s first or second variant of UDT but instead a new variant of UDT sometimes called Ambient Decision Theory or ADT.
I’m not an expert but I think this is how it works:
Both decision theories (TDT and UDT) work by imagining the problem from the point of view of themselves before the problem started. They then think “From this point of view, which sequence of decisions would be the best one?”, and then they follow that sequence of decisions. The difference is in how they react to randomness in the environment. When the algorithm is run, the agent is already midway through the problem, and so might have some knowledge that it didn’t have at the start of the problem (e.g. whether a coinflip came up heads or tails). When visualising themselves at the start of the problem TDT assumes they have this knowledge, UDT assumes they don’t.
An example is Counterfactual Mugging:
TDT visualises itself before the problem started, knowing that the coin the coin will come up tails. From this point of view the kind of agent that does well is the kind that refuses to give $100, and so that’s what TDT does.
UDT visualises itself before the problem started, and pretends it doesn’t know what the coin does. From this point of view the kind of agent that does well is the kind that gives $100 in the case of tails, so that’s what UDT does.
Why do we still reference TDT so much if UDT is better?
Many people think of UDT as being a member of the “TDT branch of decision theories.” And in fact, much of what is now discussed as “UDT” (e.g. in A model of UDT with a halting oracle) is not Wei Dai’s first or second variant of UDT but instead a new variant of UDT sometimes called Ambient Decision Theory or ADT.