Maybe the easiest way to understand UDT and TDT is:
UDT = EDT without updating on sensory inputs, with “actions” to be understood as logical facts about the agent’s outputs
TDT = CDT with “causality” to be understood as Pearl’s notion of causality plus additional arrows for logical correlations
Comparing UDT and TDT directly, the main differences seem to be that UDT does not do Bayesian updating on sensory inputs and does not make use of causality. There seems to be general agreement that Bayesian updating on sensory inputs is wrong in a number of situations, but disagreement and/or confusion about whether we need causality. Gary Drescher put it this way:
Plus, if you did have a general math-counterfactual-solving module, why would you relegate it to the logical-dependency-finding subproblem in TDT, and then return to the original factored causal graph? Instead, why not cast the whole problem as a mathematical abstraction, and then directly ask your math-counterfactual-solving module whether, say, (Platonic) C’s one-boxing counterfactually entails (Platonic) $1M? (Then do the argmax over the respective math-counterfactual consequences of C’s candidate outputs.)
(Eliezer didn’t give an answer. ETA: He did answer a related question here.)
I can see what updating on sensory updating does to TDT (causing it to fail counterfactual mugging). But what does it mean to say that TDT makes use of causality and UDT doesn’t? Are there any situations where this causes them to give different answers?
(I added a link at the end of the grandparent comment where Eliezer does give some of his thoughts on this issue.)
Are there any situations where this causes them to give different answers?
Eliezer seems to think that causality can help deal with Gary Drescher’s “5-and-10” problem:
But you would still have to factor out your logical uncertainty in a way which prevented you from concluding “if I choose A6, it must have had higher utility than A7” when considering A6 as an option (as Drescher observes).
But it seems possible to build versions of UDT that are free from such problems (such as the proof-based ones that cousin_it and Nesov have explored), although there are still some remaining issues with “spurious proofs” which may be related. In any case, it’s unclear how to get help from the notion of causality, and as far as I know, nobody has explored in that direction and reported back any results.
I’m not an expert but I think this is how it works:
Both decision theories (TDT and UDT) work by imagining the problem from the point of view of themselves before the problem started. They then think “From this point of view, which sequence of decisions would be the best one?”, and then they follow that sequence of decisions. The difference is in how they react to randomness in the environment. When the algorithm is run, the agent is already midway through the problem, and so might have some knowledge that it didn’t have at the start of the problem (e.g. whether a coinflip came up heads or tails). When visualising themselves at the start of the problem TDT assumes they have this knowledge, UDT assumes they don’t.
Imagine that one day, Omega comes to you and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it $100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don’t want to give up your $100. But see, the Omega tells you that if the coin came up heads instead of tails, it’d give you $10000, but only if you’d agree to give it $100 if the coin came up tails.
TDT visualises itself before the problem started, knowing that the coin the coin will come up tails. From this point of view the kind of agent that does well is the kind that refuses to give $100, and so that’s what TDT does.
UDT visualises itself before the problem started, and pretends it doesn’t know what the coin does. From this point of view the kind of agent that does well is the kind that gives $100 in the case of tails, so that’s what UDT does.
Many people think of UDT as being a member of the “TDT branch of decision theories.” And in fact, much of what is now discussed as “UDT” (e.g. in A model of UDT with a halting oracle) is not Wei Dai’s first or second variant of UDT but instead a new variant of UDT sometimes called Ambient Decision Theory or ADT.
I don’t think that is how CDT and EDT differ, actually. Instead, it’s that EDT cares about conditional probabilities and CDT doesn’t. For instance, in Newcomb’s problem, a CDT agent could agree that his expected utility is higher conditional on him one-boxing than it is conditional on him two-boxing. But he two-boxes anyway because the correlation isn’t causal. A guess TDT/UDT does compute conditional probabilities differently in the sense that they don’t pretend that their decisions are independent of the outputs of similar algorithms.
While we’re on the subject of decision theory… what is the difference between TDT and UDT?
Maybe the easiest way to understand UDT and TDT is:
UDT = EDT without updating on sensory inputs, with “actions” to be understood as logical facts about the agent’s outputs
TDT = CDT with “causality” to be understood as Pearl’s notion of causality plus additional arrows for logical correlations
Comparing UDT and TDT directly, the main differences seem to be that UDT does not do Bayesian updating on sensory inputs and does not make use of causality. There seems to be general agreement that Bayesian updating on sensory inputs is wrong in a number of situations, but disagreement and/or confusion about whether we need causality. Gary Drescher put it this way:
(Eliezer didn’t give an answer. ETA: He did answer a related question here.)
I can see what updating on sensory updating does to TDT (causing it to fail counterfactual mugging). But what does it mean to say that TDT makes use of causality and UDT doesn’t? Are there any situations where this causes them to give different answers?
(I added a link at the end of the grandparent comment where Eliezer does give some of his thoughts on this issue.)
Eliezer seems to think that causality can help deal with Gary Drescher’s “5-and-10” problem:
But it seems possible to build versions of UDT that are free from such problems (such as the proof-based ones that cousin_it and Nesov have explored), although there are still some remaining issues with “spurious proofs” which may be related. In any case, it’s unclear how to get help from the notion of causality, and as far as I know, nobody has explored in that direction and reported back any results.
I’m not an expert but I think this is how it works:
Both decision theories (TDT and UDT) work by imagining the problem from the point of view of themselves before the problem started. They then think “From this point of view, which sequence of decisions would be the best one?”, and then they follow that sequence of decisions. The difference is in how they react to randomness in the environment. When the algorithm is run, the agent is already midway through the problem, and so might have some knowledge that it didn’t have at the start of the problem (e.g. whether a coinflip came up heads or tails). When visualising themselves at the start of the problem TDT assumes they have this knowledge, UDT assumes they don’t.
An example is Counterfactual Mugging:
TDT visualises itself before the problem started, knowing that the coin the coin will come up tails. From this point of view the kind of agent that does well is the kind that refuses to give $100, and so that’s what TDT does.
UDT visualises itself before the problem started, and pretends it doesn’t know what the coin does. From this point of view the kind of agent that does well is the kind that gives $100 in the case of tails, so that’s what UDT does.
Why do we still reference TDT so much if UDT is better?
Many people think of UDT as being a member of the “TDT branch of decision theories.” And in fact, much of what is now discussed as “UDT” (e.g. in A model of UDT with a halting oracle) is not Wei Dai’s first or second variant of UDT but instead a new variant of UDT sometimes called Ambient Decision Theory or ADT.
Follow-up: Is it in how they compute conditional probabilities in the decision algorithm? As I understand it, that’s how CDT and EDT and TDT differ.
I don’t think that is how CDT and EDT differ, actually. Instead, it’s that EDT cares about conditional probabilities and CDT doesn’t. For instance, in Newcomb’s problem, a CDT agent could agree that his expected utility is higher conditional on him one-boxing than it is conditional on him two-boxing. But he two-boxes anyway because the correlation isn’t causal. A guess TDT/UDT does compute conditional probabilities differently in the sense that they don’t pretend that their decisions are independent of the outputs of similar algorithms.