How is it different from Updateless decision theory? What’s the simplest problem in which they give different results?
They’re the same thing, it’s just a branding change.
[EDIT: never mind I was wrong] Come to think of it, I don’t know why did the FDT paper did not make any reference to UDT or its main inventor Wei Dai (which I infer based on a cursory ctrl-f in the paper).
Nate says: “The main datapoint that Rob left out: one reason we don’t call it UDT (or cite Wei Dai much) is that Wei Dai doesn’t endorse FDT’s focus on causal-graph-style counterpossible reasoning; IIRC he’s holding out for an approach to counterpossible reasoning that falls out of evidential-style conditioning on a logically uncertain distribution. (FWIW I tried to make the formalization we chose in the paper general enough to technically include that possibility, though Wei and I disagree here and that’s definitely not where the paper put its emphasis. I don’t want to put words in Wei Dai’s mouth, but IIRC, this is also a reason Wei Dai declined to be listed as a co-author.)”
I actually think it’s a downgrade. It doesn’t include the fix that Wei calls UDT1.1, quantifying over all possible observation-action maps, instead of quantifying over possible actions for the observation you’ve actually received. The FDT paper has a footnote saying the fix would only matter for multi-agent problems, which is wrong. All my posts about UDT assume the fix as a matter of common sense.
Nate says: “You may have a scenario in mind that I overlooked (and I’d be interested to hear about it if so), but I’m not currently aware of a situation where the 1.1 patch is needed that doesn’t involve some sort of multi-agent coordination. I’ll note that a lot of the work that I (and various others) used to think was done by policy selection is in fact done by not-updating-on-your-observations instead. (E.g., FDT agents refuse blackmail because of the effects this has in the world where they weren’t blackmailed, despite how their observations say that that world is impossible.)”
Say there’s some logical random variable O you’re going to learn, which is either 0 or 1, with a prior 50% probability of being 1. After knowing the value of this variable, you take action 0 or 1. Some predictor doesn’t know the value of this variable, but does know your source code. This predictor predicts P(you take action 1 | O = 0) and P(you take action 1 | O = 1). Your utility only depends on these predictions; specifically, it is P(you take action 1 | O = 0) − 100(P(you take action 1 | O = 0)-P(you take action 1 | O = 1))^2.
This is a continuous coordination problem, and CDT-like graph intervention isn’t guaranteed to solve it, while policy selection is.
Cool. I hadn’t thought to frame those problems in predictor terms, and I agree now that “only matters in multi-agent dilemmas” is incorrect.
That said, it still seems to me like policy selection only matters in situations where, conceptually, winning requires something like multiple agents who run the same decision algorithm meeting and doing a bit of logically-prior coordination, and something kind of like this separates things like transparent Newcomb’s problem (where policy selection is not necessary) from the more coordination-shaped cases. The way the problems are classified in my head still involves me asking myself the question “well, do I need to get together and coordinate with all of the instances of me that appear in the problem logically-beforehand, or can we each individually wing it once we see our observations?”.
If anyone has examples where this classification is broken, I remain curious to hear them. Or, similar question: is there any disagreement on the weakened claim, “policy selection only matters in situations that can be transformed into multi-agent problems, where a problem is said to be ‘multi-agent’ if the winning strategy requires the agents to coordinate logically-before making their observations”?
but I’m not currently aware of a situation where the 1.1 patch is needed that doesn’t involve some sort of multi-agent coordination
I think the 1.1 patch is needed to solve problems with coordination/amnesia/prediction, and moreover these are all the same set of problems.
Coordination: two people wake up in rooms painted different colors (red and blue). Each is asked to choose a button (A or B). If they choose different buttons, both get $100. One possible winning strategy is red->A, blue->B.
Amnesia: on two consecutive days, you wake up with amnesia in rooms painted different colors and need to choose a button. If you choose different buttons on different days, you get $100. Winning strategy is same as above.
Prediction: you wake up in a room painted either red or blue and are asked to choose a button. At the same time, a predictor predicts what you would do if the room color was different. If that would lead to you choosing a different button, you get $100. Winning strategy is same as above.
Your comment here makes it sound like the FDT paper said “the difference between UDT 1.1 and UDT 1.0 isn’t important, so we’ll just endorse UDT 1.0”, where what the paper actually says is:
In the authors’ preferred formalization of FDT, agents actually iterate over policies (mappings from observations to actions) rather than actions. This makes a difference in certain multi-agent dilemmas, but will not make a difference in this paper. [...]
As mentioned earlier, the author’s preferred formulation of FDT actually intervenes on the node FDT(−) to choose not an action but a policy which maps inputs to actions, to which the agent then applies her inputs in order to select an action. The difference only matters in multi-agent dilemmas so far as we can tell, so we have set that distinction aside in this paper for ease of exposition.
I don’t know why it claims the difference only crops up in multi-agent dilemmas, if that’s wrong.
It has on the page 2: “Ideas reminiscent of FDT have been explored by many, including Spohn (2012), Meacham (2010), Yudkowsky (2010), Dai (2009), Drescher (2006), and Gauthier (1994).”
My model is that ‘FDT’ is used in the paper instead of ‘UDT’ because:
The name ‘UDT’ seemed less likely to catch on.
The term ‘UDT’ (and ‘modifier+UDT’) had come to refer to a bunch of very different things over the years. ‘UDT 1.1’ is a lot less ambiguous, since people are less likely to think that you’re talking about an umbrella category encompassing all the ‘modifier+UDT’ terms; but it’s a bit of a mouthful.
I’ve heard someone describe ‘UDT’ as “FDT + a theory of anthropics”—i.e., it builds in the core idea of what we’re calling “FDT” (“choose by imagining that your (fixed) decision function takes on different logical outputs”), plus a view to the effect that decisions+probutilities are what matter, and subjective expectations don’t make sense. Having a name for the FDT part of the view seems useful for evaluating the subclaims separately.
The FDT paper introduces the FDT/UDT concept in more CDT-ish terms (for ease of exposition), so I think some people have also started using ‘FDT’ to mean something like ‘variants of UDT that are more CDT-ish’, which is confusing given that FDT was originally meant to refer to the superset/family of UDT-ish views. Maybe that suggests that researchers feel more of a need for new narrow terms to fill gaps, since it’s less often necessary in the trenches to crisply refer to the superset.
I don’t suppose you could be clearer about how anthropics works on in FDT? Like are there any write-ups of how it solves any of the traditional anthropic paradoxes? Plus why don’t subjective expectations make any sense?
I read it several months ago and was wondering the same thing. Although it’s more than a branding change, FDT is much clearer put and much easier to understand.
They’re the same thing, it’s just a branding change.
[EDIT: never mind I was wrong] Come to think of it, I don’t know why did the FDT paper did not make any reference to UDT or its main inventor Wei Dai (which I infer based on a cursory ctrl-f in the paper).
Nate says: “The main datapoint that Rob left out: one reason we don’t call it UDT (or cite Wei Dai much) is that Wei Dai doesn’t endorse FDT’s focus on causal-graph-style counterpossible reasoning; IIRC he’s holding out for an approach to counterpossible reasoning that falls out of evidential-style conditioning on a logically uncertain distribution. (FWIW I tried to make the formalization we chose in the paper general enough to technically include that possibility, though Wei and I disagree here and that’s definitely not where the paper put its emphasis. I don’t want to put words in Wei Dai’s mouth, but IIRC, this is also a reason Wei Dai declined to be listed as a co-author.)”
I actually think it’s a downgrade. It doesn’t include the fix that Wei calls UDT1.1, quantifying over all possible observation-action maps, instead of quantifying over possible actions for the observation you’ve actually received. The FDT paper has a footnote saying the fix would only matter for multi-agent problems, which is wrong. All my posts about UDT assume the fix as a matter of common sense.
Nate says: “You may have a scenario in mind that I overlooked (and I’d be interested to hear about it if so), but I’m not currently aware of a situation where the 1.1 patch is needed that doesn’t involve some sort of multi-agent coordination. I’ll note that a lot of the work that I (and various others) used to think was done by policy selection is in fact done by not-updating-on-your-observations instead. (E.g., FDT agents refuse blackmail because of the effects this has in the world where they weren’t blackmailed, despite how their observations say that that world is impossible.)”
Say there’s some logical random variable O you’re going to learn, which is either 0 or 1, with a prior 50% probability of being 1. After knowing the value of this variable, you take action 0 or 1. Some predictor doesn’t know the value of this variable, but does know your source code. This predictor predicts P(you take action 1 | O = 0) and P(you take action 1 | O = 1). Your utility only depends on these predictions; specifically, it is P(you take action 1 | O = 0) − 100(P(you take action 1 | O = 0)-P(you take action 1 | O = 1))^2.
This is a continuous coordination problem, and CDT-like graph intervention isn’t guaranteed to solve it, while policy selection is.
Nate:
[EDIT: retracted]
I think the 1.1 patch is needed to solve problems with coordination/amnesia/prediction, and moreover these are all the same set of problems.
Coordination: two people wake up in rooms painted different colors (red and blue). Each is asked to choose a button (A or B). If they choose different buttons, both get $100. One possible winning strategy is red->A, blue->B.
Amnesia: on two consecutive days, you wake up with amnesia in rooms painted different colors and need to choose a button. If you choose different buttons on different days, you get $100. Winning strategy is same as above.
Prediction: you wake up in a room painted either red or blue and are asked to choose a button. At the same time, a predictor predicts what you would do if the room color was different. If that would lead to you choosing a different button, you get $100. Winning strategy is same as above.
Your comment here makes it sound like the FDT paper said “the difference between UDT 1.1 and UDT 1.0 isn’t important, so we’ll just endorse UDT 1.0”, where what the paper actually says is:
I don’t know why it claims the difference only crops up in multi-agent dilemmas, if that’s wrong.
Yes, good catch.
It has on the page 2: “Ideas reminiscent of FDT have been explored by many, including Spohn (2012), Meacham (2010), Yudkowsky (2010), Dai (2009), Drescher (2006), and Gauthier (1994).”
My model is that ‘FDT’ is used in the paper instead of ‘UDT’ because:
The name ‘UDT’ seemed less likely to catch on.
The term ‘UDT’ (and ‘modifier+UDT’) had come to refer to a bunch of very different things over the years. ‘UDT 1.1’ is a lot less ambiguous, since people are less likely to think that you’re talking about an umbrella category encompassing all the ‘modifier+UDT’ terms; but it’s a bit of a mouthful.
I’ve heard someone describe ‘UDT’ as “FDT + a theory of anthropics”—i.e., it builds in the core idea of what we’re calling “FDT” (“choose by imagining that your (fixed) decision function takes on different logical outputs”), plus a view to the effect that decisions+probutilities are what matter, and subjective expectations don’t make sense. Having a name for the FDT part of the view seems useful for evaluating the subclaims separately.
The FDT paper introduces the FDT/UDT concept in more CDT-ish terms (for ease of exposition), so I think some people have also started using ‘FDT’ to mean something like ‘variants of UDT that are more CDT-ish’, which is confusing given that FDT was originally meant to refer to the superset/family of UDT-ish views. Maybe that suggests that researchers feel more of a need for new narrow terms to fill gaps, since it’s less often necessary in the trenches to crisply refer to the superset.
I don’t suppose you could be clearer about how anthropics works on in FDT? Like are there any write-ups of how it solves any of the traditional anthropic paradoxes? Plus why don’t subjective expectations make any sense?
I read it several months ago and was wondering the same thing. Although it’s more than a branding change, FDT is much clearer put and much easier to understand.
Thanks heaps, that really makes it much less confusing!