FDT fails to get the answer Y&S want in most instances of the core example that’s supposed to motivate it
I am basically sympathetic to this concern: I think there’s a clear intuition that FDT is 2-boxing more than we would like (and a clear formal picture, in toy formalisms which show FDT-ish DTs failing on Agent Simulates Predictor problems).
Of course, it all depends on how logical counterfactuals are supposed to work. From a design perspective, I’m happy to take challenges like this as extra requirements for the behavior of logical counterfactuals, rather than objections to the whole project. I intuitively think there is a notion of logical counterfactual which fails in this respect, but, this does not mean there isn’t some other notion which succeeds. Perhaps we can solve the easy problem of one-boxing with a strong predictor first, and then look for ways to one-box more generally (and in fact, this is what we’ve done—one-boxing with a strong predictor is not so difficult).
However, I do want to add that when Omega uses very weak prediction methods such as the examples given, it is not so clear that we want to one-box. Will is presuming that Y&S simply want to one-box in any Newcomb problem. However, we could make a distinction between evidential Newcomb problems and functional Newcomb problems. Y&S already state that they consider some things to be functional Newcomb problems despite them not being evidential Newcomb problems (such as transparent Newcomb). It stands to reason that there would be some evidential Newcomb problems which are not functional Newcomb problems, as well, and that Y&S would prefer not to one-box in such cases.
However, the predictor needn’t be running your algorithm, or have anything like a representation of that algorithm, in order to predict whether you’ll one box or two-box. Perhaps the Scots tend to one-box, whereas the English tend to two-box.
In this example, it seems quite plausible that there’s a (logico-causal) reason for the regularity, so that in the logical counterfactual where you act differently, your reference class also acts somewhat differently. Say you’re Scottish, and 10% of Scots read a particular fairy tale growing up, and this is connected with why you two-box. Then in the counterfactual in which you one-box, it is quite possible that those 10% also one-box. Of course, this greatly weakens the connection between Omega’s prediction and your action; perhaps the change of 10% is not enough to tip the scales in Omega’s prediction.
But, without any account of Y&S’s notion of subjunctive counterfactuals, we just have no way of assessing whether that’s true or not. Y&S note that specifying an account of their notion of counterfactuals is an ‘open problem,’ but the problem is much deeper than that. Without such an account, it becomes completely indeterminate what follows from FDT, even in the core examples that are supposed to motivate it — and that makes FDT not a new decision theory so much as a promissory note.
In the TDT document, Eliezer addresses this concern by pointing out that CDT also takes a description of the causal structure of a problem as given, begging the question of how we learn causal counterfactuals. In this regard, FDT and CDT are on the same level of promissory-note-ness.
It might, of course, be taken as much more plausible that a technique of learning the physical-causal structure can be provided, in contrast to a technique which learns the logical-counterfactual structure.
I want to inject a little doubt about which is easier. If a robot is interacting with an exact simulation of itself (in an iterated prisoner’s dilemma, say), won’t it be easier to infer that it directly controls the copy than it is to figure out that the two are running on different computers and thus causally independent?
Put more generally: logical uncertainty has to be handled one way or another; it cannot be entirely put aside. Existing methods of testing causality are not designed to deal with it. It stands to reason that such methods applied naively to cases including logical uncertainty would treat such uncertainty like physical uncertainty, and therefore tend to produce logical-counterfactual structure. This would not necessarily be very good for FDT purposes, being the result of unprincipled accident—and the concern for FDT’s counterfactuals is that there may be no principled foundation. Still, I tend to think that other decision theories merely brush the problem under the rug, and actually have to deal with logical counterfactuals one way or another.
Indeed, on the most plausible ways of cashing this out, it doesn’t give the conclusions that Y&S would want. If I imagine the closest world in which 6288 + 1048 = 7336 is false (Y&S’s example), I imagine a world with laws of nature radically unlike ours — because the laws of nature rely, fundamentally, on the truths of mathematics, and if one mathematical truth is false then either (i) mathematics as a whole must be radically different, or (ii) all mathematical propositions are true because it is simple to prove a contradiction and every propositions follows from a contradiction.
To this I can only say again that FDT’s problem of defining counterfactuals seems not so different to me from CDT’s problem. A causal decision theorist should be able to work in a mathematical universe; indeed, this seems rather consistent with the ontology of modern science, though not forced by it. I find it implausible that a CDT advocate should have to deny Tegmark’s mathematical universe hypothesis, or should break down and be unable to make decisions under the supposition. So, physical counterfactuals seem like they have to be at least capable of being logical counterfactuals (perhaps a different sort of logical counterfactual than FDT would use, since physical counterfactuals are supposed to give certain different answers, but a sort of logical counterfactual nonetheless).
(But this conclusion is far from obvious, and I don’t expect ready agreement that CDT has to deal with this.)
Response to Section IV:
I am basically sympathetic to this concern: I think there’s a clear intuition that FDT is 2-boxing more than we would like (and a clear formal picture, in toy formalisms which show FDT-ish DTs failing on Agent Simulates Predictor problems).
Of course, it all depends on how logical counterfactuals are supposed to work. From a design perspective, I’m happy to take challenges like this as extra requirements for the behavior of logical counterfactuals, rather than objections to the whole project. I intuitively think there is a notion of logical counterfactual which fails in this respect, but, this does not mean there isn’t some other notion which succeeds. Perhaps we can solve the easy problem of one-boxing with a strong predictor first, and then look for ways to one-box more generally (and in fact, this is what we’ve done—one-boxing with a strong predictor is not so difficult).
However, I do want to add that when Omega uses very weak prediction methods such as the examples given, it is not so clear that we want to one-box. Will is presuming that Y&S simply want to one-box in any Newcomb problem. However, we could make a distinction between evidential Newcomb problems and functional Newcomb problems. Y&S already state that they consider some things to be functional Newcomb problems despite them not being evidential Newcomb problems (such as transparent Newcomb). It stands to reason that there would be some evidential Newcomb problems which are not functional Newcomb problems, as well, and that Y&S would prefer not to one-box in such cases.
In this example, it seems quite plausible that there’s a (logico-causal) reason for the regularity, so that in the logical counterfactual where you act differently, your reference class also acts somewhat differently. Say you’re Scottish, and 10% of Scots read a particular fairy tale growing up, and this is connected with why you two-box. Then in the counterfactual in which you one-box, it is quite possible that those 10% also one-box. Of course, this greatly weakens the connection between Omega’s prediction and your action; perhaps the change of 10% is not enough to tip the scales in Omega’s prediction.
In the TDT document, Eliezer addresses this concern by pointing out that CDT also takes a description of the causal structure of a problem as given, begging the question of how we learn causal counterfactuals. In this regard, FDT and CDT are on the same level of promissory-note-ness.
It might, of course, be taken as much more plausible that a technique of learning the physical-causal structure can be provided, in contrast to a technique which learns the logical-counterfactual structure.
I want to inject a little doubt about which is easier. If a robot is interacting with an exact simulation of itself (in an iterated prisoner’s dilemma, say), won’t it be easier to infer that it directly controls the copy than it is to figure out that the two are running on different computers and thus causally independent?
Put more generally: logical uncertainty has to be handled one way or another; it cannot be entirely put aside. Existing methods of testing causality are not designed to deal with it. It stands to reason that such methods applied naively to cases including logical uncertainty would treat such uncertainty like physical uncertainty, and therefore tend to produce logical-counterfactual structure. This would not necessarily be very good for FDT purposes, being the result of unprincipled accident—and the concern for FDT’s counterfactuals is that there may be no principled foundation. Still, I tend to think that other decision theories merely brush the problem under the rug, and actually have to deal with logical counterfactuals one way or another.
To this I can only say again that FDT’s problem of defining counterfactuals seems not so different to me from CDT’s problem. A causal decision theorist should be able to work in a mathematical universe; indeed, this seems rather consistent with the ontology of modern science, though not forced by it. I find it implausible that a CDT advocate should have to deny Tegmark’s mathematical universe hypothesis, or should break down and be unable to make decisions under the supposition. So, physical counterfactuals seem like they have to be at least capable of being logical counterfactuals (perhaps a different sort of logical counterfactual than FDT would use, since physical counterfactuals are supposed to give certain different answers, but a sort of logical counterfactual nonetheless).
(But this conclusion is far from obvious, and I don’t expect ready agreement that CDT has to deal with this.)