Here are some (very lightly edited) comments I left on Will’s draft of this post. (See also my top-level response.)
Responses to Sections II and III:
I’m not claiming that it’s clear what this means. E.g. see here, second bullet point, arguing there can be no such probability function, because any probability function requires certainty in logical facts and all their entailments.
This point shows the intertwining of logical counterfactuals (counterpossibles) and logical uncertainty. I take logical induction to represent significant progress generalizing probability theory to the case of logical uncertainty, ie, objects which have many of the virtues of probability functions while not requiring certainty about entailment of known facts. So, we can substantially reply to this objection.
However, replying to this objection does not necessarily mean we can define logical counterfactuals as we would want. So far we have only been able to use logical induction to specify a kind of “logically uncertain evidential conditional”. (IE, something closer in spirit to EDT, which does behave more like FDT in some problems but not in general.)
I want to emphasize that I agree that specifying what logical counterfactuals are is a grave difficulty, so grave as to seem (to me, at present) to be damning, provided one can avoid the difficulty in some other approach. However, I don’t actually think that the difficulty can be avoided in any other approach! I think CDT ultimately has to grapple with the question as well, because physics is math, and so physical counterfactuals are ultimately mathematical counterfactuals. Even EDT has to grapple with this problem, ultimately, due to the need to handle cases where one’s own action can be logically known. (Or provide a convincing argument that such cases cannot arise, even for an agent which is computable.)
Guaranteed Payoffs: In conditions of certainty — that is, when the decision-maker has no uncertainty about what state of nature she is in, and no uncertainty about the utility payoff of each action is — the decision-maker should choose the action that maximises utility.
(Obligatory remark that what maximizes utility is part of what’s at issue here, and for precisely this reason, an FDTist could respond that it’s CDT and EDT which fail in the Bomb example—by failing to maximize the a priori expected utility of the action taken.)
FDT would disagree with this principle in general, since full certainty implies certainty about one’s action, and the utility to be received, as well as everything else. However, I think we can set that aside and say there’s a version of FDT which would agree with this principle in terms of prior uncertainty. It seems cases like Bomb cannot be set up without either invoking prior uncertainty (taking the form of the predictor’s failure rate) or bringing the question of how to deal with logically impossible decisions to the forefront (if we consider the case of a perfect predictor).
Why should prior uncertainty be important, in cases of posterior certainty? Because of the prior-optimality notion (in which a decision theory is judged on a decision problem based on the utility received in expectation according to the prior probability which defines the decision problem).
Prior-optimality considers the guaranteed-payoff objection to be very similar to objecting to a gambling strategy by pointing out that the gambling strategy sometimes loses. In Bomb, the problem clearly stipulates that an agent who follows the FDT recommendation has a trillion trillion to one odds of doing better than an agent who follows the CDT/EDT recommendation. Complaining about the one-in-a-trillion-trillion chance that you get the bomb while being the sort of agent who takes the bomb is, to an FDT-theorist, like a gambler who has just lost a trillion-trillion-to-one bet complaining that the bet doesn’t look so rational now that the outcome is known with certainty to be the one-in-a-trillion-trillion case where the bet didn’t pay well.
The right action, according to FDT, is to take Left, in the full knowledge that as a result you will slowly burn to death. Why? Because, using Y&S’s counterfactuals, if your algorithm were to output ‘Left’, then it would also have outputted ‘Left’ when the predictor made the simulation of you, and there would be no bomb in the box, and you could save yourself $100 by taking Left.
And why, on your account, is this implausible? To my eye, this is right there in the decision problem, not a weird counterintuitive consequence of FDT: the decision problem stipulates that algorithms which output ‘left’ will not end up in the situation of taking a bomb, with very, very high probability.
Again, complaining that you now know with certainty that you’re in the unlucky position of seeing the bomb seems irrelevant in the way that a gambler complaining that they now know how the dice fell seems irrelevant—it’s still best to gamble according to the odds, taking the option which gives the best chance of success.
(But what I most want to convey here is that there is a coherent sense in which FDT does the optimal thing, whether or not one agrees with it.)
One way of thinking about this is to say that the FDT notion of “decision problem” is different from the CDT or EDT notion, in that FDT considers the prior to be of primary importance, whereas CDT and EDT consider it to be of no importance. If you had instead specified ‘bomb’ with just the certain information that ‘left’ is (causally and evidentially) very bad and ‘right’ is much less bad, then CDT and EDT would regard it as precisely the same decision problem, whereas FDT would consider it to be a radically different decision problem.
Another way to think about this is to say that FDT “rejects” decision problems which are improbable according to their own specification. In cases like Bomb where the situation as described is by its own description a one in a trillion trillion chance of occurring, FDT gives the outcome only one-trillion-trillion-th consideration in the expected utility calculation, when deciding on a strategy.
Also, I note that this analysis (on the part of FDT) does not hinge in this case on exotic counterfactuals. If you set Bomb up in the Savage framework, you would be forced to either give only the certain choice between bomb and not-bomb (so you don’t represent the interesting part of the problem, involving the predictor) or to give the decision in terms of the prior, in which case the Savage framework would endorse the FDT recommendation.
Another framework in which we could arrive at the same analysis would be that of single-player extensive-form games, in which the FDT recommendation corresponds to the simple notion of optimal strategy, whereas the CDT recommendation amounts to the stipulation of subgame-optimality.
FDT fails to get the answer Y&S want in most instances of the core example that’s supposed to motivate it
I am basically sympathetic to this concern: I think there’s a clear intuition that FDT is 2-boxing more than we would like (and a clear formal picture, in toy formalisms which show FDT-ish DTs failing on Agent Simulates Predictor problems).
Of course, it all depends on how logical counterfactuals are supposed to work. From a design perspective, I’m happy to take challenges like this as extra requirements for the behavior of logical counterfactuals, rather than objections to the whole project. I intuitively think there is a notion of logical counterfactual which fails in this respect, but, this does not mean there isn’t some other notion which succeeds. Perhaps we can solve the easy problem of one-boxing with a strong predictor first, and then look for ways to one-box more generally (and in fact, this is what we’ve done—one-boxing with a strong predictor is not so difficult).
However, I do want to add that when Omega uses very weak prediction methods such as the examples given, it is not so clear that we want to one-box. Will is presuming that Y&S simply want to one-box in any Newcomb problem. However, we could make a distinction between evidential Newcomb problems and functional Newcomb problems. Y&S already state that they consider some things to be functional Newcomb problems despite them not being evidential Newcomb problems (such as transparent Newcomb). It stands to reason that there would be some evidential Newcomb problems which are not functional Newcomb problems, as well, and that Y&S would prefer not to one-box in such cases.
However, the predictor needn’t be running your algorithm, or have anything like a representation of that algorithm, in order to predict whether you’ll one box or two-box. Perhaps the Scots tend to one-box, whereas the English tend to two-box.
In this example, it seems quite plausible that there’s a (logico-causal) reason for the regularity, so that in the logical counterfactual where you act differently, your reference class also acts somewhat differently. Say you’re Scottish, and 10% of Scots read a particular fairy tale growing up, and this is connected with why you two-box. Then in the counterfactual in which you one-box, it is quite possible that those 10% also one-box. Of course, this greatly weakens the connection between Omega’s prediction and your action; perhaps the change of 10% is not enough to tip the scales in Omega’s prediction.
But, without any account of Y&S’s notion of subjunctive counterfactuals, we just have no way of assessing whether that’s true or not. Y&S note that specifying an account of their notion of counterfactuals is an ‘open problem,’ but the problem is much deeper than that. Without such an account, it becomes completely indeterminate what follows from FDT, even in the core examples that are supposed to motivate it — and that makes FDT not a new decision theory so much as a promissory note.
In the TDT document, Eliezer addresses this concern by pointing out that CDT also takes a description of the causal structure of a problem as given, begging the question of how we learn causal counterfactuals. In this regard, FDT and CDT are on the same level of promissory-note-ness.
It might, of course, be taken as much more plausible that a technique of learning the physical-causal structure can be provided, in contrast to a technique which learns the logical-counterfactual structure.
I want to inject a little doubt about which is easier. If a robot is interacting with an exact simulation of itself (in an iterated prisoner’s dilemma, say), won’t it be easier to infer that it directly controls the copy than it is to figure out that the two are running on different computers and thus causally independent?
Put more generally: logical uncertainty has to be handled one way or another; it cannot be entirely put aside. Existing methods of testing causality are not designed to deal with it. It stands to reason that such methods applied naively to cases including logical uncertainty would treat such uncertainty like physical uncertainty, and therefore tend to produce logical-counterfactual structure. This would not necessarily be very good for FDT purposes, being the result of unprincipled accident—and the concern for FDT’s counterfactuals is that there may be no principled foundation. Still, I tend to think that other decision theories merely brush the problem under the rug, and actually have to deal with logical counterfactuals one way or another.
Indeed, on the most plausible ways of cashing this out, it doesn’t give the conclusions that Y&S would want. If I imagine the closest world in which 6288 + 1048 = 7336 is false (Y&S’s example), I imagine a world with laws of nature radically unlike ours — because the laws of nature rely, fundamentally, on the truths of mathematics, and if one mathematical truth is false then either (i) mathematics as a whole must be radically different, or (ii) all mathematical propositions are true because it is simple to prove a contradiction and every propositions follows from a contradiction.
To this I can only say again that FDT’s problem of defining counterfactuals seems not so different to me from CDT’s problem. A causal decision theorist should be able to work in a mathematical universe; indeed, this seems rather consistent with the ontology of modern science, though not forced by it. I find it implausible that a CDT advocate should have to deny Tegmark’s mathematical universe hypothesis, or should break down and be unable to make decisions under the supposition. So, physical counterfactuals seem like they have to be at least capable of being logical counterfactuals (perhaps a different sort of logical counterfactual than FDT would use, since physical counterfactuals are supposed to give certain different answers, but a sort of logical counterfactual nonetheless).
(But this conclusion is far from obvious, and I don’t expect ready agreement that CDT has to deal with this.)
An alternative approaches that captures the spirit of FDT’s aims
I’m somewhat confused about how you can buy FDT as far as you seem to buy it in this section, while also claiming not to understand FDT to the point of saying there is no sensible perspective at all in which it can be said to achieve higher utility. From the perspective in this section, it seems you can straightforwardly interpret FDT’s notion of expected utility maximization via an evaluative focal point such as “the output of the algorithm given these inputs”.
This evaluative focal point addresses the concern you raise about how bounded ability to implement decision procedures interacts with a “best decision procedure” evaluative focal point (making it depart from FDT’s recommendations in so far as the agent can’t manage to act like FDT), since those concerns don’t arise (at least not so clearly) when we consider what FDT would recommend for the response to one situation in particular. On the other hand, we also can make sense of the notion that taking the bomb is best, since (according to both global-CDT and global-EDT) it is best for an algorithm to output “left” when given the inputs of the bomb problem (in that it gives us the best news about how that agent would do in bomb problems, and causes the agent to do well when put in bomb problems, in so far as a causal intervention on the output of the algorithm also affects a predictor running the same algorithm).
I’m puzzled by this concern. Is the doctrine of expected utility plagued by a corresponding ‘implausible discontinuity’ problem because if action 1 has expected value .999 and action 2 has expected value 1, then you should take action 2, but a very small change could mean you should take action 1? Is CDT plagued by an implausible-discontinuity problem because two problems which EDT would treat as the same will differ in causal expected value, and there must be some in-between problem where uncertainty about the causal structure balances between the two options, so CDT’s recommendation implausibly makes a sharp shift when the uncertainty is jiggled a little? Can’t we similarly boggle at the implausibility that a tiny change in the physical structure of a problem should make such a large difference in the causal structure so as to change CDT’s recommendation? (For example, the tiny change can be a small adjustment to the coin which determines which of two causal structures will be in play, with no overall change in the evidential structure.)
It seems like what you find implausible about FDT here has nothing to do with discontinuity, unless you find CDT and EDT similarly implausible.
FDT is deeply indeterminate
This is obviously a big challenge for FDT; we don’t know what logical counterfactuals look like, and invoking them is problematic until we do.
However, I can point to some toy models of FDT which lend credence to the idea that there’s something there. The most interesting may be MUDT (see the “modal UDT” section of this summary post). This decision theory uses the notion of “possible” from the modal logic of provability, so that despite being a deterministic agent and therefore only taking one particular action in fact, agents have a well-defined possible-world structure to consider in making decisions, derived from what they can prove.
I have a post planned that focuses on a different toy model, single-player extensive-form games. This has the advantage of being only as exotic as standard game theory.
In both of these cases, FDT can be well-specified (at least, to the extent we’re satisfied with calling the toy DTs examples of FDT—which is a bit awkward, since FDT is kind of a weird umbrella term for several possible DTs, but also kind of specifically supposed to use functional graphs, which MUDT doesn’t use).
It bears mentioning that a Bayesian already regards the probability distribution representing a problem to be deeply indeterminate, so this seems less bad if you start from such a perspective. Logical counterfactuals can similarly be thought of as subjective objects, rather than some objective fact which we have to uncover in order to know what FDT does.
On the other hand, greater indeterminacy is still worse; just because we already have lots of degrees of freedom to mess ourselves up with doesn’t mean we happily accept even more.
And in general, it seems to me, there’s no fact of the matter about which algorithm a physical process is implementing in the absence of a particular interpretation of the inputs and outputs of that physical process.
Part of the reason that I’m happy for FDT to need such a fact is that I think I need such a fact anyway, in order to deal with anthropic uncertainty, and other issues.
If you don’t think there’s such a fact, then you can’t take a computationalist perspective on theory of mind—in which case, I wonder what position you take on questions such as consciousness. Obviously this leads to a number of questions which are quite aside from the point at hand, but I would personally think that questions such as whether an organism is experiencing suffering have to do with what computations are occurring. This ultimately cashes out to physical facts, yes, but it seems as if suffering should be a fundamentally computational fact which cashes out in terms of physical facts only in a substrate-independent way (ie, the physical facts of importance are precisely those which pertain to the question of which computation is running).
But almost all accounts of computation in physical processes have the issue that very many physical processes are running very many different algorithms, all at the same time.
Indeed, I think this is one of the main obstacles to a satisfying account—a successful account should not have this property.
Assessing by how well the decision-maker does in possible worlds that she isn’t in fact in doesn’t seem a compelling criterion (and EDT and CDT could both do well by that criterion, too, depending on which possible worlds one is allowed to pick).
You make the claim that EDT and CDT can claim optimality in exactly the same way that FDT can, here, but I think the arguments are importantly not symmetric. CDT and EDT are optimal according to their own optimality notions, but given the choice to implement different decision procedures on later problems, both the CDT and EDT optimality notions would endorse selecting FDT over themselves in many of the problems mentioned in the paper, whereas FDT will endorse itself.
Most of this section seems to me to be an argument to make careful level distinctions, in an attempt to avoid the level-crossing argument which is FDT’s main appeal. Certainly, FDTers such as myself will often use language which confuses the various levels, since we take a position which says they should be confusable—the best decision procedures should follow the best policies, which should take the best actions. But making careful level distinctions does not block the level-crossing argument, it only clarifies it. FDT may not be the only “consistent fixed-point of normativity” (to the extent that it even is that), but CDT and EDT are clearly not that.
Fourth, arguing that FDT does best in a class of ‘fair’ problems, without being able to define what that class is or why it’s interesting, is a pretty weak argument.
I basically agree that the FDT paper dropped the ball here, in that it could have given a toy setting in which ‘fair’ is rigorously defined (in a pretty standard game-theoretic setting) and FDT has the claimed optimality notion. I hope my longer writeup can make such a setting clear.
Briefly: my interpretation of the “FDT does better” claim in the FDT paper is that FDT is supposed to take UDT-optimal actions. To the extent that it doesn’t take UDT-optimal actions, I mostly don’t endorse the claim that it does better (though I plan to note in a follow-up post an alternate view in which the FDT notion of optimality may be better).
The toy setting I have in mind that makes “UDT-optimal” completely well-defined is actually fairly general. The idea is that if we can represent a decision problem as a (single-player) extensive-form game, UDT is just the idea of throwing out the requirement of subgame-optimality. In other words, we don’t even need a notion of “fairness” in the setting of extensive-form games—the setting isn’t rich enough to represent any “unfair” problems. Yet it is a pretty rich setting.
The FDT paper may have left out this model out of a desire for greater generality, which I do think is an important goal—from my perspective, it makes sense not to reduce things to the toy model in which everything works out nicely.
I think CDT ultimately has to grapple with the question as well, because physics is math, and so physical counterfactuals are ultimately mathematical counterfactuals.
“Physics is math” is ontologically reductive.
Physics can often be specified as a dynamical system (along with interpretations of e.g. what high-level entities it represents, how it gets observed). Dynamical systems can be specified mathematically. Dynamical systems also have causal counterfactuals (what if you suddenly changed the system state to be this instead?).
Causal counterfactuals defined this way have problems (violation of physical law has consequences). But they are well-defined.
Yeah, agreed, I no longer endorse the argument I was making there—one has to say more than “physics is math” to establish the importance of dealing with logical counterfactuals.
