SMK comments on A Reaction to Wolfgang Schwarz’s “On Functional Decision Theory”

SMK 12 Oct 2022 12:59 UTC
5 points
Surprisingly, Schwarz doesn’t analyze CDT’s and FDT’s answer to Prisoner’s Dilemma with a Twin (besides just giving the answers). It’s worth noting FDT clearly does better than CDT here, because the FDT agent (and its twin) both get away with 1 year in prison while the CDT agent and its twin both get 5. This is because the agents and their twins are clones—and therefore have the same decision theory and thus reach the same conclusion to this problem. FDT recognizes this, but CDT doesn’t. I am baffled Schwarz calls FDT’s recommendation on this problem “insane”, as it’s easily the right answer.
I personally agree that cooperating in the Twin PD is the correct choice, but I don’t think it is meaningful to argue for this on the grounds of decision-theoretic performance (as you seem to do). From The lack of performance metrics for CDT versus EDT, etc. by Caspar Oesterheld:
[T]here is no agreed-upon metric to compare decision theories, no way to asses even for a particular problem whether one decision theory (or its recommendation) does better than another. (This is why the CDT-versus-EDT-versus-other debate is at least partly a philosophical one.) In fact, it seems plausible that finding such a metric is “decision theory-complete” (to butcher another term with a specific meaning in computer science). By that I mean that settling on a metric is probably just as hard as settling on a decision theory and that mapping between plausible metrics and plausible decision theories is fairly easy.
Indeed, Schwarz makes a similar point in the post you are responding to:
Yudkowsky and Soares constantly talk about how FDT “outperforms” CDT, how FDT agents “achieve more utility”, how they “win”, etc. As we saw above, it is not at all obvious that this is true. It depends, in part, on how performance is measured. At one place, Yudkowsky and Soares are more specific. Here they say that “in all dilemmas where the agent’s beliefs are accurate [??] and the outcome depends only on the agent’s actual and counterfactual behavior in the dilemma at hand – reasonable constraints on what we should consider “fair” dilemmas – FDT performs at least as well as CDT and EDT (and often better)”. OK. But how we should we understand “depends on … the dilemma at hand”? First, are we talking about subjunctive or evidential dependence? If we’re talking about evidential dependence, EDT will often outperform FDT. And EDTers will say that’s the right standard. CDTers will agree with FDTers that subjunctive dependence is relevant, but they’ll insist that the standard Newcomb Problem isn’t “fair” because here the outcome (of both one-boxing and two-boxing) depends not only on the agent’s behavior in the present dilemma, but also on what’s in the opaque box, which is entirely outside her control. Similarly for all the other cases where FDT supposedly outperforms CDT. Now, I can vaguely see a reading of “depends on … the dilemma at hand” on which FDT agents really do achieve higher long-run utility than CDT/EDT agents in many “fair” problems (although not in all). But this is a very special and peculiar reading, tailored to FDT. We don’t have any independent, non-question-begging criterion by which FDT always “outperforms” EDT and CDT across “fair” decision problems.
- Heighn 20 Oct 2022 15:36 UTC
  −2 points
  0
  Parent
  Thanks for responding!
  I personally agree that cooperating in the Twin PD is clearly the correct choice, but I don’t think it is meaningful to argue for this on the grounds of decision-theoretic performance (as you seem to do). From The lack of performance metrics for CDT versus EDT, etc. by Caspar Oesterheld:
  I disagree. There’s a clear measure of performance given in the Twin PD: the utilities.
  I disagree with Oesterheld’s point about CDT vs EDT and metrics; I think we know enough math to say EDT is simply a wrong decision theory. We could, in principle, even demonstrate this in real life, by having e.g. 1000 people play a version of XOR Blackmail (500 people with and 500 people without a “termite infestation”) and see which theory performs best. It’ll be trivial to see EDT makes the wrong decision.