[note: this is bugging me more than it should. I really don’t get why this is worth so much repetition of examples that don’t show anything new.]
I’ll admit I’m one of those who doesn’t see CDT as hopeless. It takes a LOT of hypothetical setup to show cases where it fails, and neither newcomb nor this seem to be as much about decision theory as about free will.
Part of this is my failing. I keep thinking CDT is “classical decision theory”, and it means “make the best conditional predictions you can, and then maximize your expected value. This is very robust, but describes all serious decision theories. The actual discussion is about “causal decision theory”, and there are plenty of failure cases, where the agent has a flawed model of causality.
But for some reason, we can’t just say “incorrect causal models make bad predictions” and move on. We keep bringing up really contrived cases where a naive agent, which we label CDT, makes bad conditional predictions, and it’s not clear why they’re so stupid as to not notice. I don’t know ANYONE who claims an agent should make and act on incorrect predictions.
For your newcomb-like example (and really, any Omega causality violation), I assert that a CDT agent could notice outcomes and apply bayes’ theorem to the chance that they can trick Omega just as well as any other DT. Assuming that Omega is cheating, and changing the result after my choice is sufficient to get the right answer.
Cases of mind-reading and the like are similarly susceptible to better causality models—recognizing that the causality is due to the agent’s intent, not their actions, makes CDT recognize that to the extent it can control the intent, it should.
Your summary includes ” the CDT agent can never learn this”, and that seems the crux. To me, not learning something means that _EITHER_ CDT agent is a strawman that we shouldn’t spend so much time on, _OR_ this is something that cannot be true, and it’s probably good if agents can’t learn it. If you tell me that a Euclidian agent knows pi and can accurately make wagers on the circumference of a circle knowing only it’s diameter, but it’s flawed because a magic being puts it on a curved surface and it never re-considers that belief, I’m going to shrug and say “okay… but here in flatland that doesn’t happen”. It doesn’t matter how many thought experiments you come up with to show counterfactual cases where C/D is different for a circle, you’re completely talking past my objection that Euclidian decision theory is simple and workable for actual use.
To summarize my confusion, does CDT require that the agent unconditionally believe in perfect free will independent of history (and, ironically, with no causality for the exercise of will)? If so, that should be the main topic of dispute—the frequency of actual case where it makes bad predictions, not that it makes bad decisions in ludicrously-unlikely-and-perhaps-impossible situations.
To summarize my confusion, does CDT require that the agent unconditionally believe in perfect free will independent of history (and, ironically, with no causality for the exercise of will)? If so, that should be the main topic of dispute—the frequency of actual case where it makes bad predictions, not that it makes bad decisions in ludicrously-unlikely-and-perhaps-impossible situations.
Sorta, yes. CDT requires that you choose actions not by thinking “conditional on my doing A, what happens?” but rather by some other method (there are different variants) such as “For each causal graph that I think could represent the world, what happens when I intervene (in Pearl’s sense) on the node that is my action, to set it to A?)” or “Holding fixed the probability of all variables not causally downstream of my action, what happens if I do A?”
In the first version, notice that you are choosing actions by imagining a Pearl-style intervention into the world—but this is not something that actually happens; the world doesn’t actually contain such interventions.
In the second version, well, notice that you are choosing actions by imagining possible scenarios that aren’t actually possible—or at least, you are assigning the wrong probabilities to them. (“holding fixed the probability of all variables not causally downstream of my action...”)
So one way to interpret CDT is that it believes in crazy stuff like hardcore incompatibilist free will. But the more charitable way to interpret it is that it doesn’t believe in that stuff, it just acts as if it does, because it thinks that’s the rational way to act. (And they have plenty of arguments for why CDT is the rational way to act, e.g. the intuition pump “If the box is already either full or empty and you can’t change that no matter what you do, then no matter what you do you’ll get more money by two-boxing, so...”
Not believing it, but thinking it’s rational to act that way, seems even worse than believing it in the first place.
I -totally- understand the arguing-against-the-premise response to such things. It’s coherent and understandable to say “CDT is good enough, because these examples can’t actually happen, or are so rare that I’ll pay that cost in order to have a simpler model for the other 99.9999% of my decisions”. I’d enjoy talking to someone who says “I accept that I’ll get the worse result, but it’s the right thing to do because … ”. I can’t ITT the ending to this sentence.
these examples can’t actually happen, or are so rare that I’ll pay that cost in order to have a simpler model for the other 99.9999% of my decisions
Indeed, if it were true that Newcomb-like situations (or more generally, situations where other agents condition their behavior on predictions of your behavior) do not occur with any appreciable frequency, there would be much less interest in creating a decision theory that addresses such situations.
But far from constituting a mere 0.0001% of possible situations (or some other, similarly minuscule percentage), Newcomb-like situations are simply the norm! Even in everyday human life, we frequently encounter other people and base our decisions off what we expect them to do—indeed, the ability to model others and act based on those models is integral to functioning as part of any social group or community. And it should be noted that humans do not behave as causal decision theory predicts they ought to—we do not betray each other in one-shot prisoner’s dilemmas, we pay people we hire (sometimes) well in advance of them completing their job, etc.
This is not mere “irrationality”; otherwise, there would have been no reason for us to develop these kinds of pro-social instincts in the first place. The observation that CDT is inadequate is fundamentally a combination of (a) the fact that it does not accurately predict certain decisions we make, and (b) the claim that the decisions we make are in some sense correct rather than incorrect—and if CDT disagrees, then so much the worse for CDT. (Specifically, the sense in which our decisions are correct—and CDT is not—is that our decisions result in more expected utility in the long run.)
All it takes for CDT to fail is the presence of predictors. These predictors don’t have to be Omega-style superintelligences—even moderately accurate predictors who perform significantly (but not ridiculously) above random chance can create Newcomb-like elements with which CDT is incapable of coping. I really don’t see any justification at all for the idea that these situations somehow constitute a superminority of possible situations, or (worse yet) that they somehow “cannot” happen. Such a claim seems to be missing the forest for the trees: you don’t need perfect predictors to have these problems show up; the problems show up anyway. The only purpose of using Omega-style perfect predictors is to make our thought experiments clearer (by making things more extreme), but they are by no means necessary.
Which summarizes my confusion. If CDT is this clearly broken, why is it so discussed (and apparently defended, though I don’t actually know any defenders).
Dagon, I sympathize. CDT seems bonkers to me for the reasons you have pointed out. My guess is that academic philosophy has many people who support CDT for three main reasons, listed in increasing order of importance:
(1) Even within academic philosophy, many people aren’t super familiar with these arguments. They read about CDT vs. EDT, they read about a few puzzle cases, and they form an opinion and then move on—after all, there are lots of topics to specialize in, even in decision theory, and so if this debate doesn’t grip you you might not dig too deeply.
(2) Lots of people have pretty strong intuitions that CDT vindicates. E.g. iirc Newcomb’s Problem was originally invented to prove that EDT was silly (because, silly EDT, it would one-box, which is obviously stupid!) My introductory textbook to decision theory was an attempt to build for CDT an elegant mathematical foundation to rival the jeffrey-bolker axioms for EDT. And why do this? It said, basically, “EDT gives the wrong answer in Newcomb’s Problem and other problems, so we need to find a way to make some version of CDT mathematically respectable.”
(3) EDT has lots of problems too. Even hardcore LWer fans of EDT like Caspar Oesterheld admit as much, and even waver back and forth between EDT and CDT for this reason. And the various alternatives to EDT and CDT that have been thus far proposed also seem to have problems.
My introductory textbook to decision theory was an attempt to build for CDT an elegant mathematical foundation to rival the jeffrey-bolker axioms for EDT. And why do this? It said, basically, “EDT gives the wrong answer in Newcomb’s Problem and other problems, so we need to find a way to make some version of CDT mathematically respectable.”
Joyce’s Foundations of Causal Decision Theory, right? That was the book I bought to learn decision theory too. My focus was on anthropic reasoning instead of Newcomb’s problem at the time, so I just uncritically accepted the book’s contention that two-boxing is the rational thing to do. As a result, while trying to formulate my own decision theory, I had to come up with complicated ways to force it to two-box. It was only after reading Eliezer’s posts about Newcomb’s problem that I realized that if one-boxing is actually the right thing to do, the decision theory could be made much more elegant. (Too bad it turns out to still have a number of problems that we don’t know how to solve.)
But considering that randomness as an antidote to perfect predictions is ubiquitously available in this universe, it’s hard to see what practical implications these CDT failures in highly contrived thought experiments have.
[note: this is bugging me more than it should. I really don’t get why this is worth so much repetition of examples that don’t show anything new.]
I’ll admit I’m one of those who doesn’t see CDT as hopeless. It takes a LOT of hypothetical setup to show cases where it fails, and neither newcomb nor this seem to be as much about decision theory as about free will.
Part of this is my failing. I keep thinking CDT is “classical decision theory”, and it means “make the best conditional predictions you can, and then maximize your expected value. This is very robust, but describes all serious decision theories. The actual discussion is about “causal decision theory”, and there are plenty of failure cases, where the agent has a flawed model of causality.
But for some reason, we can’t just say “incorrect causal models make bad predictions” and move on. We keep bringing up really contrived cases where a naive agent, which we label CDT, makes bad conditional predictions, and it’s not clear why they’re so stupid as to not notice. I don’t know ANYONE who claims an agent should make and act on incorrect predictions.
For your newcomb-like example (and really, any Omega causality violation), I assert that a CDT agent could notice outcomes and apply bayes’ theorem to the chance that they can trick Omega just as well as any other DT. Assuming that Omega is cheating, and changing the result after my choice is sufficient to get the right answer.
Cases of mind-reading and the like are similarly susceptible to better causality models—recognizing that the causality is due to the agent’s intent, not their actions, makes CDT recognize that to the extent it can control the intent, it should.
Your summary includes ” the CDT agent can never learn this”, and that seems the crux. To me, not learning something means that _EITHER_ CDT agent is a strawman that we shouldn’t spend so much time on, _OR_ this is something that cannot be true, and it’s probably good if agents can’t learn it. If you tell me that a Euclidian agent knows pi and can accurately make wagers on the circumference of a circle knowing only it’s diameter, but it’s flawed because a magic being puts it on a curved surface and it never re-considers that belief, I’m going to shrug and say “okay… but here in flatland that doesn’t happen”. It doesn’t matter how many thought experiments you come up with to show counterfactual cases where C/D is different for a circle, you’re completely talking past my objection that Euclidian decision theory is simple and workable for actual use.
To summarize my confusion, does CDT require that the agent unconditionally believe in perfect free will independent of history (and, ironically, with no causality for the exercise of will)? If so, that should be the main topic of dispute—the frequency of actual case where it makes bad predictions, not that it makes bad decisions in ludicrously-unlikely-and-perhaps-impossible situations.
Sorta, yes. CDT requires that you choose actions not by thinking “conditional on my doing A, what happens?” but rather by some other method (there are different variants) such as “For each causal graph that I think could represent the world, what happens when I intervene (in Pearl’s sense) on the node that is my action, to set it to A?)” or “Holding fixed the probability of all variables not causally downstream of my action, what happens if I do A?”
In the first version, notice that you are choosing actions by imagining a Pearl-style intervention into the world—but this is not something that actually happens; the world doesn’t actually contain such interventions.
In the second version, well, notice that you are choosing actions by imagining possible scenarios that aren’t actually possible—or at least, you are assigning the wrong probabilities to them. (“holding fixed the probability of all variables not causally downstream of my action...”)
So one way to interpret CDT is that it believes in crazy stuff like hardcore incompatibilist free will. But the more charitable way to interpret it is that it doesn’t believe in that stuff, it just acts as if it does, because it thinks that’s the rational way to act. (And they have plenty of arguments for why CDT is the rational way to act, e.g. the intuition pump “If the box is already either full or empty and you can’t change that no matter what you do, then no matter what you do you’ll get more money by two-boxing, so...”
Thanks for this clarifying comment, Daniel!
Not believing it, but thinking it’s rational to act that way, seems even worse than believing it in the first place.
I -totally- understand the arguing-against-the-premise response to such things. It’s coherent and understandable to say “CDT is good enough, because these examples can’t actually happen, or are so rare that I’ll pay that cost in order to have a simpler model for the other 99.9999% of my decisions”. I’d enjoy talking to someone who says “I accept that I’ll get the worse result, but it’s the right thing to do because … ”. I can’t ITT the ending to this sentence.
Indeed, if it were true that Newcomb-like situations (or more generally, situations where other agents condition their behavior on predictions of your behavior) do not occur with any appreciable frequency, there would be much less interest in creating a decision theory that addresses such situations.
But far from constituting a mere 0.0001% of possible situations (or some other, similarly minuscule percentage), Newcomb-like situations are simply the norm! Even in everyday human life, we frequently encounter other people and base our decisions off what we expect them to do—indeed, the ability to model others and act based on those models is integral to functioning as part of any social group or community. And it should be noted that humans do not behave as causal decision theory predicts they ought to—we do not betray each other in one-shot prisoner’s dilemmas, we pay people we hire (sometimes) well in advance of them completing their job, etc.
This is not mere “irrationality”; otherwise, there would have been no reason for us to develop these kinds of pro-social instincts in the first place. The observation that CDT is inadequate is fundamentally a combination of (a) the fact that it does not accurately predict certain decisions we make, and (b) the claim that the decisions we make are in some sense correct rather than incorrect—and if CDT disagrees, then so much the worse for CDT. (Specifically, the sense in which our decisions are correct—and CDT is not—is that our decisions result in more expected utility in the long run.)
All it takes for CDT to fail is the presence of predictors. These predictors don’t have to be Omega-style superintelligences—even moderately accurate predictors who perform significantly (but not ridiculously) above random chance can create Newcomb-like elements with which CDT is incapable of coping. I really don’t see any justification at all for the idea that these situations somehow constitute a superminority of possible situations, or (worse yet) that they somehow “cannot” happen. Such a claim seems to be missing the forest for the trees: you don’t need perfect predictors to have these problems show up; the problems show up anyway. The only purpose of using Omega-style perfect predictors is to make our thought experiments clearer (by making things more extreme), but they are by no means necessary.
Which summarizes my confusion. If CDT is this clearly broken, why is it so discussed (and apparently defended, though I don’t actually know any defenders).
Dagon, I sympathize. CDT seems bonkers to me for the reasons you have pointed out. My guess is that academic philosophy has many people who support CDT for three main reasons, listed in increasing order of importance:
(1) Even within academic philosophy, many people aren’t super familiar with these arguments. They read about CDT vs. EDT, they read about a few puzzle cases, and they form an opinion and then move on—after all, there are lots of topics to specialize in, even in decision theory, and so if this debate doesn’t grip you you might not dig too deeply.
(2) Lots of people have pretty strong intuitions that CDT vindicates. E.g. iirc Newcomb’s Problem was originally invented to prove that EDT was silly (because, silly EDT, it would one-box, which is obviously stupid!) My introductory textbook to decision theory was an attempt to build for CDT an elegant mathematical foundation to rival the jeffrey-bolker axioms for EDT. And why do this? It said, basically, “EDT gives the wrong answer in Newcomb’s Problem and other problems, so we need to find a way to make some version of CDT mathematically respectable.”
(3) EDT has lots of problems too. Even hardcore LWer fans of EDT like Caspar Oesterheld admit as much, and even waver back and forth between EDT and CDT for this reason. And the various alternatives to EDT and CDT that have been thus far proposed also seem to have problems.
Joyce’s Foundations of Causal Decision Theory, right? That was the book I bought to learn decision theory too. My focus was on anthropic reasoning instead of Newcomb’s problem at the time, so I just uncritically accepted the book’s contention that two-boxing is the rational thing to do. As a result, while trying to formulate my own decision theory, I had to come up with complicated ways to force it to two-box. It was only after reading Eliezer’s posts about Newcomb’s problem that I realized that if one-boxing is actually the right thing to do, the decision theory could be made much more elegant. (Too bad it turns out to still have a number of problems that we don’t know how to solve.)
Yep, that’s the one! :)
But considering that randomness as an antidote to perfect predictions is ubiquitously available in this universe, it’s hard to see what practical implications these CDT failures in highly contrived thought experiments have.
You may like this, then: https://www.lesswrong.com/posts/9m2fzjNSJmd3yxxKG/acdt-a-hack-y-acausal-decision-theory
I do, very much :)