The output of this process is something people have taken to calling Son-of-CDT; the problem (insofar as we understand Son-of-CDT well enough to talk about its behavior) is that the resulting decision theory continues to neglect correlations that existed prior to self-modification.
(In your terms: Alice and Bob would only one-box in Newcomb variants where Omega based his prediction on them after they came up with their new decision theory; Newcomb variants where Omega’s prediction occurred before they had their talk would still be met with two-boxing, even if Omega is stipulated to be able to predict the outcome of the talk.)
This still does not seem like particularly sane behavior, which means, unfortunately, that there’s no real way for a CDT agent to fix itself: it was born with too dumb of a prior for even self-modification to save it.
Thanks. After thinking about your explanation for a while, I have made a small update in the direction of FDT. This example makes FDT seem parsimonious to me, because it makes a simpler precommitment.
I almost made a large update in the direction of FDT, but when I imagined explaining the reason for that update I ran into a snag. I imagined someone saying “OK, you’ve decided to precommit to one-boxing. Do you want to precommit to one-boxing when (a) Omega knows about this precommitment, or (b) Omega knows about this precommitment, AND the entangled evidence that Omega relied upon is ‘downstream’ of the precommitment itself? For example, in case (b), you would one-box if Omega read a transcript of this conversation, but not if Omega only read a meeting agenda that described how I planned to persuade you of option (a).”
But when phrased that way, it suddenly seems reasonable to reply: “I’m not sure what Omega would predict that I do if he could only see the meeting agenda. But I am sure that the meeting agenda isn’t going to change based on whether I pick (a) or (b) right now, so my choice can’t possibly alter what Omega puts into the box in that case. Thus, I see no advantage to precommiting to one-boxing in that situation.”
If Omega really did base its prediction just on the agenda (and not on, say, a scan of the source code of every living human), this reply seems correct to me. The story’s only interesting because Omega has god-like predictive abilities.
Which I guess shouldn’t be surprising, because if there were a version of Newcomb’s problem that cleanly split FDT from CDT without invoking extreme abilities on Omega’s part, I would expect that to be the standard version.
I’m left with a vague impression that FDT and CDT mostly disagree about “what rigorous mathematical model should we take this informal story-problem to be describing?” rather than “what strategy wins, given a certain rigorous mathematical model of the game?” CDT thinks you are choosing between $1K and $0, while FDT thinks you are choosing between $1K and $1M. If we could actually run the experiment, even in simulation, then that disagreement seems like it should have a simple empirical resolution; but I don’t think anyone knows how to do that. (Please correct me if I’m wrong!)
The output of this process is something people have taken to calling Son-of-CDT; the problem (insofar as we understand Son-of-CDT well enough to talk about its behavior) is that the resulting decision theory continues to neglect correlations that existed prior to self-modification.
(In your terms: Alice and Bob would only one-box in Newcomb variants where Omega based his prediction on them after they came up with their new decision theory; Newcomb variants where Omega’s prediction occurred before they had their talk would still be met with two-boxing, even if Omega is stipulated to be able to predict the outcome of the talk.)
This still does not seem like particularly sane behavior, which means, unfortunately, that there’s no real way for a CDT agent to fix itself: it was born with too dumb of a prior for even self-modification to save it.
Thanks. After thinking about your explanation for a while, I have made a small update in the direction of FDT. This example makes FDT seem parsimonious to me, because it makes a simpler precommitment.
I almost made a large update in the direction of FDT, but when I imagined explaining the reason for that update I ran into a snag. I imagined someone saying “OK, you’ve decided to precommit to one-boxing. Do you want to precommit to one-boxing when (a) Omega knows about this precommitment, or (b) Omega knows about this precommitment, AND the entangled evidence that Omega relied upon is ‘downstream’ of the precommitment itself? For example, in case (b), you would one-box if Omega read a transcript of this conversation, but not if Omega only read a meeting agenda that described how I planned to persuade you of option (a).”
But when phrased that way, it suddenly seems reasonable to reply: “I’m not sure what Omega would predict that I do if he could only see the meeting agenda. But I am sure that the meeting agenda isn’t going to change based on whether I pick (a) or (b) right now, so my choice can’t possibly alter what Omega puts into the box in that case. Thus, I see no advantage to precommiting to one-boxing in that situation.”
If Omega really did base its prediction just on the agenda (and not on, say, a scan of the source code of every living human), this reply seems correct to me. The story’s only interesting because Omega has god-like predictive abilities.
Which I guess shouldn’t be surprising, because if there were a version of Newcomb’s problem that cleanly split FDT from CDT without invoking extreme abilities on Omega’s part, I would expect that to be the standard version.
I’m left with a vague impression that FDT and CDT mostly disagree about “what rigorous mathematical model should we take this informal story-problem to be describing?” rather than “what strategy wins, given a certain rigorous mathematical model of the game?” CDT thinks you are choosing between $1K and $0, while FDT thinks you are choosing between $1K and $1M. If we could actually run the experiment, even in simulation, then that disagreement seems like it should have a simple empirical resolution; but I don’t think anyone knows how to do that. (Please correct me if I’m wrong!)