You agree with FDT on some issues. The goal of decision theory is to determine what kind of agent you should be. The kind of agent you are (your “source code”) affects other agents’ decisions
FDT requires you to construct counterfactual worlds. For example, if I’m faced with Newcomb’s problem, I have to imagine a counterfactual world in which I’m a two-boxer
We don’t know how to construct counterfactual worlds. Imagining a consistent world in which I’m a two-boxer is just as hard as imagining a one where objects fall up, or 2+2 is 5
You get around this by constructing counterfactual models, instead of counterfactual worlds. Instead of trying to imagine a consistent world that caused me to become a two-boxer, I just make a simple mental model. For example, my model might indicate that my “source code” causes both the predictor’s decision and mine. From here, I can model myself as a two-boxer, even though that model doesn’t represent any possible world
“The goal of decision theory is to determine what kind of agent you should be”
I’ll answer this with a stream of thought: I guess my position on this is slightly complex. I did say that the reason for preferring one notion of counterfactual over another must be rooted in the fact that agents adopting these counterfactuals do better over a particular set of worlds. And maybe that reduces to what you said, although maybe it isn’t quite as straightforward as that because I content “possible” is not in the territory. This opens the door to there being multiple notions of possible and hence counterfactuals being formed by merging lessons from the various notions. And it seems that we could merge these lessons either at the individual decision level or at the level of properties about agent or at the level of agents. Or at least that’s how I would like my claims in this post to be understood.
”FDT requires you to construct counterfactual worlds”
I highly doubt that Eliezer embraces David Lewis’ view of counterfactuals, especially given his post Probability is in the Mind. However, the way FDT is framed sometimes gives the impression that there’s a true definition we’re just looking for. Admittedly, if you’re just looking for something that works such as in Newcomb’s and Regret of Rationality then that avoids this mistake. And I guess if you look at how MIRI has investigated this, which is much more mathematical than philosophical that the do seem to be following this pragmatism principle. I would like to suggest though that this can only get you so far.
“We don’t know how to construct counterfactual worlds. You get around this...”
I’m not endorsing counterfactual models out of lack of knowledge of how to construct counterfactual worlds, but because I don’t think—contra Lewis—that there are strong reasons for asserting such worlds are out there. Further, it seems that our notion of counterfactuals are unavoidably rooted in themselves—this is just “the epistemics of counterfactual worlds are hard, so let’s just work with models instead”.
So is this an accurate summary of your thinking?
You agree with FDT on some issues. The goal of decision theory is to determine what kind of agent you should be. The kind of agent you are (your “source code”) affects other agents’ decisions
FDT requires you to construct counterfactual worlds. For example, if I’m faced with Newcomb’s problem, I have to imagine a counterfactual world in which I’m a two-boxer
We don’t know how to construct counterfactual worlds. Imagining a consistent world in which I’m a two-boxer is just as hard as imagining a one where objects fall up, or 2+2 is 5
You get around this by constructing counterfactual models, instead of counterfactual worlds. Instead of trying to imagine a consistent world that caused me to become a two-boxer, I just make a simple mental model. For example, my model might indicate that my “source code” causes both the predictor’s decision and mine. From here, I can model myself as a two-boxer, even though that model doesn’t represent any possible world
“The goal of decision theory is to determine what kind of agent you should be”
I’ll answer this with a stream of thought: I guess my position on this is slightly complex. I did say that the reason for preferring one notion of counterfactual over another must be rooted in the fact that agents adopting these counterfactuals do better over a particular set of worlds. And maybe that reduces to what you said, although maybe it isn’t quite as straightforward as that because I content “possible” is not in the territory. This opens the door to there being multiple notions of possible and hence counterfactuals being formed by merging lessons from the various notions. And it seems that we could merge these lessons either at the individual decision level or at the level of properties about agent or at the level of agents. Or at least that’s how I would like my claims in this post to be understood.
That said, the lesson from my post The Counterfactual Prisoner’s Dilemma is that merging at the decision-level doesn’t seem viable.
”FDT requires you to construct counterfactual worlds”
I highly doubt that Eliezer embraces David Lewis’ view of counterfactuals, especially given his post Probability is in the Mind. However, the way FDT is framed sometimes gives the impression that there’s a true definition we’re just looking for. Admittedly, if you’re just looking for something that works such as in Newcomb’s and Regret of Rationality then that avoids this mistake. And I guess if you look at how MIRI has investigated this, which is much more mathematical than philosophical that the do seem to be following this pragmatism principle. I would like to suggest though that this can only get you so far.
“We don’t know how to construct counterfactual worlds. You get around this...”
I’m not endorsing counterfactual models out of lack of knowledge of how to construct counterfactual worlds, but because I don’t think—contra Lewis—that there are strong reasons for asserting such worlds are out there. Further, it seems that our notion of counterfactuals are unavoidably rooted in themselves—this is just “the epistemics of counterfactual worlds are hard, so let’s just work with models instead”.