I guess we seem to differ on whether CDT dealt a bad hand vs. playing it badly. CDT, as usually argued for, doesn’t seem to engage with the artificial nature of counterfactuals, and I suspect that when you engage with consideration this won’t lead to CDT.
Questions in decision theory are not questions about what choices you should make with some sort of unpredictable free will. They are questions about what type of source code you should be running.
This seems like a reasonable hypothesis, but I have to point out that there’s something rather strange in imagining a situation where we make a decision outside of the universe, I think we should boggle at this using CFAR’s term. Indeed I agree that if we accept the notion of a meta-decision theory that FDT does not invoke backwards causation (elegant explanation btw!).
Comparing this to my explanation, we both seem to agree that there are two separate views—in your terms an “object” view and an “agent” view. I guess my explanation is based upon the “agent” view being artificial and it is more general as I avoid making too many assumptions about what exactly a decision is, while your view takes on an additional assumption (that we should model decisions in a meta-causal way) in exchange for being more concrete and easier to grasp/explain.
With your explanation, however, I do think you elided over this point too quickly, as it isn’t completely clear what’s going on there/why that makes sense:
FDT is actually just what happens when you use causal decision theory to select what type of source code you want to enter a Newcombian game with
There’s a sense in which this is self-defeating b/c if CDT implies that you should pre-commit to FDT, then why do you care what CDT recommends as it appears to have undermined itself?
My answer is that even though it appears this way, I don’t actually think it is self-defeating and this becomes clear when we consider this as a process of engaging in reflective equilibrium until our views are consistent. CDT doesn’t recommend itself, but FDT does, so this process leads us to replace our initial starting assumption of CDT with FDT.
In other words, we’re engaging in a form of circular epistemology as described here. We aren’t trying to get from the View from Nowhere to a model of counterfactuals—to prove everything a priori like Decartes—instead all we can do is start of with some notions, beliefs or intuitions about counterfatuals and then make them consistent. I guess I see making these mechanics explicit useful.
I’m not claiming at this stage that it is in fact correct to shift from CDT to FDT as part of the process of reflective equilibrium as it is possible to resolve inconsistencies in a different order, with different assumptions held fixed, but this is plausibly the correct way to proceed. I guess the next step would be to map out the various intuitions that we have about how to handle these kinds of situations and then figure out if there are any other possible ways of resolving the inconsistency.
in your terms an “object” view and an “agent” view.
Yes, I think that there is a time and place for these two stances toward agents. The object stance when we are thinking about how behavior is deterministic conditioned on a state of the world and agent. The agent stance for when we are trying to be purposive and think about what types of agents to be/design. If we never wanted to take the object stance, we couldn’t successfully understand many dilemmas, and if we never wanted to take the agent stance, then there seems little point in trying to talk about what any agent ever “should” do.
There’s a sense in which this is self-defeating b/c if CDT implies that you should pre-commit to FDT, then why do you care what CDT recommends as it appears to have undermined itself?
I don’t especially care.
counterfactuals only make sense from within themselves
Is naive thinking about the troll bridge problem a counterexample to this? There, the counterfactual stems from a contradiction.
CDT doesn’t recommend itself, but FDT does, so this process leads us to replace our initial starting assumption of CDT with FDT.
I think that no general type of decision theory worth two cents always does recommend itself. Any decision theory X that isn’t silly would recommend replacing itself before entering a mind-policing environment in which the mind police punishes an agent iff they use X.
Yes, I think that there is a time and place for these two stances toward agents
Agreed. The core lesson for me is that you can’t mix and match—you need to clearly separate out when you are using one stance or another.
I don’t especially care.
I can understand this perspective, but perhaps if there’s a relatively accessible way of explaining why this (or something similar to this) isn’t self-defeating, then maybe we should go with that?
Is naive thinking about the troll bridge problem a counterexample to this? There, the counterfactual stems from a contradiction.
I don’t quite your point. Any chance you could clarify? Like sure we can construct counterfactuals within an inconsistent system and sometimes this may even be a nifty trick for getting the right answer if we can avoid the inconsistency messing us up, but outside of this, why is this something that we should we care about this?
I think that no general type of decision theory worth two cents always does recommend itself
Good point, now that you’ve said it I have to agree that I was too quick to assume that the outside-of-the-universe decision theory should be the same as the inside-of-the-universe decision theory.
Thinking this through, if we use CDT as our outside decision theory to pick an inside decision theory, then we need to be able to justify why we were using CDT. Similarly, if we were to use another decision theory.
One thing I’ve just realised is that we don’t actually have to use CDT, EDT or FDT to make our decision. Since there’s no past for the meta-decider, we can just use our naive decision theory which ignores the past altogether. And we can justify this choice based on the fact that we are reasoning from where we are. This seems like it would avoid the recursion.
Except I don’t actually buy this, as we need to be able to provide a justification of why we would care about the result of a meta-decider outside of the universe when we know that isn’t the real scenario. I guess what we’re doing is making an analogy with inside the universe situations where we can set the source code of a robot before it goes and does some stuff. And we’re noting that a robot probably has a good algorithm if its code matches what a decider would choose if they had to be prepared for a wide variety of circumstances and then trying to apply this more broadly.
I don’t think I’ve got this precise yet, but I guess the key point is that this model doesn’t appear out of thin air, but that the model has a justification and that this justification involves a decision and hence some kind of decision theory where the actual decision is inside of the universe. So there is after all a reason to want the inside and outside theories to match up.
In the troll bridge problem, the counterfactual (the agent crossing the bridge) would indicate the inconsistency of the agent’s logical system of reasoning. See this post and what demski calls a subjective theory of counterfactuals.
I guess we seem to differ on whether CDT dealt a bad hand vs. playing it badly. CDT, as usually argued for, doesn’t seem to engage with the artificial nature of counterfactuals, and I suspect that when you engage with consideration this won’t lead to CDT.
This seems like a reasonable hypothesis, but I have to point out that there’s something rather strange in imagining a situation where we make a decision outside of the universe, I think we should boggle at this using CFAR’s term. Indeed I agree that if we accept the notion of a meta-decision theory that FDT does not invoke backwards causation (elegant explanation btw!).
Comparing this to my explanation, we both seem to agree that there are two separate views—in your terms an “object” view and an “agent” view. I guess my explanation is based upon the “agent” view being artificial and it is more general as I avoid making too many assumptions about what exactly a decision is, while your view takes on an additional assumption (that we should model decisions in a meta-causal way) in exchange for being more concrete and easier to grasp/explain.
With your explanation, however, I do think you elided over this point too quickly, as it isn’t completely clear what’s going on there/why that makes sense:
There’s a sense in which this is self-defeating b/c if CDT implies that you should pre-commit to FDT, then why do you care what CDT recommends as it appears to have undermined itself?
My answer is that even though it appears this way, I don’t actually think it is self-defeating and this becomes clear when we consider this as a process of engaging in reflective equilibrium until our views are consistent. CDT doesn’t recommend itself, but FDT does, so this process leads us to replace our initial starting assumption of CDT with FDT.
In other words, we’re engaging in a form of circular epistemology as described here. We aren’t trying to get from the View from Nowhere to a model of counterfactuals—to prove everything a priori like Decartes—instead all we can do is start of with some notions, beliefs or intuitions about counterfatuals and then make them consistent. I guess I see making these mechanics explicit useful.
In particular, by making this move it seems as though, at least on the face of it, that we are embracing the notion that counterfactuals only make sense from within themselves.
I’m not claiming at this stage that it is in fact correct to shift from CDT to FDT as part of the process of reflective equilibrium as it is possible to resolve inconsistencies in a different order, with different assumptions held fixed, but this is plausibly the correct way to proceed. I guess the next step would be to map out the various intuitions that we have about how to handle these kinds of situations and then figure out if there are any other possible ways of resolving the inconsistency.
Yes, I think that there is a time and place for these two stances toward agents. The object stance when we are thinking about how behavior is deterministic conditioned on a state of the world and agent. The agent stance for when we are trying to be purposive and think about what types of agents to be/design. If we never wanted to take the object stance, we couldn’t successfully understand many dilemmas, and if we never wanted to take the agent stance, then there seems little point in trying to talk about what any agent ever “should” do.
I don’t especially care.
Is naive thinking about the troll bridge problem a counterexample to this? There, the counterfactual stems from a contradiction.
I think that no general type of decision theory worth two cents always does recommend itself. Any decision theory X that isn’t silly would recommend replacing itself before entering a mind-policing environment in which the mind police punishes an agent iff they use X.
Agreed. The core lesson for me is that you can’t mix and match—you need to clearly separate out when you are using one stance or another.
I can understand this perspective, but perhaps if there’s a relatively accessible way of explaining why this (or something similar to this) isn’t self-defeating, then maybe we should go with that?
I don’t quite your point. Any chance you could clarify? Like sure we can construct counterfactuals within an inconsistent system and sometimes this may even be a nifty trick for getting the right answer if we can avoid the inconsistency messing us up, but outside of this, why is this something that we should we care about this?
Good point, now that you’ve said it I have to agree that I was too quick to assume that the outside-of-the-universe decision theory should be the same as the inside-of-the-universe decision theory.
Thinking this through, if we use CDT as our outside decision theory to pick an inside decision theory, then we need to be able to justify why we were using CDT. Similarly, if we were to use another decision theory.
One thing I’ve just realised is that we don’t actually have to use CDT, EDT or FDT to make our decision. Since there’s no past for the meta-decider, we can just use our naive decision theory which ignores the past altogether. And we can justify this choice based on the fact that we are reasoning from where we are. This seems like it would avoid the recursion.
Except I don’t actually buy this, as we need to be able to provide a justification of why we would care about the result of a meta-decider outside of the universe when we know that isn’t the real scenario. I guess what we’re doing is making an analogy with inside the universe situations where we can set the source code of a robot before it goes and does some stuff. And we’re noting that a robot probably has a good algorithm if its code matches what a decider would choose if they had to be prepared for a wide variety of circumstances and then trying to apply this more broadly.
I don’t think I’ve got this precise yet, but I guess the key point is that this model doesn’t appear out of thin air, but that the model has a justification and that this justification involves a decision and hence some kind of decision theory where the actual decision is inside of the universe. So there is after all a reason to want the inside and outside theories to match up.
In the troll bridge problem, the counterfactual (the agent crossing the bridge) would indicate the inconsistency of the agent’s logical system of reasoning. See this post and what demski calls a subjective theory of counterfactuals.