Since when does CDT include backtracking on noticing other people’s predictive inconsistency?
I agree that CDT does not including backtracking on noticing other people’s predictive inconsistency. My assumption is that decision-theories (including CDT) takesa world-map and outputs an action. I’m claiming that this post is conflating an error in constructing an accurate world-map with an error in the decision theory.
CDT cannot notice that Omega’s prediction aligns with its hypothetical decision because Omega’s prediction is causally “before” CDT’s decision, so any causal decision graph cannot condition on it. This is why post-TDT decision theories are also called “acausal.”
Here is a more explicit version of what I’m talking about. CDT makes a decision to act based on the expected value of its action. To produce such an action, we need to estimate an expected value. In the original post, there are two parts to this:
Part 1 (Building a World Model):
I believe that the predictor modeled my reasoning process and has made a prediction based on that model. This prediction happens before I actually instantiate my reasoning process
I believe this model to be accurate/quasi-accurate
I start unaware of what my causal reasoning process is so I have no idea what the predictor will do. In any case, the causal reasoning process must continue because I’m thinking.
As I think, I get more information about my causal reasoning process. Because I know that the predictor is modeling my reasoning process, this let’s me update my prediction of the predictor’s prediction.
Because the above step was part of my causal reasoning process and information about my causal reasoning process affects my model of the predictor’s model of me, I must update on the above step as well
[The Dubious Step] Because I am modeling myself as CDT, I will make a statement intended to inverse the predictor. Because I believe the predictor is modeling me, this requires me to inverse myself. That is to say, every update my causal reasoning process makes to my probabilities is inversing the previous update
Note that this only works if I believe my reasoning process (but not necessarily the ultimate action) gives me information about the predictor’s prediction.
The above leads to infinite regress
Part 2 (CDT)
Ask the world model what the odds are that the predictor said “one” or “zero”
Find the one with higher likelihood and inverse it
I believe Part 1 fails and that this isn’t the fault of CDT. For instance, imagine the above problem with zero stakes such that decision theory is irrelevant. If you ask any agent to give the inverse of its probabilities that Omega will say “one” or “zero” with the added information that Omega will perfectly predict those inverses and align with them, that agent won’t be able to give you probabilities. Hence, the failure occurs in building a world model rather than in implementing a decision theory.
-------------------------------- Original version
Since when does CDT include backtracking on noticing other people’s predictive inconsistency?
Ever since the process of updating a causal model of the world based on new information was considered an epistemic question outside the scope of decision theory.
To see how this is true, imagine the exact same situation as described in the post with zero stakes. Then ask any agent with any decision theory about the inverse of the prediction it expects the predictor to make. The answer will always be “I don’t know”, independent of decision theory. Ask that same agent if it can assign probabilities to the answers and it will say “I don’t know; every time I try to come up with one, the answer reverses.”
All I’m trying to do is compute the probability that the predictor will guess “one” or “zero” and failing. The output of failing here isn’t “well, I guess I’ll default to fifty-fifty so I should pick at random”[1], it’s NaN.
Here’s a causal explanation:
I believe the predictor modeled my reasoning process and has made a prediction based on that model.
I believe this model to be accurate/quasi-accurate
I start unaware of what my causal reasoning process is so I have no idea what the predictor will do. But my prediction of the predictor depends on my causal reasoning process
Because my causal reasoning process is contingent on my prediction and my prediction is contingent on my causal reasoning process, I end up in an infinite loop where my causal reasoning process cannot converge on an actual answer. Every time it tries, it just keeps updating.
I quit the game because my prediction is incomputable
I’m claiming that this post is conflating an error in constructing an accurate world-map with an error in the decision theory.
The problem is not that CDT has an inaccurate world-map; the problem is that CDT has an accurate world map, and then breaks it. CDT would work much better with an inaccurate world-map, one in which its decision causally affects the prediction.
Having done some research, it turns out the thing I was actually pointing to was ratifiability and the stance that any reasonable separation of world-modeling and decision-selection should put ratifiability in the former rather than the latter. This specific claim isn’t new: From “Regret and Instability in causal decision theory”:
Second, while I agree that deliberative equilibrium is central to rational decision making, I disagree with Arntzenius that CDT needs to be ammended in any way to make it appropriately deliberational. In cases like Murder Lesion a deliberational perspective is forced on us by what CDT says. It says this: A rational agent should base her decisions on her best information about the outcomes her acts are likely to causally promote, and she should ignore information about what her acts merely indicate. In other words, as I have argued, the theory asks agents to conform to Full Information, which requires them to reason themselves into a state of equilibrium before they act. The deliberational perspective is thus already a part of CDT
However, it’s clear to me now that you were discussing an older, more conventional, version of CDT[1] which does not have that property. With respect to that version, the thought-experiment goes through but, with respect to the version I believe to be sensible, it doesn’t[2].
[1] I’m actually kind of surprised that the conventional version of CDT is that dumb—and I had to check a bunch of papers to verify that this was actually happening. Maybe if my memory had complied at the time, it would’ve flagged your distinguishing between CDT and EDT here from past LessWrong articles I’ve read like CDT=EDT. But this wasn’t meant to be so I didn’t notice you were talking about something different.
[2] I am now confident it does not apply to the thing I’m referring to—the linked paper brings up “Death in Damascus” specifically as a place where ratifiable CDt does not fail
Can you clarify what you mean by “successfully formalised”? I’m not sure if I can answer that question but I can say the following:
Stanford’s encyclopedia has a discussion of ratifiability dating back to the 1960s and (by the 1980s) it has been applied to both EDT and CDT (which I’d expect, given that constraints on having an accurate world model should be independent of decision theory). This gives me confidence that it’s not just a random Less Wrong thing.
Abram Dempski from MIRI has a whole sequence on when CDT=EDT which leverages ratifiability as a sub-assumption. This gives me confidence that ratifiability is actually onto something (the Less Wrong stamp of approval is important!)
Whether any of this means that it’s been “successfully formalised”, I can’t really say. From the outside-view POV, I literally did not know about the conventional version of CDT until yesterday. Thus, I do not really view myself as someone currently capable of verifying the extent to which a decision theory has been successfully formalised. Still, I consider this version of CDT old enough historically and well-enough-discussed on Less Wrong by Known Smart People that I have high confidence in it.
[Comment edited for clarity]
I agree that CDT does not including backtracking on noticing other people’s predictive inconsistency. My assumption is that decision-theories (including CDT) takesa world-map and outputs an action. I’m claiming that this post is conflating an error in constructing an accurate world-map with an error in the decision theory.
Here is a more explicit version of what I’m talking about. CDT makes a decision to act based on the expected value of its action. To produce such an action, we need to estimate an expected value. In the original post, there are two parts to this:
Part 1 (Building a World Model):
I believe that the predictor modeled my reasoning process and has made a prediction based on that model. This prediction happens before I actually instantiate my reasoning process
I believe this model to be accurate/quasi-accurate
I start unaware of what my causal reasoning process is so I have no idea what the predictor will do. In any case, the causal reasoning process must continue because I’m thinking.
As I think, I get more information about my causal reasoning process. Because I know that the predictor is modeling my reasoning process, this let’s me update my prediction of the predictor’s prediction.
Because the above step was part of my causal reasoning process and information about my causal reasoning process affects my model of the predictor’s model of me, I must update on the above step as well
[The Dubious Step] Because I am modeling myself as CDT, I will make a statement intended to inverse the predictor. Because I believe the predictor is modeling me, this requires me to inverse myself. That is to say, every update my causal reasoning process makes to my probabilities is inversing the previous update
Note that this only works if I believe my reasoning process (but not necessarily the ultimate action) gives me information about the predictor’s prediction.
The above leads to infinite regress
Part 2 (CDT)
Ask the world model what the odds are that the predictor said “one” or “zero”
Find the one with higher likelihood and inverse it
I believe Part 1 fails and that this isn’t the fault of CDT. For instance, imagine the above problem with zero stakes such that decision theory is irrelevant. If you ask any agent to give the inverse of its probabilities that Omega will say “one” or “zero” with the added information that Omega will perfectly predict those inverses and align with them, that agent won’t be able to give you probabilities. Hence, the failure occurs in building a world model rather than in implementing a decision theory.
-------------------------------- Original version
Ever since the process of updating a causal model of the world based on new information was considered an epistemic question outside the scope of decision theory.
To see how this is true, imagine the exact same situation as described in the post with zero stakes. Then ask any agent with any decision theory about the inverse of the prediction it expects the predictor to make. The answer will always be “I don’t know”, independent of decision theory. Ask that same agent if it can assign probabilities to the answers and it will say “I don’t know; every time I try to come up with one, the answer reverses.”
All I’m trying to do is compute the probability that the predictor will guess “one” or “zero” and failing. The output of failing here isn’t “well, I guess I’ll default to fifty-fifty so I should pick at random”[1], it’s NaN.
Here’s a causal explanation:
I believe the predictor modeled my reasoning process and has made a prediction based on that model.
I believe this model to be accurate/quasi-accurate
I start unaware of what my causal reasoning process is so I have no idea what the predictor will do. But my prediction of the predictor depends on my causal reasoning process
Because my causal reasoning process is contingent on my prediction and my prediction is contingent on my causal reasoning process, I end up in an infinite loop where my causal reasoning process cannot converge on an actual answer. Every time it tries, it just keeps updating.
I quit the game because my prediction is incomputable
The problem is not that CDT has an inaccurate world-map; the problem is that CDT has an accurate world map, and then breaks it. CDT would work much better with an inaccurate world-map, one in which its decision causally affects the prediction.
See this post for how you can hack that: https://www.lesswrong.com/posts/9m2fzjNSJmd3yxxKG/acdt-a-hack-y-acausal-decision-theory
Having done some research, it turns out the thing I was actually pointing to was ratifiability and the stance that any reasonable separation of world-modeling and decision-selection should put ratifiability in the former rather than the latter. This specific claim isn’t new: From “Regret and Instability in causal decision theory”:
However, it’s clear to me now that you were discussing an older, more conventional, version of CDT[1] which does not have that property. With respect to that version, the thought-experiment goes through but, with respect to the version I believe to be sensible, it doesn’t[2].
[1] I’m actually kind of surprised that the conventional version of CDT is that dumb—and I had to check a bunch of papers to verify that this was actually happening. Maybe if my memory had complied at the time, it would’ve flagged your distinguishing between CDT and EDT here from past LessWrong articles I’ve read like CDT=EDT. But this wasn’t meant to be so I didn’t notice you were talking about something different.
[2] I am now confident it does not apply to the thing I’m referring to—the linked paper brings up “Death in Damascus” specifically as a place where ratifiable CDt does not fail
Have they successfully formalised the newer CDT?
Can you clarify what you mean by “successfully formalised”? I’m not sure if I can answer that question but I can say the following:
Stanford’s encyclopedia has a discussion of ratifiability dating back to the 1960s and (by the 1980s) it has been applied to both EDT and CDT (which I’d expect, given that constraints on having an accurate world model should be independent of decision theory). This gives me confidence that it’s not just a random Less Wrong thing.
Abram Dempski from MIRI has a whole sequence on when CDT=EDT which leverages ratifiability as a sub-assumption. This gives me confidence that ratifiability is actually onto something (the Less Wrong stamp of approval is important!)
Whether any of this means that it’s been “successfully formalised”, I can’t really say. From the outside-view POV, I literally did not know about the conventional version of CDT until yesterday. Thus, I do not really view myself as someone currently capable of verifying the extent to which a decision theory has been successfully formalised. Still, I consider this version of CDT old enough historically and well-enough-discussed on Less Wrong by Known Smart People that I have high confidence in it.