[EDIT: The following is false.] A clever CDT would be able to act like TDT if it considered, not the choice of whether to output C or D, but the choice of which mathematical object to output (because it could output a mathematical object that evaluates to C or D in a particular way depending on the code of Y—this gives it the option of genuinely acting like TDT would).
This has the interesting conclusion that even without the benefit of self-modification, a CDT agent with a good model of the world ends up acting more like TDT than traditional game theorists would expect. (Another example of this is here.) The version of CDT in the last post, contrariwise, is equipped with a very narrow model of the world and of its options. [End falsehood.]
I think these things are fascinating, but I think it’s important to show that you can get TDT behavior without incorporating anthropic reasoning, redefinition of its actions, or anything beyond a basic kind of framework that human beings know how to program.
(By the way, I wouldn’t call option 3 CliqueBot, because CliqueBots as I defined them have problems mutually cooperating with anything whose outputs aren’t identical to theirs. I think it’s better for Option 3 to be the TDT algorithm defined in the post.)
It seems to come up all the time that people aren’t aware that CDT with a sufficiently good world model (a sufficiently accurate causal graph) is the same as TDT, even though this has been known for years. If you could address that somewhere in your sequence I think you’d save a lot of people a lot of time—it’s the most common objection to standard discourse about decision theory that I’ve seen.
It seems to come up all the time that people aren’t aware that CDT with a sufficiently good world model (a sufficiently accurate causal graph) is the same as TDT
CDT leaves the money on the ground? Not unless the “sufficiently good world model” isn’t so much “sufficiently good” as it is an artificial hack that compensates for bad decision making by twisting what causal graphs are supposed to mean.
This has the interesting conclusion that even without the benefit of self-modification, a CDT agent with a good model of the world ends up acting more like TDT than traditional game theorists would expect.
This is a pretty common feature of comparisons between decision theories: different outcomes generally require different assumptions.
I think these things are fascinating, but I think it’s important to show that you can get TDT behavior without incorporating anthropic reasoning, redefinition of its actions, or anything beyond a basic kind of framework that human beings know how to program.
It’s not clear to me what the difference is between the TDT algorithm in your post and the method I’ve described. You need some method of determining what the outcome pair is from strategy pair, and the inference module can (hopefully) do that. The u_f that you use is the utility of the X outcome corresponding to the best Y outcome in row f, and picking the best of those corresponds to finding the best of the Nash equilibria (in the absence of bargaining problems). The only thing I don’t mention is the sanity check, but that should just be another run of the inference module.
By the way, I wouldn’t call option 3 CliqueBot, because CliqueBots as I defined them have problems mutually cooperating with anything whose outputs aren’t identical to theirs. I think it’s better for Option 3 to be the TDT algorithm defined in the post.
Sure, but does it have a short name? ProofBot?
(Notice that Y running the full TDT algorithm corresponds to there being multiple columns in the table: if you were running X against a CooperateBot, you’d just have the first column, and the Nash equilibrium would be (2,1) or (3,1). If you were running it against CliqueBot without a sanity check, there would just be the third column, and it would think (3,3) was the Nash equilibrium, but would be in for a nasty surprise when CliqueBot rejects it because of its baggage.)
It’s not clear to me what the difference is between the TDT algorithm in your post and the method I’ve described.
If you make sure to include a sanity check, then your description should do the same thing as the TDT algorithm in the post (at least on simple games; there may be a difference in bargaining situations.)
Sure, but does it have a short name? ProofBot?
I understand why you might feel it’s circular to name that row TDT, but nothing simpler (unless you count ADT/UDT as simpler) does as it does. It’s a layer more complicated than Newcomblike agents (which should also be included in your table); in order to get mutual cooperation with self and also defection against CooperateBot, it deduces whether a DefectBot or a MimicBot (C if it deduces Y=C, D otherwise) has a better outcome against Y, runs a sanity check, and if that goes through it does what the preferred strategy does.
Aha, I see now what you mean. Good insight!
[EDIT: The following is false.] A clever CDT would be able to act like TDT if it considered, not the choice of whether to output C or D, but the choice of which mathematical object to output (because it could output a mathematical object that evaluates to C or D in a particular way depending on the code of Y—this gives it the option of genuinely acting like TDT would).
This has the interesting conclusion that even without the benefit of self-modification, a CDT agent with a good model of the world ends up acting more like TDT than traditional game theorists would expect. (Another example of this is here.) The version of CDT in the last post, contrariwise, is equipped with a very narrow model of the world and of its options. [End falsehood.]
I think these things are fascinating, but I think it’s important to show that you can get TDT behavior without incorporating anthropic reasoning, redefinition of its actions, or anything beyond a basic kind of framework that human beings know how to program.
(By the way, I wouldn’t call option 3 CliqueBot, because CliqueBots as I defined them have problems mutually cooperating with anything whose outputs aren’t identical to theirs. I think it’s better for Option 3 to be the TDT algorithm defined in the post.)
It seems to come up all the time that people aren’t aware that CDT with a sufficiently good world model (a sufficiently accurate causal graph) is the same as TDT, even though this has been known for years. If you could address that somewhere in your sequence I think you’d save a lot of people a lot of time—it’s the most common objection to standard discourse about decision theory that I’ve seen.
I’ll discuss it in the final post.
CDT leaves the money on the ground? Not unless the “sufficiently good world model” isn’t so much “sufficiently good” as it is an artificial hack that compensates for bad decision making by twisting what causal graphs are supposed to mean.
Thanks!
This is a pretty common feature of comparisons between decision theories: different outcomes generally require different assumptions.
It’s not clear to me what the difference is between the TDT algorithm in your post and the method I’ve described. You need some method of determining what the outcome pair is from strategy pair, and the inference module can (hopefully) do that. The u_f that you use is the utility of the X outcome corresponding to the best Y outcome in row f, and picking the best of those corresponds to finding the best of the Nash equilibria (in the absence of bargaining problems). The only thing I don’t mention is the sanity check, but that should just be another run of the inference module.
Sure, but does it have a short name? ProofBot?
(Notice that Y running the full TDT algorithm corresponds to there being multiple columns in the table: if you were running X against a CooperateBot, you’d just have the first column, and the Nash equilibrium would be (2,1) or (3,1). If you were running it against CliqueBot without a sanity check, there would just be the third column, and it would think (3,3) was the Nash equilibrium, but would be in for a nasty surprise when CliqueBot rejects it because of its baggage.)
If you make sure to include a sanity check, then your description should do the same thing as the TDT algorithm in the post (at least on simple games; there may be a difference in bargaining situations.)
I understand why you might feel it’s circular to name that row TDT, but nothing simpler (unless you count ADT/UDT as simpler) does as it does. It’s a layer more complicated than Newcomblike agents (which should also be included in your table); in order to get mutual cooperation with self and also defection against CooperateBot, it deduces whether a DefectBot or a MimicBot (C if it deduces Y=C, D otherwise) has a better outcome against Y, runs a sanity check, and if that goes through it does what the preferred strategy does.