This has the interesting conclusion that even without the benefit of self-modification, a CDT agent with a good model of the world ends up acting more like TDT than traditional game theorists would expect.
This is a pretty common feature of comparisons between decision theories: different outcomes generally require different assumptions.
I think these things are fascinating, but I think it’s important to show that you can get TDT behavior without incorporating anthropic reasoning, redefinition of its actions, or anything beyond a basic kind of framework that human beings know how to program.
It’s not clear to me what the difference is between the TDT algorithm in your post and the method I’ve described. You need some method of determining what the outcome pair is from strategy pair, and the inference module can (hopefully) do that. The u_f that you use is the utility of the X outcome corresponding to the best Y outcome in row f, and picking the best of those corresponds to finding the best of the Nash equilibria (in the absence of bargaining problems). The only thing I don’t mention is the sanity check, but that should just be another run of the inference module.
By the way, I wouldn’t call option 3 CliqueBot, because CliqueBots as I defined them have problems mutually cooperating with anything whose outputs aren’t identical to theirs. I think it’s better for Option 3 to be the TDT algorithm defined in the post.
Sure, but does it have a short name? ProofBot?
(Notice that Y running the full TDT algorithm corresponds to there being multiple columns in the table: if you were running X against a CooperateBot, you’d just have the first column, and the Nash equilibrium would be (2,1) or (3,1). If you were running it against CliqueBot without a sanity check, there would just be the third column, and it would think (3,3) was the Nash equilibrium, but would be in for a nasty surprise when CliqueBot rejects it because of its baggage.)
It’s not clear to me what the difference is between the TDT algorithm in your post and the method I’ve described.
If you make sure to include a sanity check, then your description should do the same thing as the TDT algorithm in the post (at least on simple games; there may be a difference in bargaining situations.)
Sure, but does it have a short name? ProofBot?
I understand why you might feel it’s circular to name that row TDT, but nothing simpler (unless you count ADT/UDT as simpler) does as it does. It’s a layer more complicated than Newcomblike agents (which should also be included in your table); in order to get mutual cooperation with self and also defection against CooperateBot, it deduces whether a DefectBot or a MimicBot (C if it deduces Y=C, D otherwise) has a better outcome against Y, runs a sanity check, and if that goes through it does what the preferred strategy does.
Thanks!
This is a pretty common feature of comparisons between decision theories: different outcomes generally require different assumptions.
It’s not clear to me what the difference is between the TDT algorithm in your post and the method I’ve described. You need some method of determining what the outcome pair is from strategy pair, and the inference module can (hopefully) do that. The u_f that you use is the utility of the X outcome corresponding to the best Y outcome in row f, and picking the best of those corresponds to finding the best of the Nash equilibria (in the absence of bargaining problems). The only thing I don’t mention is the sanity check, but that should just be another run of the inference module.
Sure, but does it have a short name? ProofBot?
(Notice that Y running the full TDT algorithm corresponds to there being multiple columns in the table: if you were running X against a CooperateBot, you’d just have the first column, and the Nash equilibrium would be (2,1) or (3,1). If you were running it against CliqueBot without a sanity check, there would just be the third column, and it would think (3,3) was the Nash equilibrium, but would be in for a nasty surprise when CliqueBot rejects it because of its baggage.)
If you make sure to include a sanity check, then your description should do the same thing as the TDT algorithm in the post (at least on simple games; there may be a difference in bargaining situations.)
I understand why you might feel it’s circular to name that row TDT, but nothing simpler (unless you count ADT/UDT as simpler) does as it does. It’s a layer more complicated than Newcomblike agents (which should also be included in your table); in order to get mutual cooperation with self and also defection against CooperateBot, it deduces whether a DefectBot or a MimicBot (C if it deduces Y=C, D otherwise) has a better outcome against Y, runs a sanity check, and if that goes through it does what the preferred strategy does.