The sanity check outlined above is not broad enough, as it only sanity checks the best agents, whereas even if the best possible agent fails the sanity check, there still could be an improvement over the nash equilibrium which passes.
Yup, this is where I’m going in a future post. See the footnote on this post about other variants of TDT; there’s a balance between missing workable deals against genuinely stubborn opponents, and failing to get the best possible deal from clever but flexible opponents. (And, if I haven’t made a mistake in the reasoning I haven’t checked, there is a way to use further cleverness to do still better.)
For now, note that TDT wouldn’t necessarily prefer to be a hard-coded 99% cooperator in general, since those get “screw you” mutual defections from some (stubborn) agents that mutually cooperate with TDT.
Yup, this is where I’m going in a future post. See the footnote on this post about other variants of TDT; there’s a balance between missing workable deals against genuinely stubborn opponents, and failing to get the best possible deal from clever but flexible opponents. (And, if I haven’t made a mistake in the reasoning I haven’t checked, there is a way to use further cleverness to do still better.)
For now, note that TDT wouldn’t necessarily prefer to be a hard-coded 99% cooperator in general, since those get “screw you” mutual defections from some (stubborn) agents that mutually cooperate with TDT.