“Simulate my opponent, and if it tries to simulate me, see what it will do if it simulates me outputting Cooperate.”
This has the problem that, since most strategies will eval twice (to check both the C and D cases) you can be reasonably sure that if both calls return the same result you are being simulated.
Edit: Although it doesn’t fully fix the problem, this is better: eval takes a function that takes the function the other agent will call eval with as its argument and returns C or D.
eval(function(fn) {
if(fn(function(ignore) { return C; }) == C) return C;
There are still detection problems here (you could add checks to see if the function you passed to the function eval passed to you was invoked), but the fact that some strategies wouldn’t do overly ambitious recursion at least downgrades the above approach from obvious information leak to ambiguous information leak
Basically, there is always going to be some clever strategy of possibly detecting whether you’re a sim. And it may be best at that point to just say so. So the game should have three outputs: C, D, and S. S gets you zero points and gets your opponent the points they’d get if you played C. It’s clearly suboptimal. So you’d only say it to signal, “I’m smarter than you and I know you’re simulating me”. And if they’ve ever seen you say it in real life, they know you’re just bluffing. I believe that this response, if you actually were a sim being run by your opponent, would be the best way to get your opponent to cooperate on the last turn.
This doesn’t solve the problem of removing obvious, trivial ways to tell if you’re a sim. But it does mean that if there’s no shortcuts, so that the the smarter bot will win that battle of wills, then they have something useful to say for it (beyond just “I’m TFT so you shouldn’t defect until the last turn”)
you can be reasonably sure that if both calls return the same result you are being simulated.
Shouldn’t it be, at most: you can be reasonably sure that you are being simulated either a) after both calls return C or b) after you formally choose D having already seen that both calls return D?
If a simulation chooses C after seeing both results D, then the simulator might as well actually defect, so it does, and the non-simulation chooses C, just like the simulation.
If an agent strongly prefers not to be a simulation and believes in TDT and is vulnerable to blackmail, they can be coerced into cooperating and sacrificing themselves in similar cases. Unless I’m wrong, of course.
One really ought to make anti-blackmail resolutions.
This has the problem that, since most strategies will eval twice (to check both the C and D cases) you can be reasonably sure that if both calls return the same result you are being simulated.
Edit: Although it doesn’t fully fix the problem, this is better: eval takes a function that takes the function the other agent will call eval with as its argument and returns C or D.
There are still detection problems here (you could add checks to see if the function you passed to the function eval passed to you was invoked), but the fact that some strategies wouldn’t do overly ambitious recursion at least downgrades the above approach from obvious information leak to ambiguous information leak
Basically, there is always going to be some clever strategy of possibly detecting whether you’re a sim. And it may be best at that point to just say so. So the game should have three outputs: C, D, and S. S gets you zero points and gets your opponent the points they’d get if you played C. It’s clearly suboptimal. So you’d only say it to signal, “I’m smarter than you and I know you’re simulating me”. And if they’ve ever seen you say it in real life, they know you’re just bluffing. I believe that this response, if you actually were a sim being run by your opponent, would be the best way to get your opponent to cooperate on the last turn.
This doesn’t solve the problem of removing obvious, trivial ways to tell if you’re a sim. But it does mean that if there’s no shortcuts, so that the the smarter bot will win that battle of wills, then they have something useful to say for it (beyond just “I’m TFT so you shouldn’t defect until the last turn”)
Shouldn’t it be, at most: you can be reasonably sure that you are being simulated either a) after both calls return C or b) after you formally choose D having already seen that both calls return D?
If a simulation chooses C after seeing both results D, then the simulator might as well actually defect, so it does, and the non-simulation chooses C, just like the simulation.
If an agent strongly prefers not to be a simulation and believes in TDT and is vulnerable to blackmail, they can be coerced into cooperating and sacrificing themselves in similar cases. Unless I’m wrong, of course.
One really ought to make anti-blackmail resolutions.
What about the hypothesis that the opponent isn’t optimized for the game?
The standard method of examining play history will disambiguate, perhaps requiring a probing move