I’m not really able to evaluate the claims in the paper myself, so thanks for the input. Having said that, do you think the paper specifies the strategies with enough detail for us to code them up and test their mettle in a Less Wrong IPD tournament?
Sure, but this extortionate strategy wouldn’t survive long in such a tournament because it would perform poorly against TFT variants as well as against itself.
I know that this isn’t exactly what you’re asking, but: Stewart and Plotkin tested two variants of ZD strategies in a variant of Axelrod’s original tournament; one variant (ZDGTFT-2) had the highest total score, beating TFT.
Here’s another strategy computed from the paper: cooperate with probabilities (0.9, 0.7, 0.2, 0.1) in the case (CC, CD, DC, DD). Supposedly this strategy is one that sets the opponent’s score to an average of 2, regardless of his or her actions. You could come up with a similar strategy to force any outcome in the interval (1,3), excluding the endpoints.
I’m not really able to evaluate the claims in the paper myself, so thanks for the input. Having said that, do you think the paper specifies the strategies with enough detail for us to code them up and test their mettle in a Less Wrong IPD tournament?
Here’s the first strategy that is explicitly stated (in the context of the 5/3/1/0 payouts):
If the last outcome was CC, cooperate with probability 11⁄13.
If the last outcome was CD, cooperate with probability 1⁄2.
If the last outcome was DC, cooperate with probability 7⁄26.
If the last outcome was DD, defect.
This supposedly “enforces an extortion factor of 3”, whatever that means.
Sure, but this extortionate strategy wouldn’t survive long in such a tournament because it would perform poorly against TFT variants as well as against itself.
I know that this isn’t exactly what you’re asking, but: Stewart and Plotkin tested two variants of ZD strategies in a variant of Axelrod’s original tournament; one variant (ZDGTFT-2) had the highest total score, beating TFT.
Here’s another strategy computed from the paper: cooperate with probabilities (0.9, 0.7, 0.2, 0.1) in the case (CC, CD, DC, DD). Supposedly this strategy is one that sets the opponent’s score to an average of 2, regardless of his or her actions. You could come up with a similar strategy to force any outcome in the interval (1,3), excluding the endpoints.
I’ve yet to check how this works.
I am now reading the paper with intent to code them into the current simulator. Wish me luck; reading Dyson is a challenge.