Epictetus comments on Open Thread, Apr. 27 - May 3, 2015

Epictetus 27 Apr 2015 5:53 UTC
10 points
There’s still some subtlety here. A Memory-0 strategy picks C with probability p and D with probability q, independent of any past results. If you know p and q, you can devise a strategy to optimize your score. The result in the paper is that this new strategy is Memory-0 and that you can’t do better by increasing your memory.

The advantage of a longer memory is that, given enough iterations, you can get a good approximation for p and q and so deduce the appropriate Memory-0 strategy. Something like Tit-for-Tat is devised to basically get the same score as its opponent (the opponent can get an advantage of one defection). It’s not going to do worse than any individual opponent, but neither is it going to do better. A strategy that remembers the entire game can recognize, say, All-C and exploit it by defecting, which Tit-for-Tat can’t do.

A Memory-1 strategy is one where p and q are functions of the previous round. In general, they’ll depend both on what it did last round and what the opponent did last round. There are four possible results (C-C, C-D, D-C, D-D), which means that the strategy will have up to four distinct probabilities for cooperation next round. If you can learn those, you can come up with the optimal strategy for playing against it. This strategy can be modeled as a Memory-1 strategy.

The big difference, I think, is that having a longer memory is helpful if you’re in a diverse environment. In any individual game, there’s always a strategy with a shorter memory that will do as well as yours. However, the same short-memory strategy will not be optimal against every opponent, while you can use your longer memory to devise the best short-memory strategy for a given match.