Prisoner’s Dilemma as a Game Theory Laboratory

prase25 Aug 2011 14:30 UTC

22 points

Last year Yvain had organised a Diplomacy game between LessWrong users to test how well we perform in practical application of game theory. At least two games had been played, but as far as I know no analysis was made afterwards. One reason is probably that few games involving complex interactions between players constitute at most anecdotal evidence for whatever hypothesis one may test. The second one is lack of comparison to outside players. Although the games were fun, their value as a game theory experiment remains rather low. Could we test our game theoretic skills in a statistically more significant way?

Only recently I have learned about Robert Axelrod’s experiment in which he run a competition of different strategies playing iteraded prisoner’s dilemma, and got an idea to replicate it. I have already run a similar experiment with five contestants (all being my friends) and now a second run is being prepared, with at least nine strategies in the pool. I am interested in a third run, this time with strategies nominated by LessWrongers. The contestants of the second run which has identical rules are readers of my blog and neither of them is probably familiar with specific LW ideas. Therefore, they would serve as a fairly good control group to test LW’s applied rationality skills (or a subset of). After matching the strategies in both groups separately, I plan to put all of them together and see who wins.

So, if you want to participate in this contest, feel free to send me your strategy. The rules are following.

By a strategy I mean a program sent by a contestant or coded according to his/her instructions. The strategies compete in iterated prisoner’s dilemmas. A single iteration I will call a turn. In each turn each strategy has to choose between cooperating and defecting. The payoffs are:
- if both cooperate, 4 points for each
- if both defect, 1 point for each
- else 7 points for the defector and 0 points for its cooperating opponent
By a match I mean a series of 100 iterations between the same opponents.
There will be two different competitions, the round-robin tournament and the evolutionary tournament. Two separate final standings will be made, one for each tournament. Any received strategy has to participate in both.
- In the round-robin tournament strategies will play one match against each other (not against a copy of themselves). The winner will be the strategy which acquires the highest total number of points. Number of won or lost matches is disregarded.
- The evolutionary tournament will simulate evolution of a population of strategies. In the beginning, equal number of copies of each strategy is present in the pool. Then the strategies are paired randomly (now a strategy may be paired against a copy of itself) and each pair plays one match. In the next generation pool, the strategies will be represented in numbers proportional to the total number of points won by all its copies in the present generation. The total population will be maintained at a constant level (probably 2,000 strategies, up to rounding errors). Strategies may go extinct. The strategy with the highest number of copies after the 100th generation will be considered a winner.
A strategy has access to results of all previous turns in its current match and the number of current turn. It can decide randomly (a pseudo-random generator will be used). It does not have access to results of other matches (including its own previous matches), the number of current generation, population sizes, number of points won by any strategy in any phase of the tournament (except the number of points already won in the current match which can be calculated from the previous turns results) and its opponent’s identity (namely it doesn’t know whether it plays against a copy of itself). The strategies obviously can’t read their opponent’s source code.
Each person can send only one strategy. Be honest.
The strategy can be described in any comprehensible language. If I find problems in understanding, which will probably happen if the strategy is described in Lisp or Basque, but can happen even if it is written in English, I will ask.
You should send your strategies by private message, not in comments to this post. Your opponents shouldn’t know what you have prepared.
The strategy needn’t be original, but if I get two identical strategies, I will treat them as one.
A Fully Random strategy is automatically included in the tournament. It plays defect or cooperate randomly with 50% chance each turn.
Names of the authors of the strategies will be published by default. If you wish your name excluded, specify it in your message, and your strategy will compete anonymously.

The simulation will probably not be run before at least eight strategies are collected and before the beginning of September. The competition is closed, no new strategies are accepted at this moment. 21 different strategies were accepted, their implementations are now being tested. Results will be probably posted on Sunday 4th September.

[Edit: Found inconsistency in using words round and turn to denote the same thing. Now turn is used everywhere.]

What links here?

prase25 Aug 2011 14:30 UTC

22 points

47 comments3 min readLW link Archive

Prisoner's Dilemma

RobertLumley 25 Aug 2011 17:38 UTC
15 points
I wish I had been around when the initial Diplomacy game was organized. Diplomacy is a hobby of mine. And that project would have been much better organized (and recorded) on the website where I play, WebDiplomacy.

Furthermore, if you wanted a more easily analyzed metric, you can play “gunboat” diplomacy, where you’re not allowed to communicate between countries, and the only communication are by moves—you can use support holds to indicate a request for an alliance, etc. I particularly enjoy this form of diplomacy, as it’s far more like seven player chess.

A lot of the people on WebDiplomacy are math/science people; the two communities (if LessWrong could get over the HUGE number of trolls on WebDip) would actually get along really well. There are a number of metrics that they collect, but below is a link to a project that one person did, wherein, among other things, he analyzed the most statistically successful openings for the various countries. Also, below that, I have a link to my personal statistics, if you’d find those interesting.

http://tinyurl.com/DiplomacyReport

http://tinyurl.com/SmileysDipStats

I figure I’ve promoted LessWrong on WebDip enough, I ought to promote WebDip here too… :-)
- Randaly 25 Aug 2011 18:39 UTC
  2 points
  Parent
  There was another game, mostly including the NY members, which took place on WebDiplomacy. The game is here; Zvi Mowshowitz won.
  - RobertLumley 25 Aug 2011 19:01 UTC
    3 points
    Parent
    Errrrr.… Points Per Supply Center...
    
    Certainly excusable for LWers, but any serious play on WebDip is WTA. :-P
    
    This is a topic highly debated on WebDiplomacy, (I could go on) but Babak said it best when he said (paraphrasing) “Playing for a strong second in PPSC in diplomacy is like shooting it in your own goal in soccer just because you want to score a goal.”
    What links here?
    RobertLumley's comment on LW September WebDiplomacy Games: Starting soon! by Vaniver (2 Sep 2011 22:00 UTC; 5 points)
- prase 25 Aug 2011 18:32 UTC
  2 points
  Parent
  
  And that project would have been much better organized (and recorded) on the website where I play, WebDiplomacy.
  
  Perhaps it would, but Yvain’s moderation and reports after each turn gave it a unique flavour which standard WebDip games lack. Anyway, it may be easy to organise another game for LWers on WebDip.
  - RobertLumley 25 Aug 2011 18:35 UTC
    1 point
    Parent
    I’d love to play, if there’s interest.
    - Vaniver 25 Aug 2011 19:46 UTC
      2 points
      Parent
      There will definitely be tons of interest. Since I claimed I was willing to set up another game after winning one of the LW games, I probably ought to go ahead and make this happen.
      - RobertLumley 25 Aug 2011 20:06 UTC
        1 point
        Parent
        I’m happy to help if you want. I was considering just making a post myself, but I wanted to gauge whether or not there was any interest at all.
        
        And here I was thinking I might take a bit of a break from diplomacy after my games finish up so I can focus more on my school work...
        Vaniver 25 Aug 2011 21:13 UTC
        0 points
        Parent
        Thanks for the offer! I’m planning on mostly doing the administrative stuff of sorting people, as I’ve never used WebDiplomacy before. (I imagine it’s pretty easy to use, but experience generally helps). The previous post has been updated with a link, if you’d like to sign up.
lessdazed 25 Aug 2011 17:57 UTC
9 points
This is a fantastic idea.

My strategy is to build up an immunity to iocane powder. If you need help coding that just let me know.
lessdazed 25 Aug 2011 20:20 UTC
7 points
I suggest the submission guidelines be modified to encourage naming the strategies. That would be fun.
vespiacic 27 Aug 2011 12:51 UTC
4 points
When does the tournament start?
- prase 31 Aug 2011 14:42 UTC
  0 points
  Parent
  I will have it done this weekend, if I don’t meet unexpected complications.
Eugine_Nier 5 Sep 2011 14:30 UTC
2 points
Well, it’s already September 5th. Where are the results?
malthrin 25 Aug 2011 18:14 UTC
2 points
Nifty idea. Would folks be interested in this sort of thing as a web site?
- prase 25 Aug 2011 18:34 UTC
  0 points
  Parent
  What exactly do you have in mind?
  - malthrin 25 Aug 2011 18:50 UTC
    5 points
    Parent
    I’m envisioning:
    
    A textual interface where people can enter strategies in psuedocode or possibly a programming language that I can sandbox.
    A scheduled job to run games along the lines you’ve described above every day? hour?
    A page to display the outcome of previous tournaments.
    - prase 25 Aug 2011 19:19 UTC
      4 points
      Parent
      I would expect it collapse into the Nash equilibrium quite soon.
      - benelliott 1 Sep 2011 16:55 UTC
        3 points
        Parent
        This would be an interesting result, since the Nash equilibrium of fixed length prisoner’s dilemma is defectbot, and defectbots have not had a great rate of success in past tournaments.
        prase 1 Sep 2011 17:04 UTC
        1 point
        Parent
        Yes. It would depend on concrete implementation and number of players. The numbers needed for defectbots to actually thrive are very big and it would take a long time to converge into Nash, so I was perhaps a bit overconfident with my prediction.
        red75 1 Sep 2011 20:03 UTC
        3 points
        Parent
        Moreover, simulations I ran using your rules for evolutionary tournament show that one strategy quickly dominates and others go extinct. Defectbot is among strategies which are fastest to go extinct (even in presence of cooperatebot) as it feeds off overaltruist strategies, which in turn fail to compete with tit-for-tat. So I doubt that at least evolutionary tournament will converge into Nash.
        
        I predict that strategy that tit-for-tats 99 turns and defects on 100-th one will win in evolutionary tournament, given that tit-for-tat is also in the population.
        
        ETA: I’ve sent another strategy.
jimrandomh 25 Aug 2011 16:37 UTC
2 points

The strategy can be described in any comprehensible language. If I find problems in understanding, which will probably happen if the strategy is described in Lisp or Basque, but can happen even if it is written in English, I will ask.

You’re going to be translating these all into computer programs so you can run them, right? You should specify which language it’s going to be, so we can save you some work.
- prase 25 Aug 2011 16:55 UTC
  0 points
  Parent
  I have already a functional code written in Wolfram Mathematica which simulates the tournaments.
  
  If the number of strategies will not be too big, it is easier for me to code them than to write down the instructions needed for the readers to know how I represent the data, not to speak about help for those unfamiliar with Mathematica’s syntax. Simple strategies are usually one-liners, e.g. tit-for-tat has 29 characters.
  - jimrandomh 25 Aug 2011 17:07 UTC
    9 points
    Parent
    Okay, but what about complicated strategies? Even someone playing tit-for-tat will want to augment it with a special case to switch to defection rock when they’re playing against Fully Random.
    - prase 25 Aug 2011 17:11 UTC
      0 points
      Parent
      If that happens more than once or twice, then I will probably ask the author to code it.
Caerbannog 25 Aug 2011 15:54 UTC
2 points
This should be interesting. I’ve sent you my strategy by private message.
Solvent 28 Aug 2011 3:20 UTC
1 point
I just read this article about a similar prisoner’s dilemma competition here. Tit for tat won.

It’s possible that this was assumed knowledge that everyone has but me, but perhaps someone didn’t know.
- prase 1 Sep 2011 20:19 UTC
  1 point
  Parent
  I have written about that at the beginning of the second paragraph.
  - Solvent 1 Sep 2011 22:51 UTC
    0 points
    Parent
    ...That was a little slow of me. Sorry.
red75 31 Aug 2011 7:22 UTC
0 points
In the meantime I’ve run my own simulation, studying a group of strategies, which perform as tit-for-tat except that at specific turn they defect and then they use result of this turn to switch to defect stone or continue tit-for-tatting. Thus they recognize copies of itself and cooperate with them. Such strategy can be exploited by switching to defect stone before it does, or by mimicking its behavior (second defect check after first. This case I didn’t analyze).

It leads to interesting results in evolutionary tournament. Second fairest (second longest period of tit-for-tatting) strategy wins. It outperforms less fair strategies by longest cooperation with fairest strategy. And it outperforms fairest strategy by exploiting it.
- prase 1 Sep 2011 20:17 UTC
  0 points
  Parent
  Define strategy S[n] as TfT until turn n and defect ever since. In the limit of infinite population having non-zero initial number of S[n] for each n, S[0], i.e. DefectBot, eventually dominates. Starting with equal subpopulations, initially most successful is S[99] which preys on S[100] and finally drives it to extinction. But then, S[98] gains advantage over S[99] and so on.
  
  With not so big population however, the more defectorish strategies die out sooner than the environment becomes suitable for them. (I have done it with population of 2000 strategies and the lowest surviving after several hundred generations was S[80] or so).
  - red75 1 Sep 2011 20:30 UTC
    2 points
    Parent
    Try another strategy. I[n] - TfT until turn n, defect on turn n, on later turns check if result on turn n was (defect,defect) and play TfT, otherwise defect. Idea is selfcooperation.
    What links here?
    red75's comment on Prisoner’s Dilemma Tournament Results by prase (5 Sep 2011 19:08 UTC; 2 points)
  - Eugine_Nier 2 Sep 2011 2:53 UTC
    0 points
    Parent
    Except that in the initial state S[0] will get driven to extinction long before s[100] will.
    - prase 2 Sep 2011 10:59 UTC
      2 points
      Parent
      With reasonably sized population, yes. In the limit of infinite population, or with sufficiently large population, no.
Nic_Smith 26 Aug 2011 7:05 UTC
0 points
I hope the strategy I wrote in PHP is not too obscure. That was quite fun.
- prase 26 Aug 2011 12:06 UTC
  0 points
  Parent
  Also hope so. Still haven’t gone through.
Eugine_Nier 26 Aug 2011 6:03 UTC
0 points
Submitted a strategy.
- lessdazed 26 Aug 2011 7:44 UTC
  0 points
  Parent
  Does it have a name?
  - Eugine_Nier 27 Aug 2011 8:00 UTC
    0 points
    Parent
    Haven’t thought of one.
ArisKatsaris 26 Aug 2011 5:45 UTC
0 points
I’ve sent a strategy, coded in Java.
DuncanS 25 Aug 2011 23:29 UTC
0 points
I’ve sent a strategy in—looking forward to the results.
[deleted] 25 Aug 2011 20:16 UTC
0 points
I have submitted a strategy and look forward to seeing the result.
Vaniver 25 Aug 2011 19:40 UTC
0 points
So, anyone want to bet on what proportion of strategies will be tit-for-tat? I’m fairly confident it’ll be more than half.

(I should be clear, I mean the number of submitted strategies, as duplicate strategies are treated as one strategy.)
What links here?
- Prisoner’s Dilemma Tournament Results by prase (6 Sep 2011 0:46 UTC; 151 points)
- benelliott 25 Aug 2011 20:09 UTC
  2 points
  Parent
  In a tournament with a fixed number of rounds and a known random strategy, it would be a bit weird to just use vanilla tit-for-tat, but yeah. It will be a bit boring if all the strategies are nice.
  - torekp 25 Aug 2011 22:42 UTC
    3 points
    Parent
    
    tournament with a fixed number of rounds
    
    That’s a bit disappointing, but so be it.
    - benelliott 25 Aug 2011 23:28 UTC
      1 point
      Parent
      It did say so in the OP.
      
      Besides, its not such a big deal, the paradox of induction is an interesting problem, if I was very optimistic I might suggest that this could throw some light on it.
- jhuffman 25 Aug 2011 20:57 UTC
  1 point
  Parent
  If you expect to play against tit-for-tat strategies and a random strategy then you would know that tit-for-tat cannot be expected to win. I’d be surprised if lots of people submit strategies that have almost no chance of winning.
Jem 25 Aug 2011 17:00 UTC
0 points
What a great idea! I’ve sent you my strategy.