2) You can partially observe rounds played by other bots.
3) (The really interesting one.) You get a copy of the other bot’s source code and are allowed to analyze it. All bots have 10,000 instructions per turn, and if you run out of time the round is negated (both players score 0 points). There is a standard function for spending X instructions evaluating a piece of quoted code, and if the evaled code tries to eval code, it asks the outer eval-ing function whether it should simulate faithfully or return a particular value. (This enables you to say, “Simulate my opponent, and if it tries to simulate me, see what it will do if it simulates me outputting Cooperate.”)
Oh cool! You allow an agent to see how their opponent would respond when playing a 3rd agent (just call run with different source code).
[Edit: which allows for arbitrary message passing—the coop bots might all agree to coop with anyone who coops with (return C)]
However you also allow for trivially determining if an agent is being simulated: simply check how much fuel there is, which is probably not what we want.
The only way to check your fuel is to run out—unless I goofed.
You could call that message passing, though conventionally that names a kind of overt influence of one running agent on another, all kinds of which are supposed to excluded.
It shouldn’t be hard to do variations where you can only run the other player and not look at their source code.
I’m not a native schemer, but it looks like you can check fuel by calling run with a large number and seeing it if fails to return… eg (eq (run 9999 (return C) (return C)) ’exhausted) [note that this can cost fuel, and so should be done at the end of an agent to decide if returning the “real” value is a good idea]
giving us the naieve DefectorBot of (if (eq (run 9999 (return C) (return C)) ’exhausted) C D)
[Edit: and for detecting run-function-swap-out:
(if (neq (run 10000000 (return C) (return C) ’exhausted) C ;; someone is simulating us
(if (eq (run 9999 (return C) (return C)) 'exhausted) C ;; someone is simulating us more cleverly
D)) ]
[Edit 2: Is there a better way to paste code on LW?]
Re: not showing source: Okay, but I do think it would be awesome if we get bots that only cooperate with bots who would cooperate with (return C)
On message passing as described, that’d be a bug if you could do it here. The agents are confined. (There is a side channel from resource consumption, but other agents within the system can’t see it, since they run deterministically.)
When you call RUN, one of two things happens: it produces a result or you die from exhaustion. If you die, you can’t act. If you get a result, you now know something about how much fuel there was before, at the cost of having used it up. The remaning fuel might be any amount in your prior, minus the amount used.
At the Scheme prompt:
(run 10000 '(equal? 'exhausted (cadr (run 1000 '((lambda (f) (f f)) (lambda (f) (f f))) (global-environment)))) global-environment)
; result: (8985 #t) ; The subrun completed and we find #t for yes, it ran to exhaustion.
(run 100 '(equal? 'exhausted (cadr (run 1000 '((lambda (f) (f f)) (lambda (f) (f f))) (global-environment)))) global-environment)
; result: (0 exhausted) ; Oops, we never got back to our EQUAL? test.
I followed Eliezer’s proposal above (both players score 0) -- that’s if you die at “top level”. If a player is simulating you and still has fuel after, then it’s told of your sub-death.
Awesome! The only suggestion I have is to pass in a putative history and/or tournament parameters to an agent in the evaluation function so the agent can do simple things like implement tit-for-tat on the history, or do complicated things like probing the late-game behavior of other agents early in the game. (E.G. “If you think this is the last round, what do you do?”)
Thanks! Yes, I figure one-shot and iterated PDs might both hold interest, and the one-shot came first since it’s simpler. That’s a neat idea about probing ahead.
if you run out of time the round is negated (both players score 0 points).
This makes the game matrix bigger for no reason. Maybe replace this with “if you run out of time, you automatically defect”?
There is a standard function for spending X instructions evaluating a piece of quoted code, and if the evaled code tries to eval code, it asks the outer eval-ing function whether it should simulate faithfully or return a particular value.
Haha, this incentivizes players to reimplement eval themselves and avoid your broken one! One way to keep eval as a built-in would be to make code an opaque blob that can be analyzed only by functions like eval. I suggested this version a while ago :-)
I may be misreading, but I don’t see how Eliezer’s eval is broken. It can choose between a faithful evaluation and one in which inner calls to eval are replaced with a given value. That’s more powerful than standard.
This makes the game matrix bigger for no reason. Maybe replace this with “if you run out of time, you automatically defect”?
Slightly better: allow the program to set the output at any time during execution (so it can set its own time-out value before trying expensive operations).
(This enables you to say, “Simulate my opponent, and if it tries to simulate me, see what it will do if it simulates me outputting Cooperate.”)
But it will not be simulating “you”, it will be simulating the game under alternative assumptions of own behavior, and you ought to behave differently depending on what those assumptions are, otherwise you’ll think the opponent will think defection is better, making it true.
I’ve written a pretty good program to complete variant 3 but I’m such a lurker on this site i lack the nessesary karma to submit it as a article. So here it is as a comment instead:
Inspired by prase’s contest, (results here) and Eliezer_Yudkowsky’s comment, I’ve written a program that plays iterated games of prisoners dilemma, with the cravat that each player can simulate his opponent. I’m now running my own contest. You can enter your onwn strategy by sending it to me in a message. Deadline to enter is Sunday September 25. The contest itself will run shortly there after.
Each strategy plays 100 iterations against each other strategy, with cumulative points. Each specific iteration is called a turn, each set of 100 iterations is called a match. (thanks prase for the terminology). Scoring is CC: (4,4) DC: (7,0) CD: (0,7) DD: (2,2).
Every turn, each strategy receives a record of all the moves it and its opponent made this match, a lump of time to work and its opponents code. Strategies can’t review enemy code directly but they can run it through any simulations they want before deciding their move, within the limits of the amount of time they have to work.
Note that strategies can not pass information to themselves between iterations or otherwise store it, other than the record of decisions. They start each turn anew. This way any strategy can be simulated with any arbitrary record of moves without running into issue.
Strategies in simulations need a enemy strategy passed into them. To avoid infinite recursion of simulations, they are forbidden from passing themselves. They can have as many secondary strategies as need however. This creates a string of simulations where strategy(1) simulates enemy(1), passing in strategy(2) which enemy(1) simulates and passes in enemy(2) which strategy(2) simulates and passes in strategy(3). The final sub-strategy passed in by such a chain must be a strategy which performs NO simulations.
for example, the first sub-strategy could be an exact duplicate of main strategy other than that it passed the tit_for_tat program instead. so main_strategy ⇒ sub_strategy1 ⇒ tit_for_tat.
You can of course use as many different sub_strategies as you need, the programs are limited by processing time, not memory. Strategies can run their simulated opponent on any history they devise, playing which ever side they choose.
Strategies can’t read the name of their opponent, see the number of strategies in the game, watch any other matches or see any scores outside their current match.
Strategies are not cut off if they run out of time, but both will receive 0 points for the turn. The decisions of the turn will be recorded as if normal.
I never figured out a good way to keep strategies from realizing they were being simulated simply by looking at how much time they were given. Not knowing how much time they have would make it prohibitively difficult to avoid timing out. My hack solution is to not give a definitive amount of time for the main contest but instead a range: from 0.01 seconds to 0.1 seconds per turn, with the true time to be know only by me. This is far from ideal, and if anyone has a better suggestion I’m all ears.
To give reference: a strategy that runs 2 single turn simulations of tit_for_tat against tit_for_tat takes, on average 3.510^-4 seconds. Running only 1 single turn simulations took only 1.510^-4 seconds. tit_for_tat by itself takes about 2.5*10^-5 seconds to run a single turn. Unfortunately, do to factor outside of my control, matlab, the software I’m running will for unknown reasons take 3 to 5 times as long as normal about 1 out of a 100 times. Leave yourself some buffer.
Strategies are NOT allowed a random number generator. This is different from prase’s contest but I would like to see strategies for dealing with enemy intelligence trying to figure them out without resorting to unpredictability.
I’ve come up with a couple of simple example strategies that will be performing in the contest.
Careful_cheater:
simulates its opponent against tit_for_tat to see its next move. If enemy defects, Careful_cheater defects. If enemy cooperates, Careful_cheater simulates its next move after that with a record showing Careful_cheater defecting this turn, passing tit_for_tat. If the opponent would have defected on the next turn, Careful_cheater cooperates, but if the opponent would have cooperated despite Careful_cheaters defection, it goes ahead and defects.
simulations show it doing evenly against tit_for_tat and its usual variations, but Careful_cheater eats forgiving programs like tit_for_2_tats alive.
Everyman:
simulates 10 turn matchs with its opponent against several possible basic strategies, including tit_for_tat, tit_for_tat_optimist (cooperates first two turns), and several others, then compares the scores of each match and plays as the highest scoring
Copy_cat:
Switches player numbers and simulates its opponent playing Copy_cats record while passing its opponent and then performs that action. That is to say, it sees what its opponent would do if it were in Copy_cats position and up against itself.
this is basically DuncanS strategy from the comments. DuncanS, your free to enter another one sense I stole yours as an example
and of course tit_for_tat and strategy “I” from prase’s contest will be competing as well.
one strategy per person, all you need for a strategy is a complete written description, though I may message back for clarification. I reserve the right to veto any strategy I deem prohibitively difficult to implement.
So what happens if copy_cat goes actually does go up against itself, or a semantically similar bot? Infinite recursion and zero points for both? Or am I missing something.
Actually, it seems that crashes it. thanks for the catch. I hadn’t tested copy_cat against itself and should have. Forbidding strategies to pass their opponent fixes that, but it does indicate my program may not be as stable as I thought. I’m going to have to spend a few more days checking for bugs sense I missed one that big. thanks eugman.
No problem. I’ve been thinking about it and one should be able to recursively prove if a strategy bottoms’s out. But that should fix it as long the user passes terminating strategies. A warning about everyman, if it goes against itself, it will run n^2 matches where n is the number of strategies tested.Time consumed is an interesting issue because if you take a long time it’s like being a defectbot against smartbots, you punish bots that try to simulate you but you lose points too.
I didn’t get into it earlier, but everyman’s a little more complicated. it runs through each match test one at a time from most likely to least, and checks after each match how much time it has left, if it starts to run out, it just leaves with the best strategy its figured out. By controlling how much time you pass in, strategies can avoid n^2 processing problems. Its why I thought it was so necessary to include even if it does give hints if strategies are being simulated or not. Everyman was built as a sort of upper bound. its one of the least efficient strategies one might implement.
Here’s a very fancy cliquebot (with a couple of other characteristics) that I came up with.
The bot is in one of 4 “modes”—SELF, HOSTILE, ALLY, PASSWORD.
Before turn1:
Simulate the enemy for 20 turns against DCCCDDCDDD Cs thereafter, If the enemy responds 10Ds, CCDCDDDCDC then change to mode SELF. (These are pretty much random strings—the only requirement is that the first begins with D)
Simulate the enemy for 10 turns against DefectBot. If the enemy cooperates in all 10 turns, change to mode HOSTILE.
Else be in mode ALLY.
In any given turn,
If in mode SELF, cooperate always.
If in mode HOSTILE, defect always.
If in mode ALLY, simulate the enemy against TFT for the next 2 turns. If the enemy defects on either of these turns, or defected on the last turn, switch to mode HOSTILE and defect. Exception if it is move 1 and the enemy will DC, then switch to mode PASSWORD and defect. Defect on the last move.
If in mode PASSWORD, change to mode HOSTILE if the enemy’s moves have varied from DCCCDDCDDD Cs thereafter beginning from move 1. If so, switch to mode HOSTILE and defect. Else defect on moves 1-10, on moves 11-20 CCDCDDDCDC respectively and defect thereafter.
You get a copy of the other bot’s source code and are allowed to analyze it. All bots have 10,000 instructions per turn, and if you run out of time the round is negated (both players score 0 points). There is a standard function for spending X instructions evaluating a piece of quoted code
I assume that X is variable based on the amount of quoted code you’re evaluating; in that case I would submit “tit for tat, defect last 3” but obfuscate my source code in such a way that it took exactly 10,000 instructions to output, or make my source code so interdependent that you’d have to quote the entire thing, and so on. I wonder how well this “cloaked tit-for-tat” would go.
1) is feasible and that can be done easily, but now I am a little bit fed up with PD simulations, so I will take some rest before organising it. Maybe anybody else is interested in running the competition?
3) Wouldn’t it be easier to limit recursion depth instead of number of instructions? Not sure how to measure or even define the latter if the codes aren’t written in assembler. The outcome would also be more predictable for the bots and would not motivate the bots to include something like “for (i=number-of-already-spent-instructions, i<9,999, i++, do-nothing)” at the end of their source code to break anybody trying to simulate them. (I am not sure whether that would be a rational thing to do, but still quite confident that some strategies would include that).
“Simulate my opponent, and if it tries to simulate me, see what it will do if it simulates me outputting Cooperate.”
This has the problem that, since most strategies will eval twice (to check both the C and D cases) you can be reasonably sure that if both calls return the same result you are being simulated.
Edit: Although it doesn’t fully fix the problem, this is better: eval takes a function that takes the function the other agent will call eval with as its argument and returns C or D.
eval(function(fn) {
if(fn(function(ignore) { return C; }) == C) return C;
There are still detection problems here (you could add checks to see if the function you passed to the function eval passed to you was invoked), but the fact that some strategies wouldn’t do overly ambitious recursion at least downgrades the above approach from obvious information leak to ambiguous information leak
Basically, there is always going to be some clever strategy of possibly detecting whether you’re a sim. And it may be best at that point to just say so. So the game should have three outputs: C, D, and S. S gets you zero points and gets your opponent the points they’d get if you played C. It’s clearly suboptimal. So you’d only say it to signal, “I’m smarter than you and I know you’re simulating me”. And if they’ve ever seen you say it in real life, they know you’re just bluffing. I believe that this response, if you actually were a sim being run by your opponent, would be the best way to get your opponent to cooperate on the last turn.
This doesn’t solve the problem of removing obvious, trivial ways to tell if you’re a sim. But it does mean that if there’s no shortcuts, so that the the smarter bot will win that battle of wills, then they have something useful to say for it (beyond just “I’m TFT so you shouldn’t defect until the last turn”)
you can be reasonably sure that if both calls return the same result you are being simulated.
Shouldn’t it be, at most: you can be reasonably sure that you are being simulated either a) after both calls return C or b) after you formally choose D having already seen that both calls return D?
If a simulation chooses C after seeing both results D, then the simulator might as well actually defect, so it does, and the non-simulation chooses C, just like the simulation.
If an agent strongly prefers not to be a simulation and believes in TDT and is vulnerable to blackmail, they can be coerced into cooperating and sacrificing themselves in similar cases. Unless I’m wrong, of course.
One really ought to make anti-blackmail resolutions.
if you run out of time the round is negated (both players score 0 points)
In the OP, defect/defect gives each player one point. Did you intend to add a fifth possible outcome for each iteration?
I think a more natural expansion would be to have a third choice besides cooperate and defect lead to nine different possible scores per iteration. For a program to run out of time is a choice pre-committed to by the design of the analyzing function, it’s not too different from the outcomes of cooperate or defect.
I don’t care what my opponent will do this turn, I care how what they will do later depends on what I do now. That is, do they have the capacity to retaliate, or can I defect with impunity.
So basically, my strategy would be to see if they defect (or try to be sneaky by running out my cycles) on turn 100 if they think I’m a dumb TFT. If so, I attempt to defect 1 turn before they do (edit: or just TFT and plan to start defecting on turn 99), which becomes a battle to see whose recursion is more efficient. If not, I play TFT and we probably both cooperate all the way through, getting max stable points.
Significantly, my strategy would not do a last-turn defection against a dumb TFT, which leaves some points on the ground. However, it would try to defect against a (more TDT-like?) strategy which would defect against TFT; so if my strategy were dominant, it could not be attacked by a combination of TFT and some cleverer (TDT?) bot which defects last turn against TFT.
I call this strategy anti-omega, because it cooperates with dumb TFTs but inflicts some pain on Omegabot (the strategy which always gets the max points it could possibly get against any given opponent). Omegabot would still beat a combo of TFT and anti-omega (the extra points it would get from defecting last turn against TFT, and defecting turn 98 against anti-omega, would make up for the points it lost from having two mutual defections against anti-omega). But at least it would suffer for it.
This points to a case where TDT is not optimal decision theory: if non-TDT agents can reliably detect whether you are a “being too clever” (ie, a member of a class which includes TDT agents), and they punish you for it. Basically, no sense being a rationalist in the dark ages; even if you’re Holmes or Moriarty, the chances that you’ll slip up and be burned as a witch might outweigh the benefits.
I don’t see rationality was a major cause of witch accusations. Indeed, a great number of successful dark agers seem to be ones that were cleverer / more rational than their contemporaries, even openly so.
Trouble is, when your bot plays itself it gets stuck in a loop, runs out of time and gets 0. This will make it very unlikely to ever rise to dominance in an evolutionary tournament.
Depending on how much time is allowed the following might be a viable solution: with a 1% chance cooperate, otherwise run the other bot’s source code. If the bot plays itself, or for that matter any other bot that tries to run its source code, after 100 (on average) iterations it will return cooperate, and then working backwards will (usually) converge to the right thing.
Obviously, the value of 1% should be as small as possible; the average time used is proportional to the reciprocal of the probability.
This could be fixed with a trivial modification: first check if the opponent source code is identical to your own, and if it is then cooperate. Otherwise run the other bot’s source.
However, there’s another problem: If DuncanS’s strategy plays against a CooperateBot or a random bot, it will throw away a lot of points.
Quite so. I actually never intended to post this—I edited the post several times, and in the end thought I’d abandoned it. But never mind.
I thought of various trivial modifications to fix the problem that you iterate infinitely deep if you meet yourself. You could check for yourself—that works as long as nobody else enters an equivalent strategy—in which case you will do each other in.
I’m not so worried about the cooperateBot or random bot as these will in any case do so badly that they won’t tend to turn up in the population pool. But I could obviously do better against these by defecting all the way. In fact, any algorithm that neither runs me, or checks what I played previously ignores my behaviour, and should be defected against.
The most reliable fix to the recursion problem that I can think of is to implement a hard resource limit, whereby if you spend (say) 9500 of 10000 processing units on analysis, then you drop the analysis and implement a dumb strategy with whatever juice you have left. Probably wouldn’t take much more than a simple counter being incremented in a bottom-level loop somewhere, as long as you’re looking at your opponent’s source through a level of indirection rather than just jumping straight into it. Some analysis strategies might also converge on a solution rather than finding an exact one, although this would be easier and more reliable with behavioral than with source analysis.
However, trapdoors like this are exploitable: they give agents an incentive to nearly but not quite run out the clock before making a decision, in hopes that their opponents will give up and fall back on the simple and stupid. The game theory surrounding this seems to have PD-like characteristics in its own right, so it may just have pushed the relevant game-theoretical decisions up to the metagame.
There is actually a human algorithm in the cooperate/defect space—which is to distrust anyone who seems to spend a long time making a cooperate decision. Anyone who has a really quick, simple way of deciding to cooperate is OK—anyone who takes a long time is thinking about whether you will defect, or whether they should defect—and you should probably find someone else to trust. This is one of the good things about Tit for tat—it passes this test.
However, there’s another problem: If DuncanS’s strategy plays against a CooperateBot or a random bot, it will throw away a lot of points.
That is a problem, but fortunately we know that those competitors will be eliminated before the chameleon, whereupon it will begin to improve its performance.
If the DuncanBot detects a source code different from its own, it runs that source code. So a DuncanBot looks to any CliqueBot like a member of it’s clique.
That may well be right—a cliquebot may well decide I’m not in its clique, and defect. And I, running its source code, will decide that my counterparty IS in the clique, and cooperate. Not a good outcome.
Clearly I need to modify my algorithm so that I run the other bot’s code without bothering to analyse it—and have it play against not its actual opponent, but me, running it.
To add to this (probably silly) post—I was drawn to this problem for the reasons that Eliezer considers it the interesting case. In this form, it’s a variant on the halting problem / completeness problem. Each of these can’t be solved because some cases turn into an infinitely deep recursive nesting. Any purported algorithm to predict whether another algorithm will halt can’t solve the case where it’s analysing itself analysing itself analysing itself etc. Hence my rather tongue-in-cheek response.
And my response hits the same problem in a different form. When it meets itself, it gets into an infinite loop as each side tries to run the other. To solve it, at some level you need a ‘cooperate’ to be introduced in order to terminate the loop. And doing that breaks the symmetry of the situation.
A Turing machine simulated with finite resources is decidable by a Turing machine, so the halting problem isn’t a problem, though in this case the simulator also has finite resources, so infinite loops aren’t a problem.
Predicting what the opponent would do if you always return “cooperate” means predicting how the opponent would play against a CooperateBot, and acting this way means adopting opponent’s strategy against CooperateBots as your own when playing against the opponent, which is not a CooperateBot and knows you are not a CooperateBot. Sounds like a recipe for failure (or for becoming an almost-DefectBot, actually, if the opponent is smart enough and expects genuine CooperateBots).
Variants I’d like to see:
1) You can observe rounds played by other bots.
2) You can partially observe rounds played by other bots.
3) (The really interesting one.) You get a copy of the other bot’s source code and are allowed to analyze it. All bots have 10,000 instructions per turn, and if you run out of time the round is negated (both players score 0 points). There is a standard function for spending X instructions evaluating a piece of quoted code, and if the evaled code tries to eval code, it asks the outer eval-ing function whether it should simulate faithfully or return a particular value. (This enables you to say, “Simulate my opponent, and if it tries to simulate me, see what it will do if it simulates me outputting Cooperate.”)
I just hacked up something like variant 3; haven’t tried to do anything interesting with it yet.
Oh cool! You allow an agent to see how their opponent would respond when playing a 3rd agent (just call run with different source code).
[Edit: which allows for arbitrary message passing—the coop bots might all agree to coop with anyone who coops with (return C)]
However you also allow for trivially determining if an agent is being simulated: simply check how much fuel there is, which is probably not what we want.
The only way to check your fuel is to run out—unless I goofed.
You could call that message passing, though conventionally that names a kind of overt influence of one running agent on another, all kinds of which are supposed to excluded.
It shouldn’t be hard to do variations where you can only run the other player and not look at their source code.
I’m not a native schemer, but it looks like you can check fuel by calling run with a large number and seeing it if fails to return… eg (eq (run 9999 (return C) (return C)) ’exhausted) [note that this can cost fuel, and so should be done at the end of an agent to decide if returning the “real” value is a good idea]
giving us the naieve DefectorBot of (if (eq (run 9999 (return C) (return C)) ’exhausted) C D)
[Edit: and for detecting run-function-swap-out:
(if (neq (run 10000000 (return C) (return C) ’exhausted) C ;; someone is simulating us
[Edit 2: Is there a better way to paste code on LW?]
Re: not showing source: Okay, but I do think it would be awesome if we get bots that only cooperate with bots who would cooperate with (return C)
Re: message passing: Check out http://en.wikipedia.org/wiki/Message_passing for what I meant?
On message passing as described, that’d be a bug if you could do it here. The agents are confined. (There is a side channel from resource consumption, but other agents within the system can’t see it, since they run deterministically.)
When you call RUN, one of two things happens: it produces a result or you die from exhaustion. If you die, you can’t act. If you get a result, you now know something about how much fuel there was before, at the cost of having used it up. The remaning fuel might be any amount in your prior, minus the amount used.
At the Scheme prompt:
Oh, okay, I was missing that you never run the agents as scheme, only interpret them via ev.
Are you planning on supporting a default action in case time runs out? (and if so, how will that handle the equivalent problem?)
I hadn’t considered doing that—really I just threw this together because Eliezer’s idea sounded interesting and not too hard.
I’ll at least refine the code and docs and write a few more agents, and if you have ideas I’d be happy to offer advice on implementing your variant.
If you can’t act, what happens score-wise?
I followed Eliezer’s proposal above (both players score 0) -- that’s if you die at “top level”. If a player is simulating you and still has fuel after, then it’s told of your sub-death.
You could change this in play.scm.
Awesome! The only suggestion I have is to pass in a putative history and/or tournament parameters to an agent in the evaluation function so the agent can do simple things like implement tit-for-tat on the history, or do complicated things like probing the late-game behavior of other agents early in the game. (E.G. “If you think this is the last round, what do you do?”)
Thanks! Yes, I figure one-shot and iterated PDs might both hold interest, and the one-shot came first since it’s simpler. That’s a neat idea about probing ahead.
I’ll return to the code in a few days.
This makes the game matrix bigger for no reason. Maybe replace this with “if you run out of time, you automatically defect”?
Haha, this incentivizes players to reimplement eval themselves and avoid your broken one! One way to keep eval as a built-in would be to make code an opaque blob that can be analyzed only by functions like eval. I suggested this version a while ago :-)
I may be misreading, but I don’t see how Eliezer’s eval is broken. It can choose between a faithful evaluation and one in which inner calls to eval are replaced with a given value. That’s more powerful than standard.
If you build your own eval, and it returns a different result than the built in eval, you would know you’re being simulated
Slightly better: allow the program to set the output at any time during execution (so it can set its own time-out value before trying expensive operations).
But it will not be simulating “you”, it will be simulating the game under alternative assumptions of own behavior, and you ought to behave differently depending on what those assumptions are, otherwise you’ll think the opponent will think defection is better, making it true.
I’ve written a pretty good program to complete variant 3 but I’m such a lurker on this site i lack the nessesary karma to submit it as a article. So here it is as a comment instead:
Inspired by prase’s contest, (results here) and Eliezer_Yudkowsky’s comment, I’ve written a program that plays iterated games of prisoners dilemma, with the cravat that each player can simulate his opponent. I’m now running my own contest. You can enter your onwn strategy by sending it to me in a message. Deadline to enter is Sunday September 25. The contest itself will run shortly there after.
Each strategy plays 100 iterations against each other strategy, with cumulative points. Each specific iteration is called a turn, each set of 100 iterations is called a match. (thanks prase for the terminology). Scoring is CC: (4,4) DC: (7,0) CD: (0,7) DD: (2,2).
Every turn, each strategy receives a record of all the moves it and its opponent made this match, a lump of time to work and its opponents code. Strategies can’t review enemy code directly but they can run it through any simulations they want before deciding their move, within the limits of the amount of time they have to work.
Note that strategies can not pass information to themselves between iterations or otherwise store it, other than the record of decisions. They start each turn anew. This way any strategy can be simulated with any arbitrary record of moves without running into issue.
Strategies in simulations need a enemy strategy passed into them. To avoid infinite recursion of simulations, they are forbidden from passing themselves. They can have as many secondary strategies as need however. This creates a string of simulations where strategy(1) simulates enemy(1), passing in strategy(2) which enemy(1) simulates and passes in enemy(2) which strategy(2) simulates and passes in strategy(3). The final sub-strategy passed in by such a chain must be a strategy which performs NO simulations.
for example, the first sub-strategy could be an exact duplicate of main strategy other than that it passed the tit_for_tat program instead. so main_strategy ⇒ sub_strategy1 ⇒ tit_for_tat.
You can of course use as many different sub_strategies as you need, the programs are limited by processing time, not memory. Strategies can run their simulated opponent on any history they devise, playing which ever side they choose.
Strategies can’t read the name of their opponent, see the number of strategies in the game, watch any other matches or see any scores outside their current match.
Strategies are not cut off if they run out of time, but both will receive 0 points for the turn. The decisions of the turn will be recorded as if normal.
I never figured out a good way to keep strategies from realizing they were being simulated simply by looking at how much time they were given. Not knowing how much time they have would make it prohibitively difficult to avoid timing out. My hack solution is to not give a definitive amount of time for the main contest but instead a range: from 0.01 seconds to 0.1 seconds per turn, with the true time to be know only by me. This is far from ideal, and if anyone has a better suggestion I’m all ears.
To give reference: a strategy that runs 2 single turn simulations of tit_for_tat against tit_for_tat takes, on average 3.510^-4 seconds. Running only 1 single turn simulations took only 1.510^-4 seconds. tit_for_tat by itself takes about 2.5*10^-5 seconds to run a single turn. Unfortunately, do to factor outside of my control, matlab, the software I’m running will for unknown reasons take 3 to 5 times as long as normal about 1 out of a 100 times. Leave yourself some buffer.
Strategies are NOT allowed a random number generator. This is different from prase’s contest but I would like to see strategies for dealing with enemy intelligence trying to figure them out without resorting to unpredictability.
I’ve come up with a couple of simple example strategies that will be performing in the contest.
Careful_cheater:
simulates its opponent against tit_for_tat to see its next move. If enemy defects, Careful_cheater defects. If enemy cooperates, Careful_cheater simulates its next move after that with a record showing Careful_cheater defecting this turn, passing tit_for_tat. If the opponent would have defected on the next turn, Careful_cheater cooperates, but if the opponent would have cooperated despite Careful_cheaters defection, it goes ahead and defects.
simulations show it doing evenly against tit_for_tat and its usual variations, but Careful_cheater eats forgiving programs like tit_for_2_tats alive.
Everyman:
simulates 10 turn matchs with its opponent against several possible basic strategies, including tit_for_tat, tit_for_tat_optimist (cooperates first two turns), and several others, then compares the scores of each match and plays as the highest scoring
Copy_cat:
Switches player numbers and simulates its opponent playing Copy_cats record while passing its opponent and then performs that action. That is to say, it sees what its opponent would do if it were in Copy_cats position and up against itself.
this is basically DuncanS strategy from the comments. DuncanS, your free to enter another one sense I stole yours as an example
and of course tit_for_tat and strategy “I” from prase’s contest will be competing as well.
one strategy per person, all you need for a strategy is a complete written description, though I may message back for clarification. I reserve the right to veto any strategy I deem prohibitively difficult to implement.
So what happens if copy_cat goes actually does go up against itself, or a semantically similar bot? Infinite recursion and zero points for both? Or am I missing something.
Actually, it seems that crashes it. thanks for the catch. I hadn’t tested copy_cat against itself and should have. Forbidding strategies to pass their opponent fixes that, but it does indicate my program may not be as stable as I thought. I’m going to have to spend a few more days checking for bugs sense I missed one that big. thanks eugman.
No problem. I’ve been thinking about it and one should be able to recursively prove if a strategy bottoms’s out. But that should fix it as long the user passes terminating strategies. A warning about everyman, if it goes against itself, it will run n^2 matches where n is the number of strategies tested.Time consumed is an interesting issue because if you take a long time it’s like being a defectbot against smartbots, you punish bots that try to simulate you but you lose points too.
I didn’t get into it earlier, but everyman’s a little more complicated. it runs through each match test one at a time from most likely to least, and checks after each match how much time it has left, if it starts to run out, it just leaves with the best strategy its figured out. By controlling how much time you pass in, strategies can avoid n^2 processing problems. Its why I thought it was so necessary to include even if it does give hints if strategies are being simulated or not. Everyman was built as a sort of upper bound. its one of the least efficient strategies one might implement.
Here’s a very fancy cliquebot (with a couple of other characteristics) that I came up with. The bot is in one of 4 “modes”—SELF, HOSTILE, ALLY, PASSWORD.
Before turn1:
Simulate the enemy for 20 turns against DCCCDDCDDD Cs thereafter, If the enemy responds 10Ds, CCDCDDDCDC then change to mode SELF. (These are pretty much random strings—the only requirement is that the first begins with D)
Simulate the enemy for 10 turns against DefectBot. If the enemy cooperates in all 10 turns, change to mode HOSTILE. Else be in mode ALLY.
In any given turn,
If in mode SELF, cooperate always.
If in mode HOSTILE, defect always.
If in mode ALLY, simulate the enemy against TFT for the next 2 turns. If the enemy defects on either of these turns, or defected on the last turn, switch to mode HOSTILE and defect. Exception if it is move 1 and the enemy will DC, then switch to mode PASSWORD and defect. Defect on the last move.
If in mode PASSWORD, change to mode HOSTILE if the enemy’s moves have varied from DCCCDDCDDD Cs thereafter beginning from move 1. If so, switch to mode HOSTILE and defect. Else defect on moves 1-10, on moves 11-20 CCDCDDDCDC respectively and defect thereafter.
Essentially designed to dominate in the endgame.
I assume that X is variable based on the amount of quoted code you’re evaluating; in that case I would submit “tit for tat, defect last 3” but obfuscate my source code in such a way that it took exactly 10,000 instructions to output, or make my source code so interdependent that you’d have to quote the entire thing, and so on. I wonder how well this “cloaked tit-for-tat” would go.
My submitted bot would defect whenever it couldn’t finish its calculation.
Good point. Obfuscated decision process is evidence of sneakiness. Like hiding a negation somewhere you think they won’t look.
Regarding (2), this is a particularly clean way to do it (with some results of my old simulations too).
http://www.amirrorclear.net/academic/papers/sipd.pdf http://www.amirrorclear.net/academic/ideas/dilemma/index.html
1) is feasible and that can be done easily, but now I am a little bit fed up with PD simulations, so I will take some rest before organising it. Maybe anybody else is interested in running the competition?
3) Wouldn’t it be easier to limit recursion depth instead of number of instructions? Not sure how to measure or even define the latter if the codes aren’t written in assembler. The outcome would also be more predictable for the bots and would not motivate the bots to include something like “for (i=number-of-already-spent-instructions, i<9,999, i++, do-nothing)” at the end of their source code to break anybody trying to simulate them. (I am not sure whether that would be a rational thing to do, but still quite confident that some strategies would include that).
This has the problem that, since most strategies will eval twice (to check both the C and D cases) you can be reasonably sure that if both calls return the same result you are being simulated.
Edit: Although it doesn’t fully fix the problem, this is better: eval takes a function that takes the function the other agent will call eval with as its argument and returns C or D.
There are still detection problems here (you could add checks to see if the function you passed to the function eval passed to you was invoked), but the fact that some strategies wouldn’t do overly ambitious recursion at least downgrades the above approach from obvious information leak to ambiguous information leak
Basically, there is always going to be some clever strategy of possibly detecting whether you’re a sim. And it may be best at that point to just say so. So the game should have three outputs: C, D, and S. S gets you zero points and gets your opponent the points they’d get if you played C. It’s clearly suboptimal. So you’d only say it to signal, “I’m smarter than you and I know you’re simulating me”. And if they’ve ever seen you say it in real life, they know you’re just bluffing. I believe that this response, if you actually were a sim being run by your opponent, would be the best way to get your opponent to cooperate on the last turn.
This doesn’t solve the problem of removing obvious, trivial ways to tell if you’re a sim. But it does mean that if there’s no shortcuts, so that the the smarter bot will win that battle of wills, then they have something useful to say for it (beyond just “I’m TFT so you shouldn’t defect until the last turn”)
Shouldn’t it be, at most: you can be reasonably sure that you are being simulated either a) after both calls return C or b) after you formally choose D having already seen that both calls return D?
If a simulation chooses C after seeing both results D, then the simulator might as well actually defect, so it does, and the non-simulation chooses C, just like the simulation.
If an agent strongly prefers not to be a simulation and believes in TDT and is vulnerable to blackmail, they can be coerced into cooperating and sacrificing themselves in similar cases. Unless I’m wrong, of course.
One really ought to make anti-blackmail resolutions.
What about the hypothesis that the opponent isn’t optimized for the game?
The standard method of examining play history will disambiguate, perhaps requiring a probing move
In the OP, defect/defect gives each player one point. Did you intend to add a fifth possible outcome for each iteration?
I think a more natural expansion would be to have a third choice besides cooperate and defect lead to nine different possible scores per iteration. For a program to run out of time is a choice pre-committed to by the design of the analyzing function, it’s not too different from the outcomes of cooperate or defect.
How about #2 combined with bots can make claims about their own and other bots’ behavior?
I don’t care what my opponent will do this turn, I care how what they will do later depends on what I do now. That is, do they have the capacity to retaliate, or can I defect with impunity.
So basically, my strategy would be to see if they defect (or try to be sneaky by running out my cycles) on turn 100 if they think I’m a dumb TFT. If so, I attempt to defect 1 turn before they do (edit: or just TFT and plan to start defecting on turn 99), which becomes a battle to see whose recursion is more efficient. If not, I play TFT and we probably both cooperate all the way through, getting max stable points.
Significantly, my strategy would not do a last-turn defection against a dumb TFT, which leaves some points on the ground. However, it would try to defect against a (more TDT-like?) strategy which would defect against TFT; so if my strategy were dominant, it could not be attacked by a combination of TFT and some cleverer (TDT?) bot which defects last turn against TFT.
I call this strategy anti-omega, because it cooperates with dumb TFTs but inflicts some pain on Omegabot (the strategy which always gets the max points it could possibly get against any given opponent). Omegabot would still beat a combo of TFT and anti-omega (the extra points it would get from defecting last turn against TFT, and defecting turn 98 against anti-omega, would make up for the points it lost from having two mutual defections against anti-omega). But at least it would suffer for it.
This points to a case where TDT is not optimal decision theory: if non-TDT agents can reliably detect whether you are a “being too clever” (ie, a member of a class which includes TDT agents), and they punish you for it. Basically, no sense being a rationalist in the dark ages; even if you’re Holmes or Moriarty, the chances that you’ll slip up and be burned as a witch might outweigh the benefits.
I don’t see rationality was a major cause of witch accusations. Indeed, a great number of successful dark agers seem to be ones that were cleverer / more rational than their contemporaries, even openly so.
Edited to show that I was trying to illustrate a concept concisely, not being historically accurate.
3) sounds fun. I already have a strategy for this variant—just run the other bot’s source code without bothering to analyse it.
Trouble is, when your bot plays itself it gets stuck in a loop, runs out of time and gets 0. This will make it very unlikely to ever rise to dominance in an evolutionary tournament.
Depending on how much time is allowed the following might be a viable solution: with a 1% chance cooperate, otherwise run the other bot’s source code. If the bot plays itself, or for that matter any other bot that tries to run its source code, after 100 (on average) iterations it will return cooperate, and then working backwards will (usually) converge to the right thing.
Obviously, the value of 1% should be as small as possible; the average time used is proportional to the reciprocal of the probability.
This could be fixed with a trivial modification: first check if the opponent source code is identical to your own, and if it is then cooperate. Otherwise run the other bot’s source.
However, there’s another problem: If DuncanS’s strategy plays against a CooperateBot or a random bot, it will throw away a lot of points.
Quite so. I actually never intended to post this—I edited the post several times, and in the end thought I’d abandoned it. But never mind.
I thought of various trivial modifications to fix the problem that you iterate infinitely deep if you meet yourself. You could check for yourself—that works as long as nobody else enters an equivalent strategy—in which case you will do each other in.
I’m not so worried about the cooperateBot or random bot as these will in any case do so badly that they won’t tend to turn up in the population pool. But I could obviously do better against these by defecting all the way. In fact, any algorithm that neither runs me, or checks what I played previously ignores my behaviour, and should be defected against.
The most reliable fix to the recursion problem that I can think of is to implement a hard resource limit, whereby if you spend (say) 9500 of 10000 processing units on analysis, then you drop the analysis and implement a dumb strategy with whatever juice you have left. Probably wouldn’t take much more than a simple counter being incremented in a bottom-level loop somewhere, as long as you’re looking at your opponent’s source through a level of indirection rather than just jumping straight into it. Some analysis strategies might also converge on a solution rather than finding an exact one, although this would be easier and more reliable with behavioral than with source analysis.
However, trapdoors like this are exploitable: they give agents an incentive to nearly but not quite run out the clock before making a decision, in hopes that their opponents will give up and fall back on the simple and stupid. The game theory surrounding this seems to have PD-like characteristics in its own right, so it may just have pushed the relevant game-theoretical decisions up to the metagame.
There is actually a human algorithm in the cooperate/defect space—which is to distrust anyone who seems to spend a long time making a cooperate decision. Anyone who has a really quick, simple way of deciding to cooperate is OK—anyone who takes a long time is thinking about whether you will defect, or whether they should defect—and you should probably find someone else to trust. This is one of the good things about Tit for tat—it passes this test.
That is a problem, but fortunately we know that those competitors will be eliminated before the chameleon, whereupon it will begin to improve its performance.
This doesn’t solve the slightly different CliqueBots won’t cooperate with each other problem.
If the DuncanBot detects a source code different from its own, it runs that source code. So a DuncanBot looks to any CliqueBot like a member of it’s clique.
A piece of code that runs a piece of source code does not thereby become that piece of source code.
That may well be right—a cliquebot may well decide I’m not in its clique, and defect. And I, running its source code, will decide that my counterparty IS in the clique, and cooperate. Not a good outcome.
Clearly I need to modify my algorithm so that I run the other bot’s code without bothering to analyse it—and have it play against not its actual opponent, but me, running it.
I meant what happens when a DucanBot meats a DicanBot that does the same thing but has trivial cosmetic differences in its source code?
To add to this (probably silly) post—I was drawn to this problem for the reasons that Eliezer considers it the interesting case. In this form, it’s a variant on the halting problem / completeness problem. Each of these can’t be solved because some cases turn into an infinitely deep recursive nesting. Any purported algorithm to predict whether another algorithm will halt can’t solve the case where it’s analysing itself analysing itself analysing itself etc. Hence my rather tongue-in-cheek response.
And my response hits the same problem in a different form. When it meets itself, it gets into an infinite loop as each side tries to run the other. To solve it, at some level you need a ‘cooperate’ to be introduced in order to terminate the loop. And doing that breaks the symmetry of the situation.
A Turing machine simulated with finite resources is decidable by a Turing machine, so the halting problem isn’t a problem, though in this case the simulator also has finite resources, so infinite loops aren’t a problem.
Predicting what the opponent would do if you always return “cooperate” means predicting how the opponent would play against a CooperateBot, and acting this way means adopting opponent’s strategy against CooperateBots as your own when playing against the opponent, which is not a CooperateBot and knows you are not a CooperateBot. Sounds like a recipe for failure (or for becoming an almost-DefectBot, actually, if the opponent is smart enough and expects genuine CooperateBots).
I changed my comment to remove this vulnerability—didn’t realise it had already been seen.
Would being able to invoke your opponent with arbitrary inputs be equivalent?