Oh cool! You allow an agent to see how their opponent would respond when playing a 3rd agent (just call run with different source code).
[Edit: which allows for arbitrary message passing—the coop bots might all agree to coop with anyone who coops with (return C)]
However you also allow for trivially determining if an agent is being simulated: simply check how much fuel there is, which is probably not what we want.
The only way to check your fuel is to run out—unless I goofed.
You could call that message passing, though conventionally that names a kind of overt influence of one running agent on another, all kinds of which are supposed to excluded.
It shouldn’t be hard to do variations where you can only run the other player and not look at their source code.
I’m not a native schemer, but it looks like you can check fuel by calling run with a large number and seeing it if fails to return… eg (eq (run 9999 (return C) (return C)) ’exhausted) [note that this can cost fuel, and so should be done at the end of an agent to decide if returning the “real” value is a good idea]
giving us the naieve DefectorBot of (if (eq (run 9999 (return C) (return C)) ’exhausted) C D)
[Edit: and for detecting run-function-swap-out:
(if (neq (run 10000000 (return C) (return C) ’exhausted) C ;; someone is simulating us
(if (eq (run 9999 (return C) (return C)) 'exhausted) C ;; someone is simulating us more cleverly
D)) ]
[Edit 2: Is there a better way to paste code on LW?]
Re: not showing source: Okay, but I do think it would be awesome if we get bots that only cooperate with bots who would cooperate with (return C)
On message passing as described, that’d be a bug if you could do it here. The agents are confined. (There is a side channel from resource consumption, but other agents within the system can’t see it, since they run deterministically.)
When you call RUN, one of two things happens: it produces a result or you die from exhaustion. If you die, you can’t act. If you get a result, you now know something about how much fuel there was before, at the cost of having used it up. The remaning fuel might be any amount in your prior, minus the amount used.
At the Scheme prompt:
(run 10000 '(equal? 'exhausted (cadr (run 1000 '((lambda (f) (f f)) (lambda (f) (f f))) (global-environment)))) global-environment)
; result: (8985 #t) ; The subrun completed and we find #t for yes, it ran to exhaustion.
(run 100 '(equal? 'exhausted (cadr (run 1000 '((lambda (f) (f f)) (lambda (f) (f f))) (global-environment)))) global-environment)
; result: (0 exhausted) ; Oops, we never got back to our EQUAL? test.
I followed Eliezer’s proposal above (both players score 0) -- that’s if you die at “top level”. If a player is simulating you and still has fuel after, then it’s told of your sub-death.
Awesome! The only suggestion I have is to pass in a putative history and/or tournament parameters to an agent in the evaluation function so the agent can do simple things like implement tit-for-tat on the history, or do complicated things like probing the late-game behavior of other agents early in the game. (E.G. “If you think this is the last round, what do you do?”)
Thanks! Yes, I figure one-shot and iterated PDs might both hold interest, and the one-shot came first since it’s simpler. That’s a neat idea about probing ahead.
I just hacked up something like variant 3; haven’t tried to do anything interesting with it yet.
Oh cool! You allow an agent to see how their opponent would respond when playing a 3rd agent (just call run with different source code).
[Edit: which allows for arbitrary message passing—the coop bots might all agree to coop with anyone who coops with (return C)]
However you also allow for trivially determining if an agent is being simulated: simply check how much fuel there is, which is probably not what we want.
The only way to check your fuel is to run out—unless I goofed.
You could call that message passing, though conventionally that names a kind of overt influence of one running agent on another, all kinds of which are supposed to excluded.
It shouldn’t be hard to do variations where you can only run the other player and not look at their source code.
I’m not a native schemer, but it looks like you can check fuel by calling run with a large number and seeing it if fails to return… eg (eq (run 9999 (return C) (return C)) ’exhausted) [note that this can cost fuel, and so should be done at the end of an agent to decide if returning the “real” value is a good idea]
giving us the naieve DefectorBot of (if (eq (run 9999 (return C) (return C)) ’exhausted) C D)
[Edit: and for detecting run-function-swap-out:
(if (neq (run 10000000 (return C) (return C) ’exhausted) C ;; someone is simulating us
[Edit 2: Is there a better way to paste code on LW?]
Re: not showing source: Okay, but I do think it would be awesome if we get bots that only cooperate with bots who would cooperate with (return C)
Re: message passing: Check out http://en.wikipedia.org/wiki/Message_passing for what I meant?
On message passing as described, that’d be a bug if you could do it here. The agents are confined. (There is a side channel from resource consumption, but other agents within the system can’t see it, since they run deterministically.)
When you call RUN, one of two things happens: it produces a result or you die from exhaustion. If you die, you can’t act. If you get a result, you now know something about how much fuel there was before, at the cost of having used it up. The remaning fuel might be any amount in your prior, minus the amount used.
At the Scheme prompt:
Oh, okay, I was missing that you never run the agents as scheme, only interpret them via ev.
Are you planning on supporting a default action in case time runs out? (and if so, how will that handle the equivalent problem?)
I hadn’t considered doing that—really I just threw this together because Eliezer’s idea sounded interesting and not too hard.
I’ll at least refine the code and docs and write a few more agents, and if you have ideas I’d be happy to offer advice on implementing your variant.
If you can’t act, what happens score-wise?
I followed Eliezer’s proposal above (both players score 0) -- that’s if you die at “top level”. If a player is simulating you and still has fuel after, then it’s told of your sub-death.
You could change this in play.scm.
Awesome! The only suggestion I have is to pass in a putative history and/or tournament parameters to an agent in the evaluation function so the agent can do simple things like implement tit-for-tat on the history, or do complicated things like probing the late-game behavior of other agents early in the game. (E.G. “If you think this is the last round, what do you do?”)
Thanks! Yes, I figure one-shot and iterated PDs might both hold interest, and the one-shot came first since it’s simpler. That’s a neat idea about probing ahead.
I’ll return to the code in a few days.