but unfortunately threads are not guaranteed to start back up again as soon as sleep allows them to; it took about 18 seconds to terminate when I ran the second line with “your-code” being an infinite loop. I’ll figure out how to do this properly tomorrow.
Edit3: A marvelously improper but correct way to do it:
(begin (define x (current-milliseconds)) (print (your-code their-code) (- (current-milliseconds) x))
Allow to run more than 10 seconds by looking at clock at stopping manually. Throw out result if it says it took more than 10 seconds.
Keep in mind that a lot of user-submitted programs will try to do the same (because writing a step-by-step interpreter is hard), so they would keep evaluating each other spawning watchdog threads every time, so, um, your watchdog thread would be badly outnumbered.
The easy fix for you would be to run your watchdog in a separate process, but players wouldn’t have this ability, which might make things either more interesting or boring (ruling out all strategies using eval). Maybe a specially designed restricted subset of Scheme with time-restricted eval would be a better choice?
By the way, after looking at your payout matrix to see what should I do if I see the opponent using eval, looks like you have in fact created a version of PD with three choices. Not incentivizing the third choice doesn’t really help because a program still has to consider the possibility that the other program chooses “other” due to a bug, in which case it should choose Defect.
I suggest you implement the standard PD by declaring that anything that is not Cooperate is Defect. The only downside would be that you’ll see somewhat more programs using all their allotted 10s, but you’ll see a lot of those either way. At least you’ll be able to say that this competition was about the actual PD.
Tentatively: I’ll paste “(YourCode TheirCode)” into the interpreter with DrRacket, with #lang scheme.
The most elegant, it seems to me, would be to require entires to be sumitted as a .rkt file that defines a module providing a single variable (possibly with a required name, eg “bot”) that must be a quoted expression.
This makes it simple to call programs (eval the expressions), pass opponent’s source code, or even provide programs with their own source code; and could be used to automate static checking of various constraints, such as the ban on file I/O.
It also allows more flexibility for submitters in constructing their entry, for instance I was able to considerably streamline the source code to the CliqueBot template:
(this is the “workhorse” part, providing the program as a quoted expression obviates the need for repeating it twice, as we can then leverage the quasiquote and unquote language features)
You should be able to use this which I just worked out, to run something with a timeout. Seems to be working by my testing. It might be overkill to run the whole thing in a subthread but it makes certain that nothing interferes with the use of send and receive here. You would normally use it with a lambda taking no arguments, for example (using my ueval)
Ah, right, I see it now. I guess you can check the current-milliseconds after the fact, and force the default value in that case. But looks like this is going to be a problem if I try to safely simulate other agents… Actually, I suppose it’s possible for the target thread to similarly not get returned to for a long time, causing any watchdog to overestimate the time it used.current-process-milliseconds might help with that, but I’m not sure if it can deal with nested threads usefully.
What scaffolding are you going to use for the tests? (For example: #!racket seems to be implied. I’d like to be sure of all of your details.)
Tentatively: I’ll paste “(YourCode TheirCode)” into the interpreter with DrRacket, with #lang scheme.
Edit: Oops, that doesn’t enforce the time limit. Just a sec while I figure this out.
Edit2: I tried this:
but unfortunately threads are not guaranteed to start back up again as soon as sleep allows them to; it took about 18 seconds to terminate when I ran the second line with “your-code” being an infinite loop. I’ll figure out how to do this properly tomorrow.
Edit3: A marvelously improper but correct way to do it:
Allow to run more than 10 seconds by looking at clock at stopping manually. Throw out result if it says it took more than 10 seconds.
Keep in mind that a lot of user-submitted programs will try to do the same (because writing a step-by-step interpreter is hard), so they would keep evaluating each other spawning watchdog threads every time, so, um, your watchdog thread would be badly outnumbered.
The easy fix for you would be to run your watchdog in a separate process, but players wouldn’t have this ability, which might make things either more interesting or boring (ruling out all strategies using eval). Maybe a specially designed restricted subset of Scheme with time-restricted eval would be a better choice?
By the way, after looking at your payout matrix to see what should I do if I see the opponent using eval, looks like you have in fact created a version of PD with three choices. Not incentivizing the third choice doesn’t really help because a program still has to consider the possibility that the other program chooses “other” due to a bug, in which case it should choose Defect.
I suggest you implement the standard PD by declaring that anything that is not Cooperate is Defect. The only downside would be that you’ll see somewhat more programs using all their allotted 10s, but you’ll see a lot of those either way. At least you’ll be able to say that this competition was about the actual PD.
The most elegant, it seems to me, would be to require entires to be sumitted as a .rkt file that defines a module providing a single variable (possibly with a required name, eg “bot”) that must be a quoted expression.
This makes it simple to call programs (eval the expressions), pass opponent’s source code, or even provide programs with their own source code; and could be used to automate static checking of various constraints, such as the ban on file I/O.
It also allows more flexibility for submitters in constructing their entry, for instance I was able to considerably streamline the source code to the CliqueBot template:
(this is the “workhorse” part, providing the program as a quoted expression obviates the need for repeating it twice, as we can then leverage the quasiquote and unquote language features)
(this is the “template”, it will be expanded to include the program-matching portion)
(this is the actual program, it both quotes (via inclusion of the template) and uses the program-matching part).
The top of the file would consist of the following declaration:
...or if you’re requiring that all entries use the same name “bot” (which could make it easier to administrate the actual running of the contest):
You should be able to use this which I just worked out, to run something with a timeout. Seems to be working by my testing. It might be overkill to run the whole thing in a subthread but it makes certain that nothing interferes with the use of send and receive here. You would normally use it with a lambda taking no arguments, for example (using my ueval)
which is what I’ve been using to test candidates against each other locally.
He said he didn’t want to use sleep, since the argument is only a lower bound on the amount of time it takes.
Ah, right, I see it now. I guess you can check the current-milliseconds after the fact, and force the default value in that case. But looks like this is going to be a problem if I try to safely simulate other agents… Actually, I suppose it’s possible for the target thread to similarly not get returned to for a long time, causing any watchdog to overestimate the time it used.
current-process-milliseconds
might help with that, but I’m not sure if it can deal with nested threads usefully.As written, this doesn’t work; print only takes one printee argument, with other optional arguments.
Oops.