How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the “passed its own sourcecode” format. You could mess with that if you’re smart enough to locate and delete the “verify the signature” step, but then you could do that in the real “passed its own sourcecode” format too.
Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.
A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the “passed its own sourcecode” format.
Altering the internal structure of an opponent program would be very difficult, but that’s not the only way to mutate a program. You can’t tinker with the insides of a black box, but you can wrap a black box.
To be concrete: given an opponent’s source code, I could mechanically generate an equivalent program with extremely dissimilar source code (perhaps just a block of text, a decryption routine, and a call to eval) that nevertheless acts exactly like the original program in every way. And since that mechanically-obfuscated program would act exactly like the original program in every way, the obfuscated program would not be able to detect that it had been altered. Do you agree?
I’m playing Prisoner’s Dilemma and wish to test if an opponent X is honest. I might try the following:
(1) Create two programs, Y and Z, which are algorithmically equivalent but obfuscated versions of X.
(2) Run Y and Z against each other.
If Y and Z don’t cooperate with each other, that’s a good indication that X recognizes itself with a source-code comparison and that I shouldn’t trust X.
This honesty check doesn’t work if Y and Z are given access to their sources. Sure, when I simulate Y against Z, I could lie to Y and tell Y that its source is X (so Y believes itself to be unmodified). But when my deluded Y simulation is deciding whether to cooperate with Z, it (Y) may run Z in simulation. If Y informs its Z-simulation that Z’s source is Z, then that Z-simulation will not be deluded into thinking that it is unmodified. Y’s simulation of Z will be able to detect that it is an (obfuscated) simulation and act accordingly.
This honesty check isn’t fool proof. X can recognize itself with a more complicated handshake — one that survives code obfuscation. But if X recognizes itself with a more complicated handshake, then X doesn’t need to know its own source code (and we shouldn’t bother passing the source code in).
Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there’s no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot doesn’t depend on its “self” argument, provides a way to distinguish it from the unmodified CliqueBot using only blackbox queries, so in that sense it’s not quite functionally identical. Otoh, if you consider it unfair to discriminate against agents just because they use old-fashioned quine-type self-reference rather than exploiting the convenience of a “self” argument, then this transformation is fair.
How does that help? A quine-like program could just as well put its real payload in a string with a cryptographic signature, verify the signature, and then eval the string with the string as input; thus emulating the “passed its own sourcecode” format. You could mess with that if you’re smart enough to locate and delete the “verify the signature” step, but then you could do that in the real “passed its own sourcecode” format too.
Conversely, even if the tournament program itself is honest, contestants can lie to their simulations of their opponents about what sourcecode the simulation is of.
Altering the internal structure of an opponent program would be very difficult, but that’s not the only way to mutate a program. You can’t tinker with the insides of a black box, but you can wrap a black box.
To be concrete: given an opponent’s source code, I could mechanically generate an equivalent program with extremely dissimilar source code (perhaps just a block of text, a decryption routine, and a call to eval) that nevertheless acts exactly like the original program in every way. And since that mechanically-obfuscated program would act exactly like the original program in every way, the obfuscated program would not be able to detect that it had been altered. Do you agree?
I’m playing Prisoner’s Dilemma and wish to test if an opponent X is honest. I might try the following:
(1) Create two programs, Y and Z, which are algorithmically equivalent but obfuscated versions of X. (2) Run Y and Z against each other.
If Y and Z don’t cooperate with each other, that’s a good indication that X recognizes itself with a source-code comparison and that I shouldn’t trust X.
This honesty check doesn’t work if Y and Z are given access to their sources. Sure, when I simulate Y against Z, I could lie to Y and tell Y that its source is X (so Y believes itself to be unmodified). But when my deluded Y simulation is deciding whether to cooperate with Z, it (Y) may run Z in simulation. If Y informs its Z-simulation that Z’s source is Z, then that Z-simulation will not be deluded into thinking that it is unmodified. Y’s simulation of Z will be able to detect that it is an (obfuscated) simulation and act accordingly.
This honesty check isn’t fool proof. X can recognize itself with a more complicated handshake — one that survives code obfuscation. But if X recognizes itself with a more complicated handshake, then X doesn’t need to know its own source code (and we shouldn’t bother passing the source code in).
I had in mind an automated wrapper generator for the “passed own sourcecode” version of the contest:
Note that for all values of X and Y, (WrappedCliqueBot X Y) == (CliqueBot CliqueBot Y), and there’s no possible code you could add to CliqueBot that would break this identity. Now I just realized that the very fact that WrappedCliqueBot doesn’t depend on its “self” argument, provides a way to distinguish it from the unmodified CliqueBot using only blackbox queries, so in that sense it’s not quite functionally identical. Otoh, if you consider it unfair to discriminate against agents just because they use old-fashioned quine-type self-reference rather than exploiting the convenience of a “self” argument, then this transformation is fair.