Given Rice’s theorem, there is no source code analysis that is guaranteed to tell you anything interesting about your opponent, unless your opponent has been written in such a way as to facilitate that analysis. This suggests two possible modifications of the competition.
Firstly, the rule could be imposed that source code is unanalysable. A competing program will not be furnished with its opponent’s source, but only an opaque blob. The only operation they can perform on this blob is to supply it with another blob and see what answer it gives. Each program is given access to itself (as an opaque blob), and can contain within itself whatever other blobs it wishes to use for probing purposes. Note that if every program begins a contest by running its opponent on some example, then no-one terminates and everyone loses. Therefore a program must sometimes make a decision without having run its opponent on any examples. This looks like it severely limits the usefulness of simulating the opponent to see what it will do, perhaps to the point where it is impossible to gain anything from it, and the competition reduces to an ordinary PD tournament.
Alternatively, source code would be revealed, but in addition each program would be allowed to furnish machine-checkable proofs of assertions about itself. The machine checking would be part of the infrastructure of the competition, so competitors would be able to have complete trust in these assertions.
Or one could combine these: one receives one’s opponent as an unanalysable blob together with some set of guaranteed true assertions about it, supplied by the creator of that program and verified by the competition infrastructure. The idea is that skill at analysing someone’s source code is secondary to the point of the exercise.
What verifiably true assertions would it be useful for a competitor to make about itself? It would be easy to write code of which it can be proved “I always cooperate”, “I always defect”, “I play Tit For Tat”, or any of the other ordinary PD strategies. What verifiable assertions can a program make about the sort of opponent it will cooperate or defect with? “I will cooperate with anyone who cooperates with me” would not be an admissible assertion (by Rice’s theorem). Nor would “I will cooperate with anyone who verifiably asserts that they cooperate with me”. But “I will cooperate with anyone who verifiably asserts that they cooperate with anyone verifiably asserting this sentence” might be, if the circularity can be handled.
ETA: But “I will cooperate with anyone who verifiably asserts that they cooperate with anyone verifiably asserting a sentence logically equivalent to this one” won’t work. So there would have to be a negotiation phase before the competition in which people publish the assertions they want to make about themselves, so that people can coordinate to agree on the exact formualtion of the mutually referential sentences they want to be seen to be true of them.
Thinking about this further, given enough assertions, there’s no need to have the programs at all. Let the agents be just the assertions that they make about themselves. Each agent would consist of a set of axioms, perhaps written in Prolog, about who they cooperate or defect against, and running a contest between two agents would just be a matter of taking the union of the two sets of axioms and attempting to deduce (by the standard Prolog proof-search) what choice each one makes.
You just run the Prolog (or whatever logic system implements all this), and it either terminates with a failure or does not terminate within the time allowed by the competition. The time limit renders everything practically decidable.
Perhaps it terminates in the time required proving that A defects and B cooperates, even though the axioms were inconsistent, and one could also have proved that A cooperates and B defects.
The competitors know the deterministic proof-search algorithm (e.g. that of Prolog), and the first answer that produces within the time limit, that is the answer.
What’s wrong with “I will cooperate with anyone who verifiably asserts that they cooperate with me”. A program could verifiably assert that by being, e.g., cooperatebot. A program could be written that cooperates with any opponent that provides it with a proof that it will cooperate.
What’s wrong with “I will cooperate with anyone who verifiably asserts that they cooperate with me”.
The word “me”. By Rice’s theorem, they can’t tell that they’re dealing with someone computationally equivalent to me, and there’s no other sense of my “identity” that can be referred to.
Athough that could be added. Have all competitors choose a unique name and allow them to verifiably claim to be the owner of their name. Then “I will cooperate with anyone who verifiably asserts that they cooperate with me” can work if “me” is understood to mean “the entity with my name”. Discovering a blob’s name would have to be a second primitive operation allowed on blobs.
ETA: I think I missed your point a little. Yes, a cooperate-bot verifiably asserts that it cooperates with everyone, so it is an entity that “verifiably asserts (something that implies) that they will cooperate with me.” And there will be other bots of which this is true. But I’m not sure that I can verifiably express that I will cooperate with the class of all such bots, because “the class of all such bots” looks undecidable.
Your source code is your name. Having an additional name would be irrelevant. It is certainly possible for bots to prove they cooperate with a given bot, by looking at that particular bot’s source. It would, as you say, be much harder for a bot to prove it cooperates with every bot equivalent to a given bot (in the sense of making the same cooperate/defect decisions vs. every opponent).
Rice’s theorem may not be as much of an obstruction as you seem to indicate. For example, Rice’s theorem doesn’t prohibit a bot which proves that it defects against all defectbots, and cooperates with all cooperatebots. Indeed, you can construct an example of such a bot. (Rice’s theorem would, however, prevent constructing a bot which cooperates with cooperatebots and defects against everyone else.)
Given Rice’s theorem, there is no source code analysis that is guaranteed to tell you anything interesting about your opponent, unless your opponent has been written in such a way as to facilitate that analysis. This suggests two possible modifications of the competition.
Firstly, the rule could be imposed that source code is unanalysable. A competing program will not be furnished with its opponent’s source, but only an opaque blob. The only operation they can perform on this blob is to supply it with another blob and see what answer it gives. Each program is given access to itself (as an opaque blob), and can contain within itself whatever other blobs it wishes to use for probing purposes. Note that if every program begins a contest by running its opponent on some example, then no-one terminates and everyone loses. Therefore a program must sometimes make a decision without having run its opponent on any examples. This looks like it severely limits the usefulness of simulating the opponent to see what it will do, perhaps to the point where it is impossible to gain anything from it, and the competition reduces to an ordinary PD tournament.
Alternatively, source code would be revealed, but in addition each program would be allowed to furnish machine-checkable proofs of assertions about itself. The machine checking would be part of the infrastructure of the competition, so competitors would be able to have complete trust in these assertions.
Or one could combine these: one receives one’s opponent as an unanalysable blob together with some set of guaranteed true assertions about it, supplied by the creator of that program and verified by the competition infrastructure. The idea is that skill at analysing someone’s source code is secondary to the point of the exercise.
What verifiably true assertions would it be useful for a competitor to make about itself? It would be easy to write code of which it can be proved “I always cooperate”, “I always defect”, “I play Tit For Tat”, or any of the other ordinary PD strategies. What verifiable assertions can a program make about the sort of opponent it will cooperate or defect with? “I will cooperate with anyone who cooperates with me” would not be an admissible assertion (by Rice’s theorem). Nor would “I will cooperate with anyone who verifiably asserts that they cooperate with me”. But “I will cooperate with anyone who verifiably asserts that they cooperate with anyone verifiably asserting this sentence” might be, if the circularity can be handled.
ETA: But “I will cooperate with anyone who verifiably asserts that they cooperate with anyone verifiably asserting a sentence logically equivalent to this one” won’t work. So there would have to be a negotiation phase before the competition in which people publish the assertions they want to make about themselves, so that people can coordinate to agree on the exact formualtion of the mutually referential sentences they want to be seen to be true of them.
Thinking about this further, given enough assertions, there’s no need to have the programs at all. Let the agents be just the assertions that they make about themselves. Each agent would consist of a set of axioms, perhaps written in Prolog, about who they cooperate or defect against, and running a contest between two agents would just be a matter of taking the union of the two sets of axioms and attempting to deduce (by the standard Prolog proof-search) what choice each one makes.
What happens if the two sets of axioms are individually consistent, but together are inconsistent?
Deem both of the agents to have not terminated?
How will you know? The set of consistent axiom systems is undecidable. (Though the set of inconsistent axioms systems is computably enumerable.)
You just run the Prolog (or whatever logic system implements all this), and it either terminates with a failure or does not terminate within the time allowed by the competition. The time limit renders everything practically decidable.
Perhaps it terminates in the time required proving that A defects and B cooperates, even though the axioms were inconsistent, and one could also have proved that A cooperates and B defects.
The competitors know the deterministic proof-search algorithm (e.g. that of Prolog), and the first answer that produces within the time limit, that is the answer.
What’s wrong with “I will cooperate with anyone who verifiably asserts that they cooperate with me”. A program could verifiably assert that by being, e.g., cooperatebot. A program could be written that cooperates with any opponent that provides it with a proof that it will cooperate.
The word “me”. By Rice’s theorem, they can’t tell that they’re dealing with someone computationally equivalent to me, and there’s no other sense of my “identity” that can be referred to.
Athough that could be added. Have all competitors choose a unique name and allow them to verifiably claim to be the owner of their name. Then “I will cooperate with anyone who verifiably asserts that they cooperate with me” can work if “me” is understood to mean “the entity with my name”. Discovering a blob’s name would have to be a second primitive operation allowed on blobs.
ETA: I think I missed your point a little. Yes, a cooperate-bot verifiably asserts that it cooperates with everyone, so it is an entity that “verifiably asserts (something that implies) that they will cooperate with me.” And there will be other bots of which this is true. But I’m not sure that I can verifiably express that I will cooperate with the class of all such bots, because “the class of all such bots” looks undecidable.
Your source code is your name. Having an additional name would be irrelevant. It is certainly possible for bots to prove they cooperate with a given bot, by looking at that particular bot’s source. It would, as you say, be much harder for a bot to prove it cooperates with every bot equivalent to a given bot (in the sense of making the same cooperate/defect decisions vs. every opponent).
Rice’s theorem may not be as much of an obstruction as you seem to indicate. For example, Rice’s theorem doesn’t prohibit a bot which proves that it defects against all defectbots, and cooperates with all cooperatebots. Indeed, you can construct an example of such a bot. (Rice’s theorem would, however, prevent constructing a bot which cooperates with cooperatebots and defects against everyone else.)