Really, really interesting. I’m going to do some formal analysis later. Some brainstorming first:
I want my bot to have memory. My bot needs to track the opponent bot’s previous moves in a given round.
My bot would have hypotheses on the kind of bot it is playing. It would particularly need to have the following types of hypothesis {fixed probability distribution over all numbers, a reactionary program (the current move of the program depends on my bots previous moves).
I want my bot to cooperate against reactionary bots that adopt tit for tat with forgiveness. Of the reactionary bot does not forgive, my bot may be unable to cooperate. Especially if the reactionary bot tries to exploit or punish my bot based on my bot’s initial moves.
I want my bot to quickly identify itself as soon as possible. This may involve my bot playing suboptimally in the first few moves, but should also make it easier to identify reactionary bots.
I want my bot to cooperate against itself.
I want my bot to exploit submissive $(P(x \in [0,2] > 0.5)$ (where $x$ is the number the opponent bot writes) bots. The degree of exploitation (how often it exploits) depends on my bot’s hypothesis of how submissive it’s opponent is.
I want my bot to punish offensive $(P(x \in [3,5] > 0.5)$ (where $x$ is the number the opponent bot writes) bots. The degree of punishment (how often it punishes) depends on my bot’s hypothesis of how offensive it’s opponent is.
I want my bot to be Bayesian. My bot would use Bayesian inference to update it’s hypotheses on its opponent.
I would specify a utility function that specifies the appropriate trade off my bot should make between gaining points, and figuring out the kind of opponent it is facing. (I already have a rough idea of how my bot would go about gaining information due to my project on optimising scientific research).
I expect that my bot becomes more optimal as n tends to infinity, where n is the number of turns in a given round. This is because my bot may waste the first few turns may be in order to gain information.
I expect that given the source code of any bot, it is possible to script a predator bot, such that the predator bot always performs at least as good as the prey bot. As such, I expect to not disclose certain parts of my bot’s algorithm once I complete the design.
I’m thinking of having my bot randomly playing 5 or 0 in the first round, so it can quickly identify itself. If the opponent played something else, it adopts accordingly. If the opponent played 5 or 0, it adopts another sequence of rituals (I don’t want to say it here so that a predator bot wouldn’t be scripted for my bot) which should help confirm that it is playing against itself.
I initially thought along those lines, but I realized that if your Bayesian update includes your own strategy, you can very quickly converge to playing optimally against yourself without an explicit handshake. See my thought process here.
There are 5^N possible strings for a handshake that lasts N turns. Select the handshake strong randomly. If the handshake is successful, the probability that the bot it is playing against is itself is (1 − 5^-N).
Using N = 3, we can establish with a Pr(0.992) that we are playing against a copy of ourself within 3 turns.
2 turns would be spent renormalising if we are not playing against ourself.
Really, really interesting. I’m going to do some formal analysis later. Some brainstorming first:
I want my bot to have memory. My bot needs to track the opponent bot’s previous moves in a given round.
My bot would have hypotheses on the kind of bot it is playing. It would particularly need to have the following types of hypothesis {fixed probability distribution over all numbers, a reactionary program (the current move of the program depends on my bots previous moves).
I want my bot to cooperate against reactionary bots that adopt tit for tat with forgiveness. Of the reactionary bot does not forgive, my bot may be unable to cooperate. Especially if the reactionary bot tries to exploit or punish my bot based on my bot’s initial moves.
I want my bot to quickly identify itself as soon as possible. This may involve my bot playing suboptimally in the first few moves, but should also make it easier to identify reactionary bots.
I want my bot to cooperate against itself.
I want my bot to exploit submissive $(P(x \in [0,2] > 0.5)$ (where $x$ is the number the opponent bot writes) bots. The degree of exploitation (how often it exploits) depends on my bot’s hypothesis of how submissive it’s opponent is.
I want my bot to punish offensive $(P(x \in [3,5] > 0.5)$ (where $x$ is the number the opponent bot writes) bots. The degree of punishment (how often it punishes) depends on my bot’s hypothesis of how offensive it’s opponent is.
I want my bot to be Bayesian. My bot would use Bayesian inference to update it’s hypotheses on its opponent.
I would specify a utility function that specifies the appropriate trade off my bot should make between gaining points, and figuring out the kind of opponent it is facing. (I already have a rough idea of how my bot would go about gaining information due to my project on optimising scientific research).
I expect that my bot becomes more optimal as n tends to infinity, where n is the number of turns in a given round. This is because my bot may waste the first few turns may be in order to gain information.
I expect that given the source code of any bot, it is possible to script a predator bot, such that the predator bot always performs at least as good as the prey bot. As such, I expect to not disclose certain parts of my bot’s algorithm once I complete the design.
I’m thinking of having my bot randomly playing 5 or 0 in the first round, so it can quickly identify itself. If the opponent played something else, it adopts accordingly. If the opponent played 5 or 0, it adopts another sequence of rituals (I don’t want to say it here so that a predator bot wouldn’t be scripted for my bot) which should help confirm that it is playing against itself.
I initially thought along those lines, but I realized that if your Bayesian update includes your own strategy, you can very quickly converge to playing optimally against yourself without an explicit handshake. See my thought process here.
There are 5^N possible strings for a handshake that lasts N turns. Select the handshake strong randomly. If the handshake is successful, the probability that the bot it is playing against is itself is (1 − 5^-N).
Using N = 3, we can establish with a Pr(0.992) that we are playing against a copy of ourself within 3 turns.
2 turns would be spent renormalising if we are not playing against ourself.