Copying over my thought process from your main blog:
Start with a short identifying sequence of 0s,1s,4s, and 5s. Seeing any 2 or 3 drops into oppositional mode immediately. Have the last two characters of this sequence only convey one bit of identification each; {0,5} vs. {1,4}. Determine which of them to use randomly, to establish who will be the low side and who the high side. This gives you a 75% handshake chance and 75% polarity-match chance, vs. 75⁄50 for finding them separately, at a cost of about 1 point in expectation.
In adversarial cases, start out with literal tit-for-tat; play the last number the opponent played. Come up with a short list of possible default strategies; always-2, always-3, always-N, 0/5-alternating, 2/3-alternating, 1/4-alternating. Probably a couple others. If the output doesn’t look quite like tit-for-tat, and the discrepancy with one of these other simple strategies is smaller, play the FDT-best-response strategy. For alternating strategies it’s the same alternation but in reverse, for always-N for N2 it’s to play always-3. This will involve several special cases, but they only need to handle common strategies so it’s limited.
Actually, skip the handshake code. This will get optimality against itself without that by starting out with random 2⁄3 for ~4 rounds and then proceeding from there.
Updates from reading other thoughts; use a 2-probability of 0.69 rather than 0.5 for the initial random run, for better performance as per Unnamed. This would also help against dumb no-forgiveness or minimal-forgiveness bots.
Copying over my thought process from your main blog:
Start with a short identifying sequence of 0s,1s,4s, and 5s. Seeing any 2 or 3 drops into oppositional mode immediately. Have the last two characters of this sequence only convey one bit of identification each; {0,5} vs. {1,4}. Determine which of them to use randomly, to establish who will be the low side and who the high side. This gives you a 75% handshake chance and 75% polarity-match chance, vs. 75⁄50 for finding them separately, at a cost of about 1 point in expectation.
In adversarial cases, start out with literal tit-for-tat; play the last number the opponent played. Come up with a short list of possible default strategies; always-2, always-3, always-N, 0/5-alternating, 2/3-alternating, 1/4-alternating. Probably a couple others. If the output doesn’t look quite like tit-for-tat, and the discrepancy with one of these other simple strategies is smaller, play the FDT-best-response strategy. For alternating strategies it’s the same alternation but in reverse, for always-N for N2 it’s to play always-3. This will involve several special cases, but they only need to handle common strategies so it’s limited.
Actually, skip the handshake code. This will get optimality against itself without that by starting out with random 2⁄3 for ~4 rounds and then proceeding from there.
Updates from reading other thoughts; use a 2-probability of 0.69 rather than 0.5 for the initial random run, for better performance as per Unnamed. This would also help against dumb no-forgiveness or minimal-forgiveness bots.