Yet More Modal Combat

Strongly recommended prior reading.

Robust Cooperation in the Prisoner’s Dilemma:
Program Equilibrium via Provability Logic

I think I have come up with a modal agent that has these nice properties.

Cooperates against itself, and similar designs.
Cooperates against some fairbots (see notes later)
Defects against cooperate bot
Never gets the suckers payoff (C,D)
Only references the situation its actually in.

The design makes use of 2 different proof systems $W$ and $S$ . These are short for weak and strong. For example, you could consider $W = P A$ and $S = P A + 1$ .

Although $W =$ intuitionistic Robinson arithmetic, and $S = Z F C + G C H + I O$ would work too.

The algorithm

If W [C] then Defect

Else If S [C] then Cooperate

Else Defect

Where $C =$ My opponent cooperates against me.

The last point is key here. Let’s suppose this game is being played by actual computers made of physical atoms. The $J u s t b o t$ and $P r u d e n t b o t$ in the paper linked above consider what the opponent would do against a player that isn’t yourself. For example, Prudentbot imagines a hypothetical world where its own code was deleted and replaced with the code for a defect bot. The hypothetical opponent may notice this apparent anomaly and decide investigating it is far more important than the prisoners dilemma.

My bot (any good names in the comments?) solves this by only considering its opponent against itself.

Consider two versions of this bot playing against each other. Not necessarily using the same $W$ and $S$ as each other.

Lets assume all 4 proof systems in use are consistent. We also want the criteria that none of the proof systems can ever prove a false statement. (Although I suspect this can be weakened somewhat.)

Neither bot ever gets the suckers payoff (C,D), as each bot will only cooperate if its $S$ can prove that the other bot does too. If $W [C]$ on either bot, then this would mean the other bot got the suckers payoff. This can’t happen. As neither W actually succeeds, if both of the S are sufficiently strong enough to prove this ( $S_{1} [\neg W_{2} [C_{1}]]$ and $S_{2} [\neg W_{1} [C_{2}]]$ ) (Note: some technical detail work on when exactly this is true, still I expect it to be true for most sensible choices of proof system) then the situation reduces to fairbot vs fairbot (Using proof systems $S_{1}, S_{2}$ ) This cooperates for the standard Lobian reasons. (With some technical assumptions about both systems being strong enough to do lob style reasoning. See bottom of page 8 of linked paper)

Fairbot can be considered as a special case of this bot design, where W is a proof system so weak it doesn’t prove anything at all. In this case, fairbot will cooperate if its using a strong enough proof system.

Cooperate bot can be proven to cooperate by $W$ , so this bot defects.

A final note is that the same type of reasoning can be done where W and S represent proof searches in the same language (say PA) but with S having a much larger max proof length. (Long enough to prove that W doesn’t prove something by brute forcing over all proofs short enough.)

Open question? Can this work with logical inductors. Two logical inductor based agents that cooperate if their P(C) tends to one, but not too fast.