If you can prove a contradiction, defect. Otherwise, if you can prove that your choice will be the same as the opponent’s, cooperate. Otherwise, defect.
If you can prove that, if you cooperate, the other agent will cooperate, and you can’t prove that if you defect, the other agent will cooperate, then cooperate. Otherwise, defect.
Both of these are unexploitable, cooperate with themselves, and defect against CooperateBot, if my calculations are correct. The first one is a simple way of “sanitizing” NaiveBot.
The second one is exactly cousin_it’s proposal here.
Both of those agents are modal agents of rank 0 and so the fact that they defect against CooperateBot imply that FairBot defects against them by theorem 4.1.
Yes, this is problematic. But it’s not clear to me that it’s a problem for my bots, rather than for Fairbot. After all, one can equally say that they defect against FairBot.
Edit: I’ve thought more about why this is. The rational move against FairBot is to cooperate, unless FairBot’s reasoning system is inconsistent, in which case FairBot is just CooperateBot, and the rational move is to defect. ADTBot rationally defects against inconsistent ones. Since Fairbot does not know its reasoning system is consistent, it defects. Since ADTBOt is unexploitable, it, too, defects. So FairBot is not quite so Fair as it seems—it unfairly punishes you for defecting against inconsistent FairBots.
That’s an interesting take on it. My intuitions still strongly say that a good decision theory had better achieve mutual cooperation with FairBot, but I’ll consider this more thoroughly.
Let FairBot_L be the FairBot function with proof bound L and let FairBot_L,N be the Nth program (according to an arbitrary Gödel numbering) that computes FairBot_L. Then (FairBot_L,N; AnythingUnexploitableWhichMutuallyCooperatesWithFairBot_L,N) is payoff dominant Nash equilibrium.
But there are (infinitely) many other payoff dominant Nash equilibria not in this form, so there doesn’t seem to be any good reason to prefer it.
In fact, equilibra involving FairBots (or even PrudentBots) are unstable: add some uncertainty and they drift away towards a stable equilibrium, which can be easily a sub-optimal equilibrium of mutually defecting bots. On the other hand, equilibria involving CliqueBots (or generalized CliqueBots) are both payoff dominant and stable.
I don’t follow. What kind of uncertainty are you introducing that you say will break cooperation between two FairBots but not between two CliqueBots? If you’re thinking of an amended type of CliqueBot to handle such uncertainty, why not a similarly amended FairBot?
A Nash equilibrium for a mixed strategy game is stable if a small change (specifically, an infinitesimal change) in probabilities for one player leads to a situation where two conditions hold: 1) the player who did not change has no better strategy in the new circumstance 2) the player who did change is now playing with a strictly worse strategy.
CliqueBots equilibria are stable: Let CliqueBot_C be the CliqueBot for clique C. The pure stategy pair (CliqueBot_C, CliqueBot_C) is a Nash equilibrium. If Player1 deviates from this strategy by playing CliqueBot_C with probability 1-epsilon and AnythingElseBot with probability epsilon, for small epsilon, then the optimal strategy for Player2 is still CliqueBot_C, and Player1 expected payoff is strictly worse than the equilibrium payoff, therefore the equilibrium is stable. QED. (this analysis can be extended to generalized CliqueBots that recognize each other by an equivalence predicate weaker than textual equality but strong enough to guarantee functional equivalence. In this case, there is trivial instability within each clique, but the cliques themselves are stable)
Stability is important both in standard game theoretical settings, where players have some irreducible uncertainty about each others or in Evolutionary game theory, where there is a population of strategies that repeatedly face each other in memoryless games. A population of a single clique of CliqueBots will preserve itself indefinitely in the face of mutations: any individual that deviates from the clique is immediately punished, therefore mutations fail to propagate.
What about the FairBots? It’s easily shown that FairBots equilibria are unstable. Thus, in an evolutionary setting, a population of FairBots will accumulate mutations until exploitable bots such as CooperateBots will appear. When the exploitable bots reach a certain mass, bots specialized to exploit them, such as DefectBots, will appear and proliferate, even if the remaining FairBots punish them. The attractor of this process is clearly the stable but suboptimal equilibrium of DefectBots (or other mutually defecting bots). (*) Starting with a propulation of PrudentBots instead of FairBots doesn’t make things better, since PrudentBots can (de)evolve into FairBots and anyway they fail to punish most types of exploitable bots.
Similar considerations can be made for the non-evolutionary setting.
( * the probability of falling into this attractor depends on the specific random walk dynamics. IIUC, under mild assumptions (e.g. bounded maximum program size) it will be 1 minus the probability of randomly reaching a clique)
One way I would think about PrudentBot is as not trying to approximate a decision theory, but rather a special consideration to the features of this particular format, where diverse intelligent agents are examining your source code. Rather than submitting a program to make optimal decisions, you submit a program which is simplified somewhat in a way that errs on the side of cooperation, to make it easier for people who are trying to cooperate with you.
But something about my reasoning is wrong, as it doesn’t fit very well to the difference of the actual codes of ADTBot and PrudentBot.
Should this be “If you can prove that you will cooperate, defect”? As it is, I don’t see how this prevents cooperation with Cooperatebot, unless the agent uses an inconsistent system for proofs.
It kills the Lobian argument, I believe, since this implication “if there’s a proof that you cooperate, then cooperate ” is no longer true. Instead, here’s a Lobian argument for defection:
Suppose there is a proof that you defect. Then either there is a proof of contradiction, or there is no proof that your move is the same as your opponent’s. Either way, you defect.
Proposal 1 runs into the problem that it does not cooperate with itself if the two copies have slightly different bounds on proof lengths. Since if A cooperates, you can (with a not too long proof) show that B did not find a contradiction. But this contradicts the bounded version of the incompleteness theorem.
You can search for reasons to cooperate in a much stronger formal system than you search for reasons to defect in. Is there any decision-theoretic justification for this?
I have two proposed alternatives to PrudentBot.
If you can prove a contradiction, defect. Otherwise, if you can prove that your choice will be the same as the opponent’s, cooperate. Otherwise, defect.
If you can prove that, if you cooperate, the other agent will cooperate, and you can’t prove that if you defect, the other agent will cooperate, then cooperate. Otherwise, defect.
Both of these are unexploitable, cooperate with themselves, and defect against CooperateBot, if my calculations are correct. The first one is a simple way of “sanitizing” NaiveBot.
The second one is exactly cousin_it’s proposal here.
Both of those agents are modal agents of rank 0 and so the fact that they defect against CooperateBot imply that FairBot defects against them by theorem 4.1.
Yes, this is problematic. But it’s not clear to me that it’s a problem for my bots, rather than for Fairbot. After all, one can equally say that they defect against FairBot.
Edit: I’ve thought more about why this is. The rational move against FairBot is to cooperate, unless FairBot’s reasoning system is inconsistent, in which case FairBot is just CooperateBot, and the rational move is to defect. ADTBot rationally defects against inconsistent ones. Since Fairbot does not know its reasoning system is consistent, it defects. Since ADTBOt is unexploitable, it, too, defects. So FairBot is not quite so Fair as it seems—it unfairly punishes you for defecting against inconsistent FairBots.
That’s an interesting take on it. My intuitions still strongly say that a good decision theory had better achieve mutual cooperation with FairBot, but I’ll consider this more thoroughly.
Why?
Let FairBot_L be the FairBot function with proof bound L and let FairBot_L,N be the Nth program (according to an arbitrary Gödel numbering) that computes FairBot_L. Then
(FairBot_L,N; AnythingUnexploitableWhichMutuallyCooperatesWithFairBot_L,N) is payoff dominant Nash equilibrium.
But there are (infinitely) many other payoff dominant Nash equilibria not in this form, so there doesn’t seem to be any good reason to prefer it.
In fact, equilibra involving FairBots (or even PrudentBots) are unstable: add some uncertainty and they drift away towards a stable equilibrium, which can be easily a sub-optimal equilibrium of mutually defecting bots.
On the other hand, equilibria involving CliqueBots (or generalized CliqueBots) are both payoff dominant and stable.
I don’t follow. What kind of uncertainty are you introducing that you say will break cooperation between two FairBots but not between two CliqueBots? If you’re thinking of an amended type of CliqueBot to handle such uncertainty, why not a similarly amended FairBot?
It’s the other way round: FairBots (and PrudentBots) cooperate too much, and this causes instabilty.
from Wikipedia
CliqueBots equilibria are stable:
Let CliqueBot_C be the CliqueBot for clique C. The pure stategy pair (CliqueBot_C, CliqueBot_C) is a Nash equilibrium.
If Player1 deviates from this strategy by playing CliqueBot_C with probability 1-epsilon and AnythingElseBot with probability epsilon, for small epsilon, then the optimal strategy for Player2 is still CliqueBot_C, and Player1 expected payoff is strictly worse than the equilibrium payoff, therefore the equilibrium is stable. QED.
(this analysis can be extended to generalized CliqueBots that recognize each other by an equivalence predicate weaker than textual equality but strong enough to guarantee functional equivalence. In this case, there is trivial instability within each clique, but the cliques themselves are stable)
Stability is important both in standard game theoretical settings, where players have some irreducible uncertainty about each others or in Evolutionary game theory, where there is a population of strategies that repeatedly face each other in memoryless games.
A population of a single clique of CliqueBots will preserve itself indefinitely in the face of mutations: any individual that deviates from the clique is immediately punished, therefore mutations fail to propagate.
What about the FairBots?
It’s easily shown that FairBots equilibria are unstable. Thus, in an evolutionary setting, a population of FairBots will accumulate mutations until exploitable bots such as CooperateBots will appear. When the exploitable bots reach a certain mass, bots specialized to exploit them, such as DefectBots, will appear and proliferate, even if the remaining FairBots punish them. The attractor of this process is clearly the stable but suboptimal equilibrium of DefectBots (or other mutually defecting bots). (*)
Starting with a propulation of PrudentBots instead of FairBots doesn’t make things better, since PrudentBots can (de)evolve into FairBots and anyway they fail to punish most types of exploitable bots.
Similar considerations can be made for the non-evolutionary setting.
( * the probability of falling into this attractor depends on the specific random walk dynamics. IIUC, under mild assumptions (e.g. bounded maximum program size) it will be 1 minus the probability of randomly reaching a clique)
One way I would think about PrudentBot is as not trying to approximate a decision theory, but rather a special consideration to the features of this particular format, where diverse intelligent agents are examining your source code. Rather than submitting a program to make optimal decisions, you submit a program which is simplified somewhat in a way that errs on the side of cooperation, to make it easier for people who are trying to cooperate with you.
But something about my reasoning is wrong, as it doesn’t fit very well to the difference of the actual codes of ADTBot and PrudentBot.
Should this be “If you can prove that you will cooperate, defect”? As it is, I don’t see how this prevents cooperation with Cooperatebot, unless the agent uses an inconsistent system for proofs.
It kills the Lobian argument, I believe, since this implication “if there’s a proof that you cooperate, then cooperate ” is no longer true. Instead, here’s a Lobian argument for defection:
Suppose there is a proof that you defect. Then either there is a proof of contradiction, or there is no proof that your move is the same as your opponent’s. Either way, you defect.
Proposal 1 runs into the problem that it does not cooperate with itself if the two copies have slightly different bounds on proof lengths. Since if A cooperates, you can (with a not too long proof) show that B did not find a contradiction. But this contradicts the bounded version of the incompleteness theorem.
A similar problem holds for Proposal 2.
You can search for reasons to cooperate in a much stronger formal system than you search for reasons to defect in. Is there any decision-theoretic justification for this?
If you do that, you’re back in the same situation that you started with and are cooperating with CooperateBot again.
This is clearly not true for proposal 2. No matter the formal system, you will find a proof (YouDefect ⇒ OpponentCooperate), and therefore defect.