When presenting claims that the cognitively superior agent wins, often the AI safety community makes an analogies with 2-player zero-sum games such as Chess and Go where the smartest and most ruthless players prevail. However, most real-world interactions are best modeled by repeated non-zero-sum games.
In an ecological context, Maynard Smith and Price introduced the Hawk-Dove game to try to explain the fact that in nature, many animals facing contests over scarce resources such as mates or foods engage in only limited conflict rather than wiping out rivals (Maynard Smith and Price, 1973); for example, a male stag will often yield to a rival without a fight.
When we model how hard-coded strategies for this game propagate via genetic reproduction in a large well-mixed population, it turns out that, for certain payoff structures, peaceful strategies (“Doves”) co-exist with aggressive strategies (“Hawks”) in a stable equilibrium (see https://sphelps.net/teaching/egt.html for illustrative numerical simulations).
The Hawk-Dove game is also sometimes called “the Chicken Game” because we can imagine it also models a scenario in which two opposing drivers drive on a collision course and simultaneous choose whether to swerve or drive straight. Neither player wants to “look like a chicken” by serving, but if both players drive straight they crash and die.
In the Chicken Game, cognitive superiority does not always equate to winning. For example, by pre-commiting to driving straight and making this common-knowledge we can beat a rational opponent whose best response is then to swerve. Similar arguments were put forward during the cold-war to argue for removing rational deliberation from the decision to retaliate against a first strike by the enemy by removing humans from the loop, and putting strategic weapons systems on hair-trigger automated alert, because in the absence of pre-commit a threat to retaliate is not credible since there is no aposterior advantage to retaliation once the opponent actually strikes. Notice that this dynamic is the exact opposite of the power-seeking behaviour posited by reinforcement-learning agents which seek to maximise expected utility by expanding their possible choices. With non-zero-sum games, in contrast, it can make sense to reduce one’s choices. Moreover, the cognitive capacity required to enact policies over reduced choices is lower - stupid can beat smart.
Although Maynard Smith and Price originally formulated the logic of animal conflict in terms of evolutionary adaptation, the same dynamic model can be used to model social learning by boundedly-rational agents (see Phelps and Wooldridge, 2013 for a review). As regards the cognitive capacity of non-human animals to deliberate in Hawk-Dove interactions, see Morikawa et al. (2002).
References
Morikawa, T., Hanley, J.E. and Orbell, J., 2002. Cognitive requirements for hawk-dove games: A functional analysis for evolutionary design. Politics and the Life Sciences, 21(1), pp.3-12.
Phelps, S. and Wooldridge, M., 2013. Game theory and evolution. IEEE intelligent systems, 28(04), pp.76-81.
Maynard Smith, J. and Price, G.R., 1973. The logic of animal conflict. Nature, 246(5427), pp.15-18.
When presenting claims that the cognitively superior agent wins, often the AI safety community makes an analogies with 2-player zero-sum games such as Chess and Go where the smartest and most ruthless players prevail. However, most real-world interactions are best modeled by repeated non-zero-sum games.
In an ecological context, Maynard Smith and Price introduced the Hawk-Dove game to try to explain the fact that in nature, many animals facing contests over scarce resources such as mates or foods engage in only limited conflict rather than wiping out rivals (Maynard Smith and Price, 1973); for example, a male stag will often yield to a rival without a fight.
When we model how hard-coded strategies for this game propagate via genetic reproduction in a large well-mixed population, it turns out that, for certain payoff structures, peaceful strategies (“Doves”) co-exist with aggressive strategies (“Hawks”) in a stable equilibrium (see https://sphelps.net/teaching/egt.html for illustrative numerical simulations).
The Hawk-Dove game is also sometimes called “the Chicken Game” because we can imagine it also models a scenario in which two opposing drivers drive on a collision course and simultaneous choose whether to swerve or drive straight. Neither player wants to “look like a chicken” by serving, but if both players drive straight they crash and die.
In the Chicken Game, cognitive superiority does not always equate to winning. For example, by pre-commiting to driving straight and making this common-knowledge we can beat a rational opponent whose best response is then to swerve. Similar arguments were put forward during the cold-war to argue for removing rational deliberation from the decision to retaliate against a first strike by the enemy by removing humans from the loop, and putting strategic weapons systems on hair-trigger automated alert, because in the absence of pre-commit a threat to retaliate is not credible since there is no aposterior advantage to retaliation once the opponent actually strikes. Notice that this dynamic is the exact opposite of the power-seeking behaviour posited by reinforcement-learning agents which seek to maximise expected utility by expanding their possible choices. With non-zero-sum games, in contrast, it can make sense to reduce one’s choices. Moreover, the cognitive capacity required to enact policies over reduced choices is lower - stupid can beat smart.
Although Maynard Smith and Price originally formulated the logic of animal conflict in terms of evolutionary adaptation, the same dynamic model can be used to model social learning by boundedly-rational agents (see Phelps and Wooldridge, 2013 for a review). As regards the cognitive capacity of non-human animals to deliberate in Hawk-Dove interactions, see Morikawa et al. (2002).
References
Morikawa, T., Hanley, J.E. and Orbell, J., 2002. Cognitive requirements for hawk-dove games: A functional analysis for evolutionary design. Politics and the Life Sciences, 21(1), pp.3-12.
Phelps, S. and Wooldridge, M., 2013. Game theory and evolution. IEEE intelligent systems, 28(04), pp.76-81.
Maynard Smith, J. and Price, G.R., 1973. The logic of animal conflict. Nature, 246(5427), pp.15-18.