satt comments on Self-modification as a game theory problem

satt 28 Jun 2017 20:06 UTC
5 points
I’m flashing back to reading Jim Fearon!

Fearon’s paper concludes that pretty much only two mechanisms can explain “why rationally led states” would go to war instead of striking a peaceful bargain: private information, and commitment problems.

Your comment brushes off commitment problems in the case of superintelligences, which might turn out to be right. (It’s not clear to me that superintelligence entails commitment ability, but nor is it clear that it doesn’t entail commitment ability.) I’m less comfortable with setting aside the issue of private information, though.

Assuming rational choice, competing agents are only going to truthfully share information if they have incentives to do so, or at least no incentive not to do so, but in cases where war is a real possibility, I’d expect the incentives to actively encourage secrecy: exaggerating war-making power and/or resolve could allow an agent to drive a harder potential bargain.

You suggest that the ability to precommit could guarantee information sharing, but I feel unease about assuming that without a systematic argument or model. Did Schelling or anybody else formally analyze how that would work? My gut has the sinking feeling that drawing up the implied extensive-form game and solving for equilibrium would produce a non-zero probability of non-commitment, imperfect information exchange, and conflict.

Finally I’ll bring in a new point: Fearon’s analysis explicitly relies on assuming unitary states. In practice, though, states are multipartite, and if the war-choosing bit of the state can grab most of the benefits from a potential war, while dumping most of the potential costs on another bit of the state, that can enable war. I expect something analogous could produce war between superintelligences, as I don’t see why superintelligences have to be unitary agents.
- cousin_it 28 Jun 2017 21:02 UTC
  0 points
  Parent
  That’s a good question and I’m not sure my thinking is right. Let’s say two AIs want to go to war for whatever reason. Then they can agree to some other procedure that predicts the outcome of war (e.g. war in 1% of the universe, or simulated war) and precommit to accept it as binding. It seems like both would benefit from that.
  
  That said I agree that bargaining is very tricky. Coming up with an extensive form game might not help, because what if the AIs use a different extensive form game? There’s been pretty much no progress on this for a decade, I don’t see any viable attack.
  - satt 28 Jun 2017 23:11 UTC
    1 point
    Parent
    
    Let’s say two AIs want to go to war for whatever reason. Then they can agree to some other procedure that predicts the outcome of war (e.g. war in 1% of the universe, or simulated war) and precommit to accept the outcome as binding. It seems like both would benefit from that.
    
    My (amateur!) hunch is that an information deficit bad enough to motivate agents to sometimes fight instead of bargain might be an information deficit bad enough to motivate agents to sometimes fight instead of precommitting to exchange info and then bargain.
    
    Coming up with an extensive form game might not help, because what if the AIs use a different extensive form game?
    
    Certainly, any formal model is going to be an oversimplification, but models can be useful checks on intuitive hunches like mine. If I spent a long time formalizing different toy games to try to represent the situation we’re talking about, and I found that none of my games had (a positive probability of) war as an equilibrium strategy, I’d have good evidence that your view was more correct than mine.
    
    There’s been pretty much no progress on this in a decade, I don’t see any viable attack.
    
    There might be some analogous results in the post-Fearon, rational-choice political science literature, I don’t know it well enough to say. And even if not, it might be possible to build a relevant game incrementally.
    
    Start with a take-it-or-leave-it game. Nature samples a player’s cost of war from some distribution and reveals it only to that player. (Or, alternatively, Nature randomly assigns a discrete, privately known type to a player, where the type reflects the player’s cost of war.) That player then chooses between (1) initiating a bargaining sub-game and (2) issuing a demand to the other player, triggering war if the demand is rejected. This should be tractable, since standard, solvable models exist for two-player bargaining.
    
    So far we have private information, but no precommitment. But we could bring precommitment in by adding extra moves to the game: before making the bargain-or-demand choice, players can mutually agree to some information-revealing procedure followed by bargaining with the newly revealed information in hand. Solving this expanded game could be informative.