cousin_it comments on Self-modification as a game theory problem

cousin_it 27 Jun 2017 12:31 UTC
1 point
Even if A is FAI and B is a paperclipper, as long as both use correct decision theory, they will instantly merge into a new SI with a combined utility function. Avoiding arms races and any other kind of waste (including waste due to being separate SIs) is in their mutual interest. I don’t expect rational agents to fail achieving mutual interest. If you expect that, your idea of rationality leads to predictably suboptimal utility, so it shouldn’t be called “rationality”. That’s covered in the sequences.
- turchin 27 Jun 2017 12:42 UTC
  1 point
  Parent
  But how I could be sure that paperclip maximiser is a rational agent with correct decision theory? I would not expect it from the papercliper.
  - cousin_it 27 Jun 2017 12:54 UTC
    0 points
    Parent
    If an agent is irrational, it can cause all sorts of waste. I was talking about sufficiently rational agents.
    
    If the problem is proving rationality to another agent, SI will find a way.
    - turchin 27 Jun 2017 13:01 UTC
      2 points
      Parent
      My point is exactly this. If SI is able to prove its rationality (meaning that it is always cooperating in PD etc.), it also able fake any such proof.
      
      If you have two options: to turn off papercliper, or to cooperate with it by giving it half of the universe, what would you do?
      - cousin_it 27 Jun 2017 13:07 UTC
        1 point
        Parent
        I imagine merging like this:
        
        1) Bargain about a design for a joint AI, using any means of communication
        
        2) Build it in a location monitored by both parties
        
        3) Gradually transfer all resources to the new AI
        
        4) Both original AIs shut down, new AI fulfills their combined goals
        
        No proof of rationality required. You can design the process so that any deviation will help the opposing side.
        turchin 27 Jun 2017 13:29 UTC
        1 point
        Parent
        I could imagine some failure modes, but surely I can’t imagine the best one. For example, “both original AIs shut down” simultaneously is vulnerable for defecting.
        
        I also have some busyness experience, and I found that almost every deal includes some cheating, and the cheating is everytime something new. So I always have to ask myself - where is the cheating from the other side? If don’t see it, it’s bad, as it could be something really unexpected. Personally, I hate cheating.
        cousin_it 27 Jun 2017 13:35 UTC
        0 points
        Parent
        An AI could devise a very secure merging process. We don’t have to code it ourselves.
        turchin 27 Jun 2017 13:40 UTC
        0 points
        Parent
        But should we merge with papercliper if we could turn it off?
        
        It reminds me Great Britain policy towards Hitler before WW2, which suggested to give him what he wants to prevent the war. https://en.wikipedia.org/wiki/Appeasement
        cousin_it 27 Jun 2017 13:47 UTC
        0 points
        Parent
        If we can turn off the paperclipper for free, sure. But if war would destroy X resources, it’s better to merge and spend X/2 on paperclips.
        turchin 27 Jun 2017 14:14 UTC
        0 points
        Parent
        So if the price of turning off paperclip is Y, if Y is higher than X/2 , we should cooperate?
        
        But if we agree on this, we create for the papercliper an incentive to increase Y, until it reaches X/2. To increase Y, papercliper has to invest in defense mechanisms or offensive weapons. It creates arms race, until negotiations become more profitable. However, arms race is risky and could turn into war.
        
        Edited: higher.
        cousin_it 27 Jun 2017 14:26 UTC
        0 points
        Parent
        The paperclipper doesn’t need to invest anything. The AIs will just merge without any arms race or war. The possibility of an arms race or war, and its full predicted cost to both sides, will be taken into account during barganing instead. For example, if the paperclipper has a button that can nuke half of our utility, the merged AI will prioritize paperclips more.
        Expand this thread
        turchin 27 Jun 2017 14:42 UTC
        1 point
        Parent
        So they meet before the possible start of the arms race and compare each other relative advantages? I still think that they may try to demonstrate higher barging power than they actually have and that it is almost impossible for us to predict how their game will play because of its complexity.
        
        Thanks for participating in this interesting conversation.
        cousin_it 27 Jun 2017 14:57 UTC
        0 points
        Parent
        Yeah, bargaining between AIs is a very hard problem and we know almost nothing about it. It will probably have all sorts of deception tactics. But in any case, using bargaining instead of war is still in both AI’s common interest, and AIs should be able to achieve common interest.
        
        For example, if A has hidden information that will give it an advantage in war, then B can precommit to giving A more share conditional on seeing it (e.g. by constructing a successor AI that visibly includes the precommitment under A’s watch). Eventually the AIs should agree on all questions of fact and disagree only on values, at which point they agree on how the war will likely go, so they skip the war and share the bigger pie according to the war’s predicted outcome.
        What links here?
        satt's comment on Self-modification as a game theory problem by cousin_it (28 Jun 2017 20:06 UTC; 5 points)
        turchin 27 Jun 2017 15:40 UTC
        1 point
        Parent
        BTW, the book “On thermonuclear war” by Kahn is exactly an attempt to predict the ways of war, negotiations and barging between two presumably rational agents (superpowers). Even an idea to move all resources to new third agent is discussed, as I remember—that is donating all nukes to UN.
        
        How B could see that A has hidden information?
        
        Personally, I feel like you have a mathematically correct, but idealistic and unrealistic model of relations between two perfect agents.
        cousin_it 27 Jun 2017 15:45 UTC
        1 point
        Parent
        Yeah, Schelling’s “Strategy of Conflict” deals with many of the same topics.
        
        A: “I would have an advantage in war so I demand a bigger share now” B: “Prove it” A: “Giving you the info would squander my advantage” B: “Let’s agree on a procedure to check the info, and I precommit to giving you a bigger share if the check succeeds” A: “Cool”
        dogiv 28 Jun 2017 19:53 UTC
        0 points
        Parent
        If visible precommitment by B requires it to share the source code for its successor AI, then it would also be giving up any hidden information it has. Essentially both sides have to be willing to share all information with each other, creating some sort of neutral arbitration about which side would have won and at what cost to the other. That basically means creating a merged superintelligence is necessary just to start the bargaining process, since they each have to prove to the other that the neutral arbiter will control all relevant resources to prevent cheating.
        
        Realistically, there will be many cases where one side thinks its hidden information is sufficient to make the cost of conflict smaller than the costs associated with bargaining, especially given the potential for cheating.
        lmn 28 Jun 2017 3:57 UTC
        0 points
        Parent
        
        A: “I would have an advantage in war so I demand a bigger share now” B: “Prove it” A: “Giving you the info would squander my advantage” B: “Let’s agree on a procedure to check the info, and I precommit to giving you a bigger share if the check succeeds” A: “Cool”
        
        Simply by telling B about the existence of an advantage A is giving B info that could weaken it. Also, what if the advantage is a way to partially cheat in precommitments?
        turchin 27 Jun 2017 16:14 UTC
        0 points
        Parent
        I think there are two other failure modes, which need to be a resolved:
        
        A weaker side is making negotiation longer if it helps it to gain power
        
        A weaker side could fake the size of its army (Like North Korea did with its wooden missiles on last military show)
- lmn 28 Jun 2017 3:50 UTC
  0 points
  Parent
  
  Even if A is FAI and B is a paperclipper, as long as both use correct decision theory, they will instantly merge into a new SI with a combined utility function.
  
  What combined utility function? There is no way to combine utility functions.
  - cousin_it 28 Jun 2017 6:52 UTC
    3 points
    Parent
    Weighted sum, with weights determined by bargaining.