Mark Xu comments on The strategy-stealing assumption

Mark Xu 29 Jan 2024 18:32 UTC
2 points
It’s important to distinguish between:
- the strategy of “copy P2′s strategy” is a good strategy
- because P2 had a good strategy, there exists a good strategy for P1
Strategy stealing assumption isn’t saying that copying strategies is a good strategy, it’s saying the possibility of copying means that there exists a strategy P1 can take that is just a good as P2.
- Nathan Helm-Burger 29 Jan 2024 18:43 UTC
  2 points
  Parent
  Yes, that’s true. But I feel like the post doesn’t seem to address this.
  The first-mover-only strategy I think the Rogue AI team is going to be considering as one of its top options is, “Wipe out humanity (except for a few loyal servants) with an single unblockable first strike”.
  The copy-strategy that I think humanity should pursue here is, “Wipe out the Rogue AI with overwhelming force.”
  Of course, this requires humanity to even know that the Rogue AI team exists and is contemplating a first strike. That’s not an easy thing to accomplish, because an earlier strategy that the Rogue AI team is likely to be pursuing is “hide from the powerful opposed group that currently has control over 99% of the world’s resources.”
  - ryan_greenblatt 29 Jan 2024 18:53 UTC
    2 points
    Parent
    
    But I feel like the post doesn’t seem to address this.
    
    I think it does address and discuss this, see items 4, 8 and 11.
    
    I’m sympathetic to disagreeing with Paul overall, but it’s not as though these considerations haven’t been discussed.
    - Nathan Helm-Burger 29 Jan 2024 19:46 UTC
      2 points
      Parent
      I disagree that my point has been fully discussed, and even if it had been, I think it would be burying the lede to start with a paragraph like this:
      “Suppose that 1% of the world’s resources are controlled by unaligned AI, and 99% of the world’s resources are controlled by humans. We might hope that at least 99% of the universe’s resources end up being used for stuff-humans-like (in expectation).”
      Without following it up with something like:
      “Of course, the strategic considerations here are such that an immoral actor with 1% could choose to eliminate the 99% and thus have 100% of the future resources. Furthermore, if the unaligned AI team had so far hidden its existence, then this option would be asymmetrical since the 99% of humans wouldn’t know that they even had an opponent or that they were in imminent danger of being wiped out. Thus, we’d need to assume a very different offense-defense balance, or a failure of secrecy, to expect anything other than 100% of future resources going to the unaligned AI team. The remainder of this post explores the specific branch of the hypothetical future in which elimination of the opponent (in either direction) is not an option for some unspecified reason.”