Yes, that’s true. But I feel like the post doesn’t seem to address this.
The first-mover-only strategy I think the Rogue AI team is going to be considering as one of its top options is, “Wipe out humanity (except for a few loyal servants) with an single unblockable first strike”.
The copy-strategy that I think humanity should pursue here is, “Wipe out the Rogue AI with overwhelming force.”
Of course, this requires humanity to even know that the Rogue AI team exists and is contemplating a first strike. That’s not an easy thing to accomplish, because an earlier strategy that the Rogue AI team is likely to be pursuing is “hide from the powerful opposed group that currently has control over 99% of the world’s resources.”
I disagree that my point has been fully discussed, and even if it had been, I think it would be burying the lede to start with a paragraph like this:
“Suppose that 1% of the world’s resources are controlled by unaligned AI, and 99% of the world’s resources are controlled by humans. We might hope that at least 99% of the universe’s resources end up being used for stuff-humans-like (in expectation).”
Without following it up with something like:
“Of course, the strategic considerations here are such that an immoral actor with 1% could choose to eliminate the 99% and thus have 100% of the future resources. Furthermore, if the unaligned AI team had so far hidden its existence, then this option would be asymmetrical since the 99% of humans wouldn’t know that they even had an opponent or that they were in imminent danger of being wiped out. Thus, we’d need to assume a very different offense-defense balance, or a failure of secrecy, to expect anything other than 100% of future resources going to the unaligned AI team. The remainder of this post explores the specific branch of the hypothetical future in which elimination of the opponent (in either direction) is not an option for some unspecified reason.”
Yes, that’s true. But I feel like the post doesn’t seem to address this.
The first-mover-only strategy I think the Rogue AI team is going to be considering as one of its top options is, “Wipe out humanity (except for a few loyal servants) with an single unblockable first strike”.
The copy-strategy that I think humanity should pursue here is, “Wipe out the Rogue AI with overwhelming force.”
Of course, this requires humanity to even know that the Rogue AI team exists and is contemplating a first strike. That’s not an easy thing to accomplish, because an earlier strategy that the Rogue AI team is likely to be pursuing is “hide from the powerful opposed group that currently has control over 99% of the world’s resources.”
I think it does address and discuss this, see items 4, 8 and 11.
I’m sympathetic to disagreeing with Paul overall, but it’s not as though these considerations haven’t been discussed.
I disagree that my point has been fully discussed, and even if it had been, I think it would be burying the lede to start with a paragraph like this:
Without following it up with something like:
“Of course, the strategic considerations here are such that an immoral actor with 1% could choose to eliminate the 99% and thus have 100% of the future resources. Furthermore, if the unaligned AI team had so far hidden its existence, then this option would be asymmetrical since the 99% of humans wouldn’t know that they even had an opponent or that they were in imminent danger of being wiped out. Thus, we’d need to assume a very different offense-defense balance, or a failure of secrecy, to expect anything other than 100% of future resources going to the unaligned AI team. The remainder of this post explores the specific branch of the hypothetical future in which elimination of the opponent (in either direction) is not an option for some unspecified reason.”