For what it’s worth, I’m an alignment-optimist with a similar view to mukashi, and I’ve been doing your exercise as part of a science fiction novel I’m writing (Singularity: 1998). The exercise has certainly made me more concerned about the problem. I still don’t think decisive strategic advantage (beyond nuclear mutually assured destruction) is likely without nanotech or biotech. My non-biologist intuition is that an extinction-plauge is not plausible threat. However, a combination of post-singularity social engineering and nanotech could certainly result in extinction under a deceptively misaligned AI. Therefore, what I’ve learned most from the exercise is that even following a seemingly good singularity, we still need to remain on guard. We should repeatedly prove to ourselves that the AI is both corrigible and values-aligned. In my opinion, the AI absolutely must be both.
For what it’s worth, I’m an alignment-optimist with a similar view to mukashi, and I’ve been doing your exercise as part of a science fiction novel I’m writing (Singularity: 1998). The exercise has certainly made me more concerned about the problem. I still don’t think decisive strategic advantage (beyond nuclear mutually assured destruction) is likely without nanotech or biotech. My non-biologist intuition is that an extinction-plauge is not plausible threat. However, a combination of post-singularity social engineering and nanotech could certainly result in extinction under a deceptively misaligned AI. Therefore, what I’ve learned most from the exercise is that even following a seemingly good singularity, we still need to remain on guard. We should repeatedly prove to ourselves that the AI is both corrigible and values-aligned. In my opinion, the AI absolutely must be both.