If someone wants to estimate the overall existential risk attached to AGI, then it seems fitting that they would estimate the existential risk attached to the scenarios where we have 1) only unaligned AGI, 2) only aligned AGI, or 3) both. The scenario you portray is a subset of 1). I find it plausible. But most relevant discussion on this forum is devoted to 1) so I wanted to think about 2). If some non-zero probability is attached to 2), that should be a useful exercise.
I thought it was clear I was referring to Aligned AGI in the intro and the section heading. And of course, exploring a scenario doesn’t mean I think it is the only scenario that could materialise.
My point is that plausible scenarios for Aligned AGI give you AGI that remains aligned only when run within power bounds, and this seems to me like one of the largest facts affecting the outcome of arms-race dynamics.
Thanks for the clarification. If that’s the plausible scenario for Aligned AGI, then I was drawing a sharper line between Aligned and Unaligned than was warranted. I will edit some part of the text on my website to reflect that.
Thanks for your comment.
If someone wants to estimate the overall existential risk attached to AGI, then it seems fitting that they would estimate the existential risk attached to the scenarios where we have 1) only unaligned AGI, 2) only aligned AGI, or 3) both. The scenario you portray is a subset of 1). I find it plausible. But most relevant discussion on this forum is devoted to 1) so I wanted to think about 2). If some non-zero probability is attached to 2), that should be a useful exercise.
I thought it was clear I was referring to Aligned AGI in the intro and the section heading. And of course, exploring a scenario doesn’t mean I think it is the only scenario that could materialise.
My point is that plausible scenarios for Aligned AGI give you AGI that remains aligned only when run within power bounds, and this seems to me like one of the largest facts affecting the outcome of arms-race dynamics.
Thanks for the clarification. If that’s the plausible scenario for Aligned AGI, then I was drawing a sharper line between Aligned and Unaligned than was warranted. I will edit some part of the text on my website to reflect that.