It’s true that there can be an AGI that is not a maximizer and doesn’t tend to turn into one, and plausible that this is the kind of AGI we get by default. This doesn’t resolve AI risk by itself, but meaningfully reframes it.
AI risk doesn’t go away with very slow takeoff, or non-maximizer AGIs, because in these cases AGIs are still eventually in charge of the future, even if it takes a long time to get to that point (and then they probably want to very carefully build a maximizer aligned with them). The risk only goes away (as a result) if these properties of AGIs are exploitable opportunities to get alignment sorted out, or lead to alignment by default. And the kinds of alignment opportunities that could be exploited depend on the character of AGIs, so remaining aware of non-maximizer AGIs as a possibility is valuable.
It’s true that there can be an AGI that is not a maximizer and doesn’t tend to turn into one, and plausible that this is the kind of AGI we get by default. This doesn’t resolve AI risk by itself, but meaningfully reframes it.
AI risk doesn’t go away with very slow takeoff, or non-maximizer AGIs, because in these cases AGIs are still eventually in charge of the future, even if it takes a long time to get to that point (and then they probably want to very carefully build a maximizer aligned with them). The risk only goes away (as a result) if these properties of AGIs are exploitable opportunities to get alignment sorted out, or lead to alignment by default. And the kinds of alignment opportunities that could be exploited depend on the character of AGIs, so remaining aware of non-maximizer AGIs as a possibility is valuable.