A crux here is whether there is any node subset that autonomous superintelligent AI would converge on in the long-term (regardless of previous alignment development paths).
I’ve written about this:
No control method exists to safely contain the global feedback effects of self-sufficient learning machinery. What if this control problem turns out to be an unsolvable problem?
…
But there is an even stronger form of argument:
Not only would AGI component interactions be uncontainable; they will also necessarily converge on causing the extinction of all humans.
A crux here is whether there is any node subset that autonomous superintelligent AI would converge on in the long-term (regardless of previous alignment development paths).
I’ve written about this:
https://www.lesswrong.com/posts/xp6n2MG5vQkPpFEBH/the-control-problem-unsolved-or-unsolvable