So, to argue that instrumental intelligence is sufficient for existential risk, we have to explain how an instrumental intelligence can navigate different frames.
This is where the other main line of argument comes into play:
I think ‘ability to navigate frames’ is distinct from ‘philosophical maturity’, roughly because of something like a distinction between soldier mindset and scout mindset.
You can imagine an entity that, whenever it reflects on their current political / moral / philosophical positions, using their path-finding ability like a lawyer to make the best possible case for why they should believe what they already believe, or to discard incoming arguments whose conclusions are unpalatable. There’s something like another orthogonality thesis at play here, where even if you’re a wizard at maneuvering through frames, it matters whether you’re playing chess or suicide chess.
This is just a thesis; it might be the case that it is impossible to be superintelligent and in soldier mindset (the ‘curiosity’ thesis?), but the orthogonality thesis is that it is possible, and so you could end up with value lock-in, where the very intelligent entity that is morally confused uses that intelligence to prop up the confusion rather than disperse it. Here we’re using instrumental intelligence as the ‘super’ intelligence in both the orthogonality and existential risk consideration. (You consider something like this case later, but I think in a way that fails to visualize this possibility.)
[In humans, intelligence and rationality are only weakly correlated, in a way that I think supports this view pretty strongly.]
This is where the other main line of argument comes into play:
I think ‘ability to navigate frames’ is distinct from ‘philosophical maturity’, roughly because of something like a distinction between soldier mindset and scout mindset.
You can imagine an entity that, whenever it reflects on their current political / moral / philosophical positions, using their path-finding ability like a lawyer to make the best possible case for why they should believe what they already believe, or to discard incoming arguments whose conclusions are unpalatable. There’s something like another orthogonality thesis at play here, where even if you’re a wizard at maneuvering through frames, it matters whether you’re playing chess or suicide chess.
This is just a thesis; it might be the case that it is impossible to be superintelligent and in soldier mindset (the ‘curiosity’ thesis?), but the orthogonality thesis is that it is possible, and so you could end up with value lock-in, where the very intelligent entity that is morally confused uses that intelligence to prop up the confusion rather than disperse it. Here we’re using instrumental intelligence as the ‘super’ intelligence in both the orthogonality and existential risk consideration. (You consider something like this case later, but I think in a way that fails to visualize this possibility.)
[In humans, intelligence and rationality are only weakly correlated, in a way that I think supports this view pretty strongly.]