a highly intelligent agent will be extremely likely to optimize some simple global utility function
Simplicity of misaligned agent’s goals is not needed for or implied by the usual arguments. It might make agent’s self-aligned self-improvement fractionally easier, but this doesn’t seem to be an important distinction. An AI doesn’t need to radically self-modify to be existentially dangerous, it only needs to put its more mundane advantages to use to get ahead, once it’s capable of doing research autonomously.
Simplicity of misaligned agent’s goals is not needed for or implied by the usual arguments. It might make agent’s self-aligned self-improvement fractionally easier, but this doesn’t seem to be an important distinction. An AI doesn’t need to radically self-modify to be existentially dangerous, it only needs to put its more mundane advantages to use to get ahead, once it’s capable of doing research autonomously.