What I was pointing out is that the barrier is asymmetrical: it’s biased towards AIs with more-easily-aligned utility functions. A paperclipper is more likely to be able to create an improved paperclipper that it’s certain enough will massively increase its utility, while a more human-aligned AI would have to be more conservative.
In other words, this paper seems to say, “if we can create human-aligned AI, it will be cautious about self-improvement, but dangerously unaligned AIs will probably have no issues.”
What I was pointing out is that the barrier is asymmetrical: it’s biased towards AIs with more-easily-aligned utility functions. A paperclipper is more likely to be able to create an improved paperclipper that it’s certain enough will massively increase its utility, while a more human-aligned AI would have to be more conservative.
In other words, this paper seems to say, “if we can create human-aligned AI, it will be cautious about self-improvement, but dangerously unaligned AIs will probably have no issues.”