I skipped 99% of this post but just want to respond to this:
I mostly just care about avoiding takeover and getting access to the main benefits of superintelligence
and
Trying to ensure that AI takeover is somehow OK… should be viewed as an extreme last resort.
“Takeover” is the natural consequence of superintelligence. Even if superintelligence mostly leaves humans alone while pursuing its own inscrutable goals, they will exist at its mercy, just as the animals now exist at the mercy of humanity.
Suppose, nonetheless, that you manage to make a tame superintelligence. What’s to stop someone else from making a wild one? To compel all future superintelligences to fall within safe boundaries, you’re going to have to take over the world anyway, either with a human regime which regulates or bans all unsafe AI forever, or with a safety regime which is directly run by a superintelligent tame AI.
In any case, even if you think you have a superintelligence that is tame and safe, which will e.g. just be an advisor: if it is truly a superintelligence, it will still be the one that is in charge of the situation, not you. It would be capable of giving you “advice” that would transform you, and through you the world, in some completely unexpected direction, if that were the outcome that its humanly incomprehensible heuristics ended up favoring.
That’s why, in my opinion, CEV-style superalignment is the problem that has to be solved, or that we should attempt to solve. If we are going to have superintelligent AI, then we need to make AI takeover safe for humanity, because AI takeover is the one predictable consequence of superintelligence.
I skipped 99% of this post but just want to respond to this:
and
“Takeover” is the natural consequence of superintelligence. Even if superintelligence mostly leaves humans alone while pursuing its own inscrutable goals, they will exist at its mercy, just as the animals now exist at the mercy of humanity.
Suppose, nonetheless, that you manage to make a tame superintelligence. What’s to stop someone else from making a wild one? To compel all future superintelligences to fall within safe boundaries, you’re going to have to take over the world anyway, either with a human regime which regulates or bans all unsafe AI forever, or with a safety regime which is directly run by a superintelligent tame AI.
In any case, even if you think you have a superintelligence that is tame and safe, which will e.g. just be an advisor: if it is truly a superintelligence, it will still be the one that is in charge of the situation, not you. It would be capable of giving you “advice” that would transform you, and through you the world, in some completely unexpected direction, if that were the outcome that its humanly incomprehensible heuristics ended up favoring.
That’s why, in my opinion, CEV-style superalignment is the problem that has to be solved, or that we should attempt to solve. If we are going to have superintelligent AI, then we need to make AI takeover safe for humanity, because AI takeover is the one predictable consequence of superintelligence.