I completely agree about the ability to circumvent humanity’s objections, either by propaganda as you describe, or just by ignoring those objections altogether and doing what it thinks best. Of course, if for whatever reason the system were designed to require uncoerced consent before implementing its plans, it might not use that ability. (Designing it to require consent but to also be free to coerce that consent via superstimuli seems simply silly: neither safe nor efficient.)
Coercion is not binary. I was not thinking into the AI threatening to blow Earth if we refuse the plan, or exposing us to quick burst of superstimulus so high we would do anything to get it again, or lying about its plans, not any of those forms of “cheating”.
But even an AI which is forbidden to use those techniques, and require “uncoerced” consent as : no lying, no threats, not creating addiction, … would be able to present the plan (without lying, even by omission, on its content) in such a way that we’ll accept it relatively easily. Superstimulus for example doesn’t need to be used to create addiction or to blackmail, but just as natural, genuine consequence of accepting the plan. Things we might find horrifying because they are too alien would be presented with a clear analogy, or as the conclusion of a slow introductory path, in which no step is too much of inferential distance, …
I agree with you that, if a sufficiently powerful superintelligence is constrained to avoid any activities that a human would honestly classify as “coercion,” “threat,” “blackmail,” “addiction,” or “lie by omission” and is constrained to only induce changes in belief via means that a human would honestly classify as “natural” and “genuine,” it can nevertheless induce humans to accept its plan while satisfying those constraints.
I don’t think that prevents such a superintelligence from inducing humans to accept its plan through the use of means that would horrify us had we ever thought to consider them.
It’s also not at all clear to me that the fact that X would horrify me if I’d thought to consider it is sufficient grounds to reject using X.
I completely agree about the ability to circumvent humanity’s objections, either by propaganda as you describe, or just by ignoring those objections altogether and doing what it thinks best. Of course, if for whatever reason the system were designed to require uncoerced consent before implementing its plans, it might not use that ability. (Designing it to require consent but to also be free to coerce that consent via superstimuli seems simply silly: neither safe nor efficient.)
This thread has made it clear to me that designing a superintelligence that asks for consent before doing stuff is a bad idea. Thanks!
Coercion is not binary. I was not thinking into the AI threatening to blow Earth if we refuse the plan, or exposing us to quick burst of superstimulus so high we would do anything to get it again, or lying about its plans, not any of those forms of “cheating”.
But even an AI which is forbidden to use those techniques, and require “uncoerced” consent as : no lying, no threats, not creating addiction, … would be able to present the plan (without lying, even by omission, on its content) in such a way that we’ll accept it relatively easily. Superstimulus for example doesn’t need to be used to create addiction or to blackmail, but just as natural, genuine consequence of accepting the plan. Things we might find horrifying because they are too alien would be presented with a clear analogy, or as the conclusion of a slow introductory path, in which no step is too much of inferential distance, …
I agree with you that, if a sufficiently powerful superintelligence is constrained to avoid any activities that a human would honestly classify as “coercion,” “threat,” “blackmail,” “addiction,” or “lie by omission” and is constrained to only induce changes in belief via means that a human would honestly classify as “natural” and “genuine,” it can nevertheless induce humans to accept its plan while satisfying those constraints.
I don’t think that prevents such a superintelligence from inducing humans to accept its plan through the use of means that would horrify us had we ever thought to consider them.
It’s also not at all clear to me that the fact that X would horrify me if I’d thought to consider it is sufficient grounds to reject using X.