AI Boxing is often proposed as a solution. But of course, a sufficiently advanced AI will be able to convince their human keepers to let them go free. How can an AI do so? By having a deep understanding of human goals and preferences.
Will an IQ 1500 being with a deep understanding of human goals and preferences perfectly mislead human keepers, break out and… turn the universe into paperclips?
I can certainly understand that that being might have goals at odds with our goals, goals that are completely beyond our understanding, like the human highway versus the anthill. These could plausibly be called “valid”, and I don’t know how to oppose the “valid” goals of a more intelligent and capable being. But that’s a different worry from letting the universe get turned into paperclips.
AI Boxing is often proposed as a solution. But of course, a sufficiently advanced AI will be able to convince their human keepers to let them go free. How can an AI do so? By having a deep understanding of human goals and preferences.
Will an IQ 1500 being with a deep understanding of human goals and preferences perfectly mislead human keepers, break out and… turn the universe into paperclips?
I can certainly understand that that being might have goals at odds with our goals, goals that are completely beyond our understanding, like the human highway versus the anthill. These could plausibly be called “valid”, and I don’t know how to oppose the “valid” goals of a more intelligent and capable being. But that’s a different worry from letting the universe get turned into paperclips.