LukeOnline comments on Godshatter Versus Legibility: A Fundamentally Different Approach To AI Alignment

LukeOnline 11 Apr 2022 10:10 UTC
1 point
AI Boxing is often proposed as a solution. But of course, a sufficiently advanced AI will be able to convince their human keepers to let them go free. How can an AI do so? By having a deep understanding of human goals and preferences.
Will an IQ 1500 being with a deep understanding of human goals and preferences perfectly mislead human keepers, break out and… turn the universe into paperclips?
I can certainly understand that that being might have goals at odds with our goals, goals that are completely beyond our understanding, like the human highway versus the anthill. These could plausibly be called “valid”, and I don’t know how to oppose the “valid” goals of a more intelligent and capable being. But that’s a different worry from letting the universe get turned into paperclips.