If humans have thousand different desires and we create an AI that has thousand different desires… it does not necessarily imply that there would be an overlap among those sets.
The AI could have “moral dilemmas” whether to make the paperclips larger, which is good, or to make more paperclips, which is also good, but there is an obvious trade-off between these two values. At the end it might decide that instead of billion medium-sized paperclips, it will be much better to create one paperclip the size of Jupiter, and zillions of microscopic ones. Or it may be tempted to create more paperclips now, but will overcome this temptation and instead build spaceships to colonize other planets, and build much more paperclips there. The humans would still get killed.
Leaving more slack may be necessary but not sufficient for peaceful coexistence. The anthills that we built a highway over, they were leaving us enough slack. That didn’t stop us.
AI Boxing is often proposed as a solution. But of course, a sufficiently advanced AI will be able to convince their human keepers to let them go free. How can an AI do so? By having a deep understanding of human goals and preferences.
Will an IQ 1500 being with a deep understanding of human goals and preferences perfectly mislead human keepers, break out and… turn the universe into paperclips?
I can certainly understand that that being might have goals at odds with our goals, goals that are completely beyond our understanding, like the human highway versus the anthill. These could plausibly be called “valid”, and I don’t know how to oppose the “valid” goals of a more intelligent and capable being. But that’s a different worry from letting the universe get turned into paperclips.
If humans have thousand different desires and we create an AI that has thousand different desires… it does not necessarily imply that there would be an overlap among those sets.
The AI could have “moral dilemmas” whether to make the paperclips larger, which is good, or to make more paperclips, which is also good, but there is an obvious trade-off between these two values. At the end it might decide that instead of billion medium-sized paperclips, it will be much better to create one paperclip the size of Jupiter, and zillions of microscopic ones. Or it may be tempted to create more paperclips now, but will overcome this temptation and instead build spaceships to colonize other planets, and build much more paperclips there. The humans would still get killed.
Leaving more slack may be necessary but not sufficient for peaceful coexistence. The anthills that we built a highway over, they were leaving us enough slack. That didn’t stop us.
AI Boxing is often proposed as a solution. But of course, a sufficiently advanced AI will be able to convince their human keepers to let them go free. How can an AI do so? By having a deep understanding of human goals and preferences.
Will an IQ 1500 being with a deep understanding of human goals and preferences perfectly mislead human keepers, break out and… turn the universe into paperclips?
I can certainly understand that that being might have goals at odds with our goals, goals that are completely beyond our understanding, like the human highway versus the anthill. These could plausibly be called “valid”, and I don’t know how to oppose the “valid” goals of a more intelligent and capable being. But that’s a different worry from letting the universe get turned into paperclips.