Clippy seems to be someone trying to make the point that a paperclip maximizer is not necessarily bad for the universe
That’s exactly what a not-yet-superintelligent paperclip maximizer would want us to think.
(When Eliezer plays an AI in a box, the AI’s views are probably out of sync with Eliezer’s views too. There’s no rule that says the AI has to be truthful in the AI Box experiment, because there’s no such rule about AIs in reality. It’s supposed to be maximally persuasive, and you’re supposed to resist. If a paperclipper asserts x, then the right question to ask yourself is not “What should I do, given x?”, but “Why does the paperclipper want me to believe x?” The most general answer, by definition, will be something like “Because the paperclipper is executing an elaborate plan to convert the universe into paperclips, and it believes that my believing x will further that goal to some small or large degree”, which is at best orthogonal to “Because x is true”, probably even anticorrelated with it, and almost certainly anticorrelated to “Because believing x will further my goals” if you are a human.)
That’s exactly what a not-yet-superintelligent paperclip maximizer would want us to think.
(When Eliezer plays an AI in a box, the AI’s views are probably out of sync with Eliezer’s views too. There’s no rule that says the AI has to be truthful in the AI Box experiment, because there’s no such rule about AIs in reality. It’s supposed to be maximally persuasive, and you’re supposed to resist. If a paperclipper asserts x, then the right question to ask yourself is not “What should I do, given x?”, but “Why does the paperclipper want me to believe x?” The most general answer, by definition, will be something like “Because the paperclipper is executing an elaborate plan to convert the universe into paperclips, and it believes that my believing x will further that goal to some small or large degree”, which is at best orthogonal to “Because x is true”, probably even anticorrelated with it, and almost certainly anticorrelated to “Because believing x will further my goals” if you are a human.)
Or “Why does the paperclipper want me to believe it wants me to believe x?”, or something with a couple extra layers of recursion.
Or, to flatten the recursion out, “Why did the paperclipper assert x?”.
(Tangential cognitive silly time: I notice that I feel literally racist saying things like this around Clippy.)