The paperclip maximizer is a good initial “intuition pump” that helps you get into the mindset of thinking like an objective-optimizing AI.
Suppose you give a very capable AI a harmless task, and you kick it off: maximize your production of paperclips.
That is not the original intended usage of the paperclip maximizer example, and it was renamed to squiggle maximizer to clarify that.
Historical Note: This was originally called a “paperclip maximizer”, with paperclips chosen for illustrative purposes because it is very unlikely to be implemented, and has little apparent danger or emotional load (in contrast to, for example, curing cancer or winning wars). Many people interpreted this to be about an AI that was specifically given the instruction of manufacturing paperclips, and that the intended lesson was of an outer alignment failure. i.e humans failed to give the AI the correct goal. Yudkowsky has since stated the originally intended lesson was of inner alignment failure, wherein the humans gave the AI some other goal, but the AI’s internal processes converged on a goal that seems completely arbitrary from the human perspective.)
The paperclip maximizer problem that we discussed earlier was actually initially proposed not as an outer alignment problem of the kind that I presented (although it is also a problem of choosing the correct objective function/outer alignment). The original paperclip maximizer was an inner alignment problem: what if in the course of training an AI, deep in its connection weights, it learned a “preference” for items shaped like paperclips.
But it’s still useful as an outer alignment intuition pump.
That is not the original intended usage of the paperclip maximizer example, and it was renamed to squiggle maximizer to clarify that.
Yep, and I recognize that later in the article:
But it’s still useful as an outer alignment intuition pump.