The paperclip maximizer problem that we discussed earlier was actually initially proposed not as an outer alignment problem of the kind that I presented (although it is also a problem of choosing the correct objective function/outer alignment). The original paperclip maximizer was an inner alignment problem: what if in the course of training an AI, deep in its connection weights, it learned a “preference” for items shaped like paperclips.
But it’s still useful as an outer alignment intuition pump.
Yep, and I recognize that later in the article:
But it’s still useful as an outer alignment intuition pump.