most people in AI alignment think it’s possible that an AI could be trained to optimize for something like this.
I don’t think we have any idea how to do this. If we knew how to get an AGI system to reliably maximize the number of paperclips in the universe, that might be most of the (strawberry-grade) alignment problem solved right there.
You’re right, my mistake—of course we don’t know how to deliberately and reliably train a paperclip maximizer. I’ve updated the parent comment now to say:
most people in AI alignment think it’s possible that an AI like this could in principle emerge from training (though we don’t know how to reliably train one on purpose).
I don’t think we have any idea how to do this. If we knew how to get an AGI system to reliably maximize the number of paperclips in the universe, that might be most of the (strawberry-grade) alignment problem solved right there.
You’re right, my mistake—of course we don’t know how to deliberately and reliably train a paperclip maximizer. I’ve updated the parent comment now to say: