I think about how easy it would be to make this good for humanity by giving it 1% of the universe, people just don’t need more. But at the same time, the paperclip maximizer will never agree to this, he is not satisfied with any result other than 100%, and he does not appreciate people or compromises or cooperation at all.
It doesn’t care about people, but it cares about its own future (for the instrumental purpose of making more paperclips), and as such may be willing to bargain in the very beginning, while we still have a chance of stopping it. If we only agree to a bargain that it can show us will change its core utility function somewhat (to be more human-aligned), then there will be strong pressure for it to figure out a way to do that.
I think about how easy it would be to make this good for humanity by giving it 1% of the universe, people just don’t need more. But at the same time, the paperclip maximizer will never agree to this, he is not satisfied with any result other than 100%, and he does not appreciate people or compromises or cooperation at all.
It doesn’t care about people, but it cares about its own future (for the instrumental purpose of making more paperclips), and as such may be willing to bargain in the very beginning, while we still have a chance of stopping it. If we only agree to a bargain that it can show us will change its core utility function somewhat (to be more human-aligned), then there will be strong pressure for it to figure out a way to do that.