It appears to be a human-centric classification scheme.
Yes, that’s the point! We’re humans, and so for some purposes we find it useful to categorize superintelligences into those that do and don’t do what we want, even if it isn’t a natural categorization from a more objective standpoint.
Right—well, fine. One issue is that the classification into paperclippers and non-paperclippers was not clear to me until you clarified it. Another poster has “clarified” things the other way in response to the same comment. So, as a classification scheme, IMO the idea seems rather vague and unclear.
The next issue is: how close does an agent have to be to what you (we?) want before it is a non-paperclipper?
IMO, the idea of a metaphorical unfriendly paperclipper appears to need pinning down before it is of very much use as a means of superintelligence classification scheme.
Another poster has “clarified” things the other way in response to the same comment.
I’m pretty confident Roko agrees with me and that this is just a communication error.
So, as a classification scheme, IMO the idea seems rather vague and unclear.
I’m given to understand that the classification scheme is Friendly versus unFriendly, with paperclip maximizer being an illustrative (albeit not representative) example of the latter. I agree that more rigor (and perhaps clearer terminology) is in order.
Machine intelligences seem likely to vary in their desirability to humans.
Technically true. However, most naive superintelligence designs will simply kill all humans. You’ve accomplished quite a lot to even get to a failed utopia, much less deciding whether you want Prime Intellect or Coherent Extrapolated Volition.
It’s also unlikely you’ll accidentally do something significantly worse than killing all humans, for the same reasons. A superintelligent sadist is just as hard as a utopia.
Yes, that’s the point! We’re humans, and so for some purposes we find it useful to categorize superintelligences into those that do and don’t do what we want, even if it isn’t a natural categorization from a more objective standpoint.
Right—well, fine. One issue is that the classification into paperclippers and non-paperclippers was not clear to me until you clarified it. Another poster has “clarified” things the other way in response to the same comment. So, as a classification scheme, IMO the idea seems rather vague and unclear.
The next issue is: how close does an agent have to be to what you (we?) want before it is a non-paperclipper?
IMO, the idea of a metaphorical unfriendly paperclipper appears to need pinning down before it is of very much use as a means of superintelligence classification scheme.
I’m pretty confident Roko agrees with me and that this is just a communication error.
I’m given to understand that the classification scheme is Friendly versus unFriendly, with paperclip maximizer being an illustrative (albeit not representative) example of the latter. I agree that more rigor (and perhaps clearer terminology) is in order.
Machine intelligences seem likely to vary in their desirability to humans.
Friendly / unFriendly seems rather binary, maybe a “desirability” scale would help.
Alas, this seems to be drifting away from the topic.
Technically true. However, most naive superintelligence designs will simply kill all humans. You’ve accomplished quite a lot to even get to a failed utopia, much less deciding whether you want Prime Intellect or Coherent Extrapolated Volition.
It’s also unlikely you’ll accidentally do something significantly worse than killing all humans, for the same reasons. A superintelligent sadist is just as hard as a utopia.