Another poster has “clarified” things the other way in response to the same comment.
I’m pretty confident Roko agrees with me and that this is just a communication error.
So, as a classification scheme, IMO the idea seems rather vague and unclear.
I’m given to understand that the classification scheme is Friendly versus unFriendly, with paperclip maximizer being an illustrative (albeit not representative) example of the latter. I agree that more rigor (and perhaps clearer terminology) is in order.
Machine intelligences seem likely to vary in their desirability to humans.
Technically true. However, most naive superintelligence designs will simply kill all humans. You’ve accomplished quite a lot to even get to a failed utopia, much less deciding whether you want Prime Intellect or Coherent Extrapolated Volition.
It’s also unlikely you’ll accidentally do something significantly worse than killing all humans, for the same reasons. A superintelligent sadist is just as hard as a utopia.
I’m pretty confident Roko agrees with me and that this is just a communication error.
I’m given to understand that the classification scheme is Friendly versus unFriendly, with paperclip maximizer being an illustrative (albeit not representative) example of the latter. I agree that more rigor (and perhaps clearer terminology) is in order.
Machine intelligences seem likely to vary in their desirability to humans.
Friendly / unFriendly seems rather binary, maybe a “desirability” scale would help.
Alas, this seems to be drifting away from the topic.
Technically true. However, most naive superintelligence designs will simply kill all humans. You’ve accomplished quite a lot to even get to a failed utopia, much less deciding whether you want Prime Intellect or Coherent Extrapolated Volition.
It’s also unlikely you’ll accidentally do something significantly worse than killing all humans, for the same reasons. A superintelligent sadist is just as hard as a utopia.