I should have clarified: I meant horrifying in a pretty extreme sense. Like, telling the machine to torture you forever, or destroy you completely, or remove your sense of boredom. .
Just doing something that, say, alienates all your friends wouldn’t qualify. Or loses all your money, if money is still a thing that makes sense. I was also including all the things that you CAN do with your own strength but probably shouldn’t. Building a machine to torture your upload forever wouldn’t be disallowed, but you might want to prohibit the system from letting you.
I meant the ‘Do No Harm’ rule to be a bare-minimal safeguard against producing a system with net negative utility because a small minority manage to put themselves into infinitely negative utility situations. Not to be a general-class ‘the system knows what is best’ measure, which is what it sounded to me like EY was proposing. Now, in his defense, this is probably, in the context of strong AI, a discussion of what the CEV of humanity might end up choosing wisely, but I don’t like it.
I don’t know that I agree with the OP’s proposed basis for distinction, but I at least have a reasonable feel for what it would preclude. (I would even agree that, given clients substantially like modern-day humans, precluding that stuff is reasonably ethical. That said, the notion that a system on the scale the OP is discussing would have clients substantially like modern-day humans and relate to them in a fashion substantially like the fictional example given strikes me as incomprehensibly absurd.)
I don’t quite understand the basis for distinction you’re suggesting instead. I mean, I understand the specific examples you’re listing for exclusion, of course (eternal torture, lack of boredom, complete destruction), but not what they have in common or how I might determine whether, for example, choosing to be eternally alienated from friendship should be allowed or disallowed. Is that sufficiently horrifying? How could one tell?
I do understand that you don’t mean the system to prevent, say, my complete self-destruction as long as I can build the tools to destroy myself without the system’s assistance. The OP might agree with you about that, I’m not exactly sure. I suspect I disagree, personally, though I admit it’s a tricky enough question that a lot depends on how I frame it.
I should have clarified: I meant horrifying in a pretty extreme sense. Like, telling the machine to torture you forever, or destroy you completely, or remove your sense of boredom. .
Just doing something that, say, alienates all your friends wouldn’t qualify. Or loses all your money, if money is still a thing that makes sense. I was also including all the things that you CAN do with your own strength but probably shouldn’t. Building a machine to torture your upload forever wouldn’t be disallowed, but you might want to prohibit the system from letting you.
I meant the ‘Do No Harm’ rule to be a bare-minimal safeguard against producing a system with net negative utility because a small minority manage to put themselves into infinitely negative utility situations. Not to be a general-class ‘the system knows what is best’ measure, which is what it sounded to me like EY was proposing. Now, in his defense, this is probably, in the context of strong AI, a discussion of what the CEV of humanity might end up choosing wisely, but I don’t like it.
I don’t know that I agree with the OP’s proposed basis for distinction, but I at least have a reasonable feel for what it would preclude. (I would even agree that, given clients substantially like modern-day humans, precluding that stuff is reasonably ethical. That said, the notion that a system on the scale the OP is discussing would have clients substantially like modern-day humans and relate to them in a fashion substantially like the fictional example given strikes me as incomprehensibly absurd.)
I don’t quite understand the basis for distinction you’re suggesting instead. I mean, I understand the specific examples you’re listing for exclusion, of course (eternal torture, lack of boredom, complete destruction), but not what they have in common or how I might determine whether, for example, choosing to be eternally alienated from friendship should be allowed or disallowed. Is that sufficiently horrifying? How could one tell?
I do understand that you don’t mean the system to prevent, say, my complete self-destruction as long as I can build the tools to destroy myself without the system’s assistance. The OP might agree with you about that, I’m not exactly sure. I suspect I disagree, personally, though I admit it’s a tricky enough question that a lot depends on how I frame it.