How do you intend to build a powerful optimizer without having a method of representing (or of building a representation of) the concept of “human” (where “human” can be replaced with any complex concept, even probably paperclips)?
I agree that value specification is a hard problem. But I don’t think the complexity of “human” is the reason for this, although it does rule out certain simple approaches like hard-coding values.
(Also, since your link seems to indicate you believe otherwise, I am fairly familiar with the content in the sequences. Apologies if this statement represents an improper inference.)
How do you intend to build a powerful optimizer without having a method of representing (or of building a representation of) the concept of “human” (where “human” can be replaced with any complex concept, even probably paperclips)?
If a machine can learn, empirically, exactly what humans are, on the most fundamental levels, but doesn’t have any values associated with them, why should it need a concept of “human?” We don’t have a category that distinguishes igneous rocks that are circular and flat on one side, but we can still recognize them and describe them precisely.
Humans are an unnatural category. Whether a fetus, an individual in a persistent vegetative state, an amputee, a corpse, an em or a skin cell culture fall into the category of “human” depends on value-sensitive boundaries. It’s not necessarily because humans are so complex that we can’t categorize them in an appropriate manner for an AI (or at least, not just because humans are complex,) it’s because we don’t have an appropriate formulation of the values that would allow a computer to draw the boundaries of the category in a way we’d want it to.
(I wasn’t sure how familiar you were with the sequences, but in any case I figured it can’t hurt to add links for anyone who might be following along who’s not familiar.)
How do you intend to build a powerful optimizer without having a method of representing (or of building a representation of) the concept of “human” (where “human” can be replaced with any complex concept, even probably paperclips)?
I agree that value specification is a hard problem. But I don’t think the complexity of “human” is the reason for this, although it does rule out certain simple approaches like hard-coding values.
(Also, since your link seems to indicate you believe otherwise, I am fairly familiar with the content in the sequences. Apologies if this statement represents an improper inference.)
If a machine can learn, empirically, exactly what humans are, on the most fundamental levels, but doesn’t have any values associated with them, why should it need a concept of “human?” We don’t have a category that distinguishes igneous rocks that are circular and flat on one side, but we can still recognize them and describe them precisely.
Humans are an unnatural category. Whether a fetus, an individual in a persistent vegetative state, an amputee, a corpse, an em or a skin cell culture fall into the category of “human” depends on value-sensitive boundaries. It’s not necessarily because humans are so complex that we can’t categorize them in an appropriate manner for an AI (or at least, not just because humans are complex,) it’s because we don’t have an appropriate formulation of the values that would allow a computer to draw the boundaries of the category in a way we’d want it to.
(I wasn’t sure how familiar you were with the sequences, but in any case I figured it can’t hurt to add links for anyone who might be following along who’s not familiar.)