There is regular structure in human values that can be learned without requiring detailed knowledge of physics, anatomy, or AI programming. [pollid:1091]
Human values are so fragile that it would require a superintelligence to capture them with anything close to adequate fidelity.[pollid:1092]
Humans are capable of pre-digesting parts of the human values problem domain. [pollid:1093]
Successful techniques for value discovery of non-humans, (e.g. artificial agents, non-human animals, human institutions) would meaningfully translate into tools for learning human values. [pollid:1094]
Value learning isn’t adequately being researched by commercial interests who want to use it to sell you things. [pollid:1095]
Practice teaching non-superintelligent machines to respect human values will improve our ability to specify a Friendly utility function for any potential superintelligence.[pollid:1096]
Something other than AI will cause human extinction sometime in the next 100 years.[pollid:1097]
All other things being equal, an additional researcher working on value learning is more valuable than one working on corrigibility, Vingean reflection, or some other portion of the FAI problem. [pollid:1098]
There is regular structure in human values that can be learned without requiring detailed knowledge of physics, anatomy, or AI programming.
While there is some regular structure to human values, I don’t think you can say that the totality of human values has a completely regular structure. There are too many cases of nameless longings and generalized anxieties. Much of art is dedicated exactly to teasing out these feelings and experiences, often in counterintuitive contexts.
Can they be learned without detailed knowledge of X, Y and Z? I suppose it depends on what “detailed” means—I’ll assume it means “less detailed than the required knowledge of the structure of human values.” That said, the excluded set of knowledge you chose—“physics, anatomy, or AI programming”—seems really odd to me. I suppose you can poll people about their values (or use more sophisticated methods like prediction markets), but I don’t see how this can yield more than “the set of human values that humans can articulate.” It’s something, but this seems to be a small subset of the set of human values. To characterize all dimensions of human values, I do imagine that you’ll need to model human neural biophysics in detail. If successful, it will be a contribution to AI theory and practice.
Human values are so fragile that it would require a superintelligence to capture them with anything close to adequate fidelity.
To me, in this context, the term “fragile” means exactly that it is important to characterize and consider all dimensions of human values, as well as the potentially highly nonlinear relationships between those dimensions. An at-the-time invisible “blow” to at-the-time unarticulated dimension can result in unfathomable suffering 1000 years hence. Can a human intelligence capture the totality of human values? Some of our artists seem to have glimpses of the whole, but it seems unlikely to me that a baseline human can appreciate the whole clearly.
There is regular structure in human values that can be learned without requiring detailed knowledge of physics, anatomy, or AI programming. [pollid:1091]
Human values are so fragile that it would require a superintelligence to capture them with anything close to adequate fidelity.[pollid:1092]
Humans are capable of pre-digesting parts of the human values problem domain. [pollid:1093]
Successful techniques for value discovery of non-humans, (e.g. artificial agents, non-human animals, human institutions) would meaningfully translate into tools for learning human values. [pollid:1094]
Value learning isn’t adequately being researched by commercial interests who want to use it to sell you things. [pollid:1095]
Practice teaching non-superintelligent machines to respect human values will improve our ability to specify a Friendly utility function for any potential superintelligence.[pollid:1096]
Something other than AI will cause human extinction sometime in the next 100 years.[pollid:1097]
All other things being equal, an additional researcher working on value learning is more valuable than one working on corrigibility, Vingean reflection, or some other portion of the FAI problem. [pollid:1098]
While there is some regular structure to human values, I don’t think you can say that the totality of human values has a completely regular structure. There are too many cases of nameless longings and generalized anxieties. Much of art is dedicated exactly to teasing out these feelings and experiences, often in counterintuitive contexts.
Can they be learned without detailed knowledge of X, Y and Z? I suppose it depends on what “detailed” means—I’ll assume it means “less detailed than the required knowledge of the structure of human values.” That said, the excluded set of knowledge you chose—“physics, anatomy, or AI programming”—seems really odd to me. I suppose you can poll people about their values (or use more sophisticated methods like prediction markets), but I don’t see how this can yield more than “the set of human values that humans can articulate.” It’s something, but this seems to be a small subset of the set of human values. To characterize all dimensions of human values, I do imagine that you’ll need to model human neural biophysics in detail. If successful, it will be a contribution to AI theory and practice.
To me, in this context, the term “fragile” means exactly that it is important to characterize and consider all dimensions of human values, as well as the potentially highly nonlinear relationships between those dimensions. An at-the-time invisible “blow” to at-the-time unarticulated dimension can result in unfathomable suffering 1000 years hence. Can a human intelligence capture the totality of human values? Some of our artists seem to have glimpses of the whole, but it seems unlikely to me that a baseline human can appreciate the whole clearly.