But the question is whether values are even the right way to go about this problem. That’s the kind of information we’re seeking: information about how even to go about being beneficial, and what beneficial really means. Does it really make sense to model a rainforest as an agent and back out a value function for it? If we did that, would it work out in a way that we could look back on and be glad about? Perhaps it would, perhaps it wouldn’t, but the hard problem of AI safety is this question of what even is the right frame to start thinking about this in, and how we can even begin to answer such a question.
Now perhaps it’s still true that the information we seek can be found in a human toe. But just beware that we’re not talking about anything so concrete as values here.
But the question is whether values are even the right way to go about this problem. That’s the kind of information we’re seeking: information about how even to go about being beneficial, and what beneficial really means. Does it really make sense to model a rainforest as an agent and back out a value function for it? If we did that, would it work out in a way that we could look back on and be glad about? Perhaps it would, perhaps it wouldn’t, but the hard problem of AI safety is this question of what even is the right frame to start thinking about this in, and how we can even begin to answer such a question.
Now perhaps it’s still true that the information we seek can be found in a human toe. But just beware that we’re not talking about anything so concrete as values here.