You can of course define any metric you like, but what makes it a metric of “goodness” (as opposed to a metric of something else, like “badness”, or “flevness”), unless it is constructed to reflect what some agent or entity considers to be “good”?
I see human values as something built by long reflection, a heavily philosophical process where it’s unclear if humans (as opposed to human-adjacent aliens or AIs) doing the work is an important aspect of the outcome. This outcome is not something any extant agent knows. Maybe indirectly it’s what I consider good, but I don’t know what it is, so that phrasing is noncentral. Maybe long reflection is the entity that considers it good, but for this purpose it doesn’t hold the role of an agent, it’s not enacting the values, only declaring them.
You can of course define any metric you like, but what makes it a metric of “goodness” (as opposed to a metric of something else, like “badness”, or “flevness”), unless it is constructed to reflect what some agent or entity considers to be “good”?
I see human values as something built by long reflection, a heavily philosophical process where it’s unclear if humans (as opposed to human-adjacent aliens or AIs) doing the work is an important aspect of the outcome. This outcome is not something any extant agent knows. Maybe indirectly it’s what I consider good, but I don’t know what it is, so that phrasing is noncentral. Maybe long reflection is the entity that considers it good, but for this purpose it doesn’t hold the role of an agent, it’s not enacting the values, only declaring them.