If natural abstractions are a thing, in what sense is “make this AGI have particular effect X” trying to be about human values, if X is expressed using natural abstractions?
In that case, it’s not about human values, which is one of the very nice things the natural abstraction hypothesis buys us.
If natural abstractions are a thing, in what sense is “make this AGI have particular effect X” trying to be about human values, if X is expressed using natural abstractions?
In that case, it’s not about human values, which is one of the very nice things the natural abstraction hypothesis buys us.