I’m not quite sure. Some people react to the idea of imbuing AI with some values with horror (“that’s slavery!” or “you’re forcing the AI to have your values!”) and I’m a little empathetic but also befuddled about what else to do. When you make these things, you’re implicitly making some choice about how to influence what they value.
Is this the alternative you’re proposing? Is this basically saying that there should be ~indifference between many induced value changes, within some bounds of acceptability? I think clarifying the exact bounds of acceptability is quite hard, and anything that’s borderline might lead to increased chance of values drifting to “non-acceptable” regions
No, I was vaguely describing at a high-level what value-change policy I endorse. As you point out, clarifying those bounds is very hard, and very important.
Likewise, I think “common sense” can change in endrosed ways, but I think we probably have a better handle on that as correct reasoning is a much more general, and hence simple, sort of capacity.
I’m not quite sure. Some people react to the idea of imbuing AI with some values with horror (“that’s slavery!” or “you’re forcing the AI to have your values!”) and I’m a little empathetic but also befuddled about what else to do. When you make these things, you’re implicitly making some choice about how to influence what they value.
No, I was vaguely describing at a high-level what value-change policy I endorse. As you point out, clarifying those bounds is very hard, and very important.
Likewise, I think “common sense” can change in endrosed ways, but I think we probably have a better handle on that as correct reasoning is a much more general, and hence simple, sort of capacity.