Do you mean like there are (at least) two subproblems that can be addressed separately?
how to align AI with any set of values
exact specification of human values
Where the former is the proper concern of AI researchers, and the latter should be studied by someone else (even if we currently have no idea who could do such thing reliably, it’s a separate problem regardless).
I’m actually more interested in corrigibility than values alignment, so I don’t think that AI should be solving moral dilemmas every time it takes an action. I think values should be worked out in the post-ASI period, by humans in a democratic political system.
Do you mean like there are (at least) two subproblems that can be addressed separately?
how to align AI with any set of values
exact specification of human values
Where the former is the proper concern of AI researchers, and the latter should be studied by someone else (even if we currently have no idea who could do such thing reliably, it’s a separate problem regardless).
I’m actually more interested in corrigibility than values alignment, so I don’t think that AI should be solving moral dilemmas every time it takes an action. I think values should be worked out in the post-ASI period, by humans in a democratic political system.
Basically what I’m thinking here, as an upvoter of Conor.