Bob: Oh OK, we’re just going to create this user authetication technology and hope people use it for good?
Seems to me that the answer “I hope people will use it for good” is quite okay for authentication, but not okay for alignment. Doing good is outside the scope of authentication, but is kinda the point of alignment.
The basis of all successful technology to date has been separation of concerns. One of the problems with Alignment as an academic discipline is keeping the focus on problems that can actually be solved, without drawing in all of philosophy and politics. It’s like the old joke about object-oriented programming: you asked for a banana, but you got a gorilla holding the banana, and the entire jungle too.
Do you mean like there are (at least) two subproblems that can be addressed separately?
how to align AI with any set of values
exact specification of human values
Where the former is the proper concern of AI researchers, and the latter should be studied by someone else (even if we currently have no idea who could do such thing reliably, it’s a separate problem regardless).
I’m actually more interested in corrigibility than values alignment, so I don’t think that AI should be solving moral dilemmas every time it takes an action. I think values should be worked out in the post-ASI period, by humans in a democratic political system.
People currently give MIRI money in the hopes they will use it for alignment. Those people can’t explain concretely what MIRI will do to help alignment. By your standard, should anyone give MIRI money?
When you’re part of a cooperative effort, you’re going to be handing off tools to people (either now or in the future) which they’ll use in ways you don’t understand and can’t express. Making people feel foolish for being a long inferential distance away from the solution discourages them from laying groundwork that may well be necessary for progress, or even from exploring.
Seems to me that the answer “I hope people will use it for good” is quite okay for authentication, but not okay for alignment. Doing good is outside the scope of authentication, but is kinda the point of alignment.
The basis of all successful technology to date has been separation of concerns. One of the problems with Alignment as an academic discipline is keeping the focus on problems that can actually be solved, without drawing in all of philosophy and politics. It’s like the old joke about object-oriented programming: you asked for a banana, but you got a gorilla holding the banana, and the entire jungle too.
Do you mean like there are (at least) two subproblems that can be addressed separately?
how to align AI with any set of values
exact specification of human values
Where the former is the proper concern of AI researchers, and the latter should be studied by someone else (even if we currently have no idea who could do such thing reliably, it’s a separate problem regardless).
I’m actually more interested in corrigibility than values alignment, so I don’t think that AI should be solving moral dilemmas every time it takes an action. I think values should be worked out in the post-ASI period, by humans in a democratic political system.
Basically what I’m thinking here, as an upvoter of Conor.
People currently give MIRI money in the hopes they will use it for alignment. Those people can’t explain concretely what MIRI will do to help alignment. By your standard, should anyone give MIRI money?
When you’re part of a cooperative effort, you’re going to be handing off tools to people (either now or in the future) which they’ll use in ways you don’t understand and can’t express. Making people feel foolish for being a long inferential distance away from the solution discourages them from laying groundwork that may well be necessary for progress, or even from exploring.