Scenario 5 sounds like something an aligned AI should do. Actually, taking Petrov hostage would also be the right thing to do, if there was no better way to save people’s lives. It seems fine to me to take away someone’s option to start a nuclear war?
I think manipulation is bad when it’s used to harm you, but it’s good if it’s used to help you make better decisions. Like that time when banning lead reduced crime by 50%. Isn’t this the kind of thing an AI should do? We hire all kinds of people to manipulate us into becoming better: psychotherapists, fitness instructors, teachers. Why would it be wrong for an AI to fill these roles?
Some people (me included) value a certain level of non-manipulation. I’m trying to cash out that instinct. And it’s also needed for some ideas like corrigibility. Manipulation also combines poorly with value learning, see eg our paper here https://arxiv.org/abs/2004.13654
I do agree that saving the world is a clearly positive case of that ^_^
Scenario 5 sounds like something an aligned AI should do. Actually, taking Petrov hostage would also be the right thing to do, if there was no better way to save people’s lives. It seems fine to me to take away someone’s option to start a nuclear war?
I think manipulation is bad when it’s used to harm you, but it’s good if it’s used to help you make better decisions. Like that time when banning lead reduced crime by 50%. Isn’t this the kind of thing an AI should do? We hire all kinds of people to manipulate us into becoming better: psychotherapists, fitness instructors, teachers. Why would it be wrong for an AI to fill these roles?
Some people (me included) value a certain level of non-manipulation. I’m trying to cash out that instinct. And it’s also needed for some ideas like corrigibility. Manipulation also combines poorly with value learning, see eg our paper here https://arxiv.org/abs/2004.13654
I do agree that saving the world is a clearly positive case of that ^_^