Some people (me included) value a certain level of non-manipulation. I’m trying to cash out that instinct. And it’s also needed for some ideas like corrigibility. Manipulation also combines poorly with value learning, see eg our paper here https://arxiv.org/abs/2004.13654
I do agree that saving the world is a clearly positive case of that ^_^
Some people (me included) value a certain level of non-manipulation. I’m trying to cash out that instinct. And it’s also needed for some ideas like corrigibility. Manipulation also combines poorly with value learning, see eg our paper here https://arxiv.org/abs/2004.13654
I do agree that saving the world is a clearly positive case of that ^_^