I was thinking more about the way psychologists try to understand the way we make decisions. I stumbled across two papers from a few years ago by such a psychologist, Mark Muraven, who thinks that the way humans deal with conflicting goals could be important for AI alignment (https://arxiv.org/abs/1701.01487 and https://arxiv.org/abs/1703.06354).They appear a bit shallow to me and don’t contain any specific ideas on how to implement this. But maybe Muraven has a point here. Maybe we should put more effort into understanding the way we humans deal with goals, instead of letting an AI figure it out for itself through RL or IRL.
I was thinking more about the way psychologists try to understand the way we make decisions. I stumbled across two papers from a few years ago by such a psychologist, Mark Muraven, who thinks that the way humans deal with conflicting goals could be important for AI alignment (https://arxiv.org/abs/1701.01487 and https://arxiv.org/abs/1703.06354).They appear a bit shallow to me and don’t contain any specific ideas on how to implement this. But maybe Muraven has a point here. Maybe we should put more effort into understanding the way we humans deal with goals, instead of letting an AI figure it out for itself through RL or IRL.