This seems like a necessity to me. Any AI that has human-level intelligence or greater must have moral flexibility built-in, if for no reason other than the evolution of our own morality. Learning by predicting another agent’s response is a plausible path to our fuzzy social understanding of morals.
Consider: If an AI were sent back in time to 1800 and immediately triggered the US Civil War in order to end slavery early, is that AI friendly or unfriendly? What if it did the same today in order to end factory farming?
I don’t have an answer to either of these questions, because they’re uncomfortable and, I think, have no clear answer. I genuinely don’t know what I would want my morally aligned AI to do in this case. So I think the AI needs to figure out for itself what humanity’s collective preference might be, in much the same way that a person has to guess how their peers would react to many of their actions.
This seems like a necessity to me. Any AI that has human-level intelligence or greater must have moral flexibility built-in, if for no reason other than the evolution of our own morality. Learning by predicting another agent’s response is a plausible path to our fuzzy social understanding of morals.
Consider: If an AI were sent back in time to 1800 and immediately triggered the US Civil War in order to end slavery early, is that AI friendly or unfriendly? What if it did the same today in order to end factory farming?
I don’t have an answer to either of these questions, because they’re uncomfortable and, I think, have no clear answer. I genuinely don’t know what I would want my morally aligned AI to do in this case. So I think the AI needs to figure out for itself what humanity’s collective preference might be, in much the same way that a person has to guess how their peers would react to many of their actions.