Which suggests the question: is there any interesting analogue of virtue ethics, where the agent attempts to have a utility function its overseer would like?
This reminds me of Daniel Dewey’s proposal for an agent that learns its utility function: http://lesswrong.com/lw/560/new_fai_paper_learning_what_to_value_by_daniel/.
This reminds me of Daniel Dewey’s proposal for an agent that learns its utility function: http://lesswrong.com/lw/560/new_fai_paper_learning_what_to_value_by_daniel/.