Presumably by starting with some sort of prior, and incrementally updating off of available information (the Web, conversation with humans, psychology literature, etc). At any point it would have to use its current model to navigate tradeoffs between the acquisition of new information about idealised human aims and the fulfillment of those aims.
This does point to another more serious problem, which is that you can’t create an AI to “maximize the expected value of the utility function written in this sealed envelope” without a scheme for interpersonal comparison of utility functions (if you assign 50% probability to the envelope containing utility function A, and 50% probability to the envelope containing utility function B, you need an algorithm to select between actions when each utility function alone would favor a different action). See this OB post by Bostrom.
Presumably by starting with some sort of prior, and incrementally updating off of available information (the Web, conversation with humans, psychology literature, etc). At any point it would have to use its current model to navigate tradeoffs between the acquisition of new information about idealised human aims and the fulfillment of those aims.
This does point to another more serious problem, which is that you can’t create an AI to “maximize the expected value of the utility function written in this sealed envelope” without a scheme for interpersonal comparison of utility functions (if you assign 50% probability to the envelope containing utility function A, and 50% probability to the envelope containing utility function B, you need an algorithm to select between actions when each utility function alone would favor a different action). See this OB post by Bostrom.