AFI worry: A human-in-the-loop AI that only takes actions that get human approval (and whose expected outcomes have human approval) hits big problems when the context the AI is acting in is a very different context from where our values were trained.
Is there any way around this besides simulating people having their values re-organized given the new environment? Is this what CEV is about?
AFI worry: A human-in-the-loop AI that only takes actions that get human approval (and whose expected outcomes have human approval) hits big problems when the context the AI is acting in is a very different context from where our values were trained.
Is there any way around this besides simulating people having their values re-organized given the new environment? Is this what CEV is about?