Key reason: value loading AIs do not follow a utility function, but a dynamic construct, that doesn’t have all the same properties.
At least as I read the original value-learning paper, they do follow a utility function: the maximum likelihood utility function in some distribution that is subject to Bayesian updating. The hard part was how to construct that distribution and subject it to evidence; the concept that the AI is going to want to have incorrect beliefs (since, after all, the process by which the updates are performed is epistemic, not moral) hadn’t occurred to me.
At least as I read the original value-learning paper, they do follow a utility function: the maximum likelihood utility function in some distribution that is subject to Bayesian updating. The hard part was how to construct that distribution and subject it to evidence; the concept that the AI is going to want to have incorrect beliefs (since, after all, the process by which the updates are performed is epistemic, not moral) hadn’t occurred to me.