Fwiw, I talk about utility uncertainty because that’s what the mechanical change to the code looks like—instead of having a known reward function, you have a distribution over reward functions. It’s certainly true that this only makes a difference as long as there is still information to be gained about the utility function.
Fwiw, I talk about utility uncertainty because that’s what the mechanical change to the code looks like—instead of having a known reward function, you have a distribution over reward functions. It’s certainly true that this only makes a difference as long as there is still information to be gained about the utility function.