Oh, I see. So the AI simply losing the memory that v was stored in and replacing it with random noise shoudn’t count as something it will be indifferent about? How would you formalize this such that arbitrary changes to v don’t trigger the indifference?
By specifying what counts as an allowed change in U, and making the agent in to a U maximiser. Then, just as standard maximises defend their utilities, it should defend U(un clubbing the update, and only that update)
Oh, I see. So the AI simply losing the memory that v was stored in and replacing it with random noise shoudn’t count as something it will be indifferent about? How would you formalize this such that arbitrary changes to v don’t trigger the indifference?
By specifying what counts as an allowed change in U, and making the agent in to a U maximiser. Then, just as standard maximises defend their utilities, it should defend U(un clubbing the update, and only that update)