AlexMennen comments on Proper value learning through indifference

AlexMennen 21 Jun 2014 16:05 UTC
1 point
Oh, I see. So the AI simply losing the memory that v was stored in and replacing it with random noise shoudn’t count as something it will be indifferent about? How would you formalize this such that arbitrary changes to v don’t trigger the indifference?
- Stuart_Armstrong 22 Jun 2014 20:47 UTC
  1 point
  Parent
  By specifying what counts as an allowed change in U, and making the agent in to a U maximiser. Then, just as standard maximises defend their utilities, it should defend U(un clubbing the update, and only that update)