instrumental convergence basically disappears for agents with utility functions over action-observation histories.
Wait, I am puzzled. Have you just completely changed your mind about the preconditions needed to get a power-seeking agent? The way the above reads is: just add some observation of actions to your realistic utility function, and you instrumental convergence problem is solved.
u-AOH (utility functions over action-observation histories): No IC
u-OH (utility functions over observation histories): Strong IC
There are many utility functions in u-AOH that simply ignore the A part of the history, so these would then have Strong IC because they are u-OH functions. So are you are making a subtle mathematical point about how these will average away to zero (given various properties of infinite sets), or am I missing something?
Wait, I am puzzled. Have you just completely changed your mind about the preconditions needed to get a power-seeking agent? The way the above reads is: just add some observation of actions to your realistic utility function, and you instrumental convergence problem is solved.
There are many utility functions in u-AOH that simply ignore the A part of the history, so these would then have Strong IC because they are u-OH functions. So are you are making a subtle mathematical point about how these will average away to zero (given various properties of infinite sets), or am I missing something?
I recommend reading the quoted post for clarification.