A more explicit construction is “let u evaluate to 1 iff it sees “high scoring” observation o at time t; clearly, its EU is increased. If u_A is this utility, let u instead evaluate to .99 iff it sees o at time t (and 0 otherwise).”
It’s true you could prove it in the way you mentioned (although the history h wouldn’t be supplied to the inner utility calculation), but it isn’t very suggestive for the instrumental convergence / opportunity cost phenomenon I was trying to point at.
A more explicit construction is “let u evaluate to 1 iff it sees “high scoring” observation o at time t; clearly, its EU is increased. If u_A is this utility, let u instead evaluate to .99 iff it sees o at time t (and 0 otherwise).”
It’s true you could prove it in the way you mentioned (although the history h wouldn’t be supplied to the inner utility calculation), but it isn’t very suggestive for the instrumental convergence / opportunity cost phenomenon I was trying to point at.