I feel like the “obvious” thing to do is to ask how rare (in bits) the post-opitimization EV is according to the pre-optimization distribution. Like, suppose that pre-optimization my probability distribution over utilities I’d get is normally distributed, and after optimizing my EV is +1 standard deviation. Probability of doing that well or better is 0.158, which in bits is 2.65 bits.
Seems indifferent to affine transformation of the utility function, adding irrelevant states, splitting/merging states, etc. What are some bad things about this method?
I feel like the “obvious” thing to do is to ask how rare (in bits) the post-opitimization EV is according to the pre-optimization distribution. Like, suppose that pre-optimization my probability distribution over utilities I’d get is normally distributed, and after optimizing my EV is +1 standard deviation. Probability of doing that well or better is 0.158, which in bits is 2.65 bits.
Seems indifferent to affine transformation of the utility function, adding irrelevant states, splitting/merging states, etc. What are some bad things about this method?
This is the same as Eliezer’s definition, no? It only keeps information about the order induced by the utility function.
Oh, whoops.