Unfortunately it has a problem of its own—it’s sensitive to our choice of xw. By adding some made up element to X with large negative utility and zero probability of occurring, we can make OP arbitrarily low. In that case basically all of the default relative expected utility comes from avoiding the worst outcome, which is guaranteed, so you don’t get any credit for optimising.
What if we measure the utility of an outcome relative not to the worst one but to the status quo, i.e., the outcome that would happen if we did nothing/took null action?
In that case, adding or subtracting outcomes to/from X doesn’t change OP for outcomes that were already in X, as long as the default outcome also remains in X.
Obviously, this means that OP(x) for any x∈X depends on the choice of default outcome. But I think it’s OK? If I have $1000 and increase my wealth to $1,000,000, then I think I “deserve” being assigned more optimization power than if I had $1,000,000 and did nothing, even if the absolute utility I get from having $1,000,000 is the same.
We’re already comparing to the default outcome in that we’re asking “what fraction of the default expected utility minus the worst comes from outcomes at least this good?”.
I think you’re proposing to replace “the worst” with “the default”, in which case we end up dividing by zero.
We could pick some other new reference point other than the worst, but different to the default expected utility. (But that does introduce the possibility of negative OP and still have sensitivity issues).
What if we measure the utility of an outcome relative not to the worst one but to the status quo, i.e., the outcome that would happen if we did nothing/took null action?
In that case, adding or subtracting outcomes to/from X doesn’t change OP for outcomes that were already in X, as long as the default outcome also remains in X.
Obviously, this means that OP(x) for any x∈X depends on the choice of default outcome. But I think it’s OK? If I have $1000 and increase my wealth to $1,000,000, then I think I “deserve” being assigned more optimization power than if I had $1,000,000 and did nothing, even if the absolute utility I get from having $1,000,000 is the same.
We’re already comparing to the default outcome in that we’re asking “what fraction of the default expected utility minus the worst comes from outcomes at least this good?”.
I think you’re proposing to replace “the worst” with “the default”, in which case we end up dividing by zero.
We could pick some other new reference point other than the worst, but different to the default expected utility. (But that does introduce the possibility of negative OP and still have sensitivity issues).