Hum… It seems that we can stratify here. Let X represent the values of a collection of variables that we are uncertain about, and that we are stratifying on.
When we compute the normalising factor for utility U under two policies π and π′, we normally do it as:
U→U/NU, with NU=∑xP(X=x)(Eπ,X=xU−Eπ′,X=xU).
And then we replace U with U/NU.
Instead we might normalise the utility U separately for each value of x:
Conditional on X=x, then U→U/NU,x, with NU,x=Eπ,X=xU−Eπ′,X=xU.
The problem is that, since we’re dividing by the N, the expectation of U/NU,x is not the same U/NU.
Is there an obvious improvement on this?
Note that here, total utilitarianism get less weight in large universes, and more in small ones.
Hum… It seems that we can stratify here. Let X represent the values of a collection of variables that we are uncertain about, and that we are stratifying on.
When we compute the normalising factor for utility U under two policies π and π′, we normally do it as:
U→U/NU, with NU=∑xP(X=x)(Eπ,X=xU−Eπ′,X=xU).
And then we replace U with U/NU.
Instead we might normalise the utility U separately for each value of x:
Conditional on X=x, then U→U/NU,x, with NU,x=Eπ,X=xU−Eπ′,X=xU.
The problem is that, since we’re dividing by the N, the expectation of U/NU,x is not the same U/NU.
Is there an obvious improvement on this?
Note that here, total utilitarianism get less weight in large universes, and more in small ones.
I’ll think more...