Vanessa Kosoy comments on Better impossibility result for unbounded utilities

Vanessa Kosoy 13 Feb 2022 18:43 UTC
2 points
Yes, for example you can penalize the (initially Solomonoff-ish) prior probability of every hypothesis by a factor of $e^{- β (U_{max} - U_{min})}$ where $β > 0$ is some constant, $U_{max}$ is the maximal expected utility of this hypothesis over all policies, and $U_{min}$ is the minimal (and you’d have to discard hypotheses for which one of those is already divergent, except maybe in cases where the difference is renormalizable somehow). This kind of thing was referred to as “leverage penalty” in a previous discussion. Personally I’m quite skeptical it’s useful, but maaaybe?