axioman comments on When Goodharting is optimal: linear vs diminishing returns, unlikely vs likely, and other factors

axioman 13 Jan 2020 13:39 UTC
1 point
I was thinking about normalisation as linearly rescaling every reward to $[0, 1]$ when I wrote the comment. Then, one can always look at $[0, 1]^{2}$ , which might make it easier to graphically think about how different beliefs lead to different policies. Different scales can then be translated to a certain reweighting of the beliefs (at least from the perspective of the optimal policy), as maximizing $P (R_{1}) S_{1} R_{1} + P (R_{2}) S_{2} R_{2}$ is the same as maximizing $\frac{P (R_{1}) S_{1}}{P (R_{1}) S_{1} + P (R_{2}) S_{2}} R_{1} + \frac{P (R_{2}) S_{2}}{P (R_{1}) S_{1} + P (R_{2}) S_{2}} R_{2}$