If you actually optimize according to an approximation, that’s going to goodhart curse the outcome. Any approximation must only be soft-optimized for, not EU maximized. A design that seeks EU maximization, and hopes for soft optimization that doesn’t go too far, doesn’t pass the omnipotence test.
Also, an approximation worth even soft-optimizing for should be found in a value-laden way, losing inessential details and not something highly value-relevant. Approximate knowledge of values helps with finding better approximations to values.
If you actually optimize according to an approximation, that’s going to goodhart curse the outcome. Any approximation must only be soft-optimized for, not EU maximized. A design that seeks EU maximization, and hopes for soft optimization that doesn’t go too far, doesn’t pass the omnipotence test.
Also, an approximation worth even soft-optimizing for should be found in a value-laden way, losing inessential details and not something highly value-relevant. Approximate knowledge of values helps with finding better approximations to values.