TurnTrout comments on Impact Measure Desiderata

TurnTrout 4 Apr 2019 22:02 UTC
LW: 4 AF: 2
AF
Short answer: yes; if its goal is to break vases, that would be pretty reasonable.

Longer answer: The AUP theory of low impact says that impact is relative to the environment and to the agent’s vantage point therein. In Platonic gridworlds like this:

knowing whether a vase is present tells you a lot about the state, and you can’t replace the vase here, so breaking it is a big deal (according to AUP). If you could replace the vase, there would still be a lesser impact. AUP would say to avoid breaking unnecessary vases due to the slight penalty, since the goal presumably doesn’t require breaking the vase – so why not go around?

On the other hand, in the Go example, winning is the agent’s objective. Depending on how the agent models the world (as a real-world agent playing a game on a computer, or whether it thinks it’s just Platonically interacting with a Go environment), penalties get applied differently. In the former case, I don’t think it would incur much penalty for being good at a game (modulo approval incentives it may or may not predict). In the latter case, you’d probably need to keep giving it more impact allowance until it’s playing as well as you’d like. This is because the goal is related to the thing which has a bit of impact.