ESRogs comments on Conclusion to the sequence on value learning

ESRogs 3 Feb 2019 23:25 UTC
22 points
It seems to me that perhaps your argument about expected utility maximization being a trivial property extends back one step previous in the argument, to non-exploitability as well.
AlphaZero is better than us at chess, and so it is non-exploitable at chess (or you might say that being better at chess is the same thing as being non-exploitable at chess). If that’s true, then it must also appear to us to be an expected utility maximizer. But notably the kind of EU-maximizer that it must appear to be is: one whose utility function is defined in terms of chess outcomes. AlphaZero *is* exploitable if we’re secretly playing a slightly different game, like how-many-more-pawns-do-I-have-than-my-opponent-after-twenty-moves, or the game don’t-get-unplugged.
Going the other direction, from EU-maximization to non-exploitability, we can point out that any agent could be thought of as an EU-maximizer (perhaps with a very convoluted utility function), and if it’s very competent w.r.t. its utility function, then it will be non-exploitable by us, w.r.t. outcomes related to its utility function.
In other words, non-exploitability is only meaningful with respect to some utility function, and is not a property of “intelligence” or “competence” in general.
Would you agree with this statement?
What links here?
- Rohin Shah 4 Feb 2019 1:59 UTC
  5 points
  Parent
  Yes, I agree that’s a corollary.