if I’m playing chess against an opponent who plays the optimal policy for the chess objective function
1. I predict that you will never encounter such an opponent. Solving chess is hard.*
2. Optimal play within a game might not be optimal overall (others can learn from the strategy).
Why does this matter? If the theorems hold, even for ‘not optimal, but still great’ policies (say, for chess), then the distinction is irrelevant. Though for more complicated (or non-zero sum) games, the optimal move/policy may depend on the other player’s move/policy.
(I’m not sure what ‘avoid shutdown’ looks like in chess.)
ETA:
*with 10^43 legal positions in chess, it will take an impossibly long time to compute a perfect strategy with any feasible technology.
1. I predict that you will never encounter such an opponent. Solving chess is hard.*
2. Optimal play within a game might not be optimal overall (others can learn from the strategy).
Why does this matter? If the theorems hold, even for ‘not optimal, but still great’ policies (say, for chess), then the distinction is irrelevant. Though for more complicated (or non-zero sum) games, the optimal move/policy may depend on the other player’s move/policy.
(I’m not sure what ‘avoid shutdown’ looks like in chess.)
ETA:
*with 10^43 legal positions in chess, it will take an impossibly long time to compute a perfect strategy with any feasible technology.
-source: https://en.wikipedia.org/wiki/Chess#Mathematics which lists its source from 1977