Yes, that would be immediately reward-hacked. It’s extremely easy to never lose chess: you simply never play. After all, how do you force anyone to play chess...? “I’ll give you a billion dollars if you play chess.” “No, because I value not losing more than a billion dollars.” “I’m putting a gun to your head and will kill you if you don’t play!” “Oh, please do, thank you—after all, it’s impossible to lose a game of chess if I’m dead!” This is why RL agents have a nasty tendency to learn to ‘commit suicide’ if you reward-shape badly or the environment is too hard. (Tom7′s lexicographic agent famously learns to simply pause Tetris to avoid losing.)
Yes, that would be immediately reward-hacked. It’s extremely easy to never lose chess: you simply never play. After all, how do you force anyone to play chess...? “I’ll give you a billion dollars if you play chess.” “No, because I value not losing more than a billion dollars.” “I’m putting a gun to your head and will kill you if you don’t play!” “Oh, please do, thank you—after all, it’s impossible to lose a game of chess if I’m dead!” This is why RL agents have a nasty tendency to learn to ‘commit suicide’ if you reward-shape badly or the environment is too hard. (Tom7′s lexicographic agent famously learns to simply pause Tetris to avoid losing.)