This is something lc and gwern discussed in the comments here, but now we have clear evidence this is only true for Nash solvers (all typical engines like SF, Lc0, etc.). LeelaQueenOdds, which trained exploitatively against a model of top human players (FM+), is around 2k to 2.9k lichess elo depending on the time controls, so it completely trounces 1.6k elo players (especially 1.2k elo players as another commenter has suggested the author actually is). See: https://marcogio9.github.io/LeelaQueenOdds-Leaderboard/
Nash solvers are far too conservative and expect perfect play out of their opponents, hence give up most meaningful attacking chances in odds games. Exploitative models like LQO instead assume their opponents play like strong humans (good but imperfect) and do extremely well, despite a completely crushing material disadvantage. As some have noted, this is possible even with chess being a super sterile/simple environment relative to real life.
I speculate that the experiment from this post only yielded the results it did because Nash is a poor solution concept when one side is hopelessly disadvantaged under optimal play from both sides, and queen odds fall deep into that category.
I was playing this bot lately myself and one thing it made me wonder is, how much better would it be at beating me if it was trained against a model of me in particular, rather than how it actually was trained? I feel I have no idea.
This is something lc and gwern discussed in the comments here, but now we have clear evidence this is only true for Nash solvers (all typical engines like SF, Lc0, etc.). LeelaQueenOdds, which trained exploitatively against a model of top human players (FM+), is around 2k to 2.9k lichess elo depending on the time controls, so it completely trounces 1.6k elo players (especially 1.2k elo players as another commenter has suggested the author actually is). See: https://marcogio9.github.io/LeelaQueenOdds-Leaderboard/
Nash solvers are far too conservative and expect perfect play out of their opponents, hence give up most meaningful attacking chances in odds games. Exploitative models like LQO instead assume their opponents play like strong humans (good but imperfect) and do extremely well, despite a completely crushing material disadvantage. As some have noted, this is possible even with chess being a super sterile/simple environment relative to real life.
I speculate that the experiment from this post only yielded the results it did because Nash is a poor solution concept when one side is hopelessly disadvantaged under optimal play from both sides, and queen odds fall deep into that category.
See video from 7 minutes. Try it yourself https://lichess.org/@/LeelaQueenOdds :)
I was playing this bot lately myself and one thing it made me wonder is, how much better would it be at beating me if it was trained against a model of me in particular, rather than how it actually was trained? I feel I have no idea.