@gwern and @lc are right. Stockfish is terrible at odds and this post could really use some follow-up.
As @simplegeometry points out in the comments, we now have much stronger odds-playing engines that regularly win against much stronger players than OP.
FYI, there has been even further progress with Leela odds nets. Here are some recent quotes from GM Larry Kaufman (a.k.a. Hissha) found on the Leela Chess Zero Discord:
(2025-03-04) I completed an analysis of how the Leela odds nets have performed on LiChess since the search-contempt upgrade on Feb. 27. [...] I believe these are reasonable estimates of the LiChess Blitz rating needed to break even with the bots at 5′3“ in serious play. Queen and move odds (means Leela plays Black) 2400, Queen odds (Leela White) 2550, [...] Rook and move odds (Leela Black); 3000. Rook odds (Leela White) 3050, knight odds 3200. For comparison only a few top humans exceed 3000, with Magnus at 3131. So based on this, even Magnus would lose a match at 5′3” with knight odds, while perhaps the top five blitz players in the world would win a match at rook odds. Maybe about top fifty could win a match at queen for knight. At queen odds (Leela White), a “par” (FIDE 2400) IM should come out ahead, while a “par” (FIDE 2300) FM should come out behind.
(2025-03-07) Yes, there have to be limits to what is possible, but we keep blowing by what we thought those limits were! A decade ago, blitz games (3′2″) were pretty even between the best engine (then Komodo) and “par” GMs at knight odds. Maybe some people imagined that some day we could push that to being even at rook odds, but if anyone had suggested queen odds that would have been taken as a joke. And yet, if we’re not there already, we are closing in on it. Similarly at Classical time controls, we could barely give knight odds to players with ratings like FIDE 2100 back then, giving knight odds to “par” GMs in Classical seemed like an impossible goal. Now I think we are already there, and giving rook odds to players in Classical at least seems a realistic goal. What it means is that chess is more complicated than we thought it was.
As the name suggests, Leela Queen Odds is trained specifically to play without a queen, which is of course an absolutely bonkers disadvantage against 2k+ elo players. One interesting wrinkle is the time constraint. AIs are better at fast chess (obviously), and apparently no one who’s tried is yet able to beat it consistently at 3+0 (3 minutes with no timing increment)
At rapid time controls, it seems like we could maybe go even against Magnus with knight odds? If not Magnus, perhaps other high-rated GMs.
There was a match with the most recently updated LeelaKnightOdds and GM Alex Lenderman but I don’t recall the score exactly. EDIT: which was 19-3-2 win draw loss.
I am very skeptical of this on priors, for the record. I think this statement could be true for superblitz time controls and whatnot, but I would be shocked if knight odds would be enough to beat Magnus in a 10+0 or 15+0 game. That being said, I have no inside knowledge, and I would update a lot of my beliefs significantly if your statement as currently written actually ends up being true.
LeelaKnightOdds has convincingly beaten both Awonder Liang and Anish Giri at 3+2 by large margins, and has an extremely strong record at 5+3 against people who have challenged it.
I think 15+0 and probably also 10+0 would be a relatively easy win for Magnus based on Awonder, a ~150 elo weaker player, taking two draws at 8+3 and a win and a draw at 10+5. At 5+3 I’m not sure because we have so little data at winnable time controls, but wouldn’t expect an easy win for either player.
It’s also certainly not the case that these few-months-old networks running a somewhat improper algorithm are the best we could build—it’s known at minimum that this Leela is tactically weaker than normal and can drop endgame wins, even if humans rarely capitalize on that.
Hissha from the Lc0 server reports 19 wins, 3 draws, and 2 losses against Lenderman (currently ~2500 FIDE) at 15+10 from a knight odds match 2 months ago—with the caveat that Lenderman started playing too fast after 10 games. I haven’t run the numbers but suspect this would be enough to go even against a 2750, if not Magnus?
I was surprised too. I think it’s an exciting development :)
Hmm, that sounds about right based on the usual human-vs-human transfer from Elo difference to performance… but I am still not sure if that holds up when you have odds games, which feel qualitatively different to me than regular games. Based on my current chess intuition, I would expect the ability to win odds games to scale better than ELO near the top level, but I could be wrong about this.
@gwern and @lc are right. Stockfish is terrible at odds and this post could really use some follow-up.
As @simplegeometry points out in the comments, we now have much stronger odds-playing engines that regularly win against much stronger players than OP.
https://lichess.org/@/LeelaQueenOdds
https://marcogio9.github.io/LeelaQueenOdds-Leaderboard/
That’s really cool! Do you have any sense of what kind of material advantage these odd-playing engines could use against the best humans?
FYI, there has been even further progress with Leela odds nets. Here are some recent quotes from GM Larry Kaufman (a.k.a. Hissha) found on the Leela Chess Zero Discord:
As the name suggests, Leela Queen Odds is trained specifically to play without a queen, which is of course an absolutely bonkers disadvantage against 2k+ elo players. One interesting wrinkle is the time constraint. AIs are better at fast chess (obviously), and apparently no one who’s tried is yet able to beat it consistently at 3+0 (3 minutes with no timing increment)
At rapid time controls, it seems like we could maybe go even against Magnus with knight odds? If not Magnus, perhaps other high-rated GMs.
There was a match with the most recently updated LeelaKnightOdds and GM Alex Lenderman
but I don’t recall the score exactly.EDIT: which was 19-3-2 win draw loss.I am very skeptical of this on priors, for the record. I think this statement could be true for superblitz time controls and whatnot, but I would be shocked if knight odds would be enough to beat Magnus in a 10+0 or 15+0 game. That being said, I have no inside knowledge, and I would update a lot of my beliefs significantly if your statement as currently written actually ends up being true.
LeelaKnightOdds has convincingly beaten both Awonder Liang and Anish Giri at 3+2 by large margins, and has an extremely strong record at 5+3 against people who have challenged it.
I think 15+0 and probably also 10+0 would be a relatively easy win for Magnus based on Awonder, a ~150 elo weaker player, taking two draws at 8+3 and a win and a draw at 10+5. At 5+3 I’m not sure because we have so little data at winnable time controls, but wouldn’t expect an easy win for either player.
It’s also certainly not the case that these few-months-old networks running a somewhat improper algorithm are the best we could build—it’s known at minimum that this Leela is tactically weaker than normal and can drop endgame wins, even if humans rarely capitalize on that.
Hissha from the Lc0 server reports 19 wins, 3 draws, and 2 losses against Lenderman (currently ~2500 FIDE) at 15+10 from a knight odds match 2 months ago—with the caveat that Lenderman started playing too fast after 10 games. I haven’t run the numbers but suspect this would be enough to go even against a 2750, if not Magnus?
I was surprised too. I think it’s an exciting development :)
Hmm, that sounds about right based on the usual human-vs-human transfer from Elo difference to performance… but I am still not sure if that holds up when you have odds games, which feel qualitatively different to me than regular games. Based on my current chess intuition, I would expect the ability to win odds games to scale better than ELO near the top level, but I could be wrong about this.
Knight odds is pretty challenging even for grandmasters.