While I agree that this post was incorrect, I am fond of it, because the resulting conversation made a correct prediction that LeelaPieceOdds was possible. Most clearly in a thread started by lc:
I have wondered for a while if you couldn’t use the enormous online chess datasets to create an “exploitative/elo-aware” Stockfish, which had a superhuman ability to trick/trap players during handicapped games, or maybe end regular games extraordinarily quickly, and not just handle the best players.
(not quite a prediction as phrased, but I still infer a prediction overall).
Interestingly there were two reasons given for predicting that Stockfish is far from optimal when giving Queen odds to a less skilled player:
Stockfish is not trained on positions where it begins down a queen (out-of-distribution)
Stockfish is trained to play the Nash equilibrium move, not to exploit weaker play (non-exploiting)
The discussion didn’t make clear predictions about which factor would be most important, or whether both would be required, or whether it’s more complicated than that. Folks who don’t yet know might make a prediction before reading on.
For what it’s worth, my prediction was that non-exploiting play is more important. That’s mostly based on a weak intuition that starting without a queen isn’t that far out of distribution, and neural networks generalize well. Another way of putting it: I predicted that Stockfish was optimizing the wrong thing more than it was too dumb to optimize.
And the result? Alas, not very clear to me. My research is from the the lc0 blog, with posts such as The LeelaPieceOdds Challenge: What does it take you to win against Leela?. The journey began with the “contempt” setting, which I understand as expecting worse opponent moves. This allows reasonable opening play and avoids forced piece exchanges. However GM-beating play was unlocked with a fine-tuned odds-play-network, which impacts both out-of-distribution and non-exploiting concerns.
One surprise gives me more respect for the out-of-distribution theory. The developer’s blog first mentioned piece odds in The Lc0 v0.30.0 WDL rescale/contempt implementation
In our tests we still got reasonable play with up to rook+knight odds, but got poor performance with removed (otherwise blocked) bishops.
So missing a single bishop is in some sense further out-of-distribution than missing a rook and a knight! The later blog I linked explains a bit more:
Removing one of the two bishops leads to an unrealistic color imbalance regarding the pawn structure far beyond the opening phase.
An interesting example where the details of going out-of-distribution matter more than the scale of going out-of-distribution. There’s an article that may have more info in New in Chess, but it’s paywalled and I don’t know if has more info on the machine-learning aspects or the human aspects.
Spot check regarding pedestrians, at current time RSS “rule 4” mentions:
The associated graphic also shows a pedestrian. I’m not sure if this was added more recently, in response to this type of criticism. From later discussion I see that pedestrians were already included in the RSS paper, which I’ve not read.