Stockfish now uses an interesting lightweight kind of NN called NNUE which does need to be trained; more importantly, chess engines have long used machine learning techniques (if not anything we would now call deep learning) which still need to be fit/trained and Stockfish relies very heavily on distributed testing to test/create changes, so if they are not playing with queen odds, then neural or no, it amounts to the same thing: it’s been designed & hyperoptimized to play regular even-odds chess, not weird variants like queen-odd chess.
Would queen-odds games pass through roughly within-distribution game states, anyway, though?
Or, either way, if/when it does reach roughly within-distribution game states, the material advantage in relative terms will be much greater than just being down a queen early on, so the starting material advantage would still underestimate the real material advantage for a better trained AI.
Its clear that it was never optimized for odds games, therefore unless concrete evidence is presented, I doubt that @titotal actually played against a “superhuman system—which may explain why it won.
There’s definitely a ceiling to which intelligence will help—as the other guy mentioned, not even AIXI would be able to recover from an adversarially designed initial position for Tic-Tac-Toe.
But I’m highly skeptical OP has reached that ceiling for chess yet.
SF’s ability to generalize across that distribution shift seems unclear. My intuition is that a starting position with queen odds is very off distribution because in training games where both players are very strong, large material imbalances only happen very late in the game.
I’m confused by your 2nd paragraph. Do you think this experiment overestimates or underestimates resource gap required to overcome a given intelligence gap?
For my 2nd paragraph, I meant that the experiment would underestimate the required resource gap. Being down exactly by a queen at the start of a game is not as bad as being down exactly by a queen later into the game when there are fewer pieces overall left, because that’s a larger relative gap in resources.
Stockfish isn’t using deep learning afaik. It’s mostly just bruteforcing.
Stockfish now uses an interesting lightweight kind of NN called NNUE which does need to be trained; more importantly, chess engines have long used machine learning techniques (if not anything we would now call deep learning) which still need to be fit/trained and Stockfish relies very heavily on distributed testing to test/create changes, so if they are not playing with queen odds, then neural or no, it amounts to the same thing: it’s been designed & hyperoptimized to play regular even-odds chess, not weird variants like queen-odd chess.
Would queen-odds games pass through roughly within-distribution game states, anyway, though?
Or, either way, if/when it does reach roughly within-distribution game states, the material advantage in relative terms will be much greater than just being down a queen early on, so the starting material advantage would still underestimate the real material advantage for a better trained AI.
Its clear that it was never optimized for odds games, therefore unless concrete evidence is presented, I doubt that @titotal actually played against a “superhuman system—which may explain why it won.
There’s definitely a ceiling to which intelligence will help—as the other guy mentioned, not even AIXI would be able to recover from an adversarially designed initial position for Tic-Tac-Toe.
But I’m highly skeptical OP has reached that ceiling for chess yet.
SF’s ability to generalize across that distribution shift seems unclear. My intuition is that a starting position with queen odds is very off distribution because in training games where both players are very strong, large material imbalances only happen very late in the game.
I’m confused by your 2nd paragraph. Do you think this experiment overestimates or underestimates resource gap required to overcome a given intelligence gap?
For my 2nd paragraph, I meant that the experiment would underestimate the required resource gap. Being down exactly by a queen at the start of a game is not as bad as being down exactly by a queen later into the game when there are fewer pieces overall left, because that’s a larger relative gap in resources.