Thanks for the feedback. On the GM-level skepticism: I don’t see any recommendation on things we could do to make the claim stronger, and I’m really thirsty for some. What would convince you (and possibly, others) that our neural network is mastering the game of chess?
We are currently (and have already) played against actual GMs and beaten some. But it’s not that simple to publish (look at the time it took for 100 times bigger projects like AlphaGo or AlphaStar—this one is a two full-time people project, Anian and I), for legal reasons, because GM/IMs don’t really like to be beaten publicly, for statistical reasons (you need a looot of games to be meaningful) etc. Also note that many 1500 ELO people that played against the bot on Lichess are actually fake (smurf) accounts of much stronger players.
I think the most important things that are missing in the paper currently are these three points:
1. Comparison to the best Leela Zero networks
2. Testing against strong (maybe IM-level) humans at tournament time controls (or a clear claim that we are talking about blitz elo, since a player who does no explicit tree search does not get better if given more thinking time).
3. Games against traditional chess computers in the low GM/strong IM strength bracket would also be nice to have, although maybe not scientifically compelling. I sometimes do those for fun with LC0 and it is utterly fascinating to see how LC0 with current transformer networks at one node per move manages to very often crush this type of opponent by pure positional play, i.e. in a way that makes winning against these machines look extremely simple.
To me the most interesting question is to what extend your network learns to do reasoning/search vs pure pattern recognition.
I trained a transformer to predict tournament chess moves a few years back and my impression was that it played strongly in the opening and always made sensible looking moves but had absolutely no ability to look ahead.
I am currently working on a benchmark of positions that require reasoning and can’t be solved by highly trained intuition alone. Would you be interested in running such a benchmark?
If the model has beaten GMs at all, then it can only be so weak, right? I’m glad I didn’t make stronger claims than I did.
I think my questions about what humans-who-challenge-bots are like was fair, and the point about smurfing is interesting. I’d be interested in other impressions you have about those players.
Is the model’s Lichess profile/game history available?
Thanks for the feedback. On the GM-level skepticism: I don’t see any recommendation on things we could do to make the claim stronger, and I’m really thirsty for some. What would convince you (and possibly, others) that our neural network is mastering the game of chess?
We are currently (and have already) played against actual GMs and beaten some. But it’s not that simple to publish (look at the time it took for 100 times bigger projects like AlphaGo or AlphaStar—this one is a two full-time people project, Anian and I), for legal reasons, because GM/IMs don’t really like to be beaten publicly, for statistical reasons (you need a looot of games to be meaningful) etc. Also note that many 1500 ELO people that played against the bot on Lichess are actually fake (smurf) accounts of much stronger players.
I think the most important things that are missing in the paper currently are these three points:
1. Comparison to the best Leela Zero networks
2. Testing against strong (maybe IM-level) humans at tournament time controls (or a clear claim that we are talking about blitz elo, since a player who does no explicit tree search does not get better if given more thinking time).
3. Games against traditional chess computers in the low GM/strong IM strength bracket would also be nice to have, although maybe not scientifically compelling. I sometimes do those for fun with LC0 and it is utterly fascinating to see how LC0 with current transformer networks at one node per move manages to very often crush this type of opponent by pure positional play, i.e. in a way that makes winning against these machines look extremely simple.
To me the most interesting question is to what extend your network learns to do reasoning/search vs pure pattern recognition.
I trained a transformer to predict tournament chess moves a few years back and my impression was that it played strongly in the opening and always made sensible looking moves but had absolutely no ability to look ahead.
I am currently working on a benchmark of positions that require reasoning and can’t be solved by highly trained intuition alone. Would you be interested in running such a benchmark?
Wow, thanks for replying.
If the model has beaten GMs at all, then it can only be so weak, right? I’m glad I didn’t make stronger claims than I did.
I think my questions about what humans-who-challenge-bots are like was fair, and the point about smurfing is interesting. I’d be interested in other impressions you have about those players.
Is the model’s Lichess profile/game history available?