The playing strength of parrotchess seems very uneven, though. On the one hand, if I play it head-on, just trying to play the best chess I can, I would estimate it even higher than 1800, maybe around 2000 when we regard this as blitz. I’m probably roughly somewhere in the 1900s and on a few tries, playing at blitz speed myself, I would say I lost more than I won overall.
On the other hand, trying to play an unconventional but solid opening in order to neutralize its mostly awesome openings and looking out for tactics a bit while keeping the position mostly closed, I got this game, where it does not look at all stronger than the chat3.5 models, and therefore certainly not 1800-level:
Nonetheless, the performance of this model at chess is very interesting. None of the other models, including GPT-4, has (with prompting broadly similar to what parrotchess uses) been able to get a good score against me if I just played it as I would play most human opponents, so in that sense it definitively seems impressive to me, as far as chess-playing language models go.
Good lord, I just played three games against it and it beat me in all three. None of the games were particularly close. That’s really something. Thanks to whoever made that parrotchess website!
It is possible to play funny games against it, however, if one uses the fact that it is at heart a story telling, human-intent-predicting system. For instance, this here works (human white):
“The new GPT model, gpt-3.5-turbo-instruct, can play chess around 1800 Elo.”
https://twitter.com/GrantSlatton/status/1703913578036904431
https://parrotchess.com/
The playing strength of parrotchess seems very uneven, though. On the one hand, if I play it head-on, just trying to play the best chess I can, I would estimate it even higher than 1800, maybe around 2000 when we regard this as blitz. I’m probably roughly somewhere in the 1900s and on a few tries, playing at blitz speed myself, I would say I lost more than I won overall.
On the other hand, trying to play an unconventional but solid opening in order to neutralize its mostly awesome openings and looking out for tactics a bit while keeping the position mostly closed, I got this game, where it does not look at all stronger than the chat3.5 models, and therefore certainly not 1800-level:
https://lichess.org/study/ymmMxzbj/SpMFmwXH
Nonetheless, the performance of this model at chess is very interesting. None of the other models, including GPT-4, has (with prompting broadly similar to what parrotchess uses) been able to get a good score against me if I just played it as I would play most human opponents, so in that sense it definitively seems impressive to me, as far as chess-playing language models go.
Good lord, I just played three games against it and it beat me in all three. None of the games were particularly close. That’s really something. Thanks to whoever made that parrotchess website!
It is possible to play funny games against it, however, if one uses the fact that it is at heart a story telling, human-intent-predicting system. For instance, this here works (human white):
1. e4 e5 2. Ke2 Ke7 3. Ke3 Ke6 4. Kf3 Kf6 5. Kg3 Kg6 6. Kh3 Kh6 7. Nf3 Nf6 8. d4+ Kg6 9. Nxe5# 1-0
Oh wow, that is really funny. GPT-4′s greatest weakness: the Bongcloud.