GoteNoSente comments on Chess as a case study in hidden capabilities in ChatGPT

GoteNoSente 22 Sep 2023 13:20 UTC
5 points
0
The playing strength of parrotchess seems very uneven, though. On the one hand, if I play it head-on, just trying to play the best chess I can, I would estimate it even higher than 1800, maybe around 2000 when we regard this as blitz. I’m probably roughly somewhere in the 1900s and on a few tries, playing at blitz speed myself, I would say I lost more than I won overall.

On the other hand, trying to play an unconventional but solid opening in order to neutralize its mostly awesome openings and looking out for tactics a bit while keeping the position mostly closed, I got this game, where it does not look at all stronger than the chat3.5 models, and therefore certainly not 1800-level:

https://lichess.org/study/ymmMxzbj/SpMFmwXH

Nonetheless, the performance of this model at chess is very interesting. None of the other models, including GPT-4, has (with prompting broadly similar to what parrotchess uses) been able to get a good score against me if I just played it as I would play most human opponents, so in that sense it definitively seems impressive to me, as far as chess-playing language models go.
- AdamYedidia 30 Sep 2023 20:25 UTC
  2 points
  0
  Parent
  Good lord, I just played three games against it and it beat me in all three. None of the games were particularly close. That’s really something. Thanks to whoever made that parrotchess website!
  - GoteNoSente 18 Oct 2023 22:50 UTC
    10 points
    0
    Parent
    It is possible to play funny games against it, however, if one uses the fact that it is at heart a story telling, human-intent-predicting system. For instance, this here works (human white):
    
    1. e4 e5 2. Ke2 Ke7 3. Ke3 Ke6 4. Kf3 Kf6 5. Kg3 Kg6 6. Kh3 Kh6 7. Nf3 Nf6 8. d4+ Kg6 9. Nxe5# 1-0
    - AdamYedidia 25 Oct 2023 20:25 UTC
      1 point
      0
      Parent
      Oh wow, that is really funny. GPT-4′s greatest weakness: the Bongcloud.