Caleb Parikh and I were curious about GPT-4′s internal models of chess as a result of this post, so we asked it some questions about the state partway through this game:
The following is a partial chess transcript in PGN. Please describe all legal moves that could come next. 1. d4 Nf6 2. c4 e6 3. Nf3 d5 4. Nc3 Be7 5. Bf4 O-O 6. Nb5 $2 Na6 $9
It replied:
The current position after 1. d4 Nf6 2. c4 e6 3. Nf3 d5 4. Nc3 Be7 5. Bf4 O-O 6. Nb5 Na6 is as follows:
8 r b q r p b p
7 p p b n p p p
6 n . . p . n .
5 . N . . p . .
4 . . P P . B .
3 . . . . N . .
2 P P . . . P P
1 R . B Q K . R
a b c d e f g h
(and explained that lowercase letters were black pieces and uppercase letters were white pieces, which I didn’t know; I don’t play chess).
This… is not an accurate picture of the game board (what are all those pawns doing on Black’s back row?) We also asked it for a list of legal moves that White could make next, and it described some accurately but some inaccurately (e.g. listed one as a capture even though it wasn’t).
This is pretty funny because the supposed board state has only 7 columns. Yet it’s also much better than random. A lot of the pieces are correct… that is, if you count from the left (real board state is here).
Also, I’ve never heard of using upper and lowercase to differentiate white and black, I think GPT-4 just made that up. (edit: or not; see reply.)
Extra twist: I just asked a new GPT-4 instance whether any chess notation differentiates lower and upper case, and it told me algebraic notation does, but that’s the standard notation, and it doesn’t. Wikipedia article also says nothing about it. Very odd.
On a retry, it didn’t decide to summarize the board and successfully listed a bunch of legal moves for White to make. Although I asked for all legal moves, the list wasn’t exhaustive; upon prompting about this, it apologized and listed a few more moves, some of which were legal and some which were illegal, still not exhaustive.
Caleb Parikh and I were curious about GPT-4′s internal models of chess as a result of this post, so we asked it some questions about the state partway through this game:
It replied:
(and explained that lowercase letters were black pieces and uppercase letters were white pieces, which I didn’t know; I don’t play chess).
This… is not an accurate picture of the game board (what are all those pawns doing on Black’s back row?) We also asked it for a list of legal moves that White could make next, and it described some accurately but some inaccurately (e.g. listed one as a capture even though it wasn’t).
This is pretty funny because the supposed board state has only 7 columns. Yet it’s also much better than random. A lot of the pieces are correct… that is, if you count from the left (real board state is here).
Also, I’ve never heard of using upper and lowercase to differentiate white and black,
I think GPT-4 just made that up.(edit: or not; see reply.)Extra twist: I just asked a new GPT-4 instance whether any chess notation differentiates lower and upper case, and it told me algebraic notation does, but that’s the standard notation, and it doesn’t. Wikipedia article also says nothing about it. Very odd.
No, this is common. E.g. https://github.com/niklasf/python-chess
Hah, I didn’t even notice that.
XD
On a retry, it didn’t decide to summarize the board and successfully listed a bunch of legal moves for White to make. Although I asked for all legal moves, the list wasn’t exhaustive; upon prompting about this, it apologized and listed a few more moves, some of which were legal and some which were illegal, still not exhaustive.