Is it conceivable that this is purely an emergent feature from LLMs, or does this necessarily mean there’s some other stuff going on with Sydney? I don’t see how it could be the former, but I’m not an expert.
Long before we get to the “LLMs are showing a number of abilities that we don’t really understand the origins of” part (which I think is the most likely here), a number of basic patterns in chess show up in the transcript semi-directly depending on the tokenization. The full set of available board coordinates is also countable and on the small side. Enough games and it would be possible to observe that “. N?3” and “. N?5” can come in sequence but the second one has some prerequisites (I’m using the dot here to point out that there’s adjacent text cues showing which moves are from which side), that if there’s a “0-0” there isn’t going to be a second one in the same position later, that the pawn moves “. ?2” and “. ?1” never show up… and so on. You could get a lot of the way toward inferring piece positions by recognizing the alternating move structure and then just taking the last seen coordinates for a piece type, and a layer of approximate-rule-based discrimination would get you a lot further than that.
Is it conceivable that this is purely an emergent feature from LLMs, or does this necessarily mean there’s some other stuff going on with Sydney? I don’t see how it could be the former, but I’m not an expert.
Long before we get to the “LLMs are showing a number of abilities that we don’t really understand the origins of” part (which I think is the most likely here), a number of basic patterns in chess show up in the transcript semi-directly depending on the tokenization. The full set of available board coordinates is also countable and on the small side. Enough games and it would be possible to observe that “. N?3” and “. N?5” can come in sequence but the second one has some prerequisites (I’m using the dot here to point out that there’s adjacent text cues showing which moves are from which side), that if there’s a “0-0” there isn’t going to be a second one in the same position later, that the pawn moves “. ?2” and “. ?1” never show up… and so on. You could get a lot of the way toward inferring piece positions by recognizing the alternating move structure and then just taking the last seen coordinates for a piece type, and a layer of approximate-rule-based discrimination would get you a lot further than that.