Viliam comments on When fine-tuning fails to elicit GPT-3.5′s chess abilities

Viliam 17 Jun 2024 9:50 UTC
2 points
0
Broadly speaking, humans completing intellectual tasks like planning or playing chess can generalize zero-shot to completing the same task where the relevant information is presented in a novel format (so long as the information is presented in a way which is comprehensible to the person in question). This is because humans broadly complete these tasks by building an internal world model and then running an internal optimizer on that world model. If one were to play chess based on a textual description of the game, the impact on their performance caused by the format would matter very little compared to their actual competency at chess.
This reminds me of Feynman’s comments on education in Brazil (sorry, my internet connection sucks today, can’t provide a quote) (also it definitely applies to other countries, too) where students of physics just did some verbal pattern-matching, and couldn’t answer a question phrased differently.
So, this is not a difference between humans and LLMs, but rather between humans who build an internal model vs humans and LLMs who don’t.