Unfortunately I hear this quite often, sometimes even from people who should know better.
A lof of them confuses this with the actual thing that exist: “supervised ML models (which LLM is just a particular type of) tend to work much worse on the out-of-training distribution data”. If you train your model to determine the volume of apples and oranges and melons and other round-y shapes—it will work quite well on any round-y shape, including all kind of unseen ones. But it will suck at predicting the volume of a box.
You don’t need model to see every single game of chess, you just need the new situations to be within the distribution built from massive training data, and they most often are.
Real out-of-distribution example in this case would’ve been to only train it on chess and then ask what is the next best move in checkers (relatively easy OOD—same board, same type of game) or minecraft.
Unfortunately I hear this quite often, sometimes even from people who should know better.
A lof of them confuses this with the actual thing that exist: “supervised ML models (which LLM is just a particular type of) tend to work much worse on the out-of-training distribution data”. If you train your model to determine the volume of apples and oranges and melons and other round-y shapes—it will work quite well on any round-y shape, including all kind of unseen ones. But it will suck at predicting the volume of a box.
You don’t need model to see every single game of chess, you just need the new situations to be within the distribution built from massive training data, and they most often are.
Real out-of-distribution example in this case would’ve been to only train it on chess and then ask what is the next best move in checkers (relatively easy OOD—same board, same type of game) or minecraft.