I think LLMs are even worse — not just with rare encodings, but also when it comes to reasoning with rare structures. Theory-of-mind tasks provide good evidence for this. LLMs aren’t good at inferring others’ mental states; rather, they tend to mimic reasoning when reasoning steps are present in the training data.
I think LLMs are even worse — not just with rare encodings, but also when it comes to reasoning with rare structures. Theory-of-mind tasks provide good evidence for this. LLMs aren’t good at inferring others’ mental states; rather, they tend to mimic reasoning when reasoning steps are present in the training data.