Good point about the idea that LLMs are simulating people.
In terms of reconciling the results: I don’t have a full explanation. What we call “sophisticated out-of-context reasoning” (see S2 of this paper and Grosse et al) is poorly understood.
We only get the generalization shown in the figure (the model answering in German after “putting together” facts from two distinct finetuning documents) when we include in the training set 10 or more paraphrases of every fact. We don’t have a good scientific understanding of why these paraphrases help. (There are some obvious hypotheses but we haven’t tested them properly). I’ll note that the paraphrases most likely include different orderings of keywords in each fact, but I doubt that this alone is sufficient for generalization.
Good point about the idea that LLMs are simulating people.
In terms of reconciling the results: I don’t have a full explanation. What we call “sophisticated out-of-context reasoning” (see S2 of this paper and Grosse et al) is poorly understood.
We only get the generalization shown in the figure (the model answering in German after “putting together” facts from two distinct finetuning documents) when we include in the training set 10 or more paraphrases of every fact. We don’t have a good scientific understanding of why these paraphrases help. (There are some obvious hypotheses but we haven’t tested them properly). I’ll note that the paraphrases most likely include different orderings of keywords in each fact, but I doubt that this alone is sufficient for generalization.