lberglund comments on Paper: LLMs trained on “A is B” fail to learn “B is A”

lberglund 26 Sep 2023 7:41 UTC
9 points
2
I agree that training backwards would likely fix this for a causal decoder LLM.
I would define the Reversal Curse as the phenomenon by which models cannot infer ‘B → A’ by training on examples of the form ‘A → B’. In our paper we weren’t so much trying to avoid the Reversal Curse, but rather trying to generate counterexamples to it. So when we wrote, “We try different setups in an effort to help the model generalize,” we were referring to setups in which a model infers ‘B → A’ without seeing any documents in which B precedes A, rather than ways to get around the Reversal Curse in practice.