I address the motivations for our Reversal Curse paper in a reply to your other comment.
My current (highly speculative) guess is that humans do learn one-directionally. We can’t easily recite poems backwards line-by-line or word-by-word or phoneme-by-phoneme. We can’t understand such reversed language either. It’s easy to count down (because we practice that) but harder to do the alphabet backwards (because we don’t practice it). Mostly when we memorize facts that are 2-way (unlike poems), we do some minimal amount of reflection/repetition that means both AB and BA are present. E.g. repeating to ourselves “casa, house, casa, house, etc...”. For facts we read passively in newspapers, it’s trickier to think about becuase we retain relatively little. But my guess is that most facts that we retain at all will be ones that appear in both orders, though that won’t be necessary for us learning them (becauase we can reflect on them ourselves). [If we don’t understand the semantics of what we are hearing at all, then we don’t memorize. E.g. Americans might hear a lot of Spanish on the streets but but memorize basically nothing.]
We might also be using working memory to reconstruct reverse relations on the fly. E.g. reciting a poem backwards will consist of remembering chunks of it in forward direction and then rearranging the chunk to be in reverse direction. If that is correct than a variation of CoT-prompting might work. By first having the model recall any context in which it recalls an object and then picking the answer out of that.
I address the motivations for our Reversal Curse paper in a reply to your other comment.
My current (highly speculative) guess is that humans do learn one-directionally. We can’t easily recite poems backwards line-by-line or word-by-word or phoneme-by-phoneme. We can’t understand such reversed language either. It’s easy to count down (because we practice that) but harder to do the alphabet backwards (because we don’t practice it). Mostly when we memorize facts that are 2-way (unlike poems), we do some minimal amount of reflection/repetition that means both AB and BA are present. E.g. repeating to ourselves “casa, house, casa, house, etc...”. For facts we read passively in newspapers, it’s trickier to think about becuase we retain relatively little. But my guess is that most facts that we retain at all will be ones that appear in both orders, though that won’t be necessary for us learning them (becauase we can reflect on them ourselves).
[If we don’t understand the semantics of what we are hearing at all, then we don’t memorize. E.g. Americans might hear a lot of Spanish on the streets but but memorize basically nothing.]
We might also be using working memory to reconstruct reverse relations on the fly. E.g. reciting a poem backwards will consist of remembering chunks of it in forward direction and then rearranging the chunk to be in reverse direction.
If that is correct than a variation of CoT-prompting might work. By first having the model recall any context in which it recalls an object and then picking the answer out of that.