I presume you have in mind an experiment where (for example) you ask one large group of people “Who is Tom Cruise’s mother?” and then ask a different group of the same number of people “Mary Lee Pfeiffer’s son?” and compare how many got the right answer in the each group, correct?
(If you ask the same person both questions in a row, it seems obvious that a person who answers one question correctly would nearly always answer the other question correctly also.)
Nice idea. I’d imagine something like this has been done in psychology. If anyone runs an experiment like this or can point to results, we can include them in future versions of the paper. Relevant meme by Daniel Eth.
For this particular question, you could try both orderings of the question pair. (Or long question sequences, otherwise confusing, overloading, semantic satiation)
With this question and others where reversal generalization is hoped for, they have to be uncommon enough that the reverse doesn’t appear in the dataset. Some things society (*social text processing) has not chewed on enough.
While I disagree with the premise of the abstract, I laud its precision in pointing out differing, critically differing, understandings of the same words. It also gives me the sense of being sniped by a scissor statement, like the dress color / display gamma kerfuffle.
I presume you have in mind an experiment where (for example) you ask one large group of people “Who is Tom Cruise’s mother?” and then ask a different group of the same number of people “Mary Lee Pfeiffer’s son?” and compare how many got the right answer in the each group, correct?
(If you ask the same person both questions in a row, it seems obvious that a person who answers one question correctly would nearly always answer the other question correctly also.)
Nice idea. I’d imagine something like this has been done in psychology. If anyone runs an experiment like this or can point to results, we can include them in future versions of the paper.
Relevant meme by Daniel Eth.
I might have some time tomorrow to test this out on a small scale, will try to remember to update here if I do.
Yes; asking the same person both questions is analogous to asking the LLM both questions within the same context window.
For this particular question, you could try both orderings of the question pair. (Or long question sequences, otherwise confusing, overloading, semantic satiation)
With this question and others where reversal generalization is hoped for, they have to be uncommon enough that the reverse doesn’t appear in the dataset. Some things society (*social text processing) has not chewed on enough.
While I disagree with the premise of the abstract, I laud its precision in pointing out differing, critically differing, understandings of the same words. It also gives me the sense of being sniped by a scissor statement, like the dress color / display gamma kerfuffle.