I think it’s possible to arrive at an answer like “banana” with shallow yet powerful reasoning that doesn’t work as well for answering “why”—but it might still work well enough to be coming soonish.
If you just look at the sentence “What food would you use to prop a book open and why?”, first you have to figure out that some of the words are important and others aren’t. “What” at the start of the previous sentence indicates that the answer should start by naming a thing, and so you have to look at the word “food” based on its position to “What” and figure out that you should start by naming a food. And then GPT-3 didn’t really nail which foods you wanted, but it did name a suspicious number of skinny or flat foods! Somehow it was trying to take the phrase “prop open a book” and relate it to foods, and maybe it looked at the phrase “prop open” and there was some tangential association to sticks, and some foods were associated with sticks enough to bump them up. And some other foods were associated with the word “book”—maybe sandwiches kept showing up because people analogize sandwiches to books.
Once a food had been named it knew it had to say “because” because the last sentence ended in “and why?”—it was very reliable at giving a response that had the form of an answer. And then after “because,” it wanted to say some word that had a property-relation with the food at the start of the sentence, or had something to do with the phrase” prop open a book,” and it did that pretty well, actually, it’s just that it wasn’t looking for commonalities on the right level of abstraction—“like a bookmark” is a possible property of pancakes that would have been a good answer, but GPT-3 was already shaky at interpreting “prop open a book” in the first place so it said “like a book,” maybe because the words “pancake” and “book” fulfill similar roles with spatial words like “stack”, and the word “book” definitely has something to do with the phrase “prop open a book.” (And the “it’s like a” prefix is a very common thing to see after “because” that wasn’t too implausible, so it got said first).
All of these lines of reasoning require not much knowledge about the world, so it seems to me like GPT (or maybe not GPT—a different architecture might be better at doing different amounts of processing to different parts of the sentence, or better at generating acceptable text and not “trapping itself” by not doing lookahead) is actually quite close to answering your question really well.
Great test :)
I think it’s possible to arrive at an answer like “banana” with shallow yet powerful reasoning that doesn’t work as well for answering “why”—but it might still work well enough to be coming soonish.
If you just look at the sentence “What food would you use to prop a book open and why?”, first you have to figure out that some of the words are important and others aren’t. “What” at the start of the previous sentence indicates that the answer should start by naming a thing, and so you have to look at the word “food” based on its position to “What” and figure out that you should start by naming a food. And then GPT-3 didn’t really nail which foods you wanted, but it did name a suspicious number of skinny or flat foods! Somehow it was trying to take the phrase “prop open a book” and relate it to foods, and maybe it looked at the phrase “prop open” and there was some tangential association to sticks, and some foods were associated with sticks enough to bump them up. And some other foods were associated with the word “book”—maybe sandwiches kept showing up because people analogize sandwiches to books.
Once a food had been named it knew it had to say “because” because the last sentence ended in “and why?”—it was very reliable at giving a response that had the form of an answer. And then after “because,” it wanted to say some word that had a property-relation with the food at the start of the sentence, or had something to do with the phrase” prop open a book,” and it did that pretty well, actually, it’s just that it wasn’t looking for commonalities on the right level of abstraction—“like a bookmark” is a possible property of pancakes that would have been a good answer, but GPT-3 was already shaky at interpreting “prop open a book” in the first place so it said “like a book,” maybe because the words “pancake” and “book” fulfill similar roles with spatial words like “stack”, and the word “book” definitely has something to do with the phrase “prop open a book.” (And the “it’s like a” prefix is a very common thing to see after “because” that wasn’t too implausible, so it got said first).
All of these lines of reasoning require not much knowledge about the world, so it seems to me like GPT (or maybe not GPT—a different architecture might be better at doing different amounts of processing to different parts of the sentence, or better at generating acceptable text and not “trapping itself” by not doing lookahead) is actually quite close to answering your question really well.