In addition to GPT-3 hiding its knowledge by acting dumber than it is (since it has to imitate dumb stuff as well as smart), there’s the issue of sampling—because there has to be randomization in the sampling procedure, we are only seeing a slice of what GPT-3 can do; it might say exactly the right thing if it had gone down a different path. (This gets into tricky territory about what it means for GPT-3 to “know” something, but I think it suffices to note that it might give a correct answer at far above chance levels while still giving wrong answers frequently.) [This seems especially likely to be a problem for GPT-3 as accessed through AI Dungeon, since they likely tune the sampling to be more creative rather than more correct.] Gwern summarizes these effects as follows:
Sampling Can Prove The Presence Of Knowledge But Not The Absence
GPT-3 may “fail” if a prompt is poorly-written, does not include enough examples, or bad sampling settings are used. I have demonstrated this many times when someone shows a “failure” of GPT-3—the failure was their own. The question is not whether a given prompt works, but whether any prompt works.
There is infinite amount of wrong answers to “What is six plus eight”, only one is correct. If GPT-3 answers it correctly in 3 or 10 tries, that means it *has* some understanding/knowledge. Through that’s moderated by numbers being very small—if it also replies with small numbers it has non-negligible chance of being correct solely by chance.
But it’s better than that.
And more complex questions, like these in the interview above are even more convincing, through the same line of reasoning. There might be (exact numbers pulled out of the air, they’re just for illustrative purposes), out of all sensible-English completions (so no “weoi123@!#*), 0.01% correct ones, 0.09% partially correct and 99% complete nonsense / off-topic etc.
Returning to arithmetic itself, for me GPT seems intent on providing off-by-one answers for some reason. Or even less wrong [heh]. When I was playing with Gwern’s prefix-confidence-rating prompt, I got this:
Q: What is half the result of the number 102?
A: [remote] 50.5
About confidence-rating prefixes, neat thing might be to experiment with “requesting” high (or low) confidence answer by making these tags part of the prompt. It worked when I tried it (for example, if it kept answering it doesn’t know the answer, I eventually tried to write question + “A: [highly likely] ”—and it answered sensibly! But I didn’t play all that much so it might’ve been a fluke.
Yeah. The way I’m thinking about it is: to discuss these questions we have to get clear on what we mean by “knowledge” in the context of GPT. In some sense Gwern is right; in a different sense, you’re right. But no one has offered a clearer definition of “knowledge” to attempt to arbitrate these questions yet (afaik, that is).
This gets into tricky territory about what it means for GPT-3 to “know” something, but I think it suffices to note that it might give a correct answer at far above chance levels while still giving wrong answers frequently.
Yup. Information theoretically, you might think:
if it outputs general relativity’s explanation with probability .1, and Newtonian reasoning with .9, it has elevated the right hypothesis to the point that it only needs a few more bits of evidence to “become quite confident” of the real answer.
But then, what do you say if it’s .1 GR, .2 Newtonian, and then .7 total-non-sequitur? Does it “understand” gravity? Seems like our fuzzy “knowing-something” concept breaks down here.
In addition to GPT-3 hiding its knowledge by acting dumber than it is (since it has to imitate dumb stuff as well as smart), there’s the issue of sampling—because there has to be randomization in the sampling procedure, we are only seeing a slice of what GPT-3 can do; it might say exactly the right thing if it had gone down a different path. (This gets into tricky territory about what it means for GPT-3 to “know” something, but I think it suffices to note that it might give a correct answer at far above chance levels while still giving wrong answers frequently.) [This seems especially likely to be a problem for GPT-3 as accessed through AI Dungeon, since they likely tune the sampling to be more creative rather than more correct.] Gwern summarizes these effects as follows:
About the first paragraph:
There is infinite amount of wrong answers to “What is six plus eight”, only one is correct. If GPT-3 answers it correctly in 3 or 10 tries, that means it *has* some understanding/knowledge. Through that’s moderated by numbers being very small—if it also replies with small numbers it has non-negligible chance of being correct solely by chance.
But it’s better than that.
And more complex questions, like these in the interview above are even more convincing, through the same line of reasoning. There might be (exact numbers pulled out of the air, they’re just for illustrative purposes), out of all sensible-English completions (so no “weoi123@!#*), 0.01% correct ones, 0.09% partially correct and 99% complete nonsense / off-topic etc.
Returning to arithmetic itself, for me GPT seems intent on providing off-by-one answers for some reason. Or even less wrong [heh]. When I was playing with Gwern’s prefix-confidence-rating prompt, I got this:
Q: What is half the result of the number 102?
A: [remote] 50.5
About confidence-rating prefixes, neat thing might be to experiment with “requesting” high (or low) confidence answer by making these tags part of the prompt. It worked when I tried it (for example, if it kept answering it doesn’t know the answer, I eventually tried to write question + “A: [highly likely] ”—and it answered sensibly! But I didn’t play all that much so it might’ve been a fluke.
Here’s more if anyone’s interested.
Yeah. The way I’m thinking about it is: to discuss these questions we have to get clear on what we mean by “knowledge” in the context of GPT. In some sense Gwern is right; in a different sense, you’re right. But no one has offered a clearer definition of “knowledge” to attempt to arbitrate these questions yet (afaik, that is).
Yup. Information theoretically, you might think:
But then, what do you say if it’s .1 GR, .2 Newtonian, and then .7 total-non-sequitur? Does it “understand” gravity? Seems like our fuzzy “knowing-something” concept breaks down here.