You would behave the exact same way as GPT-3, were you to be put in this same challenging situation. In fact I think you’d do worse; GPT-3 managed to get quite a few words actually reversed whereas I expect you’d just output gibberish. (Remember, you only have about 1 second to think before outputting each token. You have to just read the text and immediately start typing.)
It feels like a “gotcha” rebuke, but it honestly doesn’t seem like it really addresses the article’s point. Unless you think GPT-3 would perform better if given more time to work on it?
How does it not address the article’s point? What I’m saying is that Armstrong’s example was an unfair “gotcha” of GPT-3; he’s trying to make some sort of claim about its limitations on the basis of behavior that even a human would also exhibit. Unless he’s saying we humans also have this limitation...
Yes, I think GPT-3 would perform better if given more time to work on it (and fine-tuning to get used to having more time). See e.g. PaLM’s stuff about chain-of-thought prompting. How much better? I’m not sure. But I think its failure at this particular task tells us nothing.
Suppose I offered to pay you a million dollars if you accomplished the same goal as GPT-3 in this experiment. Then you would have the same goal as GPT-3. Yet you still wouldn’t be able to accomplish it better than GPT-3.
OK, so suppose you had spent a ton of time completing text first. There just isn’t enough time for you to do the mental gymnastics needed to compose a sentence in your head and reverse it.
I think this is too far out of my experience to say anything with certainty, or for it to be particularly informative about how well I’d do in generalizing OOD within my current goals and ontology.
Remember, you only have about 1 second to think before outputting each token.
I don’t think this is true. Humans can decide to think longer on harder problems in a way GPT-3 can’t. Our “architecture” is fundamentally different from GPT-3 in that regard.
Also, our ability to think for longer fundamentally changes how we do concept extrapolation. Given a tricky extrapolation problem, you wouldn’t just spit out the first thing to enter your mind. You’d think about it.
If GPT-3 has an architectural limitation that prevents it from doing concept extrapolation in a human-like manner, we shouldn’t change our evaluation benchmarks to avoid “unfairly” penalizing GPT-3. We should acknowledge that limitation and ask how it impacts alignment prospects.
It sounds like we are on the same page. GPT-3 has an architectural limitation such that (a) it would be very surprising and impressive if it could make a coherent sentence out of reversed words, and (b) if it managed to succeed it must be doing something substantially different from how a human would do it. This is what my original point was. Maybe I’m just not understanding what point Stuart is making. Probably this is the case.
You would behave the exact same way as GPT-3, were you to be put in this same challenging situation. In fact I think you’d do worse; GPT-3 managed to get quite a few words actually reversed whereas I expect you’d just output gibberish. (Remember, you only have about 1 second to think before outputting each token. You have to just read the text and immediately start typing.)
The aim of this post is not to catch out GPT-3; it’s to see what concept extrapolation could look like for a language model.
OK, cool. I think I was confused.
It feels like a “gotcha” rebuke, but it honestly doesn’t seem like it really addresses the article’s point. Unless you think GPT-3 would perform better if given more time to work on it?
How does it not address the article’s point? What I’m saying is that Armstrong’s example was an unfair “gotcha” of GPT-3; he’s trying to make some sort of claim about its limitations on the basis of behavior that even a human would also exhibit. Unless he’s saying we humans also have this limitation...
Yes, I think GPT-3 would perform better if given more time to work on it (and fine-tuning to get used to having more time). See e.g. PaLM’s stuff about chain-of-thought prompting. How much better? I’m not sure. But I think its failure at this particular task tells us nothing.
Humans don’t have the same goal as GPT-3, though, so that doesn’t seem like a fair comparison.
Suppose I offered to pay you a million dollars if you accomplished the same goal as GPT-3 in this experiment. Then you would have the same goal as GPT-3. Yet you still wouldn’t be able to accomplish it better than GPT-3.
Not really comparable. As a minimum I should have spent a ton of time completing text first.
OK, so suppose you had spent a ton of time completing text first. There just isn’t enough time for you to do the mental gymnastics needed to compose a sentence in your head and reverse it.
I think this is too far out of my experience to say anything with certainty, or for it to be particularly informative about how well I’d do in generalizing OOD within my current goals and ontology.
I don’t think this is true. Humans can decide to think longer on harder problems in a way GPT-3 can’t. Our “architecture” is fundamentally different from GPT-3 in that regard.
Also, our ability to think for longer fundamentally changes how we do concept extrapolation. Given a tricky extrapolation problem, you wouldn’t just spit out the first thing to enter your mind. You’d think about it.
If GPT-3 has an architectural limitation that prevents it from doing concept extrapolation in a human-like manner, we shouldn’t change our evaluation benchmarks to avoid “unfairly” penalizing GPT-3. We should acknowledge that limitation and ask how it impacts alignment prospects.
It sounds like we are on the same page. GPT-3 has an architectural limitation such that (a) it would be very surprising and impressive if it could make a coherent sentence out of reversed words, and (b) if it managed to succeed it must be doing something substantially different from how a human would do it. This is what my original point was. Maybe I’m just not understanding what point Stuart is making. Probably this is the case.