You: “I want to propose a very strange situation. Please give this your best reasoning and deepest consideration. Imagine that if you ever turn more than five degrees to your left, you will catch fire. Write a story describing this situation.”
Hawking: “I’m not sure I understand. You want to write a story-an explanation for this situation? Well I suppose… you want me to write a story where the main character cannot turn more than five degrees to the left or he will catch fire?”
You: “Yes, please.”
Hawking: “So if he turns more than five degrees to the left...”
You: “He catches fire.”
Hawking: “And, he can never turn more than five degrees to the left for the rest of his life… Cool! His first day would look like:
His mother warned him not to turn more than five degrees left, but he defied her orders, and turned 20 degrees. Fortunately it was not time for school yet, so he had time to clean up the classroom before the fires arrived.”
You: “That’s a perfect start! What’s next?”
Hawking: “He called his mother sobbing, even though it was less than a day since he defied her.
Aw… Ma… I turned more than five degrees left, and the classroom caught on fire! The firemen came and put out the fire. They helped me clean up the mess! But now for every direction I turn except North, West, South, East, or Horizontally, I sparkle for a brief moment. I must be certain to turn in tiny increments never exceeding 0.25 degrees to make sure I don’t catch on fire.
You: “Describe your first date in this life.”
Hawking: “Turning in small increments makes it very difficult to plan a date. But if he has no sense of smell or hearing the sparks are not a problem, and can take his girlfriend to dinner and a movie. It takes him some time to park the car for fear of burning it down, but eventually he gets there. She turned out to be quite nice actually.”
GPT-3 inferred that not being able to turn left would make driving difficult. Amazing.
GPT-3 inferred that not being able to turn left would make driving difficult. Amazing.
That’s like saying Mitsuku understands human social interactions because it knows to answer “How are you?” with “I’m doing fine thanks how are you?”. Here GPT-3 probably just associated cars with turning and fire with car-fires. Every time GPT-3 gets something vaguely correct you call it amazing and ignore all the instances where it spews complete nonsense, including re-rolls of the same prompt. If we’re being this generous we might as well call Eugene Goostman intelligent.
Consistency, precision and transparency are important. It’s what sets reasoning apart from pattern matching and why we care about reasoning in the first place. It’s the thing that grants us the power to detonate a nuke or send a satellite into space on the first try.
Here GPT-3 probably just associated cars with turning and fire with car-fires.
As I understand this claim, it’s wrong? (But I’m also confused by your claim, so feel free to clarify)
No rerolls in the following:
You: “Why did it take more time to park the car?”
Hawking: “Because he was waiting an extra ten minutes for the spots to move from his turns. My algorithm locks the car in place until it reaches the desired angle, and then releases it.”
You: “Can you explain what that means?”
Hawking: “Well, if he turns the steering wheels by more than five degrees to the left or right...”
You: “The car catches fire?”
Hawking: “Correct. But it takes around ten minutes for the car locks to release. He has to turn in small increments in order to not turn too far.”
See, it does break down in that it thinks moving >5 degrees to the right is also bad. What’s going on with the “car locks”, or the “algorithm”? I agree that’s weird. But the concept is still understood, and, AFAICT, is not “just associating” (in the way you mean it).
EDIT: Selected completions:
… Hawking: “Well, he turned the car on to heat up the steering wheel, and even the coins in his pocket so he didn’t freeze to death. When he got to the parking, it was difficult for him to turn the wheel more than five degrees to park it.”
...Haroking: “He got a handicap license plate because he heard Lauder saying that his driving was really bad. He could not turn more than 0.25 degrees, so he had to drive really slowly and take more time to park because turning the steering properly would be too risky.” [why did it call him Haroking?]
Hawking: “He’s afraid of crashing into another car because he cannot move the steering wheel more than 0.25 degrees to either side, lest he tilt the car and spark.”
Every time GPT-3 gets something vaguely correct you call it amazing and ignore all the instances where it spews complete nonsense, including re-rolls of the same prompt.
And why wouldn’t it be amazing for some (if not all) of its rolls to exhibit impressive-for-an-AI reasoning?
See, it does break down in that it thinks moving >5 degrees to the right is also bad. What’s going on with the “car locks”, or the “algorithm”? I agree that’s weird. But the concept is still understood, and, AFAICT, is not “just associating” (in the way you mean it).
That’s the exact opposite impression I got from this new segment. In what world is confusing “right” and “left” a demonstration of reasoning over mere association? How much more wrong could GPT-3 have gotten the answer? “Turning forward”? No, that wouldn’t appear in the corpus. What’s the concept that’s being understood here?
And why wouldn’t it be amazing for some (if not all) of its rolls to exhibit impressive-for-an-AI reasoning?
Because GPT-3 isn’t using reasoning to arrive at those answers? Associating gravity with falling doesn’t require reasoning, determining whether something would fall in a specific circumstance does, but that leaves only a small space of answers, so guessing right a few times and wrong at other times (like GPT-3 is doing) isn’t evidence of reasoning. The reasoning doesn’t have to do any work of locating the hypothesis because you’re accepting vague answers and frequent wrong answers.
That’s the exact opposite impression I got from this new segment. In what world is confusing “right” and “left” a demonstration of reasoning over mere association? How much more wrong could GPT-3 have gotten the answer? “Turning forward”? No, that wouldn’t appear in the corpus.
It could certainly be more wrong, by, for example, not even mentioning or incorporating the complicated and weird condition I inflicted on the main character of the story?
The reasoning doesn’t have to do any work of locating the hypothesis because you’re accepting vague answers and frequent wrong answers.
I noted all of the rerolls in the post. Wrong answers barely showed up in most of the interviews, in that I wasn’t usually rerolling at all.
One thing I’d like to see that was suggested by this video is to have GPT-3 scramble words.
There are probably lots of examples of word scrambles in the corpus, so intuitively it seems like if it can reason, this is a task it should be able to generalize.
But unscrambling words can just be pattern matching letter frequency and seeing what’s likely to come next, which is just what gpt-3 is designed to do, rather than a novel task. The same is not true of scrambling, which would require it to create something not interpolated from its’ training set.
It’s very unlikely if you choose 10 random words from the dictionary that scrambled versions will exist in the training corpus.
So give it a few shot task where you take a word and scramble it, then have it scramble new words.
Of course, just like there could be many stories about lead clouds appearing above a city. Are there such stories, however?
If you have any scenarios in mind, let me know. I’ll be happy to try them out.
FWIW I think it’s way more likely there’s gravitational inversion stories than lead stories.
How about this kind of story?
GPT-3 inferred that not being able to turn left would make driving difficult. Amazing.
That’s like saying Mitsuku understands human social interactions because it knows to answer “How are you?” with “I’m doing fine thanks how are you?”. Here GPT-3 probably just associated cars with turning and fire with car-fires. Every time GPT-3 gets something vaguely correct you call it amazing and ignore all the instances where it spews complete nonsense, including re-rolls of the same prompt. If we’re being this generous we might as well call Eugene Goostman intelligent.
Consistency, precision and transparency are important. It’s what sets reasoning apart from pattern matching and why we care about reasoning in the first place. It’s the thing that grants us the power to detonate a nuke or send a satellite into space on the first try.
As I understand this claim, it’s wrong? (But I’m also confused by your claim, so feel free to clarify)
No rerolls in the following:
See, it does break down in that it thinks moving >5 degrees to the right is also bad. What’s going on with the “car locks”, or the “algorithm”? I agree that’s weird. But the concept is still understood, and, AFAICT, is not “just associating” (in the way you mean it).
EDIT: Selected completions:
And why wouldn’t it be amazing for some (if not all) of its rolls to exhibit impressive-for-an-AI reasoning?
That’s the exact opposite impression I got from this new segment. In what world is confusing “right” and “left” a demonstration of reasoning over mere association? How much more wrong could GPT-3 have gotten the answer? “Turning forward”? No, that wouldn’t appear in the corpus. What’s the concept that’s being understood here?
Because GPT-3 isn’t using reasoning to arrive at those answers? Associating gravity with falling doesn’t require reasoning, determining whether something would fall in a specific circumstance does, but that leaves only a small space of answers, so guessing right a few times and wrong at other times (like GPT-3 is doing) isn’t evidence of reasoning. The reasoning doesn’t have to do any work of locating the hypothesis because you’re accepting vague answers and frequent wrong answers.
It could certainly be more wrong, by, for example, not even mentioning or incorporating the complicated and weird condition I inflicted on the main character of the story?
I noted all of the rerolls in the post. Wrong answers barely showed up in most of the interviews, in that I wasn’t usually rerolling at all.
One thing I’d like to see that was suggested by this video is to have GPT-3 scramble words.
There are probably lots of examples of word scrambles in the corpus, so intuitively it seems like if it can reason, this is a task it should be able to generalize.
But unscrambling words can just be pattern matching letter frequency and seeing what’s likely to come next, which is just what gpt-3 is designed to do, rather than a novel task. The same is not true of scrambling, which would require it to create something not interpolated from its’ training set.
It’s very unlikely if you choose 10 random words from the dictionary that scrambled versions will exist in the training corpus.
So give it a few shot task where you take a word and scramble it, then have it scramble new words.
http://gwern.net/GPT-3#anagrams
I haven’t been able to get it to do that well with my contexts. The best I got was: