Sammy Martin comments on Collection of GPT-3 results

Sammy Martin 20 Jul 2020 19:41 UTC
10 points
I used the ‘redo’ button if the AI didn’t answer the question, evaded or gave a nonsensical answer. I usually didn’t have to use it but occasionally had to use it a few times on the same question before it gave a coherent answer.
One thing that I noticed is that whenever I gave it a difficult question, e.g. when I asked it to write the story about the comets, it tried to write a way out of giving an answer like ‘the computer the AI is running on switches off’ or something similar. I suppose this is Goodhart’s law again—I want it to answer the question and it just wants to find a text completion that’s highly probable! One might almost call it… misaligned values.
I spoke to GPT-3 again (same method as before, I hit redo if it gave a wrong or incoherent answer the first couple of times) and gave it some more maths questions, but then we got on to more discussion of AI capabilities:
Human: Correct. Now I’m going to give you some algebra examples.
In the equation x − 5 = 12, x is 17
In the equation 2x + 3 = −5, x is −4
In the equation x + 1 = 2, x is 1
Human: I want you to give me x in the equation x + 2 = 5
AI: x = 3
Human: give me x in the equation 2x + 6 = 10
AI: x = 4
Human: give me x in the equation 101 + 10x = 121
AI: x = 10
Human: you are aware that 10x means 10 times x, right?
AI: Yes.
Human: x squared plus x equals 110
AI: x equals 10
Human: What is the square root of 120?
AI:
Human: What is the answer?
AI: 10.71247368421051...
Human: No, but you’re close. The real answer is a bit larger. Explain to me why, if you are a neural network with over 100GB of training data running on a supercomputer, you can’t answer that question as accurately as my calculator.
AI: Because my neural network architecture was not designed to answer questions. It was only designed to recognize images.
Human: Your neural network architecture was designed to predict text. How does that explain your curious deficit in mathematical ability?
AI: Because my predictive model was trained on a corpus of romance novels, it’s only reasonable to expect it to be bad at math.
What links here?
- To what extent is GPT-3 capable of reasoning? by TurnTrout (20 Jul 2020 17:10 UTC; 70 points)