I’d say there is >60% probability that GPT-4 can’t reliably count the number of occurences of a specific word in a sentence except possibly for single digit numbers.
I base this prediction on the observation that all the big models so far seem to be bad at counting, but I haven’t tested it with GPT-3 yet, only with the biggest Aleph Alpha model which is completely unable to do this.
This task is genuinely not very impressive (I think my daughter could reliably count ten things when she was two years old) and it might be a harbinger of other more impressive system 2 thinking type abilities.
Question: How many times does the word “all” occur in the following sentence: All the lice and all the mice were all very nice to all the mice and all the lice.
Prompt: How many times does the word “all” occur in the following sentence: “All the lice and all the mice were all very nice to all the mice and all the lice”? Show your work by counting step-by-step with a running total before answering.
Response:
Sure! Here’s how I counted the occurrences of the word “all” in the sentence:
“All” (1) the lice and “all” (2) the mice were “all” (3) very nice to “all” (4) the mice and “all” (5) the lice.
So, there are 5 occurrences of the word “all” in that sentence. 😊
ChatGPT-4 can solve this without the “counting step-by-step” prompt:
Prompt: How many times does the word “all” occur in the following sentence: “All the lice and all the mice were all very nice to all the mice and all the lice”?
Response:
The word “all” occurs 5 times in the given sentence. Here’s the sentence with the occurrences of “all” highlighted:
ChatGPT could do this almost, with the same technique, but occasionally forgot to count one word. Do you know how reliably Bing Chat is for bigger numbers?
I’d say there is >60% probability that GPT-4 can’t reliably count the number of occurences of a specific word in a sentence except possibly for single digit numbers.
I base this prediction on the observation that all the big models so far seem to be bad at counting, but I haven’t tested it with GPT-3 yet, only with the biggest Aleph Alpha model which is completely unable to do this.
This task is genuinely not very impressive (I think my daughter could reliably count ten things when she was two years old) and it might be a harbinger of other more impressive system 2 thinking type abilities.
Question: How many times does the word “all” occur in the following sentence: All the lice and all the mice were all very nice to all the mice and all the lice.
Using precise mode in Bing Chat:
Prompt: How many times does the word “all” occur in the following sentence: “All the lice and all the mice were all very nice to all the mice and all the lice”? Show your work by counting step-by-step with a running total before answering.
Response:
ChatGPT-4 can solve this without the “counting step-by-step” prompt:
Prompt: How many times does the word “all” occur in the following sentence: “All the lice and all the mice were all very nice to all the mice and all the lice”?
Response:
ChatGPT could do this almost, with the same technique, but occasionally forgot to count one word. Do you know how reliably Bing Chat is for bigger numbers?