I didn’t say “it’s worse than 12 yo at any math task”. I meant nonstandard problems. Perhaps that’s wrong English terminology? Sort of easy olympiad problem?
The actual test that I performed was “take several easy problems from a math circle for 12 y/o and try various ‘lets think tep-by-step’ to make Bing write solutions”.
Example of such a problem:
Between 20 poles, several ropes are stretched (each rope connects two different poles; there is no more than one rope between any two poles). It is known that at least 15 ropes are attached to each pole. The poles are divided into groups so that each rope connects poles from different groups. Prove that there are at least four groups.
Yeah, you are right. It seems that it was actually one of the harder ones I tried. This particular problem was solved by 4 of 28 members of a relatively strong group. I distinctly remember also trying some easy problems from a relatively weak group, but I don’t have notes and Bing don’t save chat.
I guess I should just try again, especially in light of gwillen’s comment. (By the way, if somebody with access to actual GPT-4 is willing to help me with testing it on some math problems, I’d really appreacite it .)
It’s extremely important in discussions like this to be sure of what model you’re talking to. Last I heard, Bing in the default “balanced” mode had been switched to GPT-3.5, presumably as a cost saving measure.
That would explain a lot. I’ve heard this rumor, but when I tried to trace the source, i haven’t found anything better than guesses. So I dismissed it, but maybe I shouldn’t have. Do you have a better source?
GPT4 scored 700⁄800 at the SAT math test. I don’t think a 12 year old gets such a score.
I didn’t say “it’s worse than 12 yo at any math task”. I meant nonstandard problems. Perhaps that’s wrong English terminology? Sort of easy olympiad problem?
The actual test that I performed was “take several easy problems from a math circle for 12 y/o and try various ‘lets think tep-by-step’ to make Bing write solutions”.
Example of such a problem:
Between 20 poles, several ropes are stretched (each rope connects two different poles; there is no more than one rope between any two poles). It is known that at least 15 ropes are attached to each pole. The poles are divided into groups so that each rope connects poles from different groups. Prove that there are at least four groups.
Most 12-year-olds are not going to be able to solve that problem.
Yeah, you are right. It seems that it was actually one of the harder ones I tried. This particular problem was solved by 4 of 28 members of a relatively strong group. I distinctly remember also trying some easy problems from a relatively weak group, but I don’t have notes and Bing don’t save chat.
I guess I should just try again, especially in light of gwillen’s comment. (By the way, if somebody with access to actual GPT-4 is willing to help me with testing it on some math problems, I’d really appreacite it .)
It’s extremely important in discussions like this to be sure of what model you’re talking to. Last I heard, Bing in the default “balanced” mode had been switched to GPT-3.5, presumably as a cost saving measure.
That would explain a lot. I’ve heard this rumor, but when I tried to trace the source, i haven’t found anything better than guesses. So I dismissed it, but maybe I shouldn’t have. Do you have a better source?