ChristianKl comments on Stupid Questions—April 2023

ChristianKl 7 Apr 2023 0:15 UTC
8 points
6
GPT4 scored ⁷⁰⁰⁄₈₀₀ at the SAT math test. I don’t think a 12 year old gets such a score.
- AVoropaev 7 Apr 2023 0:30 UTC
  1 point
  0
  Parent
  I didn’t say “it’s worse than 12 yo at any math task”. I meant nonstandard problems. Perhaps that’s wrong English terminology? Sort of easy olympiad problem?
  The actual test that I performed was “take several easy problems from a math circle for 12 y/o and try various ‘lets think tep-by-step’ to make Bing write solutions”.
  Example of such a problem:
  Between 20 poles, several ropes are stretched (each rope connects two different poles; there is no more than one rope between any two poles). It is known that at least 15 ropes are attached to each pole. The poles are divided into groups so that each rope connects poles from different groups. Prove that there are at least four groups.
  - ChristianKl 8 Apr 2023 13:43 UTC
    6 points
    4
    Parent
    Most 12-year-olds are not going to be able to solve that problem.
    - AVoropaev 8 Apr 2023 20:49 UTC
      1 point
      0
      Parent
      Yeah, you are right. It seems that it was actually one of the harder ones I tried. This particular problem was solved by 4 of 28 members of a relatively strong group. I distinctly remember also trying some easy problems from a relatively weak group, but I don’t have notes and Bing don’t save chat.
      I guess I should just try again, especially in light of gwillen’s comment. (By the way, if somebody with access to actual GPT-4 is willing to help me with testing it on some math problems, I’d really appreacite it .)
  - gwillen 8 Apr 2023 4:06 UTC
    3 points
    0
    Parent
    It’s extremely important in discussions like this to be sure of what model you’re talking to. Last I heard, Bing in the default “balanced” mode had been switched to GPT-3.5, presumably as a cost saving measure.
    - AVoropaev 8 Apr 2023 20:31 UTC
      1 point
      0
      Parent
      That would explain a lot. I’ve heard this rumor, but when I tried to trace the source, i haven’t found anything better than guesses. So I dismissed it, but maybe I shouldn’t have. Do you have a better source?