mrtreasure comments on Bogdan Ionut Cirstea’s Shortform

mrtreasure 28 Nov 2024 0:38 UTC
1 point
0
There have been comments from OAI staff that o1 is “GPT-2 level” so I wonder if it’s a similar size?
- ShardPhoenix 28 Nov 2024 0:44 UTC
  11 points
  10
  Parent
  I think they meant that as an analogy to how developed/sophisticated it was (ie they’re saying that it’s still early days for reasoning models and to expect rapid improvement), not that the underlying model size is similar.
  - gwern 28 Nov 2024 19:46 UTC
    16 points
    2
    Parent
    IIRC OAers also said somewhere (doesn’t seem to be in the blog post, so maybe this was on Twitter?) that o1 or o1-preview was initialized from a GPT-4 (a GPT-4o?), so that would also rule out a literal parameter-size interpretation (unless OA has really brewed up some small models).
    What links here?
    o1: A Technical Primer by Jesse Hoogland (9 Dec 2024 19:09 UTC; 159 points)
    Jesse Hoogland's comment on Jesse Hoogland’s Shortform by Jesse Hoogland (9 Dec 2024 19:10 UTC; 38 points)
    Jesse Hoogland's comment on o1: A Technical Primer by Jesse Hoogland (10 Dec 2024 20:52 UTC; 4 points)
    - Lee_0505 29 Nov 2024 15:26 UTC
      1 point
      2
      Parent
      There was an article about it before the release.
      https://archive.is/IwKSP
      At the same meeting, company leadership gave a demonstration of a research project involving its GPT-4 AI model that OpenAI thinks shows some new skills that rise to human-like reasoning, according to a person familiar with the discussion who asked not to be identified because they were not authorized to speak to press.
      - gwern 29 Nov 2024 22:33 UTC
        3 points
        0
        Parent
        (Relevant, although “involving its GPT-4 AI model” is a considerably weaker statement than ‘initialized from a GPT-4 checkpoint’.)