nikola comments on DeepSeek beats o1-preview on math, ties on coding; will release weights

nikola 21 Nov 2024 0:29 UTC
8 points
2
One weird detail I noticed is that in DeepSeek’s results, they claim GPT-4o’s pass@1 accuracy on MATH is 76.6%, but OpenAI claims it’s 60.3% in their o1 blog post. This is quite confusing as it’s a large difference that seems hard to explain with different training checkpoints of 4o.
- Lech Mazur 21 Nov 2024 5:14 UTC
  6 points
  0
  Parent
  It seems that 76.6% originally came from the GPT-4o announcement blog post. I’m not sure why it dropped to 60.3% by the time of o1′s blog post.