Kei Nishimura-Gasparian comments on OpenAI o1, Llama 4, and AlphaZero of LLMs

Kei Nishimura-Gasparian 17 Sep 2024 15:18 UTC
8 points
3

I can not see any 1o improvement on this.

Are you saying that o1 did not do any better than 5-6% on your AIME-equivalent dataset? That would be interesting given that o1 did far better on the 2024 AIME which presumably was released after the training cutoff: https://openai.com/index/learning-to-reason-with-llms/