[Question] Will Orion/​Gemini 2/​Llama-4 outperform o1

What’s your bet on the next frontier models (Orion, Gemini 2, Llama-4) vs o1 in coding, math and logical reasoning benchmarks?

Will it have:

  • Better performance

  • Similar performance

  • Worse performance

Curious to hear your answers…

For OpenAI the question is if the increase in size and training on synthetic data will beat the teaching model, without test time compute.

In the comments there is some clarifications related to what I intend for “next-frontier” models.

No comments.