O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:
-you can make the base model more efficient at thinking
-you can implement the base model more efficiently on hardware
-you can simply wait for hardware to get better
-you can create custom inference chips
Anything wrong with this view? I think agents are unlocked shortly along with or after this too.
O1 probably scales to superhuman reasoning:
O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:
-you can make the base model more efficient at thinking
-you can implement the base model more efficiently on hardware
-you can simply wait for hardware to get better
-you can create custom inference chips
Anything wrong with this view? I think agents are unlocked shortly along with or after this too.