I wondered what are O3 and and O4-mini? Here’s my guess at the test-time-scaling and how openai names their model
O0 (Base model)
↓
D1 (Outputs/labels generated with extended compute: search/reasoning/verification)
↓
O1 (Model trained on higher-quality D1 outputs)
↓
O1-mini (Distilled version - smaller, faster)
↓
D2 (Outputs/labels generated with extended compute: search/reasoning/verification)
↓
O2 (Model trained on higher-quality D2 outputs)
↓
O2-mini (Distilled version - smaller, faster)
↓
...
The point is consistently applying additional compute at generation time to create better training data for each subsequent iteration. And the models go from large -(distil)-> small -(search)-> large
I wondered what are O3 and and O4-mini? Here’s my guess at the test-time-scaling and how openai names their model
The point is consistently applying additional compute at generation time to create better training data for each subsequent iteration. And the models go from large -(distil)-> small -(search)-> large