Vladimir_Nesov comments on o3, Oh My

Vladimir_Nesov 30 Dec 2024 20:33 UTC
15 points
2
The way performance of o1 falls off much faster than for o3 depending on size of ARC-AGI problems is significant evidence in favor of o3 being built on a different base model than o1, with better long context training or different handling of attention in model architecture. So probably post-trained Orion/GPT-4.5o.