When reading posts about AI development I get the impression that many people follow a model where the important variables are the data that, out there in the world, the available compute for model training and the available training algorithm.
I think this underrated the importance of synthetic training data generation.
AlphaStar trained entirely on synthetic data to become much better than humans.
There’s an observation that you can’t improve a standard LLM much by retraining it by just feeding it random pieces of it’s own output.
I think there’s a good chance that training on the output on models that can reason like o1 and o3 does allow for improvement.
Just like AlphaStar could make up the necessary training data to become superhuman on its own, it’s possible that this is true for the kind of models like o3 simply by throwing compute at them.
When reading posts about AI development I get the impression that many people follow a model where the important variables are the data that, out there in the world, the available compute for model training and the available training algorithm.
I think this underrated the importance of synthetic training data generation.
AlphaStar trained entirely on synthetic data to become much better than humans.
There’s an observation that you can’t improve a standard LLM much by retraining it by just feeding it random pieces of it’s own output.
I think there’s a good chance that training on the output on models that can reason like o1 and o3 does allow for improvement.
Just like AlphaStar could make up the necessary training data to become superhuman on its own, it’s possible that this is true for the kind of models like o3 simply by throwing compute at them.