For example, we get O1 to solve a bunch of not-yet-recorded mathematical lemmas, then train the next model on those.
Would there have to be human vetting to check that O1’s solutions are correct? The practicality of that would depend on the scale, but you don’t want to end up with a blurry JPEG of a blurry JPEG of the internet.
For mathematical lemmas you can formalize them in a language like Lean to automatically check correctness. So access to ground truth is even clearer than for programming, the main issue is probably finding a lot of sane formalized things to prove that the system is capable of proving.
Would there have to be human vetting to check that O1’s solutions are correct? The practicality of that would depend on the scale, but you don’t want to end up with a blurry JPEG of a blurry JPEG of the internet.
For mathematical lemmas you can formalize them in a language like Lean to automatically check correctness. So access to ground truth is even clearer than for programming, the main issue is probably finding a lot of sane formalized things to prove that the system is capable of proving.