Epoch is one of my favorite orgs, but I expect many of the predictions in https://epochai.org/blog/interviewing-ai-researchers-on-automation-of-ai-rnd to be overconservative / too pessimistic. I expect roughly a similar scaleup in terms of compute as https://x.com/peterwildeford/status/1825614599623782490… - training runs ~1000x larger than GPT-4′s in the next 3 years—and massive progress in both coding and math (e.g. along the lines of the medians in https://metaculus.com/questions/6728/ai-wins-imo-gold-medal/…https://metaculus.com/questions/12467/ai-wins-ioi-gold-medal/…), which seem to me like some of the most important domains for automated ML research, both in terms of prerequisite capabilities, and for transfer reasons. Furthermore, along the lines of previous Epoch analysis in https://epochai.org/blog/trading-off-compute-in-training-and-inference…, I expect large chunks of ML research (e.g. post-training) to be differentially automatable by spending more inference compute, because they have relatively accurate and cheap proxy feedback signals (e.g. accuracy on various benchmarks). I also expect the ability to generate verifiable synthetic code + math data to train on to also contribute to math/code/reasoning capabilities more broadly (as is already rumored about Q*/Strawberry). And finally, I wouldn’t be too surprised if even the fuzzier parts of ML research workflows like ideation might be at least somewhat automatable, along the lines of https://sakana.ai/ai-scientist/, especially for ML research which takes relatively little compute to iterate (e.g. large parts of post-training).
(crossposted from X/twitter)
Epoch is one of my favorite orgs, but I expect many of the predictions in https://epochai.org/blog/interviewing-ai-researchers-on-automation-of-ai-rnd to be overconservative / too pessimistic. I expect roughly a similar scaleup in terms of compute as https://x.com/peterwildeford/status/1825614599623782490… - training runs ~1000x larger than GPT-4′s in the next 3 years—and massive progress in both coding and math (e.g. along the lines of the medians in https://metaculus.com/questions/6728/ai-wins-imo-gold-medal/… https://metaculus.com/questions/12467/ai-wins-ioi-gold-medal/…), which seem to me like some of the most important domains for automated ML research, both in terms of prerequisite capabilities, and for transfer reasons. Furthermore, along the lines of previous Epoch analysis in https://epochai.org/blog/trading-off-compute-in-training-and-inference…, I expect large chunks of ML research (e.g. post-training) to be differentially automatable by spending more inference compute, because they have relatively accurate and cheap proxy feedback signals (e.g. accuracy on various benchmarks). I also expect the ability to generate verifiable synthetic code + math data to train on to also contribute to math/code/reasoning capabilities more broadly (as is already rumored about Q*/Strawberry). And finally, I wouldn’t be too surprised if even the fuzzier parts of ML research workflows like ideation might be at least somewhat automatable, along the lines of https://sakana.ai/ai-scientist/, especially for ML research which takes relatively little compute to iterate (e.g. large parts of post-training).