Depending on how far inference scaling laws go, the situation might be worse still. Picture LLama-4-o1 scaffolds that anyone can run for indefinite amounts of time (as long as they have the money/compute) to autonomously do ML research on various ways to improve Llama-4-o1 and open-weights descendants, to potentially be again appliable to autonomous ML research. Fortunately, lack of access to enough quantities of compute for pretraining the next-gen model is probably a barrier for most actors, but this still seems like a pretty (increasingly, with every open-weights improvement) scary situation to be in.
Excellent point that these capabilities will contribute to more advancements that will compound rates of progress.
Worse yet, I have it on good authority that a technique much like o1 is thought to use, can be done at very low cost and low human effort on open-source models. It’s unclear how effective it is at those low levels of cost and effort, but definitely useful, so likely scalable to intermediate levels of project.
Here’s hoping that the terrifying acceleration of proliferation and progress is balanced by the inherent ease of aligning LLM agents, and by the relatively slow takeoff speeds, giving us at least a couple of years to get our shit halfway together, including with the automated alignment techniques you focus on.
Depending on how far inference scaling laws go, the situation might be worse still. Picture LLama-4-o1 scaffolds that anyone can run for indefinite amounts of time (as long as they have the money/compute) to autonomously do ML research on various ways to improve Llama-4-o1 and open-weights descendants, to potentially be again appliable to autonomous ML research. Fortunately, lack of access to enough quantities of compute for pretraining the next-gen model is probably a barrier for most actors, but this still seems like a pretty (increasingly, with every open-weights improvement) scary situation to be in.
Now I’m picturing that, and I don’t like it.
Excellent point that these capabilities will contribute to more advancements that will compound rates of progress.
Worse yet, I have it on good authority that a technique much like o1 is thought to use, can be done at very low cost and low human effort on open-source models. It’s unclear how effective it is at those low levels of cost and effort, but definitely useful, so likely scalable to intermediate levels of project.
Here’s hoping that the terrifying acceleration of proliferation and progress is balanced by the inherent ease of aligning LLM agents, and by the relatively slow takeoff speeds, giving us at least a couple of years to get our shit halfway together, including with the automated alignment techniques you focus on.
Related:
And more explicitly, from GDM’s Frontier Safety Framework, pages 5-6:
Image link is broken.