Additional resources, thanks to Avery:
COT Scaling implies slower takeoff speeds (Zoellner − 10 min)
o1: A Technical Primer (Hoogland − 20 min)
Unpacking o1 and the Path to AGI (Brown—up to 8:38)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Deepmind, Snell − 20 min to skim)
Speculations on Test-Time Scaling (Rush—from 4:25 to 21:25, 17 min total)
Did you mean to link to my specific comment for the first link?
Ah, that’s a mistake. Our bad.
Additional resources, thanks to Avery:
COT Scaling implies slower takeoff speeds (Zoellner − 10 min)
o1: A Technical Primer (Hoogland − 20 min)
Unpacking o1 and the Path to AGI (Brown—up to 8:38)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Deepmind, Snell − 20 min to skim)
Speculations on Test-Time Scaling (Rush—from 4:25 to 21:25, 17 min total)
Did you mean to link to my specific comment for the first link?
Ah, that’s a mistake. Our bad.