Archimedes comments on COT Scaling implies slower takeoff speeds

Archimedes 29 Sep 2024 3:28 UTC
3 points
0
Kudos for referencing actual numbers. I don’t think it makes sense to measure humans in terms of tokens, but I don’t have a better metric handy. Tokens obviously aren’t all equivalent either. For some purposes, a small fast LLM is more way efficient than a human. For something like answering SIMPLEBENCH, I’d guess o1-preview is less efficient while still significantly below human performance.