Anthropic’s power laws for scaling are sort of unsurprising, in a certain sense, if you know how ubiquitous some kinds of relationships are given some kinds of underlying dynamics (e.g. minimizing cost dynamics)
Also unsurprising from the comp-mech point of view I’m told.
For the first one, I’m currently making a suite of long-running games/tasks to generate streams of data from LLMs (and some other kinds of algorithms too, like basic RL and genetic algorithms eventually) and am running some techniques borrowed from financial analysis and signal processing (etc) on them because of some intuitions built from experience with models as well as what other nearby fields do
I’m curious about the technical details here, if you’re willing to provide them (privately is fine too).
Also unsurprising from the comp-mech point of view I’m told.
I’m curious about the technical details here, if you’re willing to provide them (privately is fine too).
Yeah, I’d be happy to.
I’m working on a post for it as well + hope to make it so others can try experiments of their own—but I can DM you.