My suspicion is it might be better to think about kinetics rather than energetics. That is, the order things get learned in seems important.
So it might be interesting to mathematically investigate questions like “given small random initialization, what determines the relative gradients towards different heuristics?” I would guess there’s some literature on this already—the only thing I can think of off the top of my head is infinite width stuff that’s not super relevant, but probably someone has made other simplifying assumptions like heuristics being fixed circuits with simple effects on the loss, and seen what happens.
My suspicion is it might be better to think about kinetics rather than energetics. That is, the order things get learned in seems important.
So it might be interesting to mathematically investigate questions like “given small random initialization, what determines the relative gradients towards different heuristics?” I would guess there’s some literature on this already—the only thing I can think of off the top of my head is infinite width stuff that’s not super relevant, but probably someone has made other simplifying assumptions like heuristics being fixed circuits with simple effects on the loss, and seen what happens.