Ethan finds empirically that neural network scaling laws (performance vs size, data, other things) are characterised by functions that look piecewise linear on a log log plot, and postulates that a “sharp left turn” describes a transition from a slower to a faster scaling regime. He also postulates that it might be predictable in advance using his functional form for scaling.
I give a crisp definition from 6:27 to 7:50 of this video:
Ethan finds empirically that neural network scaling laws (performance vs size, data, other things) are characterised by functions that look piecewise linear on a log log plot, and postulates that a “sharp left turn” describes a transition from a slower to a faster scaling regime. He also postulates that it might be predictable in advance using his functional form for scaling.
You drew a right turn, the post is asking about a left turn.
you draw a right turn. The post is asking about a left turn.