Eric Askelson comments on Ethan Caballero on Broken Neural Scaling Laws, Deception, and Recursive Self Improvement

Eric Askelson 4 Nov 2022 22:30 UTC
11 points
4
Am I just inexperienced or confused, or is this paper using a lot of words to say effectively very little? Sure, this functional form works fine for a given set of regimes of scaling, but it effectively gives you no predictive ability to determine when the next break will occur.
Sorry if this is overly confrontational, but I keep seeing this paper on Twitter and elsewhere and I’m not sure I understand why.
- Ethan Caballero 6 Nov 2022 20:20 UTC
  4 points
  1
  Parent
  When f (in equation 1 of the paper ( https://arxiv.org/abs/2210.14891 ) not the video) of next break is sufficiently large, it gives you predictive ability to determine when that next break will occur; also, the number of seeds needed to get such predictive ability is very large. When f of next break is sufficiently small (& nonnegative), it does not give you predictive ability to determine when that next break will occur.
  Play around with $f_{i}$ in this code to see what I mean:
  https://github.com/ethancaballero/broken_neural_scaling_laws/blob/main/make_figure_1__decomposition_of_bnsl_into_power_law_segments.py#L25-L29