Tamay comments on Is AI Progress Impossible To Predict?

Tamay 15 May 2022 20:30 UTC
25 points
Thanks! At least for Gopher, if you look at correlations between reductions in log-error (which I think is the scaling laws literature suggests would be the more natural framing) you find a more tighter relationship, particularly when looking at the relatively smaller models.
- alyssavance 15 May 2022 21:07 UTC
  9 points
  Parent
  Thanks! Is the important thing there log-error, though, or just that if the absolute performance difference between models is small enough, then different task performance between the two is noise (as in parallel runs of the same model) and you do wind up reverting to the mean?
  I can’t get the image to display, but here’s an example of how you get a negative correlation if your runs are random draws from the same Gaussian:
  https://i.imgur.com/xhtIX8F.png
  - Tamay 15 May 2022 21:19 UTC
    1 point
    Parent
    I’m not sure what you mean; I’m not looking at log-odds. Maybe the correlation is an artefact from noise being amplified in log-space (I’m not sure), but it’s not obvious to me that this isn’t the correct way to analyse the data.
- Lukas Finnveden 18 May 2022 12:22 UTC
  6 points
  Parent
  Here’s the corresponding graph for the non-logged difference, which also displays a large correlation.
- ESRogs 15 May 2022 21:21 UTC
  5 points
  Parent
  Nitpick: wouldn’t this graph be much more natural with the x and y axes reversed? I’d want to input the reduction in log-error over a cheaper compute regime to predict the reduction in log-error over a more expensive one.
- Neel Nanda 15 May 2022 22:37 UTC
  4 points
  Parent
  How much does this change when you remove the big outlier in the top left?