Tamay comments on Is AI Progress Impossible To Predict?

Tamay 15 May 2022 18:49 UTC
10 points
This is super interesting. Are you able to share the underlying data?
- alyssavance 15 May 2022 19:03 UTC
  11 points
  Parent
  I just got it from the papers and ran a linear regression, using pdftables.com to convert from PDF to Excel. I used pages 68 and 79 in the Gopher paper:
  https://arxiv.org/pdf/2112.11446.pdf
  Page 35 in the Chinchilla paper:
  https://arxiv.org/pdf/2203.15556.pdf
  Pages 79 and 80 in the PaLM paper:
  https://arxiv.org/pdf/2203.15556.pdf
  - Tamay 15 May 2022 19:17 UTC
    1 point
    Parent
    Thanks, though I was hoping for something like a Google Sheet containing the data.
    - alyssavance 15 May 2022 19:29 UTC
      15 points
      Parent
      OK, here’s a Google sheet I just threw together: https://docs.google.com/spreadsheets/d/1Y_00UcsYZeOwRuwXWD5_nQWAJp4A0aNoySW0EOhnp0Y/edit?usp=sharing
      - Tamay 15 May 2022 20:30 UTC
        25 points
        Parent
        Thanks! At least for Gopher, if you look at correlations between reductions in log-error (which I think is the scaling laws literature suggests would be the more natural framing) you find a more tighter relationship, particularly when looking at the relatively smaller models.
        alyssavance 15 May 2022 21:07 UTC
        9 points
        Parent
        Thanks! Is the important thing there log-error, though, or just that if the absolute performance difference between models is small enough, then different task performance between the two is noise (as in parallel runs of the same model) and you do wind up reverting to the mean?
        I can’t get the image to display, but here’s an example of how you get a negative correlation if your runs are random draws from the same Gaussian:
        https://i.imgur.com/xhtIX8F.png
        Tamay 15 May 2022 21:19 UTC
        1 point
        Parent
        I’m not sure what you mean; I’m not looking at log-odds. Maybe the correlation is an artefact from noise being amplified in log-space (I’m not sure), but it’s not obvious to me that this isn’t the correct way to analyse the data.
        Lukas Finnveden 18 May 2022 12:22 UTC
        6 points
        Parent
        Here’s the corresponding graph for the non-logged difference, which also displays a large correlation.
        ESRogs 15 May 2022 21:21 UTC
        5 points
        Parent
        Nitpick: wouldn’t this graph be much more natural with the x and y axes reversed? I’d want to input the reduction in log-error over a cheaper compute regime to predict the reduction in log-error over a more expensive one.
        Neel Nanda 15 May 2022 22:37 UTC
        4 points
        Parent
        How much does this change when you remove the big outlier in the top left?