Charlie Steiner comments on Are Emergent Abilities of Large Language Models a Mirage? [linkpost]

Charlie Steiner 2 May 2023 22:56 UTC
6 points
0
Interesting stuff. The nonlinearity of requiring long sequences of tokens doesn’t seem to be a fatal objection to measuring long sequences, because often we’re interested in capabilities that really do require getting long sequences all correct. But from the perspective of predicting capabilities, this is definitely a point for team straight lines on graphs.