Lukas Finnveden comments on OpenAI: “Scaling Laws for Transfer”, Hernandez et al.

Lukas Finnveden 4 Feb 2021 12:59 UTC
10 points
0
It’s worth noting that their language model still uses BPEs, and as far as I can tell the encoding is completely optimised for English text rather than code (see section 2). It seems like this should make coding unusually hard compared to the pretraining task; but maybe make pretraining more useful, as the model needs time to figure out how the encoding works.