I’ve heard rumors that people are interpreting the highlighted papers as “huh, large models aren’t that good at writing code, they don’t even solve introductory problems”. (Note that these are only rumors, I don’t know of any specific people who take this interpretation.)
I don’t buy this interpretation, because these papers didn’t do the biggest, most obvious improvement: to actually train on a large dataset of code (i.e. Github), as in Codex. My reaction to these papers is more like “wow, even models trained on language are weirdly good at writing code, given they were trained to produce language, imagine how good they must be when trained on Github”.
I’ve heard rumors that people are interpreting the highlighted papers as “huh, large models aren’t that good at writing code, they don’t even solve introductory problems”. (Note that these are only rumors, I don’t know of any specific people who take this interpretation.)
I don’t buy this interpretation, because these papers didn’t do the biggest, most obvious improvement: to actually train on a large dataset of code (i.e. Github), as in Codex. My reaction to these papers is more like “wow, even models trained on language are weirdly good at writing code, given they were trained to produce language, imagine how good they must be when trained on Github”.