Algon comments on Uncompetitive programming with GPT-3

Algon 6 Feb 2022 21:12 UTC
1 point
Some things that I feel undermine your case: your sample size is fairly small here, and it would have been valuable if you tried sampling maybe 10-20 times for each. Also, these code snippets are either the kind of thing I’d expect would be in the dataset, or are trivial. Plus, GPT-3 wasn’t used as a base model for AlphaCode, so it can’t have been due to “fine-tuning and filtering tricks”. Finally, GPT-3 is way bigger than any AlphaCode model.
- Bezzi 9 Feb 2022 19:47 UTC
  1 point
  Parent
  GPT-3 wasn’t used as a base model for AlphaCode
  I had missed this step. Retrospectively it should have been obvious… of course that you don’t start from a huge text predictor model to build a code predictor model that only needs to predict compilable code. Thanks for the clarification.
  - gwern 9 Feb 2022 22:00 UTC
    3 points
    Parent
    I think the fact that GPT-3 is controlled by OpenAI and AlphaCode is a DeepMind project has more to do with it. Of course you don’t need to hotstart by transfer learning, but it’s a good idea anyway if you can, which is why DM not using its own GPT-3-equivalent (Gopher, trained at considerable expense) has drawn comment.