avturchin comments on Meta Programming GPT: A route to Superintelligence?

avturchin 11 Jul 2020 21:40 UTC
3 points
It would be interesting to train GPT-4 on the raw code of GPT-3 neural net (weights table), so it will be able to output larger networks code.
- gwern 13 Mar 2023 0:47 UTC
  4 points
  Parent
  You wouldn’t be able to do that because the raw weights would require context windows of millions or billions. Approaches to meta-learning fast weights require more tailored approaches; a good recent example is the meta-learning diffusion model “Gpt”. (Yes, that is really its name—possibly the worst named DL result of 2022.)