Vanessa Kosoy comments on Sam Altman and Ezra Klein on the AI Revolution

Vanessa Kosoy 27 Jun 2021 14:01 UTC
2 points
Is there an explanation how it works somewhere?
- Nisan 27 Jun 2021 22:04 UTC
  2 points
  Parent
  I haven’t seen a writeup anywhere of how it was trained.
  - gwern 27 Jun 2021 22:15 UTC
    4 points
    Parent
    Neither have I. I vaguely recall a call for volunteers on the Slack very early on for crowdsourcing tasks/instruction-following prompts and completions, and I speculate this might be the origin: the instruction series may be simply a model finetuned on a small corpus of handcorrected or handwritten demonstrations of ‘following instructions’. If there’s any use of the fancy RL or preference learning work, they haven’t mentioned it that I’ve seen. (In the most recent finetuning paper, none of the examples look like generic ‘instructions’.)