peterbarnett comments on Daniel Kokotajlo’s Shortform

peterbarnett 10 Jul 2024 18:55 UTC
7 points
0
The report does say that the AI will likely be trained with a bunch of pre-training before the RL:
Even before Alex is ever trained with RL, it already has a huge amount of knowledge and understanding of the world from its predictive and imitative pretraining step.
The HFDT is what makes it a “generally competent creative planner” and capable of long-horizon open-ended tasks.
Do you think most of future capabilities will continue to come from scaling pretraining, rather than something like HFDT? (There is obviously some fuzziness when talking about where “most capabilities come from”, but I think the capability to do long-horizon open-ended tasks will reasonably be thought of as coming from the HFDT or a similar process rather than the pretraining)
- Bogdan Ionut Cirstea 10 Jul 2024 19:55 UTC
  7 points
  2
  Parent
  The HFDT is what makes it a “generally competent creative planner” and capable of long-horizon open-ended tasks.
  
  I’m not entirely sure how to interpret this, but my impression from playing with LMs (which also seems close to something like folk wisdom) is that they are already creative enough and quite competent at coming up with high-level plans, they’re just not reliable enough for long-horizon open-ended tasks.
  
  Do you think most of future capabilities will continue to come from scaling pretraining, rather than something like HFDT? (There is obviously some fuzziness when talking about where “most capabilities come from”, but I think the capability to do long-horizon open-ended tasks will reasonably be thought of as coming from the HFDT or a similar process rather than the pretraining)
  I would probably expect a mix of more single-step reliability mostly from pre-training (at least until running out of good quality text data) + something like self-correction / self-verification, where I’m more unsure where most of the gains would come from and could see e.g. training on synthetic data with automated verification contributing more.