ryan_greenblatt comments on Run evals on base models too!

ryan_greenblatt 4 Apr 2024 19:17 UTC
LW: 15 AF: 8
1
AF
METR (formerly ARC Evals) included results on base models in their recent work “Measuring the impact of post-training enhancements” (“post-training enhancements”=elicitation). They found that GPT-4-base performed poorly in their scaffold and prompting.

I believe the prompting they used included a large number of few-show examples (perhaps 10?), so it should be a vaguely reasonable setup for base models. (Though I do expect that elicitation which is more specialized to base model would work better.)

I predict that base models will consistently do worse on tasks that labs care about (software engineering, agency, math) then models which have gone through post-training, particularly models which have gone through post training just aimed at improving capabilities and improving the extent to which the model follows instructions (instruction tuning).

My overall sense is that there is plausibly a lot of low hanging fruit in elicitation, but I’m pretty skeptical that base models are a very promising direction.
What links here?
- ryan_greenblatt's comment on AI #57: All the AI News That’s Fit to Print by Zvi (4 Apr 2024 19:20 UTC; 11 points)
- orthonormal 4 Apr 2024 19:23 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Thank you! I’d forgotten about that.