You’ll need to evaluate more than just foundation models
Not sure what this is gesturing at—you need to evaluate other kinds of models, or whole labs, or foundation-models-plus-finetuning-and-scaffolding, or something else.
(I think “model evals” means “model+finetuning+scaffolding evals,” at least to the AI safety community + Anthropic.)
Nice.
Not sure what this is gesturing at—you need to evaluate other kinds of models, or whole labs, or foundation-models-plus-finetuning-and-scaffolding, or something else.
(I think “model evals” means “model+finetuning+scaffolding evals,” at least to the AI safety community + Anthropic.)