Tao Lin comments on Zach Stein-Perlman’s Shortform

Tao Lin 20 Aug 2024 22:55 UTC
3 points
0
if you tested a recent version of the model and your tests have a large enough safety buffer, it’s OK to not test the final model at all.
I agree in theory but testing the final model feels worthwhile, because we want more direct observability and less complex reasoning in safety cases.
- Zach Stein-Perlman 20 Aug 2024 23:06 UTC
  2 points
  0
  Parent
  Thanks. Is this because of posttraining? Ignoring posttraining, I’d rather that evaluators get the 90% through training model version and are unrushed than the final version and are rushed — takes?
  - Tao Lin 21 Aug 2024 19:31 UTC
    3 points
    0
    Parent
    two versions with the same posttraining, one with only 90% pretraining are indeed very similar, no need to evaluate both. It’s likely more like one model with 80% pretraining and 70% posttraining of the final model, and the last 30% of posttraining might be significant