Nathan Helm-Burger comments on Akash’s Shortform

Nathan Helm-Burger 20 Nov 2024 22:19 UTC
2 points
−2
I have an answer to that: making sure that NIST:AISI had at least scores of automated evals for checkpoints of any new large training runs, as well as pre-deployment eval access.

Seems like a pretty low-cost, high-value ask to me. Even if that info leaked from AISI, it wouldn’t give away corporate algorithmic secrets.

A higher cost ask, but still fairly reasonable, is pre-deployment evals which require fine-tuning. You can’t have a good sense of a what the model would be capable of in the hands of bad actors if you don’t test fine-tuning it on hazardous info.