These days it’s rare for a release to advance the frontier substantially.
This seems to be one crux. Sure, there’s no need for staged release if the model doesn’t actually do much more than previous models, and doesn’t have unpatched vulnerabilities of types that would be identified by somewhat broader testing.
The other crux, I think, is around public release of model weights. (Often referred to, incorrectly, as “open sourcing.”) Staged release implies not releasing weights immediately—and I think this is one of the critical issues with what companies like X have done that make it important to demand staged release for any models claiming to be as powerful or more powerful than current frontier models. (In addition to testing and red-teaming, which they also don’t do.)
This seems to be one crux. Sure, there’s no need for staged release if the model doesn’t actually do much more than previous models, and doesn’t have unpatched vulnerabilities of types that would be identified by somewhat broader testing.
The other crux, I think, is around public release of model weights. (Often referred to, incorrectly, as “open sourcing.”) Staged release implies not releasing weights immediately—and I think this is one of the critical issues with what companies like X have done that make it important to demand staged release for any models claiming to be as powerful or more powerful than current frontier models. (In addition to testing and red-teaming, which they also don’t do.)