One question: Given that interpretability might well lead to systems which are powerful enough to be an x-risk long before we have a strong enough understanding to direct a superintelligence, so publish-by-default seems risky, are you considering adopting a non-publish-by-default policy? I know you talk about capabilities risks in general terms, but is this specific policy on the table?
To be clear, our policy is not publish-by-default. Our current assessment is that the projects we’re prioritizing do not pose a significant risk of capabilities externalities. We will continue to make these decisions on a per-project basis.
Congratulations on launching!
Added you to the map:
and your Discord to the list of communities, which is now a sub-page of aisafety.com.
One question: Given that interpretability might well lead to systems which are powerful enough to be an x-risk long before we have a strong enough understanding to direct a superintelligence, so publish-by-default seems risky, are you considering adopting a non-publish-by-default policy? I know you talk about capabilities risks in general terms, but is this specific policy on the table?
To be clear, our policy is not publish-by-default. Our current assessment is that the projects we’re prioritizing do not pose a significant risk of capabilities externalities. We will continue to make these decisions on a per-project basis.
FWIW for this sort of research I support a strong prior in favor of publishing.