I agree with this, I’d like to see AI Safety scale with new projects. A few ideas I’ve been mulling:
- A ‘festival week’ bringing entrepreneur types and AI safety types together to cowork from the same place, along with a few talks and lot of mixers.
- running an incubator/accelerator program at the tail end of a funding round, with fiscal sponsorship and some amount of operational support.
- more targeted recruitment for specific projects to advance important parts of a research agenda.
It’s often unclear to me whether new projects should actually be new organizations; making it easier to spin up new projects, that can then either join existing orgs or grow into orgs themselves, seems like a promising direction.
I agree with you that this feels like a ‘compact crux’ for many parts of the agenda. I’d like to take your bet, let me reflect if there’s any additional operationalizations or conditioning.
FWIW in Towards Guaranteed Safe AI I we endorse this: “Moreover, while we have argued for the need for verifiable quantitative safety guarantees, it is important to note that GS AI may not be the only route to achieving such guarantees. An alternative approach might be to extract interpretable
policies from black-box algorithms via automated mechanistic interpretability… it is ultimately an empirical question whether it is easier to create interpretable world models or interpretable policies in a given domain of operation.”