elspood comments on Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment

elspood 23 Jun 2022 15:55 UTC
6 points
I appreciate the nudge here to put some of this into action. I hear alarm bells when thinking about formalizing a centralized location for AI safety proposals and information about how they break, but my rough intuition is that if there is a way these can be scrubbed of descriptions of capabilities which could be used irresponsibly to bootstrap AGI, then this is a net positive. At the very least, we should be scrambling to discuss safety controls for already public ML paradigms, in case any of these are just one key insight or a few teraflops away from being world-ending.

I would like to hear from others about this topic, though; I’m very wary of being at fault for accelerating the doom of humanity.