On a first reading I feel like I agree with most everything that was said, including about RSPs and the importance of regulation.
Small caveats: (i) I don’t know enough to understand the implications or comment on the recommendation “they should also hold frontier AI developers and owners legally accountable for harms from their models that can be reasonably foreseen and prevented,” (ii) “take seriously the possibility that generalist AI systems will outperform human abilities across many critical domains within this decade or the next” seems like a bit of a severe understatement that might undermine urgency (I think we should that possibility seriously over the next few years, and I’d give better than even odds that they will outperform humans across all critical domains within this decade or next), (iii) I think that RSPs / if-then commitments are valuable not just for bridging the period between now and when regulation is in place, but for helping accelerate more concrete discussions about regulation and building relevant infrastructure.
I’m a tiny bit nervous about the way that “autonomous replication” is used as a dangerous capability here and in other communications. I’ve advocated for it as a good benchmark task for evaluation and responses because it seems likely to be easier than almost anything catastrophic (including e.g. intelligence explosion, superhuman weapons R&D, organizing a revolution or coup...) and by the time it occurs there is a meaningful probability of catastrophe unless you have much more comprehensive evaluations in place. That said, I think most audiences will think it sounds somewhat improbable as a catastrophic risk in and of itself (and a bit science-fiction-y, in contrast with other risks like cybersecurity that also aren’t existential in-and-of-themselves but sound much more grounded). So it’s possible that while it makes a good evaluation target it doesn’t make a good first item on a list of dangerous capabilities. I would defer to people who have a better understanding of politics and perception, I mostly raise the hesitation because I think ARC may have had a role in how focal it is in some of these discussions.
Relatedly, I thought Managing AI Risks in an Era of Rapid Progress was great, particularly the clear statement that this is an urgent priority and the governance recommendations.
On a first reading I feel like I agree with most everything that was said, including about RSPs and the importance of regulation.
Small caveats: (i) I don’t know enough to understand the implications or comment on the recommendation “they should also hold frontier AI developers and owners legally accountable for harms from their models that can be reasonably foreseen and prevented,” (ii) “take seriously the possibility that generalist AI systems will outperform human abilities across many critical domains within this decade or the next” seems like a bit of a severe understatement that might undermine urgency (I think we should that possibility seriously over the next few years, and I’d give better than even odds that they will outperform humans across all critical domains within this decade or next), (iii) I think that RSPs / if-then commitments are valuable not just for bridging the period between now and when regulation is in place, but for helping accelerate more concrete discussions about regulation and building relevant infrastructure.
I’m a tiny bit nervous about the way that “autonomous replication” is used as a dangerous capability here and in other communications. I’ve advocated for it as a good benchmark task for evaluation and responses because it seems likely to be easier than almost anything catastrophic (including e.g. intelligence explosion, superhuman weapons R&D, organizing a revolution or coup...) and by the time it occurs there is a meaningful probability of catastrophe unless you have much more comprehensive evaluations in place. That said, I think most audiences will think it sounds somewhat improbable as a catastrophic risk in and of itself (and a bit science-fiction-y, in contrast with other risks like cybersecurity that also aren’t existential in-and-of-themselves but sound much more grounded). So it’s possible that while it makes a good evaluation target it doesn’t make a good first item on a list of dangerous capabilities. I would defer to people who have a better understanding of politics and perception, I mostly raise the hesitation because I think ARC may have had a role in how focal it is in some of these discussions.