I appreciate the articulation and assessment of various strategies. My comment will focus on a specific angle that I notice both in the report and in the broader ecosystem:
I think there has been a conflating of “catastrophic risks” and “extinction/existential risks” recently, especially among groups that are trying to influence policy. This is somewhat understandable– the difference between “catastrophic” and “existential” is not that big of a deal in most people’s minds. But in some contexts, I think it misses the fact that “existential [and thus by definition irreversible]” is actually a very different level of risk compared to “catastrophic [but something that we would be able to recover from.]”
This view seems to be (implicitly) expressed in the report summary, most notably the chart. It seems to me like the main frame is something like “if you want to avoid an unacceptable chance of catastrophic risk, all of these other options are bad.”
But not all of these catastrophic risks are the same, I think this is actually quite an important consideration, and I think even (some) policymakers would/will see this as an essential consideration as AGI becomes more salient.
Specifically, “war” and “misuse” seem very different than “extinction” or “total and irreversible civilizational collapse.”
“War” is broad enough to encompass many outcomes (ranging from “conflict with <1M deaths” to “nuclear conflict in which civilization recovers” all the way to “nuclear conflict in which civilization does not recover.”) Note also that many natsec leaders already think the chance of a war between the US and China is at a level that would probably meet an intuitive bar for “unacceptable.” (I don’t have actual statistics on this but my guess is that >10% chance of war in the next decade is not an uncommon view. One plausible pathway that is discussed often is China invading Taiwan and US being committed to its defense).
“Misuse” can refer to many different kinds of events (including $1B in damages from a cyberattack, 10M deaths, 1B deaths, or complete human extinction.) These are, of course, very different in terms of their overall impact, even though all of them are intuitively/emotionally stored as “very bad things that we would ideally avoid.”
It seems plausible to me that we will be in situations in which policymakers have to make tricky trade-offs between these different sources of risk, and my hope is that the community of people concerned about AI can distinguish between the different “levels” or “magnitudes” of different types of risks.
(My impression is that MIRI agrees with this, so this is more a comment on how the summary was presented & more a general note of caution to the ecosystem as a whole. I also suspect that the distinction between “catastrophic” and “existential/civilization-ending” will become increasingly more important as the AI conversation becomes more interlinked with the national security apparatus.)
Caveat: I have not read the full report and this comment is mostly inspired by the summary, the chart, and a general sense that many organizations other than MIRI are also engaging in this kind of conflation.
I would be curious for your thoughts on which organizations you feel are robustly trustworthy.
Bonus points for a list that is kind of a weighted sum of “robustly trustworthy” and “having a meaningful impact RE improving public/policymaker understanding”. (Adding this in because I suspect that it’s easier to maintain “robustly trustworthy” status if one simply chooses not to do a lot of externally-focused comms, so it’s particularly impressive to have the combination of “doing lots of useful comms/policy work” and “managing to stay precise/accurate/trustworthy”).