I found this a useful framing. I’ve thought quite a lot about the offender versus defence dominance angle and to me it seems almost impossible that we can trust that defence will be dominant. As you said, defence has to be dominant in every single attack vector, both known and unknown vectors.
That is an important point because I hear some people argue that to protect against offensive AGI we need defensive AGI.
I’m tempted to combine the intelligence dominance and starting costs into a single dimensions, and then reframe the question in terms of “at what point would a dominant friendly AGI need to intervene to prevent a hostile AGI from killing everyone”. The pivotal act view is that you need to intervene before a hostile AGI even emerges. It might be that we can intervene slightly later, before a hostile AGI has enough resources to cause much harm but after we can tell if it is hostile or friendly.
I found this a useful framing. I’ve thought quite a lot about the offender versus defence dominance angle and to me it seems almost impossible that we can trust that defence will be dominant. As you said, defence has to be dominant in every single attack vector, both known and unknown vectors.
That is an important point because I hear some people argue that to protect against offensive AGI we need defensive AGI.
I’m tempted to combine the intelligence dominance and starting costs into a single dimensions, and then reframe the question in terms of “at what point would a dominant friendly AGI need to intervene to prevent a hostile AGI from killing everyone”. The pivotal act view is that you need to intervene before a hostile AGI even emerges. It might be that we can intervene slightly later, before a hostile AGI has enough resources to cause much harm but after we can tell if it is hostile or friendly.