Let me clarify an important point: The strategy preferences outlined in the paper are conditional statements—they describe what strategy is optimal given certainty about timeline and alignment difficulty scenarios. When we account for uncertainty and the asymmetric downside risks—where misalignment could be catastrophic—the calculation changes significantly. However, it’s not true that GM’s only downside is that it might delay the benefits of TAI.
Misalignment (or catastrophic misuse) has a much larger downside than a successful moratorium. That is true, but trying to do a moratorium, losing your lead, and then someone else developing catastrophically misaligned AI when you could have developed a defense against it if you’d adopted CD or SA has just as large a downside.
And GM has a lower chance of being adopted than CD or SA, so the downside to pushing for a moratorium is not necessarily lower.
Since a half-successful moratorium is the worst of all worlds (assuming that alignment is feasible) because you lose out on your chances of developing defenses against unaligned or misused AGI, it’s not always true that the moratorium plan has fewer downsides than the others.
However, I agree with your core point—if we were to model this with full probability distributions over timeline and alignment difficulty, GM would likely be favored more heavily than our conditional analysis suggests, especially if we place significant probability on short timelines or hard alignment
Let me clarify an important point: The strategy preferences outlined in the paper are conditional statements—they describe what strategy is optimal given certainty about timeline and alignment difficulty scenarios. When we account for uncertainty and the asymmetric downside risks—where misalignment could be catastrophic—the calculation changes significantly. However, it’s not true that GM’s only downside is that it might delay the benefits of TAI.
Misalignment (or catastrophic misuse) has a much larger downside than a successful moratorium. That is true, but trying to do a moratorium, losing your lead, and then someone else developing catastrophically misaligned AI when you could have developed a defense against it if you’d adopted CD or SA has just as large a downside.
And GM has a lower chance of being adopted than CD or SA, so the downside to pushing for a moratorium is not necessarily lower.
Since a half-successful moratorium is the worst of all worlds (assuming that alignment is feasible) because you lose out on your chances of developing defenses against unaligned or misused AGI, it’s not always true that the moratorium plan has fewer downsides than the others.
However, I agree with your core point—if we were to model this with full probability distributions over timeline and alignment difficulty, GM would likely be favored more heavily than our conditional analysis suggests, especially if we place significant probability on short timelines or hard alignment