I think it would probably be bad for the US to unilaterally force all US AI developers to pause if they didn’t simultaneously somehow slow down non-US development.
It seems to me that to believe this, you have to believe all of these four things are true:
Solving AI alignment is basically easy
Non-US frontier AI developers are not interested in safety
Non-US frontier AI developers will quickly catch up to the US
If US developers slow down, then non-US developers are very unlikely to also slow down—either voluntarily, or because the US strong-arms them into signing a non-proliferation treaty, or whatever
I think #3 is sort-of true and the others are probably false, so the probability of all four being simultaneously true is quite low.
(Statements I’ve seen from Chinese developers lead me to believe that they are less interested in racing and more concerned about safety.)
I made a quick Squiggle model on racing vs. slowing down. Based on my first-guess parameters, it suggests that racing to build AI destroys ~half the expected value of the future compared to not racing. Parameter values are rough, of course.
Yes the model is more about racing than about pausing but I thought it was applicable here. My thinking was that there is a spectrum of development speed with “completely pause” on one end and “race as fast as possible” on the other. Pushing more toward the “pause” side of the spectrum has the ~opposite effect as pushing toward the “race” side.
I’ve never seen anyone else try to quantitatively model it. As far as I know, my model is the most granular quantitative model ever made. Which isn’t to say it’s particularly granular (I spent less than an hour on it) but this feels like an unfair criticism.
In general I am not a fan of criticisms of the form “this model is too simple”. All models are too simple. What, specifically, is wrong with it?
I had a quick look at the linked post and it seems to be making some implicit assumptions, such as
the plan of “use AI to make AI safe” has a ~100% chance of working (the post explicitly says this is false, but then proceeds as if it’s true)
there is a ~100% chance of slow takeoff
if you unilaterally pause, this doesn’t increase the probability that anyone else pauses, doesn’t make it easier to get regulations passed, etc.
I would like to see some quantification of the from “we think there is a 30% chance that we can bootstrap AI alignment using AI; a unilateral pause will only increase the probability of a global pause by 3 percentage points; and there’s only a 50% chance that the 2nd-leading company will attempt to align AI in a way we’d find satisfactory, therefore we think the least-risky plan is to stay at the front of the race and then bootstrap AI alignment.” (Or a more detailed version of that.)