That particular statement was very poorly received, with a 139-karma retort from John Wentworth arguing,
What exactly is the model by which some AI organization demonstrating AI capabilities will lead to world governments jointly preventing scary AI from being built, in a world which does not actually ban gain-of-function research?
I’m not sure what’s going on here
So, wait, what’s actually the answer to this question? I read that entire comment thread and didn’t find one. The question seems to me to be a good one!
“What exactly” seems a bit weird type of question. For example, consider nukes: it was hard to predict what exactly is the model by which governments will not blow everyone up after use of nukes in Japan. But also: while the resulting equilibrium is not great, we haven’t died in nuclear WWIII so far.
As in my comment here, if you have a model that simultaneously both explains the fact that governments are funding GoF research right now, and predicts that governments would nevertheless react helpfully to AGI, I’m very interested to hear it. It seems to me that defunding GoF is a dramatically easier problem in practically every way.
The only responses I can think of right now are (1) “Basically nobody in or near government is working hard to defund GoF but people in or near government will be working hard to spur on a helpful response to AGI” (really? if so, what’s upstream of that supposed difference?) or (2) “It’s all very random—who happens to be in what position of power and when, etc.—and GoF is just one example, so we shouldn’t generalize too far from it” (OK maybe, but if so, then can we pile up more examples into a reference class to get a base rate or something? and what are the interventions to improve the odds, and can we also try those same interventions on GoF?)
I think it’s worth updating on the fact that the US government has already launched a massive, disruptive, costly, unprecedented policy of denying AI-training chips to China. I’m not aware of any similar-magnitude measure happening in the GoF domain.
IMO that should end the debate about whether the government will treat AI dev the way it has GoF—it already has moved it to a different reference class.
Some wild speculation on upstream attributes of advanced AI’s reference class that might explain the difference in the USG’s approach:
a perception of new AI as geoeconomically disruptive; that new AI has more obvious natsec-relevant use-cases than GoF; that powerful AI is more culturally salient than powerful bio (“evil robots are scarier than evil germs”).
Not all of these are cause for optimism re: a global ASI ban, but (by selection) they point to governments treating AI “seriously”.
One big difference is GoF currently does not seem that dangerous to governments. If you look on it from a perspective not focusing on the layer of individual humans as agents, but instead states, corporations, memplexes and similar creatures as the agents, GoF maybe does not look that scary? Sure, there was covid, but while it was clearly really bad for humans, it mostly made governments/states relatively stronger.
Taking this difference into account, my model was and still is governments will react to AI.
This does not imply reacting in a helpful way, but I think whether the reaction will be helpful, harmful or just random is actually one of the higher variance parameters, and a point of leverage. (And the common-on-LW stance governments are stupid and evil and you should mostly ignore them is unhelpful in both understanding and influencing the situation.)
Personally I haven’t thought about how strong the analogy to GoF is, but another thing that feels worth noting is that there may be a bunch of other cases where the analogy is similarly strong and where major government efforts aimed at risk-reduction have occurred. And my rough sense is that that’s indeed the case, e.g. some of the examples here.
In general, at least for important questions worth spending time on, it seems very weird to say “You think X will happen, but we should be very confident it won’t because in analogous case Y it didn’t”, without also either (a) checking for other analogous cases or other lines of argument or (b) providing an argument for why this one case is far more relevant evidence than any other available evidence. I do think it totally makes sense to flag the analogous case and to update in light of it, but stopping there and walking away feeling confident in the answer seems very weird.
I haven’t read any of the relevant threads in detail, so perhaps the arguments made are stronger than I imply here, but my guess is they weren’t. And it seems to me that it’s unfortunately decently common for AI risk discussions on LessWrong to involve this pattern I’m sketching here.
(To be clear, all I’m arguing here is that these arguments often seem weak, not that their conclusions are false.)
(This comment is raising an additional point to Jan’s, not disagreeing.)
Update: Oh, I just saw Steve Byrnes also the following in this thread, which I totally agree with:
“[Maybe one could argue] “It’s all very random—who happens to be in what position of power and when, etc.—and GoF is just one example, so we shouldn’t generalize too far from it” (OK maybe, but if so, then can we pile up more examples into a reference class to get a base rate or something? and what are the interventions to improve the odds, and can we also try those same interventions on GoF?)”
What exactly” seems a bit weird type of question. For example, consider nukes: it was hard to predict what exactly is the model by which governments will not blow everyone up after use of nukes in Japan. But also: while the resulting equilibrium is not great, we haven’t died in nuclear WWIII so far.
This would be useful if the main problem was misuse, and while this problem is arguably serious, there is another problem, called the alignment problem, that doesn’t care who uses AGI, only that it exists.
Biotech is probably the best example of technology being slowed down in the manner required, and suffice it to say it only happened because eugenics and anything related to that became taboo after WW2. I obviously don’t want a WW3 to slow down AI progress, but the main criticism remains: The examples of tech that were slowed down in the manner required for alignment required massive death tolls, ala a pivotal act.
The analogy I had in mind is not so much in exact nature of the problem, but in the aspect it’s hard to make explicit precise models of such situations in advance. In case of nukes, consider the fact that smartest minds of the time, like von Neumann or Feynman, spent decent amount of time thinking about the problems, had clever explicit models, and were wrong—in case of von Neumann to the extent that if US followed his advice, they would have launched nuclear armageddon.
So, wait, what’s actually the answer to this question? I read that entire comment thread and didn’t find one. The question seems to me to be a good one!
The GoF analogy is quite weak.
“What exactly” seems a bit weird type of question. For example, consider nukes: it was hard to predict what exactly is the model by which governments will not blow everyone up after use of nukes in Japan. But also: while the resulting equilibrium is not great, we haven’t died in nuclear WWIII so far.
As in my comment here, if you have a model that simultaneously both explains the fact that governments are funding GoF research right now, and predicts that governments would nevertheless react helpfully to AGI, I’m very interested to hear it. It seems to me that defunding GoF is a dramatically easier problem in practically every way.
The only responses I can think of right now are (1) “Basically nobody in or near government is working hard to defund GoF but people in or near government will be working hard to spur on a helpful response to AGI” (really? if so, what’s upstream of that supposed difference?) or (2) “It’s all very random—who happens to be in what position of power and when, etc.—and GoF is just one example, so we shouldn’t generalize too far from it” (OK maybe, but if so, then can we pile up more examples into a reference class to get a base rate or something? and what are the interventions to improve the odds, and can we also try those same interventions on GoF?)
I think it’s worth updating on the fact that the US government has already launched a massive, disruptive, costly, unprecedented policy of denying AI-training chips to China. I’m not aware of any similar-magnitude measure happening in the GoF domain.
IMO that should end the debate about whether the government will treat AI dev the way it has GoF—it already has moved it to a different reference class.
Some wild speculation on upstream attributes of advanced AI’s reference class that might explain the difference in the USG’s approach:
a perception of new AI as geoeconomically disruptive; that new AI has more obvious natsec-relevant use-cases than GoF; that powerful AI is more culturally salient than powerful bio (“evil robots are scarier than evil germs”).
Not all of these are cause for optimism re: a global ASI ban, but (by selection) they point to governments treating AI “seriously”.
One big difference is GoF currently does not seem that dangerous to governments. If you look on it from a perspective not focusing on the layer of individual humans as agents, but instead states, corporations, memplexes and similar creatures as the agents, GoF maybe does not look that scary? Sure, there was covid, but while it was clearly really bad for humans, it mostly made governments/states relatively stronger.
Taking this difference into account, my model was and still is governments will react to AI.
This does not imply reacting in a helpful way, but I think whether the reaction will be helpful, harmful or just random is actually one of the higher variance parameters, and a point of leverage. (And the common-on-LW stance governments are stupid and evil and you should mostly ignore them is unhelpful in both understanding and influencing the situation.)
Personally I haven’t thought about how strong the analogy to GoF is, but another thing that feels worth noting is that there may be a bunch of other cases where the analogy is similarly strong and where major government efforts aimed at risk-reduction have occurred. And my rough sense is that that’s indeed the case, e.g. some of the examples here.
In general, at least for important questions worth spending time on, it seems very weird to say “You think X will happen, but we should be very confident it won’t because in analogous case Y it didn’t”, without also either (a) checking for other analogous cases or other lines of argument or (b) providing an argument for why this one case is far more relevant evidence than any other available evidence. I do think it totally makes sense to flag the analogous case and to update in light of it, but stopping there and walking away feeling confident in the answer seems very weird.
I haven’t read any of the relevant threads in detail, so perhaps the arguments made are stronger than I imply here, but my guess is they weren’t. And it seems to me that it’s unfortunately decently common for AI risk discussions on LessWrong to involve this pattern I’m sketching here.
(To be clear, all I’m arguing here is that these arguments often seem weak, not that their conclusions are false.)
(This comment is raising an additional point to Jan’s, not disagreeing.)
Update: Oh, I just saw Steve Byrnes also the following in this thread, which I totally agree with:
This would be useful if the main problem was misuse, and while this problem is arguably serious, there is another problem, called the alignment problem, that doesn’t care who uses AGI, only that it exists.
Biotech is probably the best example of technology being slowed down in the manner required, and suffice it to say it only happened because eugenics and anything related to that became taboo after WW2. I obviously don’t want a WW3 to slow down AI progress, but the main criticism remains: The examples of tech that were slowed down in the manner required for alignment required massive death tolls, ala a pivotal act.
The analogy I had in mind is not so much in exact nature of the problem, but in the aspect it’s hard to make explicit precise models of such situations in advance. In case of nukes, consider the fact that smartest minds of the time, like von Neumann or Feynman, spent decent amount of time thinking about the problems, had clever explicit models, and were wrong—in case of von Neumann to the extent that if US followed his advice, they would have launched nuclear armageddon.