To start things off, I would predict a world with slow takeoff and personal intent-alignment looks far more multipolar than the standard Yudkowskian recursively self-improving singleton that takes over the entire lightcone in a matter of “weeks or hours rather than years or decades”. So the title of that section seems a bit off because, in this world, what the literal first AGI does becomes much less important, since we expect to see other similarly capable AI systems get developed by other leading labs relatively soon afterwards anyway.
But, in any case, the bigger issue I have with the reasoning there is the assumption (inferred from statements like “the humans in charge of AGI may not have the chutzpah to even try such a thing”) that the social response to the development of general intelligence is going to be… basically muted? Or that society will continue to be business-as-normal in any meaningful sense? I would be entirely shocked if the current state of the world in which the vast majority of people have little knowledge of the current capabilities of AI systems and are totally clueless about the AI labs’ race towards AGI were to continue past the point that actual AGI is reached.
I think intuitions of the type that “There’s No Fire Alarm for Artificial General Intelligence” are very heavily built around the notion of rapid takeoff that is so fast there might well be no major economic evidence of the impact of AI before the most advanced systems become massively superintelligent. Or that there might not be massive rises in unemployment negatively impacting many people who are trying to live through the transition to an eventual post-scarcity economy. Or that the ways people relate to AIs or to one another will not be completely turned on their heads.
A future world in which we get pretty far along the way to no longer needing old OSs or programming languages because you can get an LLM to write really good code for you, in which AI can write an essay better than most (if not all) A+ undergrad students, in which it can solve Olympiad math problems better than all contestants and do research better than a graduate student, in which deep-learning based lie detection technology actually gets good and starts being used more and more, in which major presidential candidates are already using AI-generated imagery and causing controversies over whether others are using similar technology, in which the capacity to easily generate whatever garbage you request breaks the internet or fills it entirely with creepy AI-generated propaganda videos made by state-backed cults, is a world in which stability and equilibrium are broken. It is not a world in which “normality” can continue, in the sense that governments and people keep sleepwalking through the threat posed by AI.
I consider it very unlikely that such major changes to society can go by without the fundamental thinking around them changing massively, and without those who will be close to the top of the line of “most informed about the capabilities of AI” grasping the importance of the moment. Humans are social creatures who delegate most of their thinking on what issues should even be sanely considered to the social group around them; a world with slow takeoff is a world in which I expect massive changes to happen during a long enough time-span that public opinion shifts, dragging along with it both the Overton window and the baseline assumptions about what can/must be done about this.
There will, of course, be a ton of complicating factors that we can discuss, such as the development of more powerful persuasive AI catalyzing the shift of the world towards insanity and inadequacy, but overall I do not expect the argument in this section to follow through.
Edit: I very much agree with your arguments against sleepwalking and against the continuation of normality. I think the “inattentive world” hypothesis is all but disproven, and it still plays an outsized role in alignment thinking.
I don’t think the arguments in that section depend on any assumption of normality or sleepwalking. And the multipolar scenario is the problem, so it can’t be part of a solution. They do depend on people making nonoptimal decisions, which people do constantly.
So I think the arguments in that section are more general than you’re hoping.
If those don’t hold, what is the alternate scenario in which a multipolar world remains safe?
If those don’t hold, what is the alternate scenario in which a multipolar world remains safe?
The choice of the word “remains” is an interesting one here. What is true of our current multipolar world which makes the current world “safe”, but which would stop being true of a more advanced multipolar world? I don’t think it can be “offense/defense balance” because nuclear and biological weapons are already far on the “offense is easier than defense” side of that spectrum.
I agree that it should be phrased differently. One problem here is that AGI may allow victory without mutually assured destruction. A second is that it may proliferate far more widely than nukes or bioweapons have so far. People often speak of massively multipolar scenarios as a good outcome.
Good point about the word “remains”. I’m afraid people see a “stable” situation—but logically that only extends for a few years until fully autonomously RSI-capable AGI and robotics is widespread, and any malcontent can produce offensive capabilities we can’t yet imagine.
People often speak of massively multipolar scenarios as a good outcome.
I understand that inclination. Historically, unipolar scenarios do not have a great track record of being good for those not in power, especially unipolar scenarios where the one in power doesn’t face significant risks to mistreating those under them. So if unipolar scenarios are bad, that means multipolar scenarios are good, right?
But “the good situation we have now is not stable, we can choose between making things a bit worse (for us personally) immediately and maybe not get catastrophically worse later, or having things remain good now but get catastrophically worse later” is a pretty hard pill to swallow. And is also an argument with a rich history of being ignored without the warned catastrophic thing happening.
I think “The first AGI probably won’t perform a pivotal act” is by far the weakest section.
To start things off, I would predict a world with slow takeoff and personal intent-alignment looks far more multipolar than the standard Yudkowskian recursively self-improving singleton that takes over the entire lightcone in a matter of “weeks or hours rather than years or decades”. So the title of that section seems a bit off because, in this world, what the literal first AGI does becomes much less important, since we expect to see other similarly capable AI systems get developed by other leading labs relatively soon afterwards anyway.
But, in any case, the bigger issue I have with the reasoning there is the assumption (inferred from statements like “the humans in charge of AGI may not have the chutzpah to even try such a thing”) that the social response to the development of general intelligence is going to be… basically muted? Or that society will continue to be business-as-normal in any meaningful sense? I would be entirely shocked if the current state of the world in which the vast majority of people have little knowledge of the current capabilities of AI systems and are totally clueless about the AI labs’ race towards AGI were to continue past the point that actual AGI is reached.
I think intuitions of the type that “There’s No Fire Alarm for Artificial General Intelligence” are very heavily built around the notion of rapid takeoff that is so fast there might well be no major economic evidence of the impact of AI before the most advanced systems become massively superintelligent. Or that there might not be massive rises in unemployment negatively impacting many people who are trying to live through the transition to an eventual post-scarcity economy. Or that the ways people relate to AIs or to one another will not be completely turned on their heads.
A future world in which we get pretty far along the way to no longer needing old OSs or programming languages because you can get an LLM to write really good code for you, in which AI can write an essay better than most (if not all) A+ undergrad students, in which it can solve Olympiad math problems better than all contestants and do research better than a graduate student, in which deep-learning based lie detection technology actually gets good and starts being used more and more, in which major presidential candidates are already using AI-generated imagery and causing controversies over whether others are using similar technology, in which the capacity to easily generate whatever garbage you request breaks the internet or fills it entirely with creepy AI-generated propaganda videos made by state-backed cults, is a world in which stability and equilibrium are broken. It is not a world in which “normality” can continue, in the sense that governments and people keep sleepwalking through the threat posed by AI.
I consider it very unlikely that such major changes to society can go by without the fundamental thinking around them changing massively, and without those who will be close to the top of the line of “most informed about the capabilities of AI” grasping the importance of the moment. Humans are social creatures who delegate most of their thinking on what issues should even be sanely considered to the social group around them; a world with slow takeoff is a world in which I expect massive changes to happen during a long enough time-span that public opinion shifts, dragging along with it both the Overton window and the baseline assumptions about what can/must be done about this.
There will, of course, be a ton of complicating factors that we can discuss, such as the development of more powerful persuasive AI catalyzing the shift of the world towards insanity and inadequacy, but overall I do not expect the argument in this section to follow through.
Edit: I very much agree with your arguments against sleepwalking and against the continuation of normality. I think the “inattentive world” hypothesis is all but disproven, and it still plays an outsized role in alignment thinking.
I don’t think the arguments in that section depend on any assumption of normality or sleepwalking. And the multipolar scenario is the problem, so it can’t be part of a solution. They do depend on people making nonoptimal decisions, which people do constantly.
So I think the arguments in that section are more general than you’re hoping.
If those don’t hold, what is the alternate scenario in which a multipolar world remains safe?
The choice of the word “remains” is an interesting one here. What is true of our current multipolar world which makes the current world “safe”, but which would stop being true of a more advanced multipolar world? I don’t think it can be “offense/defense balance” because nuclear and biological weapons are already far on the “offense is easier than defense” side of that spectrum.
I agree that it should be phrased differently. One problem here is that AGI may allow victory without mutually assured destruction. A second is that it may proliferate far more widely than nukes or bioweapons have so far. People often speak of massively multipolar scenarios as a good outcome.
Good point about the word “remains”. I’m afraid people see a “stable” situation—but logically that only extends for a few years until fully autonomously RSI-capable AGI and robotics is widespread, and any malcontent can produce offensive capabilities we can’t yet imagine.
I understand that inclination. Historically, unipolar scenarios do not have a great track record of being good for those not in power, especially unipolar scenarios where the one in power doesn’t face significant risks to mistreating those under them. So if unipolar scenarios are bad, that means multipolar scenarios are good, right?
But “the good situation we have now is not stable, we can choose between making things a bit worse (for us personally) immediately and maybe not get catastrophically worse later, or having things remain good now but get catastrophically worse later” is a pretty hard pill to swallow. And is also an argument with a rich history of being ignored without the warned catastrophic thing happening.
Excellent point that unipolar scenarios have been bad historically. I wrote about recognizing the validity of that concern recently in Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours.
And good point that warnings of future catastrophe are likely to go unheeded because wolf has been cried in the past.
Although sometimes those things didn’t happen precisely because the warnings were heeded.
In this case, we only need one or a few relatively informed actors to heed the call to prevent proliferation even if it’s short-term risky.