It feels like the most basic version of the argument also applies to industrialization or any technology that changes the world (and hastens further changes). If you set aside alignment concerns, I don’t think it’s obvious that AGI is fundamentally different in the way you claim—Google could build an AGI system that only does what Google wants, but someone else can build AGI-as-a-service that does whatever the customer wants, and customers will buy that one. I think it is much more robust to make sure someone builds the good AGI product rather than to try to prevent anyone from building the bad AGI product.
On the substance I’m skeptical of the more general anti-change sentiment—I think that technological progress has been one of the most important drivers of improving human conditions, and procedurally I value a liberal society where people are free to build and sell technologies as long as they comply with the law. In some sense it might be right to call industrialization an act of aggression, but I think the implied policy response may have been counterproductive.
Probably this comes down to disagreements about the likely dynamics of an intelligence explosion. I think I’m representing something closer to the “mainstream” opinion on this question (in fact even my view is an outlier), and so someone making the general case for badness probably needs to focus on the case for fast takeoff.
In my view, to the extent that I think AGI should be developed more slowly, it’s mostly because of distinctive features of AGI. I think this is consistent with what you are saying in the article, but my own thinking is focused on those distinctions.
The two distinctions I care most about are:
To effectively govern a world with AI, militaries and law enforcement will need to rely extensively on AI. If they don’t they will not remain competitive with criminals or other states who do use the technology. This is also true for idnustrialization.
If AI isn’t suitable for law enforcement or military use, then it’s not a good idea to develop it full stop. Alignment is the most obvious problem here—automated militaries or law enforcement pose an unacceptable risk of coup—but you could also make reasonable arguments based on unreliability or unpredictability of AI. This is the key disanalogy with industrialization.
On this framing me developing AGI is primarily an “act of aggression” to the extent that you have good reasons not to develop or deploy AGI. Under those conditions you may (correctly) recognize that a large advantage AI is likely to put me in a position where I can later commit acts of aggression and you won’t have the infrastructure to defend yourself, and so you need to take steps to defend yourself at this earlier stage.
You could try to make the more general argument that any technological progress by me is a potential prelude to future aggression, whether or not you have good reasons not to develop the technology yourself. But in the more general context I’m skeptical—competing by developing new technologies is often good for the world (unlike central examples of aggression), trying to avoid it seems highly unstable and difficult, the upside in general is not that clear, and it acts as an almost fully general pretense for preemptive war.
AI is likely to move fast. A company can make a lot of money by moving faster than its competition, but the world as a whole gains relatively little from the acceleration. At best we realize the gains from AI years or even just months sooner. As we approach the singularity, the room for acceleration (and hence the social benefits from faster AI) converge to zero. No matter how cool AI is, I don’t think getting them slightly sooner is “spectacular.” This is in contrast to historical technologies, which are developed over decades and centuries, where a 10% slowdown could easily delay new technologies by a whole lifetime.
Meanwhile, the faster the technology moves the more severe the unaddressed governance problems are, and so the larger the gains from slowing down become. These problems crop up everywhere: our laws and institutions and culture just don’t have an easy time changing fast enough if a technology goes from 0 to transformative over a single decade.
On an econ 101 model, “almost no social value from accelerating” may seem incongruent with the fact that AI developers can make a lot of money, especially if there is a singularity. The econ 101 reconciliation is that having better AI may allow you to win a war or grab resources in space before someone else gets to them—if we had secure property rights, then all the economic value would flow to physical resources rather than information technology, since it won’t be long before everyone has great AI and people aren’t willing to pay very much to get those benefits slightly sooner.
So that’s pretty fundamentally different from other technologies. It may be that in the endgame people and states are willing to pay a large fraction of their wealth to get access to the very latest AI, but the only reason is that if they don’t then someone else will eat their lunch. Once we’re in that regime, we have an obvious collective rationale to slow down development.
(This argument only applies to modest slowdowns, that remain small relative to consumers’ impatience. If you are considering 6 months of slowdown the social cost can easily be offset by small governance improvements; if you are considering 10 years I think you can only justify the cost by appealing to potential catastrophic or irreversible harms.)
On the substance I’m skeptical of the more general anti-change sentiment—I think that technological progress has been one of the most important drivers of improving human conditions, and procedurally I value a liberal society where people are free to build and sell technologies as long as they comply with the law.
I’m pretty conflicted but a large part of me wants to bite this bullet, and say that a more deliberate approach to technological change would be good overall, even when applied to both the past and present/future. Because:
Tech progress improving human conditions up to now depended on luck, and could have turned out differently, if for example there was some tech that allowed an individual or small group to destroy the world, or human fertility didn’t decrease and Malthusian dynamics kept applying.
On some moral views (e.g. utilitarianism), it would be worth it to achieve a smaller x-risk even if there’s a cost in terms of more time humanity spends under worse conditions. If you think that there’s 20% x-risk on the current trajectory, for example, why isn’t it worth a general slowdown in tech progress and associated improvements in human conditions to reduce it to 1% or even 10%, if that was the cost? (Not entirely rhetorical. I genuinely don’t know why you’d be against this.)
Thanks for the answer! To be honest when I wrote this I mostly had in mind the kind of winner-takes-all intelligence explosion scenarios that are essentially the flip side of Eliezer’s chosen flavour of catastrophe: fast take-off by a FAI, pivotal act or what have you, and essentially world conquest. I think if the choice boiled down to those two things (and I’m not sure how can we know unless we manage to have a solid theory of intelligence scaling before we have AGI), or rather, if states believed it did, then it’d really be a lose-lose scenario. Everyone would just want not only FAI, but their FAI, probably couldn’t really tell an UFAI from a FAI anyway (you have to trust your enemy to not having screwed up essentially), so whatever the outcome, nukes fly at the first sign of anomalous activity.
If we tone it down to a slower take off I agree with you the situation may not be so dramatic—yes, everything that gives one country a huge asymmetrical advantage in productivity would translate also into strategic dominance, but it’s a fact that (be it out of rational self-preservation over dedication to the superorganism country, or be it out of irrational behaviour dominated by a forlorn hope that if you wait it out there’ll be better times to do something later) usually even enemies don’t straight up launch deadly attacks in response to just that.
I think however there is a third fundamental qualitative difference between AGI and any other technology, including specialised AI tools such as what we have now. Specialised tools only amplify the power of humans, leaving all value decisions in their hands. A single human has still limited physical and cognitive power, and we need to congregate in groups to gain enough force for large scale action. Since different people have different values and interests, groups need negotiation and a base of minimum shared goals to coalesce around. This produces some disunity that requires management, puts a damper on some of the most extreme behaviours (as they may splinter the moderates from the radicals), and overall defines all the dynamics of every group and organisation. Armed with spears or with guns, an army still runs on its morale, still needs trust in its leaders and its goals, still can mutiny.
By comparison, an army (or a company) of aligned AGIs can’t mutiny. It is almost the same as an extension of the body and mind of its master (to whom we assume it is aligned). The dynamics of this are radically different, akin to the power differential introduced by having people with superpowers or magic suddenly appear. Individuals commanding immense power, several orders of magnitude above that of a single human, with perfect fidelity, no lossiness.
I agree that technology has been good by improving standards of life, but I have to wonder—how much of this was because we truly held human flourishing as a terminal value, and how much simply because it was instrumental to other values (e.g. we needed educated workers and consumers so that current mass industrialised society could work at all)? After all, “setting up incentives so that human flourishing is an instrumental value to achieving selfish terminal values” is kind of the whole sleight of hand of capitalism and the free market—“it is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own self-interest”. With AGI (and possibly robotics, which we might expect would follow if you put a million instances of AGI engineers on the case) human flourishing and economic self-interest of their owners might be entirely and fundamentally decoupled. If that were the case, then the classic assumption that technology equals flourishing could stop being true. That would make things very painful, and possibly deadly, even if AGI itself happened to not be the threat.
(This argument only applies to modest slowdowns, that remain small relative to consumers’ impatience. If you are considering 6 months of slowdown the social cost can easily be offset by small governance improvements; if you are considering 10 years I think you can only justify the cost by appealing to potential catastrophic or irreversible harms.)
I’m not really thinking of a slowdown as much as a switch in focus (though of course right now that would also mean a slowdown, since we’d have to rewind the clock a bit). More focus on building systems bottom-up instead of top-down and more focus on building systems that are powerful but specialised rather than general and agentic. Things like protein folding prediction AIs would be the perfect example of this sort of tool. Something that enables humans workers to do their work better and faster, surpassing the limits of their cognition, without making them entirely redundant. That way we both don’t break the known rules of technological innovation and we guarantee that values remain human-centred as always. It might be less performant than fully automated AGI loops but it vastly gains in safety. Though of course, seeing how even GPT-4 appears mildly agentic just when rigged into a self-loop with LangChain, I have to wonder whether you can ever keep those two areas separate enough.
This might be a bit off topic for the focus of your response. I actually agree that deployment of AGI won’t be seen as an act of aggression. But I think it probably should be, if other actors understand the huge advantage that first movers will enjoy, and how tricky a new balance of power will become.
By setting aside alignment concerns entirely, you’re assuming for this scenario that not only is alignment solved, but that solution is easy enough, or coordination is good enough, that every new AGI is also aligned. I don’t think it’s useful to set the issue that far aside. Eventually, somebody is going to screw up and make one that’s not aligned.
I think a balance of power scenario also requires many AGIs to stay at about the same level of capability. If one becomes rapidly more capable, the balance of power is thrown off.
Another issue with balance-of-power scenarios, even assuming alignment, is that eventually individuals or small groups will be able to create AGI. And by eventually, I mean at most ten years after states and large corporations can do it. Then a lot of the balance of power arguments don’t apply, and you’re more prone to having people do truly stupid or evil (by default ethical standards) things with their personally-aligned AGI.
Most of the arguments in Steve Byrnes excellent What does it take to defend the world against out-of-control AGIs? apply to hostile actions from sane state and corporate actors. Even more apply to non-state actors with weirder goals. One pivotal act he doesn’t mention is forming a panopticon, monitoring and de-encrypting every human communication for the purpose of preventing further AGI development. Having this amount of power would also enable easy manipulation and sabotage of political systems, and it’s hard to imagine a balance of power where on corporation or government enjoys this power.
It feels like the most basic version of the argument also applies to industrialization or any technology that changes the world (and hastens further changes). If you set aside alignment concerns, I don’t think it’s obvious that AGI is fundamentally different in the way you claim—Google could build an AGI system that only does what Google wants, but someone else can build AGI-as-a-service that does whatever the customer wants, and customers will buy that one. I think it is much more robust to make sure someone builds the good AGI product rather than to try to prevent anyone from building the bad AGI product.
On the substance I’m skeptical of the more general anti-change sentiment—I think that technological progress has been one of the most important drivers of improving human conditions, and procedurally I value a liberal society where people are free to build and sell technologies as long as they comply with the law. In some sense it might be right to call industrialization an act of aggression, but I think the implied policy response may have been counterproductive.
Probably this comes down to disagreements about the likely dynamics of an intelligence explosion. I think I’m representing something closer to the “mainstream” opinion on this question (in fact even my view is an outlier), and so someone making the general case for badness probably needs to focus on the case for fast takeoff.
In my view, to the extent that I think AGI should be developed more slowly, it’s mostly because of distinctive features of AGI. I think this is consistent with what you are saying in the article, but my own thinking is focused on those distinctions.
The two distinctions I care most about are:
To effectively govern a world with AI, militaries and law enforcement will need to rely extensively on AI. If they don’t they will not remain competitive with criminals or other states who do use the technology. This is also true for idnustrialization.
If AI isn’t suitable for law enforcement or military use, then it’s not a good idea to develop it full stop. Alignment is the most obvious problem here—automated militaries or law enforcement pose an unacceptable risk of coup—but you could also make reasonable arguments based on unreliability or unpredictability of AI. This is the key disanalogy with industrialization.
On this framing me developing AGI is primarily an “act of aggression” to the extent that you have good reasons not to develop or deploy AGI. Under those conditions you may (correctly) recognize that a large advantage AI is likely to put me in a position where I can later commit acts of aggression and you won’t have the infrastructure to defend yourself, and so you need to take steps to defend yourself at this earlier stage.
You could try to make the more general argument that any technological progress by me is a potential prelude to future aggression, whether or not you have good reasons not to develop the technology yourself. But in the more general context I’m skeptical—competing by developing new technologies is often good for the world (unlike central examples of aggression), trying to avoid it seems highly unstable and difficult, the upside in general is not that clear, and it acts as an almost fully general pretense for preemptive war.
AI is likely to move fast. A company can make a lot of money by moving faster than its competition, but the world as a whole gains relatively little from the acceleration. At best we realize the gains from AI years or even just months sooner. As we approach the singularity, the room for acceleration (and hence the social benefits from faster AI) converge to zero. No matter how cool AI is, I don’t think getting them slightly sooner is “spectacular.” This is in contrast to historical technologies, which are developed over decades and centuries, where a 10% slowdown could easily delay new technologies by a whole lifetime.
Meanwhile, the faster the technology moves the more severe the unaddressed governance problems are, and so the larger the gains from slowing down become. These problems crop up everywhere: our laws and institutions and culture just don’t have an easy time changing fast enough if a technology goes from 0 to transformative over a single decade.
On an econ 101 model, “almost no social value from accelerating” may seem incongruent with the fact that AI developers can make a lot of money, especially if there is a singularity. The econ 101 reconciliation is that having better AI may allow you to win a war or grab resources in space before someone else gets to them—if we had secure property rights, then all the economic value would flow to physical resources rather than information technology, since it won’t be long before everyone has great AI and people aren’t willing to pay very much to get those benefits slightly sooner.
So that’s pretty fundamentally different from other technologies. It may be that in the endgame people and states are willing to pay a large fraction of their wealth to get access to the very latest AI, but the only reason is that if they don’t then someone else will eat their lunch. Once we’re in that regime, we have an obvious collective rationale to slow down development.
(This argument only applies to modest slowdowns, that remain small relative to consumers’ impatience. If you are considering 6 months of slowdown the social cost can easily be offset by small governance improvements; if you are considering 10 years I think you can only justify the cost by appealing to potential catastrophic or irreversible harms.)
I’m pretty conflicted but a large part of me wants to bite this bullet, and say that a more deliberate approach to technological change would be good overall, even when applied to both the past and present/future. Because:
Tech progress improving human conditions up to now depended on luck, and could have turned out differently, if for example there was some tech that allowed an individual or small group to destroy the world, or human fertility didn’t decrease and Malthusian dynamics kept applying.
On some moral views (e.g. utilitarianism), it would be worth it to achieve a smaller x-risk even if there’s a cost in terms of more time humanity spends under worse conditions. If you think that there’s 20% x-risk on the current trajectory, for example, why isn’t it worth a general slowdown in tech progress and associated improvements in human conditions to reduce it to 1% or even 10%, if that was the cost? (Not entirely rhetorical. I genuinely don’t know why you’d be against this.)
Thanks for the answer! To be honest when I wrote this I mostly had in mind the kind of winner-takes-all intelligence explosion scenarios that are essentially the flip side of Eliezer’s chosen flavour of catastrophe: fast take-off by a FAI, pivotal act or what have you, and essentially world conquest. I think if the choice boiled down to those two things (and I’m not sure how can we know unless we manage to have a solid theory of intelligence scaling before we have AGI), or rather, if states believed it did, then it’d really be a lose-lose scenario. Everyone would just want not only FAI, but their FAI, probably couldn’t really tell an UFAI from a FAI anyway (you have to trust your enemy to not having screwed up essentially), so whatever the outcome, nukes fly at the first sign of anomalous activity.
If we tone it down to a slower take off I agree with you the situation may not be so dramatic—yes, everything that gives one country a huge asymmetrical advantage in productivity would translate also into strategic dominance, but it’s a fact that (be it out of rational self-preservation over dedication to the superorganism country, or be it out of irrational behaviour dominated by a forlorn hope that if you wait it out there’ll be better times to do something later) usually even enemies don’t straight up launch deadly attacks in response to just that.
I think however there is a third fundamental qualitative difference between AGI and any other technology, including specialised AI tools such as what we have now. Specialised tools only amplify the power of humans, leaving all value decisions in their hands. A single human has still limited physical and cognitive power, and we need to congregate in groups to gain enough force for large scale action. Since different people have different values and interests, groups need negotiation and a base of minimum shared goals to coalesce around. This produces some disunity that requires management, puts a damper on some of the most extreme behaviours (as they may splinter the moderates from the radicals), and overall defines all the dynamics of every group and organisation. Armed with spears or with guns, an army still runs on its morale, still needs trust in its leaders and its goals, still can mutiny.
By comparison, an army (or a company) of aligned AGIs can’t mutiny. It is almost the same as an extension of the body and mind of its master (to whom we assume it is aligned). The dynamics of this are radically different, akin to the power differential introduced by having people with superpowers or magic suddenly appear. Individuals commanding immense power, several orders of magnitude above that of a single human, with perfect fidelity, no lossiness.
I agree that technology has been good by improving standards of life, but I have to wonder—how much of this was because we truly held human flourishing as a terminal value, and how much simply because it was instrumental to other values (e.g. we needed educated workers and consumers so that current mass industrialised society could work at all)? After all, “setting up incentives so that human flourishing is an instrumental value to achieving selfish terminal values” is kind of the whole sleight of hand of capitalism and the free market—“it is not from the benevolence of the butcher, the brewer, or the baker that we expect our dinner, but from their regard to their own self-interest”. With AGI (and possibly robotics, which we might expect would follow if you put a million instances of AGI engineers on the case) human flourishing and economic self-interest of their owners might be entirely and fundamentally decoupled. If that were the case, then the classic assumption that technology equals flourishing could stop being true. That would make things very painful, and possibly deadly, even if AGI itself happened to not be the threat.
I’m not really thinking of a slowdown as much as a switch in focus (though of course right now that would also mean a slowdown, since we’d have to rewind the clock a bit). More focus on building systems bottom-up instead of top-down and more focus on building systems that are powerful but specialised rather than general and agentic. Things like protein folding prediction AIs would be the perfect example of this sort of tool. Something that enables humans workers to do their work better and faster, surpassing the limits of their cognition, without making them entirely redundant. That way we both don’t break the known rules of technological innovation and we guarantee that values remain human-centred as always. It might be less performant than fully automated AGI loops but it vastly gains in safety. Though of course, seeing how even GPT-4 appears mildly agentic just when rigged into a self-loop with LangChain, I have to wonder whether you can ever keep those two areas separate enough.
This might be a bit off topic for the focus of your response. I actually agree that deployment of AGI won’t be seen as an act of aggression. But I think it probably should be, if other actors understand the huge advantage that first movers will enjoy, and how tricky a new balance of power will become.
By setting aside alignment concerns entirely, you’re assuming for this scenario that not only is alignment solved, but that solution is easy enough, or coordination is good enough, that every new AGI is also aligned. I don’t think it’s useful to set the issue that far aside. Eventually, somebody is going to screw up and make one that’s not aligned.
I think a balance of power scenario also requires many AGIs to stay at about the same level of capability. If one becomes rapidly more capable, the balance of power is thrown off.
Another issue with balance-of-power scenarios, even assuming alignment, is that eventually individuals or small groups will be able to create AGI. And by eventually, I mean at most ten years after states and large corporations can do it. Then a lot of the balance of power arguments don’t apply, and you’re more prone to having people do truly stupid or evil (by default ethical standards) things with their personally-aligned AGI.
Most of the arguments in Steve Byrnes excellent What does it take to defend the world against out-of-control AGIs? apply to hostile actions from sane state and corporate actors. Even more apply to non-state actors with weirder goals. One pivotal act he doesn’t mention is forming a panopticon, monitoring and de-encrypting every human communication for the purpose of preventing further AGI development. Having this amount of power would also enable easy manipulation and sabotage of political systems, and it’s hard to imagine a balance of power where on corporation or government enjoys this power.