Let’s say Company A can make AGIs that are drop-in replacements for highly-skilled humans at any existing remote job (including e.g. “company founder”), and no other company can. And Company C is a cloud provider. Then Company A will be able to outbid every other company for Company C’s cloud compute, since Company A is able to turn cloud compute directly into massive revenue. It can just buy more and more cloud compute from C and every other company, funding itself with rapid exponential growth, until the whole world is saturated.
I think this is outside the timeline under consideration. Transforming compute into massive revenue is still gated by the ability of non-AGI enabled customers to decide to spend with Company A; regardless of price the ability of Company C to make more compute available to sell depends quite a bit on the timelines of their contracts with other companies, etc. The ability to outbid the whole rest of the world for commercial compute already crosses the transformational threshold, I claim. This remains true regardless of whether it is a single dominant bidder or several.
I think the timeline we are looking at is from initial launch through the first round of compute-buy. This still leaves all normal customers of compute as bidders, so I would expect the amount of additional compute going to AGI to be a small fraction of the total.
Though let the record reflect based on the other details in the estimate this could still be an enormous increase in the population.
Yeah it’s fine to assume that there might be some period of time that (1) the AGIs don’t escape control, (2) the code doesn’t leak or get stolen, (3) nobody else reinvents the same thing, (4) Company A doesn’t have infinite capital (yet) to spend on renting cloud compute (or the contracts haven’t yet been signed or whatever). And it’s fine to be curious about how many AGIs would Company A have available during this period of time.
And then a key question is whether anything happens during that period of time that would change what happens after that period of time. (And if not, then the analysis isn’t too important.) A pivotal act would certainly qualify. I’m kinda cynical in this area; I think the most likely scenario by far is that nothing happens during this period that has an appreciable impact on what happens afterwards. Like, I’m sure that Company A try to get their AGIs to beat benchmarks, do scientific research, make money, etc. I also expect them to have lots of very serious meetings, both internally and with government officials. But I don’t expect that Company A would succeed at making the world resilient to future out-of-control AGIs, because that’s just a crazy hard thing to do even with millions of intent-aligned AGIs at your disposal. I discussed some of the practical challenges at What does it take to defend the world against out-of-control AGIs?.
Well anyway. My comment above was just saying that the OP could be clearer on what they’re trying to estimate, not that they’re wrong to be trying to estimate it. :)
There are a lot of situations where that’s not an appropriate assumption, but rather the relevant question is “what’s the AGI population if most of the world’s compute is running AGIs”.
Agreed. It would be interesting to extend this to answer that question and in-between scenarios (like having access to a large chunk of the compute in China or the US + allies).
FWIW Holden Karnofsky wrote a 2022 blog post “AI Could Defeat All Of Us Combined” that mentions the following: “once the first human-level AI system is created, whoever created it could use the same computing power it took to create it in order to run several hundred million copies for about a year each.” Brief justification in his footnote 5.
Thanks for pointing us to this. It looks to be the same method as our method 3.
Yeah it’s fine to assume that there might be some period of time that (1) the AGIs don’t escape control, (2) the code doesn’t leak or get stolen, (3) nobody else reinvents the same thing, (4) Company A doesn’t have infinite capital (yet) to spend on renting cloud compute (or the contracts haven’t yet been signed or whatever). And it’s fine to be curious about how many AGIs would Company A have available during this period of time.
We think that period might be substantial, for reasons discussed in Section II.
Yeah it’s fine to assume that there might be some period of time that (1) the AGIs don’t escape control, (2) the code doesn’t leak or get stolen, (3) nobody else reinvents the same thing, (4) Company A doesn’t have infinite capital (yet) to spend on renting cloud compute (or the contracts haven’t yet been signed or whatever). And it’s fine to be curious about how many AGIs would Company A have available during this period of time.
We think that period might be substantial, for reasons discussed in Section II.
I don’t think Section II is related to that. Again, the question I’m asking is How long is the period where an already-existing AGI model type / training approach is only running on the compute already owned by the company that made that AGI, rather than on most of the world’s then-existing compute? If I compare that question to the considerations that you bring up in Section II, they seem almost entirely irrelevant, right? I’ll go through them:
Plateau: There may be unexpected development plateaus that come into effect at around human-level intelligence. These plateaus could be architecture-specific (scaling laws break down; getting past AGI requires something outside the deep learning paradigm) or fundamental to the nature of machine intelligence.
That doesn’t prevent any of those four things I mentioned: it doesn’t prevent (1) the AGIs escaping control and self-reproducing, nor (2) the code / weights leaking or getting stolen, nor (3) other companies reinventing the same thing, nor (4) the AGI company (or companies) having an ability to transform compute into profits at a wildly higher exchange rate than any other compute customer, and thus making unprecedented amounts of money off their existing models, and thus buying more and more compute to run more and more copies of their AGI (e.g. see the “Everything, Inc.” scenario of §3.2.4 here).
Pause: Government intervention could pause frontier AI development. Such a pause could be international. It is plausible that achieving or nearly achieving an AGI system would constitute exactly the sort of catalyzing event that would inspire governments to sharply and suddenly restrict frontier AI development.
That definitely doesn’t prevent (1) or (2), and it probably doesn’t prevent (3) or (4) either depending on implementation details.
Collapse: Advances in AI are dependent on the semiconductor industry, which is composed of several fragile supply chains. A war between China and Taiwan is considered reasonably possible by experts and forecasters. Such an event would dramatically disrupt the semiconductor industry (not to mention the world economy). If this happens around the time that AGI is first developed, AI capabilities could be artificially suspended at human-level for years while computer chip supply chains and AI firms recover.
That doesn’t prevent any of (1,2,3,4). Running an already-existing AGI model on the world’s already-existing stock of chips is unrelated to how many new chips are being produced. And war is not exactly a time when governments tend to choose caution and safety over experimenting with powerful new technologies at scale. Likewise, war is a time when rival countries are especially eager to steal each other’s military-relevant IP.
Abstention: Many frontier AI firms appear to take the risks of advanced AI seriously, and have risk management frameworks in place (see those of Google DeepMind, OpenAI, and Anthropic). Some contain what Holden Karnofsky calls if-then commitments: “If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time.” Commitments to pause further development may kick at human-level capabilities. AGI firms might avoid recursive self-improvement to avoid existential or catastrophic risks.
That could be relevant to (1,2,4) with luck. As for (3), it might buy a few months, before Meta and the various other firms and projects that are extremely dismissive of the risks of advanced AI catch up to the front-runners.
Windup: There are hard-to-reduce windup times in the production process of frontier AI models. For example, a training run for future systems may run into the hundreds of billions of dollars, consuming vast amounts of compute and taking months of processing. Other bottlenecks, like the time it takes to run ML experiments, might extend this windup period.
That doesn’t prevent any of (1,2,3,4). Again, we’re assuming the AGI already exists, and discussing how many servers will be running copies of it, and how soon. The question of training next-generation even-more-powerful AGIs is irrelevant to that question. Right?
Plateau: There may be unexpected development plateaus that come into effect at around human-level intelligence. These plateaus could be architecture-specific (scaling laws break down; getting past AGI requires something outside the deep learning paradigm) or fundamental to the nature of machine intelligence.
That doesn’t prevent any of those four things I mentioned: it doesn’t prevent (1) the AGIs escaping control and self-reproducing, nor (2) the code / weights leaking or getting stolen, nor (3) other companies reinventing the same thing, nor (4) the AGI company (or companies) having an ability to transform compute into profits at a wildly higher exchange rate than any other compute customer, and thus making unprecedented amounts of money off their existing models, and thus buying more and more compute to run more and more copies of their AGI
It doesn’t prevent (1) but it does make it less likely. A ‘barely general’ AGI is less likely to be able to escape control than an ASI. It doesn’t prevent (2). We acknowledge (3) in section IV: “We can also incorporate multiple firms or governments building AGI, by multiplying the initial AGI population by the number of such additional AGI projects. For example, 2x if we believe China and the US will be the only two projects, or 3x if we believe OpenAI, Anthropic, and DeepMind each achieve AGI.” We think there are likely to be a small number of companies near the frontier, so this is likely to be a modest multiplier. Re. (4), I think ryan_b made relevant points. I would expect some portion of compute to be tied up in long-term contracts. I agree that I would expect the developer of AGI to be able to increase their access to compute over time, but it’s not obvious to me how fast that would be.
Pause: Government intervention could pause frontier AI development. Such a pause could be international. It is plausible that achieving or nearly achieving an AGI system would constitute exactly the sort of catalyzing event that would inspire governments to sharply and suddenly restrict frontier AI development.
That definitely doesn’t prevent (1) or (2), and it probably doesn’t prevent (3) or (4) either depending on implementation details.
I mostly agree on this one, though again think it makes (1) less likely for the same reason. As you say, the implementation details matter for (3) and (4), and it’s not clear to me that it ‘probably’ wouldn’t prevent them. It might be that a pause would target all companies near the frontier, in which case we could see a freeze at AGI for its developer, and near AGI for competitors.
Abstention: Many frontier AI firms appear to take the risks of advanced AI seriously, and have risk management frameworks in place (see those of Google DeepMind, OpenAI, and Anthropic). Some contain what Holden Karnofsky calls if-then commitments: “If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time.” Commitments to pause further development may kick at human-level capabilities. AGI firms might avoid recursive self-improvement to avoid existential or catastrophic risks.
That could be relevant to (1,2,4) with luck. As for (3), it might buy a few months, before Meta and the various other firms and projects that are extremely dismissive of the risks of advanced AI catch up to the front-runners.
Again, mostly agreed. I think it’s possible that the development of AGI would precipitate a wider change in attitude towards it, including at other developers. Maybe it would be exactly what is needed to make other firms take the risks seriously. Perhaps it’s more likely it would just provide a clear demonstration of a profitable path and spur further acceleration though. Again, we see (3) as a modest multiplier.
Windup: There are hard-to-reduce windup times in the production process of frontier AI models. For example, a training run for future systems may run into the hundreds of billions of dollars, consuming vast amounts of compute and taking months of processing. Other bottlenecks, like the time it takes to run ML experiments, might extend this windup period.
That doesn’t prevent any of (1,2,3,4). Again, we’re assuming the AGI already exists, and discussing how many servers will be running copies of it, and how soon. The question of training next-generation even-more-powerful AGIs is irrelevant to that question. Right?
The question of training next-generation even-more-powerful AGIs is relevant to containment, and is therefore relevant to how long a relatively stable period running a ‘first generation AGI’ might last. It doesn’t prevent (2) ad (3). It doesn’t prevent (4) either, though presumably a next-gen AGI would further increase a company’s ability in this regard.
I think this is outside the timeline under consideration. Transforming compute into massive revenue is still gated by the ability of non-AGI enabled customers to decide to spend with Company A; regardless of price the ability of Company C to make more compute available to sell depends quite a bit on the timelines of their contracts with other companies, etc. The ability to outbid the whole rest of the world for commercial compute already crosses the transformational threshold, I claim. This remains true regardless of whether it is a single dominant bidder or several.
I think the timeline we are looking at is from initial launch through the first round of compute-buy. This still leaves all normal customers of compute as bidders, so I would expect the amount of additional compute going to AGI to be a small fraction of the total.
Though let the record reflect based on the other details in the estimate this could still be an enormous increase in the population.
Yeah it’s fine to assume that there might be some period of time that (1) the AGIs don’t escape control, (2) the code doesn’t leak or get stolen, (3) nobody else reinvents the same thing, (4) Company A doesn’t have infinite capital (yet) to spend on renting cloud compute (or the contracts haven’t yet been signed or whatever). And it’s fine to be curious about how many AGIs would Company A have available during this period of time.
And then a key question is whether anything happens during that period of time that would change what happens after that period of time. (And if not, then the analysis isn’t too important.) A pivotal act would certainly qualify. I’m kinda cynical in this area; I think the most likely scenario by far is that nothing happens during this period that has an appreciable impact on what happens afterwards. Like, I’m sure that Company A try to get their AGIs to beat benchmarks, do scientific research, make money, etc. I also expect them to have lots of very serious meetings, both internally and with government officials. But I don’t expect that Company A would succeed at making the world resilient to future out-of-control AGIs, because that’s just a crazy hard thing to do even with millions of intent-aligned AGIs at your disposal. I discussed some of the practical challenges at What does it take to defend the world against out-of-control AGIs?.
Well anyway. My comment above was just saying that the OP could be clearer on what they’re trying to estimate, not that they’re wrong to be trying to estimate it. :)
Agreed. It would be interesting to extend this to answer that question and in-between scenarios (like having access to a large chunk of the compute in China or the US + allies).
Thanks for pointing us to this. It looks to be the same method as our method 3.
We think that period might be substantial, for reasons discussed in Section II.
I don’t think Section II is related to that. Again, the question I’m asking is How long is the period where an already-existing AGI model type / training approach is only running on the compute already owned by the company that made that AGI, rather than on most of the world’s then-existing compute? If I compare that question to the considerations that you bring up in Section II, they seem almost entirely irrelevant, right? I’ll go through them:
That doesn’t prevent any of those four things I mentioned: it doesn’t prevent (1) the AGIs escaping control and self-reproducing, nor (2) the code / weights leaking or getting stolen, nor (3) other companies reinventing the same thing, nor (4) the AGI company (or companies) having an ability to transform compute into profits at a wildly higher exchange rate than any other compute customer, and thus making unprecedented amounts of money off their existing models, and thus buying more and more compute to run more and more copies of their AGI (e.g. see the “Everything, Inc.” scenario of §3.2.4 here).
That definitely doesn’t prevent (1) or (2), and it probably doesn’t prevent (3) or (4) either depending on implementation details.
That doesn’t prevent any of (1,2,3,4). Running an already-existing AGI model on the world’s already-existing stock of chips is unrelated to how many new chips are being produced. And war is not exactly a time when governments tend to choose caution and safety over experimenting with powerful new technologies at scale. Likewise, war is a time when rival countries are especially eager to steal each other’s military-relevant IP.
That could be relevant to (1,2,4) with luck. As for (3), it might buy a few months, before Meta and the various other firms and projects that are extremely dismissive of the risks of advanced AI catch up to the front-runners.
That doesn’t prevent any of (1,2,3,4). Again, we’re assuming the AGI already exists, and discussing how many servers will be running copies of it, and how soon. The question of training next-generation even-more-powerful AGIs is irrelevant to that question. Right?
It doesn’t prevent (1) but it does make it less likely. A ‘barely general’ AGI is less likely to be able to escape control than an ASI. It doesn’t prevent (2). We acknowledge (3) in section IV: “We can also incorporate multiple firms or governments building AGI, by multiplying the initial AGI population by the number of such additional AGI projects. For example, 2x if we believe China and the US will be the only two projects, or 3x if we believe OpenAI, Anthropic, and DeepMind each achieve AGI.” We think there are likely to be a small number of companies near the frontier, so this is likely to be a modest multiplier. Re. (4), I think ryan_b made relevant points. I would expect some portion of compute to be tied up in long-term contracts. I agree that I would expect the developer of AGI to be able to increase their access to compute over time, but it’s not obvious to me how fast that would be.
I mostly agree on this one, though again think it makes (1) less likely for the same reason. As you say, the implementation details matter for (3) and (4), and it’s not clear to me that it ‘probably’ wouldn’t prevent them. It might be that a pause would target all companies near the frontier, in which case we could see a freeze at AGI for its developer, and near AGI for competitors.
Again, mostly agreed. I think it’s possible that the development of AGI would precipitate a wider change in attitude towards it, including at other developers. Maybe it would be exactly what is needed to make other firms take the risks seriously. Perhaps it’s more likely it would just provide a clear demonstration of a profitable path and spur further acceleration though. Again, we see (3) as a modest multiplier.
The question of training next-generation even-more-powerful AGIs is relevant to containment, and is therefore relevant to how long a relatively stable period running a ‘first generation AGI’ might last. It doesn’t prevent (2) ad (3). It doesn’t prevent (4) either, though presumably a next-gen AGI would further increase a company’s ability in this regard.