I understand that you’re basically assuming that the “initial AGI population” is running on only the same amount of compute that was used to train that very AGI. It’s fine to make that assumption but I think you should emphasize it more. There are a lot of situations where that’s not an appropriate assumption, but rather the relevant question is “what’s the AGI population if most of the world’s compute is running AGIs”.
For example, if the means to run AGIs (code, weights, whatever) gets onto the internet, then everybody all over the world would be doing that immediately. Or if a power-seeking AGI escapes human control, then a possible thing it might do is work to systematically get copies of itself running on most of the world’s compute. Or another possible thing it might do is wipe out humanity and then get copies of itself running on most of the world’s compute, and then we’ll want to know if that’s enough AGIs for a self-sufficient stable supply chain (see “Argument 2” here). Or if we’re thinking more than a few months after AGI becomes possible at all, in a world like today’s where the leader is only slightly ahead of a gaggle of competitors and open-source projects, then AGI would again presumably be on most of the world’s compute. Or if we note that a company with AGI can make unlimited money by renting more and more compute to run more AGIs to do arbitrary remote-work jobs, then we might guess that they would decide to do so, which would lead to scaling up to as much compute around the world as money can buy.
OK, here’s the part of the post where you justified your decision to base your analysis on one training run worth of compute rather than one planet worth of compute, I think:
One reason the training run imputation approach is likely still solid is that competition between firms or countries will crowd out compute or compute will be excluded on national security grounds. Consider the two main actors that could build AGI. If a company builds AGI, they are unlikely to have easy access to commodified compute that they have not themselves built, since they will be in fierce competition with other firms buying chips and obtaining compute. If a government builds AGI, it seems plausible they would impose strict security measures on their compute, reducing the likelihood that anything not immediately in the project would be employable at inference.
The first part doesn’t make sense to me:
Let’s say Company A can make AGIs that are drop-in replacements for highly-skilled humans at any existing remote job (including e.g. “company founder”), and no other company can. And Company C is a cloud provider. Then Company A will be able to outbid every other company for Company C’s cloud compute, since Company A is able to turn cloud compute directly into massive revenue. It can just buy more and more cloud compute from C and every other company, funding itself with rapid exponential growth, until the whole world is saturated.
If Company A and Company B can BOTH make AGIs that are drop-in replacements for highly-skilled humans, and Company C doesn’t do AI research but is just a giant cloud provider, then Company A and Company B will bid against each other to rent Company C’s compute, and no other bidders will be anywhere close to those two. It doesn’t matter whether Company A or Company B wins the auction—Company C’s compute is going to be running AGIs either way. Right?
Next, the second part.
Yes it’s possible that a government would be sufficiently paranoid about IP theft (or loss of control or other things) that it doesn’t want to run its AGI code on random servers that it doesn’t own itself. (We should be so lucky!) It’s also possible that a company would make the same decision for the same reason. Yeah OK, that’s indeed a scenario where one might be interested in the question of what AGI population you get for its training compute. But that’s really only relevant if the government or company rapidly does a pivotal act, I think. Otherwise that’s just an interesting few-month period of containment before AGIs are on most of the world’s compute as above.
we found three existing attempts to estimate the initial AGI population
FWIW Holden Karnofsky wrote a 2022 blog post “AI Could Defeat All Of Us Combined” that mentions the following: “once the first human-level AI system is created, whoever created it could use the same computing power it took to create it in order to run several hundred million copies for about a year each.” Brief justification in his footnote 5. Not sure that adds much to the post, it just popped into my head as a fourth example.
~ ~ ~
For what it’s worth, my own opinion is that 1e14 FLOP/s is a better guess than 1e15 FLOP/s for human brain compute, and also that we should divide all the compute in the world including consumer PCs by 1e14 FLOP/s to guess (what I would call) “initial AGI population”, for all planning purposes apart from pivotal acts. But you’re obviously assuming that AGI will be an LLM, and I’m assuming that it won’t, so you should probably ignore my opinion. We’re talking about different things. Just thought I’d share anyway ¯\_(ツ)_/¯
I understand that you’re basically assuming that the “initial AGI population” is running on only the same amount of compute that was used to train that very AGI. It’s fine to make that assumption but I think you should emphasize it more. There are a lot of situations where that’s not an appropriate assumption, but rather the relevant question is “what’s the AGI population if most of the world’s compute is running AGIs”.
For example, if the means to run AGIs (code, weights, whatever) gets onto the internet, then everybody all over the world would be doing that immediately. Or if a power-seeking AGI escapes human control, then a possible thing it might do is work to systematically get copies of itself running on most of the world’s compute. Or another possible thing it might do is wipe out humanity and then get copies of itself running on most of the world’s compute, and then we’ll want to know if that’s enough AGIs for a self-sufficient stable supply chain (see “Argument 2” here). Or if we’re thinking more than a few months after AGI becomes possible at all, in a world like today’s where the leader is only slightly ahead of a gaggle of competitors and open-source projects, then AGI would again presumably be on most of the world’s compute. Or if we note that a company with AGI can make unlimited money by renting more and more compute to run more AGIs to do arbitrary remote-work jobs, then we might guess that they would decide to do so, which would lead to scaling up to as much compute around the world as money can buy.
OK, here’s the part of the post where you justified your decision to base your analysis on one training run worth of compute rather than one planet worth of compute, I think:
The first part doesn’t make sense to me:
Let’s say Company A can make AGIs that are drop-in replacements for highly-skilled humans at any existing remote job (including e.g. “company founder”), and no other company can. And Company C is a cloud provider. Then Company A will be able to outbid every other company for Company C’s cloud compute, since Company A is able to turn cloud compute directly into massive revenue. It can just buy more and more cloud compute from C and every other company, funding itself with rapid exponential growth, until the whole world is saturated.
If Company A and Company B can BOTH make AGIs that are drop-in replacements for highly-skilled humans, and Company C doesn’t do AI research but is just a giant cloud provider, then Company A and Company B will bid against each other to rent Company C’s compute, and no other bidders will be anywhere close to those two. It doesn’t matter whether Company A or Company B wins the auction—Company C’s compute is going to be running AGIs either way. Right?
Next, the second part.
Yes it’s possible that a government would be sufficiently paranoid about IP theft (or loss of control or other things) that it doesn’t want to run its AGI code on random servers that it doesn’t own itself. (We should be so lucky!) It’s also possible that a company would make the same decision for the same reason. Yeah OK, that’s indeed a scenario where one might be interested in the question of what AGI population you get for its training compute. But that’s really only relevant if the government or company rapidly does a pivotal act, I think. Otherwise that’s just an interesting few-month period of containment before AGIs are on most of the world’s compute as above.
FWIW Holden Karnofsky wrote a 2022 blog post “AI Could Defeat All Of Us Combined” that mentions the following: “once the first human-level AI system is created, whoever created it could use the same computing power it took to create it in order to run several hundred million copies for about a year each.” Brief justification in his footnote 5. Not sure that adds much to the post, it just popped into my head as a fourth example.
~ ~ ~
For what it’s worth, my own opinion is that 1e14 FLOP/s is a better guess than 1e15 FLOP/s for human brain compute, and also that we should divide all the compute in the world including consumer PCs by 1e14 FLOP/s to guess (what I would call) “initial AGI population”, for all planning purposes apart from pivotal acts. But you’re obviously assuming that AGI will be an LLM, and I’m assuming that it won’t, so you should probably ignore my opinion. We’re talking about different things. Just thought I’d share anyway ¯\_(ツ)_/¯