Have you considered the quantity of inference silicon required?
You buy out a large fraction of all of TSMC’s annual production and get the training compute. You now have one AGI. Replacing the cognitive work of one human requires a certain amount of silicon per human replaced, and the AGI model itself is large, it likely requires multiple nodes—possibly 100s—fully populated with 16 x <Nvidia’s latest accelerator> x $25,000 each.
So if say we need 250 nodes, times 16, times 25k, that’s 100 million USD. Say it is cognitively as effective as 10 humans in the top .1% of intelligence. That’s still not really changing anything unless it’s malicious. At this cost − 10 million an equivalent person—it’s just slightly more cost effective than training humans.
At this point there would be a ‘warning’ period as very large sums of money would have to be spent to scale production of the silicon to bring these costs down and to supply enough copies of the hardware that it’s equivalent to enough people cognitively to make a difference.
It is a positive feedback loop—each set of inference silicon is paying for itself in value in 10-20 years. Moreover once you have enough sets you could make a knockoff design for the chip and stop paying Nvidia, only paying for the silicon itself which is probably $2000 instead of $25,000. So an OOM more effective and it just gets faster from here.
The model does take into account the cost of runtime compute and how that affects demand, but mostly costs are dominated by training compute, not runtime compute. In the default settings for the model, it costs about 13 orders of magnitude more to train an AGI than to run one for 8 hours a day on 250 days (a typical human work year).
Note that mostly the model is modeling the takeoff to AGI and how pre-human level AI might affect things, especially the speed of that takeoff, rather than the effects of AGI itself.
Have you considered the quantity of inference silicon required?
You buy out a large fraction of all of TSMC’s annual production and get the training compute. You now have one AGI. Replacing the cognitive work of one human requires a certain amount of silicon per human replaced, and the AGI model itself is large, it likely requires multiple nodes—possibly 100s—fully populated with 16 x <Nvidia’s latest accelerator> x $25,000 each.
So if say we need 250 nodes, times 16, times 25k, that’s 100 million USD. Say it is cognitively as effective as 10 humans in the top .1% of intelligence. That’s still not really changing anything unless it’s malicious. At this cost − 10 million an equivalent person—it’s just slightly more cost effective than training humans.
At this point there would be a ‘warning’ period as very large sums of money would have to be spent to scale production of the silicon to bring these costs down and to supply enough copies of the hardware that it’s equivalent to enough people cognitively to make a difference.
It is a positive feedback loop—each set of inference silicon is paying for itself in value in 10-20 years. Moreover once you have enough sets you could make a knockoff design for the chip and stop paying Nvidia, only paying for the silicon itself which is probably $2000 instead of $25,000. So an OOM more effective and it just gets faster from here.
The model does take into account the cost of runtime compute and how that affects demand, but mostly costs are dominated by training compute, not runtime compute. In the default settings for the model, it costs about 13 orders of magnitude more to train an AGI than to run one for 8 hours a day on 250 days (a typical human work year).
You can see and adjust those settings here: https://takeoffspeeds.com/
Note that mostly the model is modeling the takeoff to AGI and how pre-human level AI might affect things, especially the speed of that takeoff, rather than the effects of AGI itself.