Energy requirements are an issue locally, when you need to build a single large datacenter on short notice. With distributedtraining, you only need to care about a global energy budget. The world generates about 20,000 GW of power, the H100s fueled with 1% of that would cost trillions of dollars.
I think the crux for feasibility of further scaling (beyond $10-$50 billion) is whether systems with currently-reasonable cost keep getting sufficiently more useful, for example enable economically valuable agentic behavior, things like preparing pull requests based on feature/bug discussion on an issue tracker, or fixing failing builds.
Agreed and thanks for sharing that comment.
To clarify, my point was not that we would not have energy itself (a point I was trying to make by giving the percentage of energy usage it would be in the US and China) but, as you point out, the cost of energy itself and whether companies will seek out more energy-efficient forms of computing to reduce this cost. That is, if that amount of computing even becomes necessary for the capabilities you mentioned and more.
Energy requirements are an issue locally, when you need to build a single large datacenter on short notice. With distributed training, you only need to care about a global energy budget. The world generates about 20,000 GW of power, the H100s fueled with 1% of that would cost trillions of dollars.
Agreed and thanks for sharing that comment.
To clarify, my point was not that we would not have energy itself (a point I was trying to make by giving the percentage of energy usage it would be in the US and China) but, as you point out, the cost of energy itself and whether companies will seek out more energy-efficient forms of computing to reduce this cost. That is, if that amount of computing even becomes necessary for the capabilities you mentioned and more.