Abilene site of Stargate will host 100K-128K chips in GB200 NVL72 racks by this summer, and a total of 400K-512K chips in 2026, based on a new post by Crusoe and a reinterpretation of the recent Bloomberg post in light of the Crusoe post. For 2025, it’s less than 200K chips[1], but more than the surprising 16K-32K chips[2] that the Bloomberg post suggested. It can be a training system after all, but training a raw compute “GPT-5” (2e27 FLOPs) by the end of 2025 would require using FP8[3].
The Crusoe post says “initial phase, comprising two buildings at … 200+ megawatts” and “each building is designed to operate up to 50,000 NVIDIA GB200 NVL72s”. Dylan Patel’s estimate (at 1:24:42) for all-in power per Blackwell GPU as a fraction of the datacenter was 2.0 KW (meaning per chip, or else it’s way too much). At GTC 2025, Jensen Huang showed a slide (at 1:20:52) where the estimate is 2.3 KW per chip (100 MW per 85K dies, which is 42.5K chips).
So the “50K GB200 NVL72s” per building from the Mar 2025 Crusoe post can only mean the number of chips (not dies or superchips), and the “100K GPUs” per building from the Jul 2024 Crusoe post must’ve meant 100K compute dies (which is 50K chips). It’s apparently 100-115 MW per building then, or 800-920 MW for all 8 buildings in 2026, which is notably lower than 1.2 GW the Mar 2025 Crusoe post cites.
How can the Bloomberg’s 16K “GB200 semiconductors” in 2025 and 64K in 2026 be squared with this? The Mar 2025 Crusoe post says there are 2 buildings now and 6 additional buildings in 2026, for the total of 8, so in 2026 the campus grows 4x, which fits 16K vs. 64K from Bloomberg. But the numbers themselves must be counting in the units of 8 chips. This fits counting in the units of GB200 NVL8 (see at 1:13:39), which can be referred to as a “superchip”. The Mar 2025 Crusoe post says Abilene site will be using NVL72 racks, so counting in NVL8 is wrong, but someone must’ve made that mistake on the way to the Bloomberg post.
Interpreting the Bloomberg numbers in units of 8 chips, we get 128K chips in 2025 (64K chips per building) and 512K chips in 2026 (about 7K GB200 NVL72 racks). This translates to 256-300 MW for the current 2 buildings and 1.0-1.2 GW for the 8 buildings in 2026. This fits the 1.2 GW figure from the Mar 2025 Crusoe post better, so there might be some truth to the Bloomberg post after all, even as it’s been delivered in a thoroughly misleading way.
Crusoe’s Jul 2024 post explicitly said “each data center building will be able to operate up to 100,000 GPUs”, and in 2024 “GPU” usually meant chip/package (in 2025, it’s starting to mean “compute die”, see at 1:28:04; there are 2 compute dies per chip in GB200 systems). Which suggested 200K chips for the initial 2 buildings.
The post said it’s the number of “coveted GB200 semiconductors”, which is highly ambiguous because of the die/chip/superchip counting issue. A “GB200 superchip” means 2 chips (plus a CPU) by default, so 16K superchips would be 32K chips.
A GB200 chip (not die or superchip) produces 2.5e15 dense BF16 FLOP/s (2.5x more than an H100 chip). Training at 40% utilization for 3 months, 100K chips produce 8e26 FLOPs. But in FP8 it’s 1.6e27 FLOPs. Assuming GPT-4 was 2e25 FLOPs, 100x its raw compute asks “GPT-5” to need about 2e27 FLOPs. In the OpenAI’s introductory video about GPT-4.5, there was a hint it might’ve been trained in FP8 (at 7:38), so it’s not implausible that GPT-5 would be trained in FP8 as well.
Abilene site of Stargate will host 100K-128K chips in GB200 NVL72 racks by this summer, and a total of 400K-512K chips in 2026, based on a new post by Crusoe and a reinterpretation of the recent Bloomberg post in light of the Crusoe post. For 2025, it’s less than 200K chips[1], but more than the surprising 16K-32K chips[2] that the Bloomberg post suggested. It can be a training system after all, but training a raw compute “GPT-5” (2e27 FLOPs) by the end of 2025 would require using FP8[3].
The Crusoe post says “initial phase, comprising two buildings at … 200+ megawatts” and “each building is designed to operate up to 50,000 NVIDIA GB200 NVL72s”. Dylan Patel’s estimate (at 1:24:42) for all-in power per Blackwell GPU as a fraction of the datacenter was 2.0 KW (meaning per chip, or else it’s way too much). At GTC 2025, Jensen Huang showed a slide (at 1:20:52) where the estimate is 2.3 KW per chip (100 MW per 85K dies, which is 42.5K chips).
So the “50K GB200 NVL72s” per building from the Mar 2025 Crusoe post can only mean the number of chips (not dies or superchips), and the “100K GPUs” per building from the Jul 2024 Crusoe post must’ve meant 100K compute dies (which is 50K chips). It’s apparently 100-115 MW per building then, or 800-920 MW for all 8 buildings in 2026, which is notably lower than 1.2 GW the Mar 2025 Crusoe post cites.
How can the Bloomberg’s 16K “GB200 semiconductors” in 2025 and 64K in 2026 be squared with this? The Mar 2025 Crusoe post says there are 2 buildings now and 6 additional buildings in 2026, for the total of 8, so in 2026 the campus grows 4x, which fits 16K vs. 64K from Bloomberg. But the numbers themselves must be counting in the units of 8 chips. This fits counting in the units of GB200 NVL8 (see at 1:13:39), which can be referred to as a “superchip”. The Mar 2025 Crusoe post says Abilene site will be using NVL72 racks, so counting in NVL8 is wrong, but someone must’ve made that mistake on the way to the Bloomberg post.
Interpreting the Bloomberg numbers in units of 8 chips, we get 128K chips in 2025 (64K chips per building) and 512K chips in 2026 (about 7K GB200 NVL72 racks). This translates to 256-300 MW for the current 2 buildings and 1.0-1.2 GW for the 8 buildings in 2026. This fits the 1.2 GW figure from the Mar 2025 Crusoe post better, so there might be some truth to the Bloomberg post after all, even as it’s been delivered in a thoroughly misleading way.
Crusoe’s Jul 2024 post explicitly said “each data center building will be able to operate up to 100,000 GPUs”, and in 2024 “GPU” usually meant chip/package (in 2025, it’s starting to mean “compute die”, see at 1:28:04; there are 2 compute dies per chip in GB200 systems). Which suggested 200K chips for the initial 2 buildings.
The post said it’s the number of “coveted GB200 semiconductors”, which is highly ambiguous because of the die/chip/superchip counting issue. A “GB200 superchip” means 2 chips (plus a CPU) by default, so 16K superchips would be 32K chips.
A GB200 chip (not die or superchip) produces 2.5e15 dense BF16 FLOP/s (2.5x more than an H100 chip). Training at 40% utilization for 3 months, 100K chips produce 8e26 FLOPs. But in FP8 it’s 1.6e27 FLOPs. Assuming GPT-4 was 2e25 FLOPs, 100x its raw compute asks “GPT-5” to need about 2e27 FLOPs. In the OpenAI’s introductory video about GPT-4.5, there was a hint it might’ve been trained in FP8 (at 7:38), so it’s not implausible that GPT-5 would be trained in FP8 as well.