The current scaling speed is created by increasing funding for training projects, which isn’t sustainable without continued success. Without this, the speed goes down to the much slower FLOP/dollar trend of improving cost efficiency of compute, making better AI accelerators. The 2 + 4 + 8 years thing might describe gradual increase in funding, but there are still 2 OOMs of training compute beyond original GPT-4 that are already baked-in in the scale of the datacenters that are being built and didn’t yet produce deployed models. We’ll only observe this in full by late 2026, so the current capabilities don’t yet match the capabilities before a possible scaling slowdown.
The current scaling speed is created by increasing funding for training projects, which isn’t sustainable without continued success. Without this, the speed goes down to the much slower FLOP/dollar trend of improving cost efficiency of compute, making better AI accelerators. The 2 + 4 + 8 years thing might describe gradual increase in funding, but there are still 2 OOMs of training compute beyond original GPT-4 that are already baked-in in the scale of the datacenters that are being built and didn’t yet produce deployed models. We’ll only observe this in full by late 2026, so the current capabilities don’t yet match the capabilities before a possible scaling slowdown.