What are we trying to model here or find examples of?
Here’s what I think we’re trying to model: if a technology were isolated and for whatever reason, development was stopped, then during the ‘stopped’ period very little effort is being put into it.
After the ‘stopped’ period ends, development resumes and presumably progress is proportional to effort, with an unavoidable serial part of the process (from Amdahl’s law/ Gant charts show this) restricting the rate that progress could be made at.
For US Navy tonnage : without a washington Naval treaty, and a Great Depression and a policy of isolation, the US Navy would presumably have built warships at a steady rate. They did not, as shown in your data.
However, during this prewar period, other processes continued. Multiple countries continuously improved aircraft designs, with better aerodynamics (biplane to mono), carrier launching and landing, ever larger and more powerful engines, dive and torpedo bombing, and other innovations.
So even though very few ships are being built, aircraft are being improved. Now Pearl harbor, and unpause. All out effort, which shows in the data you linked.
But we don’t have to trust it, all that really matters is the aircraft carrier numbers, nothing else. As it turned out, the carrier was a hard counter to everything, even other carriers—the other ships in a carrier battle group are there to hunt submarines, supplement the carriers antiaircraft fire, and resupply the carriers. While there were direct gun battles in late ww2 in the Pacific theater, better admirals could probably have avoided every battle and just sank all the enemy ships with aircraft. Shooting down enemy aircraft was also way easier, it turned out, to do with aircraft.
So only the left column matters for the model, and you also need the 0 point. There were seven fleet aircraft carriers and one escort carrier at t=0, beginning of ww2.
If we count the escort carriers at 30% of a fleet carrier, and there were 27 new fleet carriers, then the total of “weighted” carriers went from 7.3 (prewar) to 61.2 (weighted, 1945).
8.38x increase in the most relevant parameter.
The data you don’t trust, 1940 tonnage is 1956867 and 1945 tonnage is 11267550. Increase of 5.75x.
In terms of “relevant” tonnage obviously aircraft carriers, due to their role as a hard counter with 1940s era technology, is all that matters.
From the perspective of Japanese admirals, over 4 years of war, they faced 8 times the warships produced during the entire prewar period. This is the issue, their battle plans could not scale to handle this. Had the Americans been building carriers at a steadier rate the entire time, the Japanese would never have attacked. The rate of increase turned out to be a strategic surprise that doomed the Japanese side of the war.
Predictions for AI:
It’s frankly hard to imagine a world where an AI pause is actually negotiated. Remember, if any major party says no, nobody can afford to pause. Say it happens:
During the pause, there would be some flops limit on model scale, above which requires likely prohibitively expensive precautions.
During the pause period, people would be experimenting with below threshold ML algorithms. They might build app stores where sub AGI algorithms can be easily licensed and combined into integrated systems, where “ML microservices” give an integrated system some of the benefits of a real AGI. Probably a network* of ML microservices could be built that can perceive, interpret the environment, compare the current state to a goal state, consider many possible actions using an environment/robotics sim that is a service, and then choose the best action. It might not even be that much less effective than a “real” AGI model that is integrated for some purposes like factory work.
The app stores would also likely contain key innovations that current AI stacks are missing. A series of software intercompatibility frameworks (anyone can drive a robot with any model), cloud hosted realistic evaluation environments (these are things like realistic robotic environment sims), composable robotics backends, formally proven stack components (so robotics can be certified as life safety using new AI models), and likely cloud hosted AI improving AI services. (this is where it is possible to pay for improvements to an AI stack using a service that is cloud hosted, this is possible even with sub AGI RL models).
Note that the things I just mentioned do not exist today, everyone on autonomous cars has rolled their own duplicate version of it, with only a little bit of sharing with platforms like ROS.
During the pause period, Moore’s law continues to whatever level it is able to continue (physics is obviously an ultimate limit to an extent), but the experience curve would continue to make transistors cheaper even after they cannot be made smaller. AI models are close to embarrassingly parallel so they benefit linearly from the number of transistors.
During the pause period, ASIC architectures for AI chips are developed. On previous examples like bitcoin mining and others, ASICs have proven to be substantially faster and more powerful efficient for the same quantity of transistors.
So after the pause period, all these innovations hit all at once, and the rate of AI increase becomes very high, potentially uncontrollable. You can see why: all the obstacles to letting true AGI models immediately control robots have been smoothed away, AGI models can be used to improve each other even if a different company owns the model (that cloud hoisted AI improving service), and the realistic evaluation environments allow the new AGI models to quickly be evaluated and improved on. The compute is also faster, and the pause was likely dropped due to a war, so the new AI effort is probably flat out.
So I’m convinced. With a gears level model I think the overhang idea has a very high probability of happening if a pause occurred. It’s just super rare for there to be any pause at all historically, and there probably will not be one for AI.
Oh one bit of confusion: say there was a 5 year pause between 2030 and 2035. Am I saying that in 2040, is the amount of AI progress the same as in a 2040 with no pause? No. Obviously the pause slowed down progress, the fastest progress is when people are working on AI at full speed from 2023 to 2040. But it does create a discontinuity : 2035 after the pause is a more interesting year than the “no pause” 2030-2035 period.
*the reason a network of separate models isn’t the same as an AGI is each model gets trained separately, and it would be illegal to train the models on information from when they were assembled into a system. Each model has to remain static, validated on separate benches. And the models must operate in a human interpretable way, humans would have to be able to clearly understand both the input and output. This is how current autonomous car stacks, except comma.ai’s, already work. Assuming each model can be up to 10^26 flops—that’s quite a bit of headroom, probably easily enough to do a lot of “AGI” like things using a network of 10-100 subsystems, each under 10^26. The human brain works this way.
>But we don’t have to trust it, all that really matters is the aircraft carrier numbers, nothing else. As it turned out, the carrier was a hard counter to everything, even other carriers—the other ships in a carrier battle group are there to hunt submarines, supplement the carriers antiaircraft fire, and resupply the carriers.
This is not true. Carriers were powerful, yes, but also vulnerable. I point to the loss of Glorious and Samar as cases where they ended up under the guns of battleships and it didn’t go great, and frankly Samar should have been so much worse than it was. And sinking ships with airplanes is quite difficult. The total number of battleships sunk at sea by carrier planes? Two, and in both cases, it took a lot of planes.
More broadly, the growth of the American carrier fleet was because there was a war on, and any time there was a US-Japan war, the US was going to be building a lot of ships. There were a lot of carriers because carriers had reached the point of being genuinely useful (and could be built reasonably quickly, the main reason the battleship program was curtailed) but it wasn’t like the USN would have had 30+ fleet carriers in 1945 in a world with neither the treaties nor the war.
The intricacies of tradeoffs between WW2 ship classes could be argued, and was argued, for decades in books.
You’re correct that you can create a scenario where the carrier doesn’t always win, and in the confusion of ww2 sensors and communications those scenarios occasionally happened.
You’re correct that aerial weapons at the time were less effective against battleships.
I don’t think these exceptions change the basic idea that the chance of winning the pacific theater fleet battles is proportional to the number and effectiveness of the carrier launched aircraft you can field. So the total combat power of the USN in WW2 is mostly proportional to the carrier number, and the rate of increase is exactly the post overhang example asked for.
Note also overhang does not mean catch up. The timeline with an artificial pause always has less potential progress than the normal timeline.
What are we trying to model here or find examples of?
Here’s what I think we’re trying to model: if a technology were isolated and for whatever reason, development was stopped, then during the ‘stopped’ period very little effort is being put into it.
After the ‘stopped’ period ends, development resumes and presumably progress is proportional to effort, with an unavoidable serial part of the process (from Amdahl’s law/ Gant charts show this) restricting the rate that progress could be made at.
For US Navy tonnage : without a washington Naval treaty, and a Great Depression and a policy of isolation, the US Navy would presumably have built warships at a steady rate. They did not, as shown in your data.
However, during this prewar period, other processes continued. Multiple countries continuously improved aircraft designs, with better aerodynamics (biplane to mono), carrier launching and landing, ever larger and more powerful engines, dive and torpedo bombing, and other innovations.
So even though very few ships are being built, aircraft are being improved. Now Pearl harbor, and unpause. All out effort, which shows in the data you linked.
But we don’t have to trust it, all that really matters is the aircraft carrier numbers, nothing else. As it turned out, the carrier was a hard counter to everything, even other carriers—the other ships in a carrier battle group are there to hunt submarines, supplement the carriers antiaircraft fire, and resupply the carriers. While there were direct gun battles in late ww2 in the Pacific theater, better admirals could probably have avoided every battle and just sank all the enemy ships with aircraft. Shooting down enemy aircraft was also way easier, it turned out, to do with aircraft.
So only the left column matters for the model, and you also need the 0 point. There were seven fleet aircraft carriers and one escort carrier at t=0, beginning of ww2.
If we count the escort carriers at 30% of a fleet carrier, and there were 27 new fleet carriers, then the total of “weighted” carriers went from 7.3 (prewar) to 61.2 (weighted, 1945).
8.38x increase in the most relevant parameter.
The data you don’t trust, 1940 tonnage is 1956867 and 1945 tonnage is 11267550. Increase of 5.75x.
In terms of “relevant” tonnage obviously aircraft carriers, due to their role as a hard counter with 1940s era technology, is all that matters.
From the perspective of Japanese admirals, over 4 years of war, they faced 8 times the warships produced during the entire prewar period. This is the issue, their battle plans could not scale to handle this. Had the Americans been building carriers at a steadier rate the entire time, the Japanese would never have attacked. The rate of increase turned out to be a strategic surprise that doomed the Japanese side of the war.
Predictions for AI:
It’s frankly hard to imagine a world where an AI pause is actually negotiated. Remember, if any major party says no, nobody can afford to pause. Say it happens:
During the pause, there would be some flops limit on model scale, above which requires likely prohibitively expensive precautions.
During the pause period, people would be experimenting with below threshold ML algorithms. They might build app stores where sub AGI algorithms can be easily licensed and combined into integrated systems, where “ML microservices” give an integrated system some of the benefits of a real AGI. Probably a network* of ML microservices could be built that can perceive, interpret the environment, compare the current state to a goal state, consider many possible actions using an environment/robotics sim that is a service, and then choose the best action. It might not even be that much less effective than a “real” AGI model that is integrated for some purposes like factory work.
The app stores would also likely contain key innovations that current AI stacks are missing. A series of software intercompatibility frameworks (anyone can drive a robot with any model), cloud hosted realistic evaluation environments (these are things like realistic robotic environment sims), composable robotics backends, formally proven stack components (so robotics can be certified as life safety using new AI models), and likely cloud hosted AI improving AI services. (this is where it is possible to pay for improvements to an AI stack using a service that is cloud hosted, this is possible even with sub AGI RL models).
Note that the things I just mentioned do not exist today, everyone on autonomous cars has rolled their own duplicate version of it, with only a little bit of sharing with platforms like ROS.
During the pause period, Moore’s law continues to whatever level it is able to continue (physics is obviously an ultimate limit to an extent), but the experience curve would continue to make transistors cheaper even after they cannot be made smaller. AI models are close to embarrassingly parallel so they benefit linearly from the number of transistors.
During the pause period, ASIC architectures for AI chips are developed. On previous examples like bitcoin mining and others, ASICs have proven to be substantially faster and more powerful efficient for the same quantity of transistors.
So after the pause period, all these innovations hit all at once, and the rate of AI increase becomes very high, potentially uncontrollable. You can see why: all the obstacles to letting true AGI models immediately control robots have been smoothed away, AGI models can be used to improve each other even if a different company owns the model (that cloud hoisted AI improving service), and the realistic evaluation environments allow the new AGI models to quickly be evaluated and improved on. The compute is also faster, and the pause was likely dropped due to a war, so the new AI effort is probably flat out.
So I’m convinced. With a gears level model I think the overhang idea has a very high probability of happening if a pause occurred. It’s just super rare for there to be any pause at all historically, and there probably will not be one for AI.
Oh one bit of confusion: say there was a 5 year pause between 2030 and 2035. Am I saying that in 2040, is the amount of AI progress the same as in a 2040 with no pause? No. Obviously the pause slowed down progress, the fastest progress is when people are working on AI at full speed from 2023 to 2040. But it does create a discontinuity : 2035 after the pause is a more interesting year than the “no pause” 2030-2035 period.
*the reason a network of separate models isn’t the same as an AGI is each model gets trained separately, and it would be illegal to train the models on information from when they were assembled into a system. Each model has to remain static, validated on separate benches. And the models must operate in a human interpretable way, humans would have to be able to clearly understand both the input and output. This is how current autonomous car stacks, except comma.ai’s, already work. Assuming each model can be up to 10^26 flops—that’s quite a bit of headroom, probably easily enough to do a lot of “AGI” like things using a network of 10-100 subsystems, each under 10^26. The human brain works this way.
>But we don’t have to trust it, all that really matters is the aircraft carrier numbers, nothing else. As it turned out, the carrier was a hard counter to everything, even other carriers—the other ships in a carrier battle group are there to hunt submarines, supplement the carriers antiaircraft fire, and resupply the carriers.
This is not true. Carriers were powerful, yes, but also vulnerable. I point to the loss of Glorious and Samar as cases where they ended up under the guns of battleships and it didn’t go great, and frankly Samar should have been so much worse than it was. And sinking ships with airplanes is quite difficult. The total number of battleships sunk at sea by carrier planes? Two, and in both cases, it took a lot of planes.
More broadly, the growth of the American carrier fleet was because there was a war on, and any time there was a US-Japan war, the US was going to be building a lot of ships. There were a lot of carriers because carriers had reached the point of being genuinely useful (and could be built reasonably quickly, the main reason the battleship program was curtailed) but it wasn’t like the USN would have had 30+ fleet carriers in 1945 in a world with neither the treaties nor the war.
The intricacies of tradeoffs between WW2 ship classes could be argued, and was argued, for decades in books.
You’re correct that you can create a scenario where the carrier doesn’t always win, and in the confusion of ww2 sensors and communications those scenarios occasionally happened.
You’re correct that aerial weapons at the time were less effective against battleships.
I don’t think these exceptions change the basic idea that the chance of winning the pacific theater fleet battles is proportional to the number and effectiveness of the carrier launched aircraft you can field. So the total combat power of the USN in WW2 is mostly proportional to the carrier number, and the rate of increase is exactly the post overhang example asked for.
Note also overhang does not mean catch up. The timeline with an artificial pause always has less potential progress than the normal timeline.