Do you have any basis for the 1e6 estimate? Assuming 25,000 GPUs were used to train 4, when I do the math on Nvidia’s annual volume I get about 1e6 of the data center GPUs that matter.
Reason you cannot use gaming GPUs has to do with the large size of the activation, you must have the high internode bandwidth between the machines or you get negligible performance.
So 40 times. Say it didn’t take 25k but took 2.5k. 400 times. Nowhere close to 1e6.
Distributed networks spend most of the time idle waiting on activations to transfer, it could be 1000 times performance loss or more, making every gaming g GPU in the world—they are made at about 60 times the rate of data center GPUs—not matter at all.
Orders of what? You said billions of dollars I assume you had some idea of what it buys for that
Out trading empties the order books of exploitable gradients so this saturates.
That’s what this argument is about- I am saying the growth doubling time is months to years per doubling. So it takes a couple decades to matter. It’s still “fast”—and it gets crazy the near the end—but it’s not an explosion and there are many years where the AGI is too weak to openly turn against humans. So it has to pretend to cooperate and if humans refuse to trust it and build systems that can’t defect at all because they lack context (they have no way to know if they are in the training set) humans can survive.
I agree that this is one of the ways AGI could beat us, given the evidence of large amounts of human stupidity in some scenarios.
Do you have any basis for the 1e6 estimate? Assuming 25,000 GPUs were used to train 4, when I do the math on Nvidia’s annual volume I get about 1e6 of the data center GPUs that matter.
Reason you cannot use gaming GPUs has to do with the large size of the activation, you must have the high internode bandwidth between the machines or you get negligible performance.
So 40 times. Say it didn’t take 25k but took 2.5k. 400 times. Nowhere close to 1e6.
Distributed networks spend most of the time idle waiting on activations to transfer, it could be 1000 times performance loss or more, making every gaming g GPU in the world—they are made at about 60 times the rate of data center GPUs—not matter at all.
Orders of what? You said billions of dollars I assume you had some idea of what it buys for that
Out trading empties the order books of exploitable gradients so this saturates.
That’s what this argument is about- I am saying the growth doubling time is months to years per doubling. So it takes a couple decades to matter. It’s still “fast”—and it gets crazy the near the end—but it’s not an explosion and there are many years where the AGI is too weak to openly turn against humans. So it has to pretend to cooperate and if humans refuse to trust it and build systems that can’t defect at all because they lack context (they have no way to know if they are in the training set) humans can survive.
I agree that this is one of the ways AGI could beat us, given the evidence of large amounts of human stupidity in some scenarios.