We invent a way for AGIs to learn faster than humans
40%
AGI inference costs drop below $25/hr (per human equivalent)
16%
We invent and scale cheap, quality robots
60%
We massively scale production of chips and power
46%
We avoid derailment by human regulation
70%
We avoid derailment by AI-caused delay
90%
We avoid derailment from wars (e.g., China invades Taiwa
70%
We avoid derailment from pandemics
90%
We avoid derailment from severe depression
95%
I’m going to try a better calculation on what I think are the pivotal cruxes.
Algorithms, as I mentioned in my last comment, depends on compute. Iff enough training compute exists by 2043, then it is inevitable that a tAGI grade algorithm will be found, simply through RSI.
Learning faster than humans is just “algorithms”. Ted Sanders, you stated that autonomous cars not being as good as humans was because they “take time to learn”. This is completely false, this is because the current algorithms in use, especially the cohesive software and hardware systems and servers around the core driving algorithms, have bugs. If we invent better algorithms that can generalize better, train on more examples, and using the same or a different tAGI system, seal up the software and firmware and system design stack by hunting down all the bugs that can be found with a careful review of every line (in assembly!), and seal up the chip design, so that you have a clean design with no known errata—all things humans are capable of doing already, we just can’t afford the labor hours to do this—then the actual Driver agent can learn it’s job in a few months.
So compute’s a big part of the low probability.
Let’s calculate this one fresh.
Epistemic status: i have a master’s in biomedical science, which included a medical school neuroscience course, a bachelor’s in computer engineering, and a master’s in CS, and I work as a platform engineer on an AI accelerator project.
Doing it from the bio anchors view: There are approximately 86 billion neurons in the human brain, with approximately 1000 connections each. Noise is extremely high, we’re going to be generous and assume 8 bits of precision, which is likely more than the brain actually achieves. We’ll be generous and assuming 1 khz and that all connections are active.
Both are false, and the asynchronous calculations and noise cause a lot of error that a comparable synthetic system will not make. This is the reason why you can safely knock 1-2 orders of magnitude right off the top, this is well justified. Timing jitter alone from a lack of a central clock injects noise into all the synapses, synapses are using analog accumulators that change their charge levels from crosstalk and many forms of biological noise (digital accumulators reject such noise under conditions insufficient to cause a bit flip). I am not accounting for this.
So we need 8.6e+16, or 10^17 flops in int8. I’ll have to check the essay to see where you went wrong, but 10^20 is clearly not justified. Arguably flat out wrong.
With the current gen H100 and 2000 dense int8 tops, you would need 43 H100s.
Compute isn’t the issue, VRAM is. 43 H100s only have 3440 gb of VRAM.
We need 16 bit accumulators on 8 bit weights, and there’s VRAM lost from the very high sparseness of the human brain. So for a quick estimate, if we consume 8 bytes per synapse, we need 688 terabytes of VRAM, or 400 H100s.
So the estimate narrows to, present day, [43, 400] H100s. I think this is a pretty hard estimate, you’re going to need to produce substantial evidence to disprove. The reason it’s so firm is because you probably need 1⁄10 or less the above because the brain makes so many errors and randomly loses cells, so to stay alive we have a lot of redundant synapses that contain the same information. An efficient tAGI design can obviously reliably know each weight will always be factored in every timestep, and will always be available, and ECC memory is common so undetected errors will rarely happen.
During inference time, how many H100s equal “one human equivalent”? The answer may surprise you. It’s about 18.4. The reason is that the hardest working humans, working “996″ (12 hours a day, 6 days a week) are still only working 42% of the time, while H100s work 24⁄7.
The reason you don’t need 400 is because you batch the operations. There will be approximately 20 “human level” sessions across an array of 400 H100s, running a model that has unique weights tiled across all 400 cards, since the compute load per human level session is only about 43. You can reduce the batch size if you want better latency, we do that for autonomous cars.
H100s now cost about $2.33 an hour.
So present day, the cost is $100.19 an hour.
I thought I was going to have to argue for better than the 38x hardware improvement you estimated, mention each H100 actually costs about $2500 to build not $25k, or describe a method that is essentially the brain’s System 1 and System 2, but there’s no need, compute isn’t a problem. I’m sure you can see that getting a factor of 4 reduction is fairly trivial.
Unless I made a major error, I think you may need a rewrite/retraction, as you’re way off. I look forward to hearing your response. Ted, how do you not know this, you can just ping a colleague and check the math above. Anyone at OpenAI that does the modeling for infrastructure will be able to check whether or not the above is correct or not.
Remaining terms: robotics is a big open. I will simply accept your 60% though arguably in a world where the tAGI algorithms exist, and are cheap to run, there would be total war levels of effort put into robot manufacturing. Not to mention it will self amplify, as previously built robots contribute their labor to some of the tasks involved in building more robots.
Massively scaling chips and power, at <$25 per human equivalent, would happen inherently, being able to get a human equivalent worker that is reliable for this low a cost creates immense economic pressure to do this. It’s also coupled to cheap robotics—repetitive tasks like manufacturing and building solar panels and batteries, both of which are fundamentally a product that you ‘tile’, every solar cell and every battery like all it’s peers, are extremely automatable. Data centers could be air cooled and installed in large deserts, you would just accept the latency, using local hardware for the real-time control, and you would accept a less than 100% level of service, so compute availability would fluctuate with the seasons.
All the other terms: human regulation, AI-disaster, wars, pandemics, economic depression have the issue that each is conditional on all the relevant powers fail the same way. So all the probability estimates need to be raised to the 3rd power, conservatively.
Conclusion: iff my calculation on the compute required is possible, this problem collapses. Only the tAGI algorithm matters. In worlds where the tAGI algorithm was discovered at least 5 years before 2043, tAGI is closer to certain. Only extreme events—nuclear war, every tAGI becomes rampant in insidious and dangerous ways, coordinated worldwide efforts to restrict tAGI development—could derail it from there.
As for thinking again about the broader picture: your “90% of tasks” requirement, below which apparently the machine is not transformative, which seems like a rather bad goalpost, is the issue.
A reasonable tAGI algorithm is a multimodal algorithm, trained on a mixture of supervised and reinforcement learning, that is capable of learning any task to expert level iff it receives sufficient feedback on task quality to become an expert. So there’s AI scientists and AI programmers and AI grammer editors and many forms of AI doing manufacturing.
But can they write books that outsell all human authors? Depends on how much structured data they get on human enjoyment of text samples.
Can they write movies that are better than human screenwriters? Same limitation.
Can they go door to door and evangelize? Depends on how much humans discriminate against robots.
There are a whole bunch of tasks that arguably do not matter that would fit into that 90%. They don’t matter because the class of tasks that do matter—self improving AI, robots, and technology—all are solvable. “matter” defined by “did humans require the task to be done at all to live” or “does the task need to be done before robots can tear down the solar system”. If the answer is no, it doesn’t matter. Car salesman would be an example of a task that doesn’t matter. tAGI never needs to know how to do it since it can just offer fixed prices through a web page, and answer any questions about the car or transaction based on published information instead of acting in a way motivated to maximize commission.
A world where the machine can only do 50% of tasks, including all the important ones, but legal regulations means it can’t watch children or argue a case in court or supervise other AI or serve in government—is still a transformed one.
Ted Sanders, you stated that autonomous cars not being as good as humans was because they “take time to learn”. This is completely false, this is because the current algorithms in use, especially the cohesive software and hardware systems and servers around the core driving algorithms, have bugs.
I guess it depends what you mean by bugs? Kind of a bummer for Waymo if 14 years and billions invested was only needed because they couldn’t find bugs in their software stack.
If bugs are the reason self-driving is taking so long, then our essay is wildly off.
So present day, the cost is $100.19 an hour.
Yes, if with present day hardware we can effectively emulate a human brain for $100/hr, then our essay is wildly off.
I’m going to try a better calculation on what I think are the pivotal cruxes.
Algorithms, as I mentioned in my last comment, depends on compute. Iff enough training compute exists by 2043, then it is inevitable that a tAGI grade algorithm will be found, simply through RSI.
Learning faster than humans is just “algorithms”. Ted Sanders, you stated that autonomous cars not being as good as humans was because they “take time to learn”. This is completely false, this is because the current algorithms in use, especially the cohesive software and hardware systems and servers around the core driving algorithms, have bugs. If we invent better algorithms that can generalize better, train on more examples, and using the same or a different tAGI system, seal up the software and firmware and system design stack by hunting down all the bugs that can be found with a careful review of every line (in assembly!), and seal up the chip design, so that you have a clean design with no known errata—all things humans are capable of doing already, we just can’t afford the labor hours to do this—then the actual Driver agent can learn it’s job in a few months.
So compute’s a big part of the low probability.
Let’s calculate this one fresh.
Epistemic status: i have a master’s in biomedical science, which included a medical school neuroscience course, a bachelor’s in computer engineering, and a master’s in CS, and I work as a platform engineer on an AI accelerator project.
Doing it from the bio anchors view:
There are approximately 86 billion neurons in the human brain, with approximately 1000 connections each. Noise is extremely high, we’re going to be generous and assume 8 bits of precision, which is likely more than the brain actually achieves. We’ll be generous and assuming 1 khz and that all connections are active.
Both are false, and the asynchronous calculations and noise cause a lot of error that a comparable synthetic system will not make. This is the reason why you can safely knock 1-2 orders of magnitude right off the top, this is well justified. Timing jitter alone from a lack of a central clock injects noise into all the synapses, synapses are using analog accumulators that change their charge levels from crosstalk and many forms of biological noise (digital accumulators reject such noise under conditions insufficient to cause a bit flip). I am not accounting for this.
So we need 8.6e+16, or 10^17 flops in int8. I’ll have to check the essay to see where you went wrong, but 10^20 is clearly not justified. Arguably flat out wrong.
With the current gen H100 and 2000 dense int8 tops, you would need 43 H100s.
Compute isn’t the issue, VRAM is. 43 H100s only have 3440 gb of VRAM.
We need 16 bit accumulators on 8 bit weights, and there’s VRAM lost from the very high sparseness of the human brain. So for a quick estimate, if we consume 8 bytes per synapse, we need 688 terabytes of VRAM, or 400 H100s.
So the estimate narrows to, present day, [43, 400] H100s. I think this is a pretty hard estimate, you’re going to need to produce substantial evidence to disprove. The reason it’s so firm is because you probably need 1⁄10 or less the above because the brain makes so many errors and randomly loses cells, so to stay alive we have a lot of redundant synapses that contain the same information. An efficient tAGI design can obviously reliably know each weight will always be factored in every timestep, and will always be available, and ECC memory is common so undetected errors will rarely happen.
During inference time, how many H100s equal “one human equivalent”? The answer may surprise you. It’s about 18.4. The reason is that the hardest working humans, working “996″ (12 hours a day, 6 days a week) are still only working 42% of the time, while H100s work 24⁄7.
The reason you don’t need 400 is because you batch the operations. There will be approximately 20 “human level” sessions across an array of 400 H100s, running a model that has unique weights tiled across all 400 cards, since the compute load per human level session is only about 43. You can reduce the batch size if you want better latency, we do that for autonomous cars.
H100s now cost about $2.33 an hour.
So present day, the cost is $100.19 an hour.
I thought I was going to have to argue for better than the 38x hardware improvement you estimated, mention each H100 actually costs about $2500 to build not $25k, or describe a method that is essentially the brain’s System 1 and System 2, but there’s no need, compute isn’t a problem. I’m sure you can see that getting a factor of 4 reduction is fairly trivial.
Unless I made a major error, I think you may need a rewrite/retraction, as you’re way off. I look forward to hearing your response. Ted, how do you not know this, you can just ping a colleague and check the math above. Anyone at OpenAI that does the modeling for infrastructure will be able to check whether or not the above is correct or not.
Remaining terms: robotics is a big open. I will simply accept your 60% though arguably in a world where the tAGI algorithms exist, and are cheap to run, there would be total war levels of effort put into robot manufacturing. Not to mention it will self amplify, as previously built robots contribute their labor to some of the tasks involved in building more robots.
Massively scaling chips and power, at <$25 per human equivalent, would happen inherently, being able to get a human equivalent worker that is reliable for this low a cost creates immense economic pressure to do this. It’s also coupled to cheap robotics—repetitive tasks like manufacturing and building solar panels and batteries, both of which are fundamentally a product that you ‘tile’, every solar cell and every battery like all it’s peers, are extremely automatable. Data centers could be air cooled and installed in large deserts, you would just accept the latency, using local hardware for the real-time control, and you would accept a less than 100% level of service, so compute availability would fluctuate with the seasons.
All the other terms: human regulation, AI-disaster, wars, pandemics, economic depression have the issue that each is conditional on all the relevant powers fail the same way. So all the probability estimates need to be raised to the 3rd power, conservatively.
Conclusion: iff my calculation on the compute required is possible, this problem collapses. Only the tAGI algorithm matters. In worlds where the tAGI algorithm was discovered at least 5 years before 2043, tAGI is closer to certain. Only extreme events—nuclear war, every tAGI becomes rampant in insidious and dangerous ways, coordinated worldwide efforts to restrict tAGI development—could derail it from there.
As for thinking again about the broader picture: your “90% of tasks” requirement, below which apparently the machine is not transformative, which seems like a rather bad goalpost, is the issue.
A reasonable tAGI algorithm is a multimodal algorithm, trained on a mixture of supervised and reinforcement learning, that is capable of learning any task to expert level iff it receives sufficient feedback on task quality to become an expert. So there’s AI scientists and AI programmers and AI grammer editors and many forms of AI doing manufacturing.
But can they write books that outsell all human authors? Depends on how much structured data they get on human enjoyment of text samples.
Can they write movies that are better than human screenwriters? Same limitation.
Can they go door to door and evangelize? Depends on how much humans discriminate against robots.
There are a whole bunch of tasks that arguably do not matter that would fit into that 90%. They don’t matter because the class of tasks that do matter—self improving AI, robots, and technology—all are solvable. “matter” defined by “did humans require the task to be done at all to live” or “does the task need to be done before robots can tear down the solar system”. If the answer is no, it doesn’t matter. Car salesman would be an example of a task that doesn’t matter. tAGI never needs to know how to do it since it can just offer fixed prices through a web page, and answer any questions about the car or transaction based on published information instead of acting in a way motivated to maximize commission.
A world where the machine can only do 50% of tasks, including all the important ones, but legal regulations means it can’t watch children or argue a case in court or supervise other AI or serve in government—is still a transformed one.
I agree with your cruxes:
I guess it depends what you mean by bugs? Kind of a bummer for Waymo if 14 years and billions invested was only needed because they couldn’t find bugs in their software stack.
If bugs are the reason self-driving is taking so long, then our essay is wildly off.
Yes, if with present day hardware we can effectively emulate a human brain for $100/hr, then our essay is wildly off.