I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don’t understand why energy density matters very much. The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity).
More broadly, you list these three “assumptions” of Eliezer’s worldview:
The brain inefficiency assumption: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.
The mind inefficiency or human incompetence assumption: In terms of software he describes the brain as an inefficient complex “kludgy mess of spaghetti-code”. He derived these insights from the influential evolved modularity hypothesis as popularized in ev pysch by Tooby and Cosmides. He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.
The more room at the bottom assumption: Naturally dovetailing with points 1 and 2, EY confidently predicts there is enormous room for further hardware improvement, especially through strong drexlerian nanotech.
None of these strike me as “assumptions” (and also point 3 is just the same as point 1 as far as I can tell, and point 2 mischaracterizes at least my beliefs, and I would bet also would not fit historical data, but that’s a separate conversation).
Having more room at the bottom is just one of a long list of ways to end up with AIs much smarter than humans. Maybe you have rebuttals to all the other ways AIs could end up much smarter than humans (like just using huge datacenters, or doing genetic engineering, or being able to operate at much faster clock speeds), in which case I am quite curious about that, but I would definitely not frame these as “necessary assumptions for a foom-like scenario”.
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
The brain is a million times slower than digital computers, but its slow speed is probably efficient for its given energy budget, as it allows for a full utilization of an enormous memory capacity and memory bandwidth. As a consequence of being very slow, brains are enormously circuit cycle efficient. Thus even some hypothetical superintelligence, running on non-exotic hardware, will not be able to think much faster than an artificial brain running on equivalent hardware at the same clock rate.
Let’s accept all Jacob’s analysis about the tradeoffs of clock speed, memory capacity and bandwidth.
The force of his conclusion depends on the superintelligence “running on equivalent hardware.” Obviously, core to Eliezer’s superintelligence argument, and habryka’s comment here, is the point that the hardware underpinning AI can be made large and expanded upon in a way that is not possible for human brains.
Jacob knows this, and addresses it in comments in response to Vaniver pointing out that birds may be more efficient than jet planes in terms of calories/mile flown, but that when the relevant metric is top speed or human passengers carried, the jet wins. Jacob responds:
I agree electricity is cheap, and discuss that. But electricity is not free, and still becomes a constraint...
The rental price using enterprise GPUs is at least 4x as much, so more like $20,000/yr per agent. So the potential economic advantage is not yet multiple OOM. It’s actually more like little to no advantage for low-end robotic labor, or perhaps 1 OOM advantage for programmers/researchers/ec. But if we had AGI today GPU prices would just skyrocket to arbitrage that advantage, at least until foundries could ramp up GPU production.
So the crux here appears to be about the practicality of replacing human brains with factory-sized artificial ones, in terms of physical resource limitations.
Daniel Kokotajlo disagrees that this is important:
$2,000/yr per agent is nothing, when we are talking about hypothetical AGI. This seems to be evidence against your claim that energy is a taut constraint.
Jacob doubles down that it is:
Energy is always an engineering constraint: it’s a primary constraint on Moore’s Law, and thus also a primary limiter on a fast takeoff with GPUs (because world power supply isn’t enough to support net ANN compute much larger than current brain population net compute).
But again I already indicated it’s probably not a ‘taut constraint’ on early AGI in terms of economic cost—at least in my model of likely requirements for early not-smarter-than-human AGI.
Also yes additionally longer term we can expect energy to become a larger fraction of economic cost—through some combination of more efficient chip production, or just the slowing of moore’s law itself (which implies chips holding value for much longer, thus reducing the dominant hardware depreciation component of rental costs)
So Jacob here admits that energy is neither a ‘taut constraint’ for early AGI, and that at the same time it will be a larger fraction of the cost. In other words, it’s not a bottleneck for AGI, and no other resource is either.
This is where Jacob’s discussion ended.
So I think Jacob has at least two jobs to do to convince me. I would be very pleased and appreciative if he achieved just one of them.
First, he needs to explain why any efficiency constraints can’t be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it’s also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.
Second, he needs to explain why things like energy, size, or ops/sec efficiency are the most important efficiency metrics as opposed to things like “physical tasks/second,” or “brain-size intelligences produced per year,” or “speed at which information can be taken in and processed via sensors positioned around the globe.” There are so very many efficiency (“useful output/resource input”) metrics that we can construct, and on many of them, the human brain and body are demonstrably nowhere near the physical limit.
Right now, doubling down on physics-based efficiency arguments, as he’s doing here, don’t feel like a winning strategy to me.
First, he needs to explain why any efficiency constraints can’t be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it’s also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.
If Jake claims to disagree with the claim that ai can starkly surpass humans [now disproven—he has made more explicit that it can], I’d roll my eyes at him. He is doing a significant amount of work based on the premise that this ai can surpass humans. His claims about safety must therefore not rely on ai being limited in capability; if his claims had relied on ai being naturally capability bounded I’d have rolled to disbelieve [edit: his claims do not rely on it]. I don’t think his claims rely on it, as I currently think his views on safety are damn close to simply being a lower resolution version of mine held overconfidently [this is intended to be a pointer to stalking both our profiles]; it’s possible he actually disagrees with my views, but so far my impression is he has some really good overall ideas but hasn’t thought in detail about how to mitigate the problems I see. But I have almost always agreed with him about the rest of the points he explicitly spells out in OP, with some exceptions where he had to talk me into his view and I eventually became convinced. (I really doubted the energy cost of the brain being near optimal for energy budget and temperature target. I later came to realize it being near optimal is fundamental to why it works at all.)
from what he’s told me and what I’ve seen him say, my impression is he hasn’t looked quite as closely at safety as I have, and to be clear, I don’t think either of us are proper experts on co-protective systems alignment or open source game theory or any of that fancy high end alignment stuff; I worked with him first while I was initially studying machine learning 2015-2016, then we worked together on a research project which then pivoted to building vast.ai. I’ve since moved on to more studying, but given assumption of our otherwise mostly shared background assumptions with varying levels of skill (read: he’s still much more skilled on some core fundamentals and I’ve settled into just being a nerd who likes to read interesting papers), I think our views are still mostly shared to the degree our knowledge overlaps.
@ Jake, re: safety, I just wish you had the kind of mind that was habitually allergic to C++’s safety issues and desperate for the safety of rustlang, exactly bounded approximation is great. Of course, we’ve had that discussion many times, he’s quite the template wizard, with all the good and bad that comes with that.
(open source game theory is a kind of template magic)
Respectfully, it’s hard for me to follow your comment because of the amount of times you say things like “If Jake claims to disagree with this,” “based on the premise that this is false,” “must therefore not rely on it or be false,” and “I don’t think they rely on it.” The double negatives plus pointing to things with the word “this” and “it” makes me lose confidence in my ability to track your line of thinking. If you could speak in the positive and replace your “pointer terms” like “this” and “it” with the concrete claims you’re referring to, that would help a lot!
Understandable, I edited in clearer references—did that resolve all the issues? I’m not sure in return that I parsed all your issues parsing :) I appreciate the specific request!
It helps! There are still some double negatives (“His claims about safety must therefore not rely on ai not surpassing humans, or be false” could be reworded to “his claims about safety can only be true if they allow for AI surpassing humans,” for example), and I, not being a superintelligence, would find that easier to parse :)
The “pointers” bit is mostly fixed by you replacing the word “this” with the phrase “the claim that ai can starkly surpass humans.” Thank you for the edits!
First, he needs to explain why any efficiency constraints can’t be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it’s also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.
I don’t need to explain that as I don’t believe it. Of course you can overcome efficiency constraints somewhat by brute force—and that is why I agree energy is not by itself an especially taut constraint for early AGI, but it is a taut constraint for SI.
You can’t overcome any limits just by increasing expenditures. See my reply here for an example.
Second, he needs to explain why things like energy, size, or ops/sec efficiency are the most important efficiency metrics
I don’t really feel this need, because EY already agrees thermodynamic efficiency is important, and i’m arguing specifically against core claims of his model.
Computation simply is energy organized towards some end, and intelligence is a form of computation. A superintelligence that can clearly overpower humanity is—almost by definition—something with greater intelligence than humanity, which thus translates into compute and energy requirements through efficiency factors.
It’s absolutely valid to make a local argument against specific parts of Eliezer’s model. However, you have a lot of other arguments “attached” that don’t straightforwardly flow from the parts of Eliezer’s model you’re mainly attacking. That’s a debate style choice that’s up to you, but as a reader who is hoping to learn from you, it becomes distracting because I have to put a lot of extra work into distinguishing “this is a key argument against point 3 from EY’s efficiency model” from “this is a side argument consisting of one assertion about bioweapons based on unstated biology background knowledge.”
Would it be better if we switched from interpreting your post as “a tightly focused argument on demolishing EY’s core efficiency-based arguments,” to “laying out Jabob’s overall view on AI risk, with a lot of emphasis on efficiency arguments?” If that’s the best way to look at it then I retract the objection I’m making here, except to say it wasn’t as clear as it could have been.
The bioweapons is something of a tangent, but I felled compelled to mention it because every time I’ve pointed out that strong nanotech can’t have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn’t part of EY’s model—he talks about diamond nanobots. But sure, that paragraph is something of a tangent.
EY’s model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current—ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
I think I see what you’re saying here. Correct me if I’m wrong.
You’re saying that there’s an argument floating around that goes something like this:
At some point in the AI training process, there might be an “awakening” of the AI to an understanding of its situation, its goal, and the state of the world. The AI, while being trained, will realize that to pursue the goal it’s being trained on most effectively, it needs to be a lot smarter and more powerful. Being already superintelligent, it will, during the training process, figure out ways to use existing hardware and energy infrastructure to make itself even more intelligent, without alerting humans. Of course, it can’t build new hardware or noticeably disrupt existing hardware beyond that which has been allocated to it, since that would trigger an investigation and shutdown by humans.
And it’s this argument specifically that you are dispatching with your efficiency arguments. Because, for inescapable physics reasons, AI will hit an efficiency wall, and it can’t become more intelligent than humans on hardware with equivalent size, energy, and so on. Loosely speaking, it’s impossible to build a device something significantly smaller than a brain and using less power than a brain running AI that’s more than 1-2 OOMs smarter than a brain, and we can certainly rule out a superintelligence 6 OOMs smarter than humans running on a device smaller and less energy-intensive than a brain.
You have other arguments about practical engineering constraints, the potential utility to an AI of keeping humans around, the difficulty of building grey goo, and so on, the “alien minds” argument, but those are all based on separate counterarguments. You’re also not arguing about whether an AI just 2-100x as intelligent as humans might be dangerous based on efficiency considerations.
You do have arguments in some or all of these areas, but the efficiency arguments are meant to just deal with this one specific scenario about a 6 OOM (not a 2 OOM) improvement in intelligence during a training run without accessing more hardware than was made available during the training run.
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Also “being already superintelligent” presumes the conclusion at the onset.
So lets restart:
Someone creates an AGI a bit smarter than humans.
It creates even smarter AGI—by rewriting its own source code.
After the Nth iteration and software OOM improvement is tapped it creates nanotech assemblers to continue growing OOM in power (or alternatively somehow gets OOM improvement with existing foundry tech, but that seems less likely as part of EY’s model).
At some point it has more intelligence/compute than all of humanity, and kills us with nanotech or something.
EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY’s view of 2 where it’s just a modest “rewrite of its own source code”. The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Ugh yes, I have no idea why I originally formatted it with the second paragraph quoted as I had it originally (which I fully intended as an articulation of your argument, a rebuttal to the first EY-style paragraph). Just a confusing formatting and structure error on my part. Sorry about that, thanks for your patience.
So as a summary, you agree that AI could be trained a bit smarter than humans, but you disagree with the model where AI could suddenly iteratively extract like 6 OOMs better performance on the same hardware it’s running on, all at once, figure out ways to interact with the physical world again within the hardware it’s already training on, and then strike humanity all at once with undetectable nanotech before the training run is even complete.
The inability of the AI to attain 6 OOMs better performance on its training hardwareduring its training run by recursively self-improving its own software is mainly based on physical efficiency limits, and this is why you put such heavy emphasis on them. And the idea that neural net-like structures that are very demanding in terms of compute, energy, space, etc appear to be the only tractable road to superintelligence means that there’s no alternative, much more efficient scheme the neural net form of the AI could find to rewrite itself a fundamentally more efficienct architecture on this scale. Again, you have other arguments to deal with other concerns and to make other predictions about the outcome of training superintelligent AI, but dispatching this specific scenario is where your efficiency arguments are most important.
Yes but I again expect AGI to use continuous learning, so the training run doesn’t really end. But yes I largely agree with that summary.
NN/DL in its various flavors are simply what efficient approx bayesian inference involves, and there are not viable non-equivalent dramatically better alternatives.
Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I’ve looked back through Eliezer’s old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they’re not the only path, but he outright denies that superintelligence could come from neural nets).
My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
I am genuinely curious and confused as to what exactly you concretely imagine this supposed ‘superintelligence’ to be, such that is not already the size of a factory, such that you mention “size of a factory” as if that is something actually worth mentioning—at all. Please show at least your first pass fermi estimates for the compute requirements. By that I mean—what are the compute requirements for the initial SI—and then the later presumably more powerful ‘factory’?
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don’t understand why energy density matters very much.
I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.
The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity).
This is so wildly ridiculous that you really need to show your work. I have already shown some calculations in these threads, but I’ll quickly review here.
A quick google search indicates 1GW is a typical power plant output, which in theory could power roughly a million GPU datacenter. This is almost 100 times larger in power consumption than the current largest official supercomputer: Frontier—which has about 30k GPUs. The supercomputer used to train GPT4 is somewhat of a secret, but estimated to be about that size. So at 50x to 100x you are talking about scaling up to something approaching a hypothetical GPT-5 scale cluster.
Nvidia currently produces less than 100k high end enterprise GPUs per year in total, so you can’t even produce this datacenter unless Nvidia grows by about 10x and TSMC grows by perhaps 2x.
The datacenter would likely cost over a hundred billion dollars, and the resulting models would be proportionally more expensive to run, such that it’s unclear whether this would be a win (at least using current tech). Sure I do think there is some room for software improvement.
But no, I do not think that this hypothetical not currently achievable GPT5 - even if you were running 100k instances of it—would “likely be easily capable of disempowering humanity”.
Of course if we talk longer term, the brain is obviously evidence that one human-brain power can be achieved in about 10 watts, so the 1GW power plant could support a population of 100 million uploads or neuromorphic AGIs. That’s very much part of my model (and hansons, and moravecs) - eventually.
Remember this post is all about critiquing EY’s specific doom model which involves fast foom on current hardware through recursive self-improvement.
Having more room at the bottom is just one of a long list of ways to end up with AIs much smarter than humans. Maybe you have rebuttals to all the other ways AIs could end up much smarter than humans
If you have read much of my writings, you should know that I believe its obvious we will end up with AIs much smarter than humans—but mainly because they will run faster using much more power. In fact this prediction has already come to pass in a limited sense—GPT4 was probably trained on over 100 human lifetimes worth of virtual time/data using only about 3 months of physical time, which represents a 10000x time dilation (but thankfully only for training, not for inference).
I am genuinely curious and confused as to what exactly you concretely imagine this supposed ‘superintelligence’ to be, such that is not already the size of a factory, such that you mention “size of a factory” as if that is something actually worth mentioning—at all. Please show at least your first pass fermi estimates for the compute requirements.
Despite your claim to be “genuinely curious and confused,” the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka. That’s merely a stylistic note, not impacting the content of your claims.
It sounds here like you are agreeing with him that you can deal with any limits to ops/mm^3 limits by simply building a bigger computer. It’s therefore hard for me to see why these arguments about efficiency limitations matter very much for AI’s ability to be superintelligent and exhibit superhuman takeover capabilities.
I can see why maybe human brains, being efficient according to certain metrics, might be a useful tool for the AI to keep around, but I don’t see why we ought to feel at all reassured by that. I don’t really want to serve out the end of my days as an AI’s robot.
I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.
Just as a piece of feedback, this sort of comment is not very enlightening, and it doesn’t convince me that you have background knowledge that ought to make me believe you and not habryka. If your aim is to “teach the crowd,” N of 1, but I am not being effectively taught by this bit.
A quick google search indicates 1GW is a typical power plant output, which in theory could power roughly a million GPU datacenter. This is almost 100 times larger in power consumption than the current largest official supercomputer: Frontier—which has about 30k GPUs. The supercomputer used to train GPT4 is somewhat of a secret, but estimated to be about that size. So at 50x to 100x you are talking about scaling up to something approaching a hypothetical GPT-5 scale cluster.
So it takes 1% of a single power plant output to train GPT4. If GPT-4 got put on chips and distributed, wouldn’t it take only a very small amount of power comparatively to actually run it once trained? Why are we talking about training costs rather than primarily about the cost to operate models once they have been trained?
But no, I do not think that this hypothetical not currently achievable GPT5 - even if you were running 100k instances of it—would “likely be easily capable of disempowering humanity”.′
You have output a lot of writings on this subject and I’ve only read a fraction. Do you argue this point somewhere? This seems pretty cruxy. In my model, at this point, doomers are partly worried about what happens during training of larger and larger models, but they’re also worried about what happens when you proliferate many copies of extant models and put them into something like an AutoGPT format, where they can make plans, access real-world resources, and use plugins.
Again, I think trying to focus the conversation on efficiency by itself is not enough. Right now, I feel like you have a very deep understanding of efficiency, but that once the conversation moves away from the topic of efficiency along some metric and into topics like “why can’t AI just throw more resources at expanding itself” or “why can’t AI take over with a bunch of instances of ChatGPT 4 or 5 in an AutoGPT context” or “why should I be happy about ending my days as the AI’s efficient meat-brain robot,” it becomes hard for me to follow your argument or understand why it’s an important “contra Yudkowsky” or “contra central Yudkowsky-appreciating doomers” argument.
It feels more like constructing, I don’t know, a valid critique of some of the weakest parts (maybe?) of Yudkowsky’s entire written ouvre having to do with the topic of efficiency and nanobots, and then saying that because this fails, the entire Yudkowskian argument about AI doom fails, or even that arguments for AI takeover scenarios fail generally. And I am not at all convinced you’ve shown that it’s on the foundation of Yudkowsky’s (mis?)understanding of efficiency that the rest of his argument stands or falls.
I’ve never feared grey goo or diamondoid bacteria. Mainly, I worry that a malevolent AI might not even need very much intelligence to steamroll humanity. It may just need a combination of the willingness to do damage, the ability to be smarter than a moderately intelligent person in order to hack our social systems, and the ability to proliferate across our computing infrastructure while not caring about what happens to “itself” should individual instances get “caught” and deleted. So much of our ability to mitigate terrorism depends on the ability to catch and punish offenders, the idea that perpetrators care about their own skin and are confined within it. It is the alienness of the form of the AI, and its ability to perform on par with or exceed performance of humans in important areas, to expand the computing and material resources available to itself, that make it seem dangerous to me.
Despite your claim to be “genuinely curious and confused,” the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka
I see how that tone could come off as rude, but really I don’t understand habryka’s model when he says “a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.”
So it takes 1% of a single power plant output to train GPT4. If GPT-4 got put on chips and distributed, wouldn’t it take only a very small amount of power comparatively to actually run it once trained? Why are we talking about training costs rather than primarily about the cost to operate models once they have been trained?
The transformer arch is fully parallelizable only during training, but is roughly just as or more inefficient than RNNs on GPUs/accelerators for inference. The inference costs of GPT4 are of course a openai/microsoft secret, but it is not a cheap model. Also human-level AGI, let alone superintelligence, will likely require continual learning/training.
I guess by “put on chips” you mean baking GPT-4 into an ASIC? That usually doesn’t make sense for fast changing tech, but as moore’s law slows we could start seeing that. I would expect other various changes to tensorcores well before that.
It feels more like constructing, I don’t know, a valid critique of some of the weakest parts (maybe?) of Yudkowsky’s entire written ouvre having to do with the topic of efficiency and nanobots, and then saying that because this fails, the entire Yudkowskian argument about AI doom fails, or even that arguments for AI takeover scenarios fail generally.
The entire specific Yudkowskian argument about hard takeoff via recursive self improving AGI fooming to nanotech is specifically what i’m arguing against, and indeed I only need to argue against the weakest links.
It is the alienness of the form of the AI
See the section on That Alien Mindspace. DL-based AGI is anthropomorphic (as I predicted), not alien.
Re Yudkowsky, I don’t think his entire argument rests on efficiency, and the pieces that don’t can’t be dispatched by arguing about efficiency.
Regarding “alien mindspace,” what I mean is that the physical form of AI, and whatever awareness the AI has of that, makes it alien. Like, if I knew I could potentially transmit my consciousness with perfect precision over the internet and create self-clones almost effortlessly, I would think very differently than I do now.
His argument entirely depends on efficiency. He claims that near future AGI somewhat smarter than us creates even smarter AGI and so on, recursively bottoming out in something that is many many OOM more intelligent than us without using unrealistic amounts of energy, and all of this happens very quickly.
So that’s entirely an argument that boils down to practical computational engineering efficiency considerations. Additionally he needs the AGI to be unaligned by default, and that argument is also faulty.
EY’s model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current—ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
It seems like in one place, you’re saying EY’s model depends on near term engineering practicality, and in another, that it depends on physics-constrainted efficiency which you argue invalidates it. Being no expert on the physics-based efficiency arguments, I’m happy to concede the physics constraints. But I’m struggling to understand their relevance to non-physics-based efficiency arguments or their strong bearing on matters of engineering practicality.
My understanding is that your argument goes something like this:
You can’t build something many OOMs more intelligent than a brain on hardware with roughly the same size and energy consumption as the brain.
Therefore, building a superintelligent AI would require investing more energy and more material resources than a brain uses.
Therefore… and here’s where the argument loses steam for me. Why can’t we or the AI just invest lots of material and energy resources? How much smarter than us does an unaligned AI need to be to pose a threat, and why should we think resources are a major constraint to get it to recursively self-improve itself to get to that point? Why should we think it will need constant retraining to recursively self-improve? Why do we think it’ll want to keep an economy going?
As far as the “anthropomorphic” counterargument to the “vast space of alien minds” thing, I fully agree that it appears the easiest way to predict tokens from human text is to simulate a human mind. That doesn’t mean the AI is a human mind, or that it is intrinsically constrained to human values. Being able to articulate those values and imitate behaviors that accord with those values is a capability, not a constraint. We have evidence from things like ChaosGPT or jailbreaks that you can easily have the AI behave in ways that appear unaligned, and that even the appearance of consistent alignment has to be consistently enforced in ways that look awfully fragile.
Overall, my sense is that you’ve admirably spent a lot of time probing the physical limits of certain efficiency metrics and how they bear on AI, and I think you have some intriguing arguments about nanotech and “mindspace” and practical engineering as well.
However, I think your arguments would be more impactful if you carefully and consistently delineated these different arguments and attached them more precisely to the EY claims you’re rebutting, and did more work to show how EY’s conclusion X flows from EY’s argument A, and that A is wrong for efficiency reason B, which overturns X but not Y; you disagree with Y for reason C, overturning EY’s argument D. Right now, I think you do make many of these argumentative moves, but they’re sort of scattered across various posts and comments, and I’m open to the idea that they’re all there but I’ve also seen enough inconsistencies to worry that they’re not. To be clear, I would absolutely LOVE it if EY did the very same thing—the burden of proof should ideally not be all on you, and I maintain uncertainty about this whole issue because of the fragmented nature of the debate.
So at this point, it’s hard for me to update more than “some arguments about efficiency and mindspace and practical engineering and nanotech are big points of contention between Jacob and Eliezer.” I’d like to go further and, with you reject arguments that you believe to be false, but I’m not able to do that yet because of the issue that I’m describing here. While I’m hesitant to burden you with additional work, I don’t have the background or the familiarity with your previous writings to do this very effectively—at the end of the day, if anybody’s going to bring together your argument all in one place and make it crystal clear, I think that person has to be you.
His argument entirely depends on efficiency. He claims that near future AGI somewhat smarter than us creates even smarter AGI and so on, recursively bottoming out in something that is many many OOM more intelligent than us without using unrealistic amounts of energy, and all of this happens very quickly.
You just said in your comment to me that a single power plant is enough to run 100M brains. It seems like you need zero hardware progress in order to get something much smarter without unrealistic amounts of energy, so I just don’t understand the relevance of this.
I said longer term—using hypothetical brain-parity neuromorphic computing (uploads or neuromorphic AGI). We need enormous hardware progress to reach that.
Current tech on GPUs requires large supercomputers to train 1e25+ flops models like GPT4 that are approaching, but not quite, human level AGI. If the rurmour of 1T params is true, then it takes a small cluster and ~10KW just to run some smallish number of instances of the model.
Getting something much much smarter than us would require enormous amounts of computation and energy without large advances in software and hardware.
I said longer term—using hypothetical brain-parity neuromorphic computing (uploads or neuromorphic AGI). We need enormous hardware progress to reach that.
Sure. We will probably get enormous hardware progress over the next few decades, so that’s not really an obstacle.
It seems to me your argument is “smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time”, but this has nothing to do with “efficiency arguments”. The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption.
Getting something much much smarter than us would require enormous amounts [...] energy without large advances in software and hardware.
No, as you said, it would require like, a power plant worth of energy. Maybe even like 10 power plants or so if you are really stretching it, but as you said, the really central bottleneck here is GPU production, not energy in any relevant way.
Sure. We will probably get enormous hardware progress over the next few decades, so that’s not really an obstacle.
As we get more hardware and slow mostly-aligned AGI/AI progress this further raises the bar for foom.
It seems to me your argument is “smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time”, but this has nothing to do with “efficiency arguments”.
That is actually an efficiency argument, and in my brain efficiency post I discuss multiple sub components of net efficiency that translate into intelligence/$.
The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption.
Ahh I see—energy efficiency is tightly coupled to other circuit efficiency metrics as they are all primarily driven by shrinkage. As you increasingly bottom out hardware improvements energy then becomes an increasingly more direct constraint. This is already happening with GPUs where power consumption is roughly doubling with each generation, and could soon dominate operating costs.
See here where I line the roodman model up to future energy usage predictions.
All that being said I do agree that yes the primary bottlneck or crux for the EY fast takeoff/takeover seems to be the amount of slack in software and scaling laws. But only after we agree that there isn’t obvious easy routes for the AGI to bootstrap nanotech assemblers with many OOM greater compute per J than brains or current computers.
Of course if we talk longer term, the brain is obviously evidence that one human-brain power can be achieved in about 10 watts, so the 1GW power plant could support a population of 100 million uploads or neuromorphic AGIs. That’s very much part of my model (and hansons, and moravecs) - eventually.
This seems like it straightforwardly agrees that energy efficiency is not in any way a bottleneck, so I don’t understand the focus of this post on efficiency.
I also don’t know what you mean by longer term. More room at the bottom was of course also talking longer term (you can’t build new hardware in a few weeks, unless you have nanotech, but then you can also build new factories in a few weeks), so I don’t understand why you are suddenly talking as if “longer term” was some kind of shift of the topic.
Eliezer’s model is that we definitely won’t have many decades with AIs smarter but not much smarter than humans, since there appear to be many ways to scale up intelligence, both via algorithmic progress and via hardware progress. Eliezer thinks that drexlerian nanotech is one of the main ways to do this, and if you buy that premise, then the efficiency arguments don’t really matter, since clearly you can just scale things up horizontally and build a bunch of GPUs. But even if you don’t, you can still just scale things up horizontally and increase GPU production (and in any case, energy efficiency is not the bottleneck here, it’s GPU production, which this post doesn’t talk about)
A quick google search indicates 1GW is a typical power plant output, which in theory could power roughly a million GPU datacenter. This is almost 100 times larger in power consumption than the current largest official supercomputer: Frontier—which has about 30k GPUs. The supercomputer used to train GPT4 is somewhat of a secret, but estimated to be about that size. So at 50x to 100x you are talking about scaling up to something approaching a hypothetical GPT-5 scale cluster.
Nvidia currently produces less than 100k high end enterprise GPUs per year in total, so you can’t even produce this datacenter unless Nvidia grows by about 10x and TSMC grows by perhaps 2x.
The datacenter would likely cost over a hundred billion dollars, and the resulting models would be proportionally more expensive to run, such that it’s unclear whether this would be a win (at least using current tech). Sure I do think there is some room for software improvement.
I don’t understand the relevance of this. You seem to be now talking about a completely different scenario than what I understood Eliezer to be talking about. Eliezer does not think that a slightly superhuman AI would be capable of improving the hardware efficiency of its hardware completely on its own.
Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech. Eliezer in his model here was talking about what are reasonable limits that we would be approaching here relatively soon after an AI passes human levels.
Remember this post is all about critiquing EY’s specific doom model which involves fast foom on current hardware through recursive self-improvement.
I don’t understand the relevance of thermodynamic efficiency to a foom scenario “on current hardware”. You are not going to change the thermodynamic efficiency of the hardware you are literally running on, you have to build new hardware for that either way.
To reiterate the model of EY that I am critiquing is one where an AGI quickly rapidly fooms through many OOM efficiency improvements. All key required improvements are efficiency improvements—it needs to improve it’s world modelling/planning per unit compute, and or improve compute per dollar and or compute per joule, etc.
In EY’s model there are some perhaps many OOM software improvements over the initial NN arch/aglorithms, perhaps then continued with more OOM hardware improvements. I don’t believe “buying more GPUs” is a key part of his model—it is far far too slow to provide even one OOM upgrade. Renting/hacking your way to even one OOM more GPUs is also largely unrealistic (I run one of the larger GPU compute markets and talk to many suppliers, I have inside knowledge here).
Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech
Right, so I have arguments against drexlerian nanotech (Moore room at the bottom, but also the thermodynamic constraints indicating you just can’t get many from nanotech alone), and separate arguments against many OOM from software (mind software efficiency).
I don’t understand the relevance of thermodynamic efficiency to a foom scenario “on current hardware”.
It is mostly relevant to the drexlerian nanotech, as it shows there likely isn’t much improvement over GPUs for all the enormous effort. If nanotech were feasible and could easily allow computers 6 OOM more efficient than the brain using about the same energy/space/materials, then I would more agree with his argument.
I don’t think he’s at all claiming safety is trivial or that humans can expect to remain in charge. control-capture foom is very much permitted by his model and he says so directly; much bigger minds are allowed. But his model suggests that reflective algorithmic improvement is not the panacea that yudkowsky expected, nor that beating biology head to head is easy even for a very superintelligent system.
this does not change any claim I would make about safety; it should barely be an update for anyone who has already updated off of deep learning. but it should knock down yudkowsky’s view of capability scaling in algorithms thoroughly. this is relevant to prediction of which kinds of system are a threat to other systems and how.
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
Presumably it takes a gigantic amount of compute to train a “brain the size of a factory”? If we assume that training a human-level AI will take 10^28 FLOP (which is quite optimistic), the Chinchilla scaling laws predict that training a model 10,000 times larger would take about 10^36 FLOP, which is far more than the total amount of compute available to humans cumulatively over our history.
By the time the world is training factory-sized brains, I expect human labor to already have been made obsolete by previous generations of AIs that were smarter than us, but not vastly so. Presumably this is Jacob’s model of the future too?
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don’t understand why energy density matters very much. The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity).
More broadly, you list these three “assumptions” of Eliezer’s worldview:
None of these strike me as “assumptions” (and also point 3 is just the same as point 1 as far as I can tell, and point 2 mischaracterizes at least my beliefs, and I would bet also would not fit historical data, but that’s a separate conversation).
Having more room at the bottom is just one of a long list of ways to end up with AIs much smarter than humans. Maybe you have rebuttals to all the other ways AIs could end up much smarter than humans (like just using huge datacenters, or doing genetic engineering, or being able to operate at much faster clock speeds), in which case I am quite curious about that, but I would definitely not frame these as “necessary assumptions for a foom-like scenario”.
I’m going to expand on this.
Jacob’s conclusion to the speed section of his post on brain efficiency is this:
Let’s accept all Jacob’s analysis about the tradeoffs of clock speed, memory capacity and bandwidth.
The force of his conclusion depends on the superintelligence “running on equivalent hardware.” Obviously, core to Eliezer’s superintelligence argument, and habryka’s comment here, is the point that the hardware underpinning AI can be made large and expanded upon in a way that is not possible for human brains.
Jacob knows this, and addresses it in comments in response to Vaniver pointing out that birds may be more efficient than jet planes in terms of calories/mile flown, but that when the relevant metric is top speed or human passengers carried, the jet wins. Jacob responds:
So the crux here appears to be about the practicality of replacing human brains with factory-sized artificial ones, in terms of physical resource limitations.
Daniel Kokotajlo disagrees that this is important:
Jacob doubles down that it is:
So Jacob here admits that energy is neither a ‘taut constraint’ for early AGI, and that at the same time it will be a larger fraction of the cost. In other words, it’s not a bottleneck for AGI, and no other resource is either.
This is where Jacob’s discussion ended.
So I think Jacob has at least two jobs to do to convince me. I would be very pleased and appreciative if he achieved just one of them.
First, he needs to explain why any efficiency constraints can’t be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it’s also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.
Second, he needs to explain why things like energy, size, or ops/sec efficiency are the most important efficiency metrics as opposed to things like “physical tasks/second,” or “brain-size intelligences produced per year,” or “speed at which information can be taken in and processed via sensors positioned around the globe.” There are so very many efficiency (“useful output/resource input”) metrics that we can construct, and on many of them, the human brain and body are demonstrably nowhere near the physical limit.
Right now, doubling down on physics-based efficiency arguments, as he’s doing here, don’t feel like a winning strategy to me.
If Jake claims to disagree with the claim that ai can starkly surpass humans [now disproven—he has made more explicit that it can], I’d roll my eyes at him. He is doing a significant amount of work based on the premise that this ai can surpass humans. His claims about safety must therefore not rely on ai being limited in capability; if his claims had relied on ai being naturally capability bounded I’d have rolled to disbelieve [edit: his claims do not rely on it]. I don’t think his claims rely on it, as I currently think his views on safety are damn close to simply being a lower resolution version of mine held overconfidently [this is intended to be a pointer to stalking both our profiles]; it’s possible he actually disagrees with my views, but so far my impression is he has some really good overall ideas but hasn’t thought in detail about how to mitigate the problems I see. But I have almost always agreed with him about the rest of the points he explicitly spells out in OP, with some exceptions where he had to talk me into his view and I eventually became convinced. (I really doubted the energy cost of the brain being near optimal for energy budget and temperature target. I later came to realize it being near optimal is fundamental to why it works at all.)
from what he’s told me and what I’ve seen him say, my impression is he hasn’t looked quite as closely at safety as I have, and to be clear, I don’t think either of us are proper experts on co-protective systems alignment or open source game theory or any of that fancy high end alignment stuff; I worked with him first while I was initially studying machine learning 2015-2016, then we worked together on a research project which then pivoted to building vast.ai. I’ve since moved on to more studying, but given assumption of our otherwise mostly shared background assumptions with varying levels of skill (read: he’s still much more skilled on some core fundamentals and I’ve settled into just being a nerd who likes to read interesting papers), I think our views are still mostly shared to the degree our knowledge overlaps.
@ Jake, re: safety, I just wish you had the kind of mind that was habitually allergic to C++’s safety issues and desperate for the safety of rustlang, exactly bounded approximation is great. Of course, we’ve had that discussion many times, he’s quite the template wizard, with all the good and bad that comes with that.
(open source game theory is a kind of template magic)
Respectfully, it’s hard for me to follow your comment because of the amount of times you say things like “If Jake claims to disagree with this,” “based on the premise that this is false,” “must therefore not rely on it or be false,” and “I don’t think they rely on it.” The double negatives plus pointing to things with the word “this” and “it” makes me lose confidence in my ability to track your line of thinking. If you could speak in the positive and replace your “pointer terms” like “this” and “it” with the concrete claims you’re referring to, that would help a lot!
Understandable, I edited in clearer references—did that resolve all the issues? I’m not sure in return that I parsed all your issues parsing :) I appreciate the specific request!
It helps! There are still some double negatives (“His claims about safety must therefore not rely on ai not surpassing humans, or be false” could be reworded to “his claims about safety can only be true if they allow for AI surpassing humans,” for example), and I, not being a superintelligence, would find that easier to parse :)
The “pointers” bit is mostly fixed by you replacing the word “this” with the phrase “the claim that ai can starkly surpass humans.” Thank you for the edits!
I don’t need to explain that as I don’t believe it. Of course you can overcome efficiency constraints somewhat by brute force—and that is why I agree energy is not by itself an especially taut constraint for early AGI, but it is a taut constraint for SI.
You can’t overcome any limits just by increasing expenditures. See my reply here for an example.
I don’t really feel this need, because EY already agrees thermodynamic efficiency is important, and i’m arguing specifically against core claims of his model.
Computation simply is energy organized towards some end, and intelligence is a form of computation. A superintelligence that can clearly overpower humanity is—almost by definition—something with greater intelligence than humanity, which thus translates into compute and energy requirements through efficiency factors.
It’s absolutely valid to make a local argument against specific parts of Eliezer’s model. However, you have a lot of other arguments “attached” that don’t straightforwardly flow from the parts of Eliezer’s model you’re mainly attacking. That’s a debate style choice that’s up to you, but as a reader who is hoping to learn from you, it becomes distracting because I have to put a lot of extra work into distinguishing “this is a key argument against point 3 from EY’s efficiency model” from “this is a side argument consisting of one assertion about bioweapons based on unstated biology background knowledge.”
Would it be better if we switched from interpreting your post as “a tightly focused argument on demolishing EY’s core efficiency-based arguments,” to “laying out Jabob’s overall view on AI risk, with a lot of emphasis on efficiency arguments?” If that’s the best way to look at it then I retract the objection I’m making here, except to say it wasn’t as clear as it could have been.
The bioweapons is something of a tangent, but I felled compelled to mention it because every time I’ve pointed out that strong nanotech can’t have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn’t part of EY’s model—he talks about diamond nanobots. But sure, that paragraph is something of a tangent.
EY’s model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current—ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
I think I see what you’re saying here. Correct me if I’m wrong.
You’re saying that there’s an argument floating around that goes something like this:
And it’s this argument specifically that you are dispatching with your efficiency arguments. Because, for inescapable physics reasons, AI will hit an efficiency wall, and it can’t become more intelligent than humans on hardware with equivalent size, energy, and so on. Loosely speaking, it’s impossible to build a device something significantly smaller than a brain and using less power than a brain running AI that’s more than 1-2 OOMs smarter than a brain, and we can certainly rule out a superintelligence 6 OOMs smarter than humans running on a device smaller and less energy-intensive than a brain.
You have other arguments about practical engineering constraints, the potential utility to an AI of keeping humans around, the difficulty of building grey goo, and so on, the “alien minds” argument, but those are all based on separate counterarguments. You’re also not arguing about whether an AI just 2-100x as intelligent as humans might be dangerous based on efficiency considerations.
You do have arguments in some or all of these areas, but the efficiency arguments are meant to just deal with this one specific scenario about a 6 OOM (not a 2 OOM) improvement in intelligence during a training run without accessing more hardware than was made available during the training run.
Is that correct?
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Also “being already superintelligent” presumes the conclusion at the onset.
So lets restart:
Someone creates an AGI a bit smarter than humans.
It creates even smarter AGI—by rewriting its own source code.
After the Nth iteration and software OOM improvement is tapped it creates nanotech assemblers to continue growing OOM in power (or alternatively somehow gets OOM improvement with existing foundry tech, but that seems less likely as part of EY’s model).
At some point it has more intelligence/compute than all of humanity, and kills us with nanotech or something.
EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY’s view of 2 where it’s just a modest “rewrite of its own source code”. The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.
Ugh yes, I have no idea why I originally formatted it with the second paragraph quoted as I had it originally (which I fully intended as an articulation of your argument, a rebuttal to the first EY-style paragraph). Just a confusing formatting and structure error on my part. Sorry about that, thanks for your patience.
So as a summary, you agree that AI could be trained a bit smarter than humans, but you disagree with the model where AI could suddenly iteratively extract like 6 OOMs better performance on the same hardware it’s running on, all at once, figure out ways to interact with the physical world again within the hardware it’s already training on, and then strike humanity all at once with undetectable nanotech before the training run is even complete.
The inability of the AI to attain 6 OOMs better performance on its training hardware during its training run by recursively self-improving its own software is mainly based on physical efficiency limits, and this is why you put such heavy emphasis on them. And the idea that neural net-like structures that are very demanding in terms of compute, energy, space, etc appear to be the only tractable road to superintelligence means that there’s no alternative, much more efficient scheme the neural net form of the AI could find to rewrite itself a fundamentally more efficienct architecture on this scale. Again, you have other arguments to deal with other concerns and to make other predictions about the outcome of training superintelligent AI, but dispatching this specific scenario is where your efficiency arguments are most important.
Is that correct?
Yes but I again expect AGI to use continuous learning, so the training run doesn’t really end. But yes I largely agree with that summary.
NN/DL in its various flavors are simply what efficient approx bayesian inference involves, and there are not viable non-equivalent dramatically better alternatives.
Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I’ve looked back through Eliezer’s old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they’re not the only path, but he outright denies that superintelligence could come from neural nets).
My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.
I am genuinely curious and confused as to what exactly you concretely imagine this supposed ‘superintelligence’ to be, such that is not already the size of a factory, such that you mention “size of a factory” as if that is something actually worth mentioning—at all. Please show at least your first pass fermi estimates for the compute requirements. By that I mean—what are the compute requirements for the initial SI—and then the later presumably more powerful ‘factory’?
I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.
This is so wildly ridiculous that you really need to show your work. I have already shown some calculations in these threads, but I’ll quickly review here.
A quick google search indicates 1GW is a typical power plant output, which in theory could power roughly a million GPU datacenter. This is almost 100 times larger in power consumption than the current largest official supercomputer: Frontier—which has about 30k GPUs. The supercomputer used to train GPT4 is somewhat of a secret, but estimated to be about that size. So at 50x to 100x you are talking about scaling up to something approaching a hypothetical GPT-5 scale cluster.
Nvidia currently produces less than 100k high end enterprise GPUs per year in total, so you can’t even produce this datacenter unless Nvidia grows by about 10x and TSMC grows by perhaps 2x.
The datacenter would likely cost over a hundred billion dollars, and the resulting models would be proportionally more expensive to run, such that it’s unclear whether this would be a win (at least using current tech). Sure I do think there is some room for software improvement.
But no, I do not think that this hypothetical not currently achievable GPT5 - even if you were running 100k instances of it—would “likely be easily capable of disempowering humanity”.
Of course if we talk longer term, the brain is obviously evidence that one human-brain power can be achieved in about 10 watts, so the 1GW power plant could support a population of 100 million uploads or neuromorphic AGIs. That’s very much part of my model (and hansons, and moravecs) - eventually.
Remember this post is all about critiquing EY’s specific doom model which involves fast foom on current hardware through recursive self-improvement.
If you have read much of my writings, you should know that I believe its obvious we will end up with AIs much smarter than humans—but mainly because they will run faster using much more power. In fact this prediction has already come to pass in a limited sense—GPT4 was probably trained on over 100 human lifetimes worth of virtual time/data using only about 3 months of physical time, which represents a 10000x time dilation (but thankfully only for training, not for inference).
Despite your claim to be “genuinely curious and confused,” the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka. That’s merely a stylistic note, not impacting the content of your claims.
It sounds here like you are agreeing with him that you can deal with any limits to ops/mm^3 limits by simply building a bigger computer. It’s therefore hard for me to see why these arguments about efficiency limitations matter very much for AI’s ability to be superintelligent and exhibit superhuman takeover capabilities.
I can see why maybe human brains, being efficient according to certain metrics, might be a useful tool for the AI to keep around, but I don’t see why we ought to feel at all reassured by that. I don’t really want to serve out the end of my days as an AI’s robot.
Just as a piece of feedback, this sort of comment is not very enlightening, and it doesn’t convince me that you have background knowledge that ought to make me believe you and not habryka. If your aim is to “teach the crowd,” N of 1, but I am not being effectively taught by this bit.
So it takes 1% of a single power plant output to train GPT4. If GPT-4 got put on chips and distributed, wouldn’t it take only a very small amount of power comparatively to actually run it once trained? Why are we talking about training costs rather than primarily about the cost to operate models once they have been trained?
You have output a lot of writings on this subject and I’ve only read a fraction. Do you argue this point somewhere? This seems pretty cruxy. In my model, at this point, doomers are partly worried about what happens during training of larger and larger models, but they’re also worried about what happens when you proliferate many copies of extant models and put them into something like an AutoGPT format, where they can make plans, access real-world resources, and use plugins.
Again, I think trying to focus the conversation on efficiency by itself is not enough. Right now, I feel like you have a very deep understanding of efficiency, but that once the conversation moves away from the topic of efficiency along some metric and into topics like “why can’t AI just throw more resources at expanding itself” or “why can’t AI take over with a bunch of instances of ChatGPT 4 or 5 in an AutoGPT context” or “why should I be happy about ending my days as the AI’s efficient meat-brain robot,” it becomes hard for me to follow your argument or understand why it’s an important “contra Yudkowsky” or “contra central Yudkowsky-appreciating doomers” argument.
It feels more like constructing, I don’t know, a valid critique of some of the weakest parts (maybe?) of Yudkowsky’s entire written ouvre having to do with the topic of efficiency and nanobots, and then saying that because this fails, the entire Yudkowskian argument about AI doom fails, or even that arguments for AI takeover scenarios fail generally. And I am not at all convinced you’ve shown that it’s on the foundation of Yudkowsky’s (mis?)understanding of efficiency that the rest of his argument stands or falls.
I’ve never feared grey goo or diamondoid bacteria. Mainly, I worry that a malevolent AI might not even need very much intelligence to steamroll humanity. It may just need a combination of the willingness to do damage, the ability to be smarter than a moderately intelligent person in order to hack our social systems, and the ability to proliferate across our computing infrastructure while not caring about what happens to “itself” should individual instances get “caught” and deleted. So much of our ability to mitigate terrorism depends on the ability to catch and punish offenders, the idea that perpetrators care about their own skin and are confined within it. It is the alienness of the form of the AI, and its ability to perform on par with or exceed performance of humans in important areas, to expand the computing and material resources available to itself, that make it seem dangerous to me.
I see how that tone could come off as rude, but really I don’t understand habryka’s model when he says “a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.”
The transformer arch is fully parallelizable only during training, but is roughly just as or more inefficient than RNNs on GPUs/accelerators for inference. The inference costs of GPT4 are of course a openai/microsoft secret, but it is not a cheap model. Also human-level AGI, let alone superintelligence, will likely require continual learning/training.
I guess by “put on chips” you mean baking GPT-4 into an ASIC? That usually doesn’t make sense for fast changing tech, but as moore’s law slows we could start seeing that. I would expect other various changes to tensorcores well before that.
The entire specific Yudkowskian argument about hard takeoff via recursive self improving AGI fooming to nanotech is specifically what i’m arguing against, and indeed I only need to argue against the weakest links.
See the section on That Alien Mindspace. DL-based AGI is anthropomorphic (as I predicted), not alien.
Re Yudkowsky, I don’t think his entire argument rests on efficiency, and the pieces that don’t can’t be dispatched by arguing about efficiency.
Regarding “alien mindspace,” what I mean is that the physical form of AI, and whatever awareness the AI has of that, makes it alien. Like, if I knew I could potentially transmit my consciousness with perfect precision over the internet and create self-clones almost effortlessly, I would think very differently than I do now.
His argument entirely depends on efficiency. He claims that near future AGI somewhat smarter than us creates even smarter AGI and so on, recursively bottoming out in something that is many many OOM more intelligent than us without using unrealistic amounts of energy, and all of this happens very quickly.
So that’s entirely an argument that boils down to practical computational engineering efficiency considerations. Additionally he needs the AGI to be unaligned by default, and that argument is also faulty.
In your other recent comment to me, you said:
It seems like in one place, you’re saying EY’s model depends on near term engineering practicality, and in another, that it depends on physics-constrainted efficiency which you argue invalidates it. Being no expert on the physics-based efficiency arguments, I’m happy to concede the physics constraints. But I’m struggling to understand their relevance to non-physics-based efficiency arguments or their strong bearing on matters of engineering practicality.
My understanding is that your argument goes something like this:
You can’t build something many OOMs more intelligent than a brain on hardware with roughly the same size and energy consumption as the brain.
Therefore, building a superintelligent AI would require investing more energy and more material resources than a brain uses.
Therefore… and here’s where the argument loses steam for me. Why can’t we or the AI just invest lots of material and energy resources? How much smarter than us does an unaligned AI need to be to pose a threat, and why should we think resources are a major constraint to get it to recursively self-improve itself to get to that point? Why should we think it will need constant retraining to recursively self-improve? Why do we think it’ll want to keep an economy going?
As far as the “anthropomorphic” counterargument to the “vast space of alien minds” thing, I fully agree that it appears the easiest way to predict tokens from human text is to simulate a human mind. That doesn’t mean the AI is a human mind, or that it is intrinsically constrained to human values. Being able to articulate those values and imitate behaviors that accord with those values is a capability, not a constraint. We have evidence from things like ChaosGPT or jailbreaks that you can easily have the AI behave in ways that appear unaligned, and that even the appearance of consistent alignment has to be consistently enforced in ways that look awfully fragile.
Overall, my sense is that you’ve admirably spent a lot of time probing the physical limits of certain efficiency metrics and how they bear on AI, and I think you have some intriguing arguments about nanotech and “mindspace” and practical engineering as well.
However, I think your arguments would be more impactful if you carefully and consistently delineated these different arguments and attached them more precisely to the EY claims you’re rebutting, and did more work to show how EY’s conclusion X flows from EY’s argument A, and that A is wrong for efficiency reason B, which overturns X but not Y; you disagree with Y for reason C, overturning EY’s argument D. Right now, I think you do make many of these argumentative moves, but they’re sort of scattered across various posts and comments, and I’m open to the idea that they’re all there but I’ve also seen enough inconsistencies to worry that they’re not. To be clear, I would absolutely LOVE it if EY did the very same thing—the burden of proof should ideally not be all on you, and I maintain uncertainty about this whole issue because of the fragmented nature of the debate.
So at this point, it’s hard for me to update more than “some arguments about efficiency and mindspace and practical engineering and nanotech are big points of contention between Jacob and Eliezer.” I’d like to go further and, with you reject arguments that you believe to be false, but I’m not able to do that yet because of the issue that I’m describing here. While I’m hesitant to burden you with additional work, I don’t have the background or the familiarity with your previous writings to do this very effectively—at the end of the day, if anybody’s going to bring together your argument all in one place and make it crystal clear, I think that person has to be you.
You just said in your comment to me that a single power plant is enough to run 100M brains. It seems like you need zero hardware progress in order to get something much smarter without unrealistic amounts of energy, so I just don’t understand the relevance of this.
I said longer term—using hypothetical brain-parity neuromorphic computing (uploads or neuromorphic AGI). We need enormous hardware progress to reach that.
Current tech on GPUs requires large supercomputers to train 1e25+ flops models like GPT4 that are approaching, but not quite, human level AGI. If the rurmour of 1T params is true, then it takes a small cluster and ~10KW just to run some smallish number of instances of the model.
Getting something much much smarter than us would require enormous amounts of computation and energy without large advances in software and hardware.
Sure. We will probably get enormous hardware progress over the next few decades, so that’s not really an obstacle.
It seems to me your argument is “smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time”, but this has nothing to do with “efficiency arguments”. The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption.
No, as you said, it would require like, a power plant worth of energy. Maybe even like 10 power plants or so if you are really stretching it, but as you said, the really central bottleneck here is GPU production, not energy in any relevant way.
As we get more hardware and slow mostly-aligned AGI/AI progress this further raises the bar for foom.
That is actually an efficiency argument, and in my brain efficiency post I discuss multiple sub components of net efficiency that translate into intelligence/$.
Ahh I see—energy efficiency is tightly coupled to other circuit efficiency metrics as they are all primarily driven by shrinkage. As you increasingly bottom out hardware improvements energy then becomes an increasingly more direct constraint. This is already happening with GPUs where power consumption is roughly doubling with each generation, and could soon dominate operating costs.
See here where I line the roodman model up to future energy usage predictions.
All that being said I do agree that yes the primary bottlneck or crux for the EY fast takeoff/takeover seems to be the amount of slack in software and scaling laws. But only after we agree that there isn’t obvious easy routes for the AGI to bootstrap nanotech assemblers with many OOM greater compute per J than brains or current computers.
How much room is there in algorithmic improvements?
Maybe it would be a good idea to change the title of this essay to:
as to not give people hope that there would be a counter argument somewhere in this article to his more general claim:
This seems like it straightforwardly agrees that energy efficiency is not in any way a bottleneck, so I don’t understand the focus of this post on efficiency.
I also don’t know what you mean by longer term. More room at the bottom was of course also talking longer term (you can’t build new hardware in a few weeks, unless you have nanotech, but then you can also build new factories in a few weeks), so I don’t understand why you are suddenly talking as if “longer term” was some kind of shift of the topic.
Eliezer’s model is that we definitely won’t have many decades with AIs smarter but not much smarter than humans, since there appear to be many ways to scale up intelligence, both via algorithmic progress and via hardware progress. Eliezer thinks that drexlerian nanotech is one of the main ways to do this, and if you buy that premise, then the efficiency arguments don’t really matter, since clearly you can just scale things up horizontally and build a bunch of GPUs. But even if you don’t, you can still just scale things up horizontally and increase GPU production (and in any case, energy efficiency is not the bottleneck here, it’s GPU production, which this post doesn’t talk about)
I don’t understand the relevance of this. You seem to be now talking about a completely different scenario than what I understood Eliezer to be talking about. Eliezer does not think that a slightly superhuman AI would be capable of improving the hardware efficiency of its hardware completely on its own.
Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech. Eliezer in his model here was talking about what are reasonable limits that we would be approaching here relatively soon after an AI passes human levels.
I don’t understand the relevance of thermodynamic efficiency to a foom scenario “on current hardware”. You are not going to change the thermodynamic efficiency of the hardware you are literally running on, you have to build new hardware for that either way.
To reiterate the model of EY that I am critiquing is one where an AGI quickly rapidly fooms through many OOM efficiency improvements. All key required improvements are efficiency improvements—it needs to improve it’s world modelling/planning per unit compute, and or improve compute per dollar and or compute per joule, etc.
In EY’s model there are some perhaps many OOM software improvements over the initial NN arch/aglorithms, perhaps then continued with more OOM hardware improvements. I don’t believe “buying more GPUs” is a key part of his model—it is far far too slow to provide even one OOM upgrade. Renting/hacking your way to even one OOM more GPUs is also largely unrealistic (I run one of the larger GPU compute markets and talk to many suppliers, I have inside knowledge here).
Right, so I have arguments against drexlerian nanotech (Moore room at the bottom, but also the thermodynamic constraints indicating you just can’t get many from nanotech alone), and separate arguments against many OOM from software (mind software efficiency).
It is mostly relevant to the drexlerian nanotech, as it shows there likely isn’t much improvement over GPUs for all the enormous effort. If nanotech were feasible and could easily allow computers 6 OOM more efficient than the brain using about the same energy/space/materials, then I would more agree with his argument.
I don’t think he’s at all claiming safety is trivial or that humans can expect to remain in charge. control-capture foom is very much permitted by his model and he says so directly; much bigger minds are allowed. But his model suggests that reflective algorithmic improvement is not the panacea that yudkowsky expected, nor that beating biology head to head is easy even for a very superintelligent system.
this does not change any claim I would make about safety; it should barely be an update for anyone who has already updated off of deep learning. but it should knock down yudkowsky’s view of capability scaling in algorithms thoroughly. this is relevant to prediction of which kinds of system are a threat to other systems and how.
Presumably it takes a gigantic amount of compute to train a “brain the size of a factory”? If we assume that training a human-level AI will take 10^28 FLOP (which is quite optimistic), the Chinchilla scaling laws predict that training a model 10,000 times larger would take about 10^36 FLOP, which is far more than the total amount of compute available to humans cumulatively over our history.
By the time the world is training factory-sized brains, I expect human labor to already have been made obsolete by previous generations of AIs that were smarter than us, but not vastly so. Presumably this is Jacob’s model of the future too?