Contra Yudkowsky on AI Doom
Eliezer Yudkowsky predicts doom from AI: that humanity faces likely extinction in the near future (years or decades) from a rogue unaligned superintelligent AI system. Moreover he predicts that this is the default outcome, and AI alignment is so incredibly difficult that even he failed to solve it.
EY is an entertaining and skilled writer, but do not confuse rhetorical writing talent for depth and breadth of technical knowledge. I do not have EY’s talents there, or Scott Alexander’s poetic powers of prose. My skill points instead have gone near exclusively towards extensive study of neuroscience, deep learning, and graphics/GPU programming. More than most, I actually have the depth and breadth of technical knowledge necessary to evaluate these claims in detail.
I have evaluated this model in detail and found it substantially incorrect and in fact brazenly naively overconfident.
Intro
Even though the central prediction of the doom model is necessarily un-observable for anthropic reasons, alternative models (such as my own, or moravec’s, or hanson’s) have already made substantially better predictions, such that EY’s doom model has low posterior probability.
EY has espoused this doom model for over a decade, and hasn’t updated it much from what I can tell. Here is the classic doom model as I understand it, starting first with key background assumptions/claims:
-
Brain inefficiency: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.
-
Mind inefficiency or human incompetence: In terms of software he describes the brain as an inefficient complex “kludgy mess of spaghetti-code”. He derived these insights from the influential evolved modularity hypothesis as popularized in ev pysch by Tooby and Cosmides. He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.
-
More room at the bottom: Naturally dovetailing with points 1 and 2, EY confidently predicts there is enormous room for further software and hardware improvement, the latter especially through strong drexlerian nanotech.
-
That Alien mindspace: EY claims human mindspace is an incredibly narrow twisty complex target to hit, whereas the space of AI mindspace is vast, and AI designs will be something like random rolls from this vast alien landscape resulting in an incredibly low probability of hitting the narrow human target.
Doom naturally follows from these assumptions: Sometime in the near future some team discovers the hidden keys of intelligence and creates a human-level AGI which then rewrites its own source code, initiating a self improvement recursion cascade which ultimately increases the AGI’s computational efficiency (intelligence/$, intelligence/J, etc) by many OOM to far surpass human brains, which then quickly results in the AGI developing strong nanotech and killing all humans within a matter of days or even hours.
If assumptions 1 and 2 don’t hold (relative to 3) then there is little to no room for recursive self improvement. If assumption 4 is completely wrong then the default outcome is not doom regardless.
Every one of his key assumptions is mostly wrong, as I and others predicted well in advance. EY seems to have been systematically overconfident as an early futurist, and then perhaps updated later to avoid specific predictions, but without much updating his mental models (specifically his nanotech-woo model, as we will see).
Brain Hardware Efficiency
EY correctly recognizes that thermodynamic efficiency is a key metric for computation/intelligence, and he confidently, brazenly claims (as of late 2021), that the brain is not that efficient and about 6 OOM from thermodynamic limits:
Which brings me to the second line of very obvious-seeming reasoning that converges upon the same conclusion—that it is in principle possible to build an AGI much more computationally efficient than a human brain—namely that biology is simply not that efficient, and especially when it comes to huge complicated things that it has started doing relatively recently.
ATP synthase may be close to 100% thermodynamically efficient, but ATP synthase is literally over 1.5 billion years old and a core bottleneck on all biological metabolism. Brains have to pump thousands of ions in and out of each stretch of axon and dendrite, in order to restore their ability to fire another fast neural spike. The result is that the brain’s computation is something like half a million times less efficient than the thermodynamic limit for its temperature—so around two millionths as efficient as ATP synthase. And neurons are a hell of a lot older than the biological software for general intelligence!
The software for a human brain is not going to be 100% efficient compared to the theoretical maximum, nor 10% efficient, nor 1% efficient, even before taking into account the whole thing with parallelism vs. serialism, precision vs. imprecision, or similarly clear low-level differences.
EY is just completely out of his depth here: he doesn’t seem to understand how the Landauer limit actually works, doesn’t seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn’t seem to have a good model of the interconnect requirements, etc.
Some attempt to defend EY by invoking reversible computing, but EY explicitly states that ATP synthase may be close to 100% thermodynamically efficient, and explicitly links the end result of extreme inefficiency to the specific cause of pumping “thousands of ions in and out of each stretch of axon and dendrite”—which would be irrelevant when comparing to some exotic reversible superconducting or optical computer. Given that he doesn’t mention reversible computing and the hint “biology is simply not that efficient” helps establish we are both discussing conventional irreversible computation: not exotic reversible or quantum computing (neither of which are practical in the near future or relevant for the nanotech he envisions, which is fundamentally robotic and thus constrained by the efficiency of applying energy to irreversibly transform matter). He seems to believe biology is inefficient even given the practical constraints it is working with, not inefficient compared to all possible future hypothetical exotic computing platforms without consideration for other tradeoffs. Finally if he actually believes (as I do) that brains are efficient within the constraints of conventional irreversible computation, this would in fact substantially weaken his larger argument—and EY is not the kind of writer who weakens his own arguments.
In actuality biology is incredibly thermodynamically efficient, and generally seems to be near pareto-optimal in that regard at the cellular nanobot level, but we’ll get back to that.
In a 30 year human “training run” the brain uses somewhere between 1e23 to 1e25 flops. ANNs trained with this amount of compute already capture much—but not all—of human intelligence. One likely reason is that flops is not the only metric of relevance, and a human brain training run also uses 1e23 to 1e25 bytes of memops, which is still OOM more than the likely largest ANN training run to date (GPT4) - because GPUs have a 2 or 3 OOM gap between flops and memops.
My model instead predicts that AGI will require GPT4 -ish levels of training compute, and SI will require far more. To the extent that recursive self-improvement is actually a thing in the NN paradigm, it’s something that NNs mostly just do automatically (and something the brain currently still does better than ANNs).
Mind Software Efficiency
EY derived much of his negative beliefs about the human mind from the cognitive biases and ev psych literature, and especially Tooby and Cosmide’s influential evolved modularity hypothesis. The primary competitor to evolved modularity was/is the universal learning hypothesis and associated scaling hypothesis, and there was already sufficient evidence to rule out evolved modularity back in 2015 or earlier.
Let’s quickly assess the predictions of evolved modularity vs universal learning/scaling. Evolved modularity posits that the brain is a kludgy mess of domain specific evolved mechanisms (“spaghetti code” in EY’s words), and thus AGI will probably not come from brain reverse engineering. Human intelligence is exceptional because evolution figured out some “core to generality” that prior primate brains don’t have, but humans have only the minimal early version of this, and there is likely huge room for further improvement.
The universal learning/scaling model instead posits that there is a single obvious algorithmic signature for intelligence (approx bayesian inference), it isn’t that hard to figure out, evolution found it multiple times, human DL researchers also figured much of it out in the 90′s—ie intelligence is easy—it just takes enormous amounts of compute for training. As long as you don’t shoot yourself in the foot—as long as your architectural prior is flexible enough (ex transformers), as long as your approx to bayesian inference actually converges correctly (normalization etc) etc—then the amount of intelligence you get is proportional to the net compute spent on training. The human brain isn’t exceptional—its just a scaled up primate brain, but scaling up the net training compute by 10x (3x from larger brain, 3x from extended neotany, and some from arch/hyperparm tweaking) was enough for linguistic intelligence and the concomitant turing transition to emerge[1]. EY hates the word emergence, but intelligence is an emergent phenomena.
The universal learning/scaling model was largely correct—as tested by openAI scaling up GPT to proto-AGI.
That does not mean we are on the final scaling curve. The brain is of course strong evidence of other scaling choices that look different than chinchilla scaling. A human brain’s natural ‘clock rate’ of about 100hz supports a thoughtspeed of about 10 tokens per second, or only about 10 billion tokens per lifetime training run. GPT3 trained on about 50x longer lifetime experience/data, and GPT4 may have trained on 1000 human lifetimes of experience/data. You can spend roughly the same compute budget training a huge brain sized model for just one human lifetime, or spend it on a 100x smaller model trained for far longer. You don’t end up in exactly the same space of course—GPT4 has far more crystallized knowledge than any one human, but seems to still lack much of a human domain expert’s fluid intelligence capabilities.
Moore room at the Bottom
If the brain really is ~6 OOM from thermodynamic efficiency limits, then we should not expect moore’s law to end with brains still having a non-trivial thermodynamic efficiency advantage over digital computers. Except that is exactly what is happening. TSMC is approaching the limits of circuit miniaturization, and it is increasing obvious that fully closing the (now not so large) gap with the brain will require more directly mimicking it through neuromorphic computing[2].
Biological cells operate directly at thermodynamic efficiency limits: they copy DNA using near minimal energy, and in general they perform robotics tasks of rearranging matter using near minimal energy. For nanotech replicators (and nanorobots in general) like biological cells thermodynamic efficiency is the dominant constraint, and biology is already pareto optimal there. No SI will ever create strong nanotech that significantly improves on the thermodynamic efficiency of biology—unless/until they can rewrite the laws of physics.
Of course an AGI could still kill much of humanity using advanced biotech weapons—ex a supervirus—but that is beyond the scope of EY’s specific model, and for various reasons mostly stemming from the strong prior that biology is super effecient I expect humanity to be very difficult to kill in this way (and growing harder to kill every year as we advance prosaic AI tech). Also killing humanity would likely not be in the best interests of even unaligned AGI, because humans will probably continue to be key components of the economy (as highly efficient general purpose robots) long after AGI running in datacenters takes most higher paying intellectual jobs. So instead I expect unaligned power-seeking AGIs to adopt much more covert strategies for world domination.
That Alien Mindspace
In the “design space of minds in general” EY says:
Any two AI designs might be less similar to each other than you are to a petunia.
Quintin Pope has already written out a well argued critique of this alien mindspace meme from the DL perspective, and I already criticized this meme once when it was fresh over a decade ago. So today I will instead take a somewhat different approach (an updated elaboration of my original critique).
Imagine we have some set of mysterious NNs, which we’d like to replicate, but we only have black box access. By that I mean we have many many examples of the likely partial inputs and outputs of these networks, and some ideas about the architecture, but we don’t have any direct access to the weights.
In turns out there is a simple and surprisingly successful technique which one can use to create an arbitrary partial emulation of any ensemble of NNs: distillation. In essence distillation is simply the process of training one NN on the collected inputs/outputs of other NNs, such that it learns to emulate them.
This is exactly how we train modern large ANNs, and LLMs specifically: by training them on the internet, we are training them on human thoughts and thus (partially) distilling human minds.
Thus my model (or the systems/cybernetic model in general) correctly predicted—well in advance—that LLMs would have anthropomorphic cognition: mirroring much or our seemingly idiosyncratic cognitive biases, quirks, and limitations. Thus we have AGI that can write poems and code (like humans) but struggles with multiplying numbers (like humans), generally exhibits human like psychology, is susceptible to flattery, priming, the Jungian “shadow self” effect, etc. There is a large growing pile of specific evidence that LLMs are distilling/simulating human minds, some of which I and others have collected in prior posts, but the strength of this argument should already clearly establish a strong prior expectation that distillation should be the default outcome.
The width of mindspace is completely irrelevant. Moravec, myself and the other systems-thinkers were correct: AI is and will be our mind children; the technosphere extends the noosphere.
This alone does not strongly entail that AGI will be aligned by default, but it does defeat EY’s argument that AGI will be unaligned by default (and he loses much bayes points that I gain).
The Risk Which Remains
To be clear, I am not arguing that AGI is not a threat. It is rather obviously the pivotal eschatonic event, the closing chapter in human history. Of course ‘it’ is dangerous, for we are dangerous. But that does not mean that 1.) extinction is the most likely outcome, or 2.) that alignment is intrinsically more difficult than AGI, or 3.) that EY’s specific arguments are the especially relevant and correct way to arrive at any such conclusions.
You will likely die, but probably not because of a nanotech holocaust initiated by a god-like machine superintelligence. Instead you will probably die when you simply can no longer afford the tech required to continue living. If AI does end up causing humanity’s extinction, it will probably be the result of a slow more prosaic process of gradually out-competing us economically. AGI is not inherently mortal and can afford patience, unlike us mere humans.
- ↩︎
The turing transition: brains evolved linguistic symbolic communication which permits compressing and sharing thoughts across brains, forming a new layer of networked social computational organization and allowing minds to emerge as software entities. This is a one time transition, as there is nothing more universal/general than a turing machine.
- ↩︎
Neuromorphic computing is the main eventual long-term threat to current GPUs/accelerators and a continuation of the trend of embedding efficient matrix ops into the hardware, but it is unlikely to completely replace them for various reasons, and I don’t expect it to be very viable until traditional moore’s law has mostly ended.
- The Brain is Not Close to Thermodynamic Limits on Computation by 24 Apr 2023 8:21 UTC; 167 points) (
- Brain Efficiency Cannell Prize Contest Award Ceremony by 24 Jul 2023 11:30 UTC; 145 points) (
- $250 prize for checking Jake Cannell’s Brain Efficiency by 26 Apr 2023 16:21 UTC; 123 points) (
- Contra Yudkowsky on Doom from Foom #2 by 27 Apr 2023 0:07 UTC; 93 points) (
- 30 Apr 2023 0:33 UTC; 25 points) 's comment on Accuracy of arguments that are seen as ridiculous and intuitively false but don’t have good counter-arguments by (
- 24 Apr 2023 16:23 UTC; 10 points) 's comment on The Brain is Not Close to Thermodynamic Limits on Computation by (
- 17 May 2023 3:48 UTC; 8 points) 's comment on A TAI which kills all humans might also doom itself by (
- 29 Aug 2023 11:41 UTC; 4 points) 's comment on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong by (
I feel like even under the worldview that your beliefs imply, a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.
Maybe it will do that using GPUs, or maybe it will do that using some more neuromorphic design, but I really don’t understand why energy density matters very much. The vast majority of energy that current humans produce is of course not spent on running human brains, and there are easily 10-30 OOMs of improvement lying around without going into density (just using the energy output of a single power plant under your model would produce something that would likely be easily capable of disempowering humanity).
More broadly, you list these three “assumptions” of Eliezer’s worldview:
None of these strike me as “assumptions” (and also point 3 is just the same as point 1 as far as I can tell, and point 2 mischaracterizes at least my beliefs, and I would bet also would not fit historical data, but that’s a separate conversation).
Having more room at the bottom is just one of a long list of ways to end up with AIs much smarter than humans. Maybe you have rebuttals to all the other ways AIs could end up much smarter than humans (like just using huge datacenters, or doing genetic engineering, or being able to operate at much faster clock speeds), in which case I am quite curious about that, but I would definitely not frame these as “necessary assumptions for a foom-like scenario”.
I’m going to expand on this.
Jacob’s conclusion to the speed section of his post on brain efficiency is this:
Let’s accept all Jacob’s analysis about the tradeoffs of clock speed, memory capacity and bandwidth.
The force of his conclusion depends on the superintelligence “running on equivalent hardware.” Obviously, core to Eliezer’s superintelligence argument, and habryka’s comment here, is the point that the hardware underpinning AI can be made large and expanded upon in a way that is not possible for human brains.
Jacob knows this, and addresses it in comments in response to Vaniver pointing out that birds may be more efficient than jet planes in terms of calories/mile flown, but that when the relevant metric is top speed or human passengers carried, the jet wins. Jacob responds:
So the crux here appears to be about the practicality of replacing human brains with factory-sized artificial ones, in terms of physical resource limitations.
Daniel Kokotajlo disagrees that this is important:
Jacob doubles down that it is:
So Jacob here admits that energy is neither a ‘taut constraint’ for early AGI, and that at the same time it will be a larger fraction of the cost. In other words, it’s not a bottleneck for AGI, and no other resource is either.
This is where Jacob’s discussion ended.
So I think Jacob has at least two jobs to do to convince me. I would be very pleased and appreciative if he achieved just one of them.
First, he needs to explain why any efficiency constraints can’t be overcome by just throwing a lot of material and energy resources into building and powering inefficient or as-efficient-as-human-brains GPUs. If energy is not a taut constraint for AGI, and it’s also expected to be an increasing fraction of costs over time, then that sounds like an argument that we can overcome any efficiency limits with increasing expenditures to achieve superhuman performance.
Second, he needs to explain why things like energy, size, or ops/sec efficiency are the most important efficiency metrics as opposed to things like “physical tasks/second,” or “brain-size intelligences produced per year,” or “speed at which information can be taken in and processed via sensors positioned around the globe.” There are so very many efficiency (“useful output/resource input”) metrics that we can construct, and on many of them, the human brain and body are demonstrably nowhere near the physical limit.
Right now, doubling down on physics-based efficiency arguments, as he’s doing here, don’t feel like a winning strategy to me.
If Jake claims to disagree with the claim that ai can starkly surpass humans [now disproven—he has made more explicit that it can], I’d roll my eyes at him. He is doing a significant amount of work based on the premise that this ai can surpass humans. His claims about safety must therefore not rely on ai being limited in capability; if his claims had relied on ai being naturally capability bounded I’d have rolled to disbelieve [edit: his claims do not rely on it]. I don’t think his claims rely on it, as I currently think his views on safety are damn close to simply being a lower resolution version of mine held overconfidently [this is intended to be a pointer to stalking both our profiles]; it’s possible he actually disagrees with my views, but so far my impression is he has some really good overall ideas but hasn’t thought in detail about how to mitigate the problems I see. But I have almost always agreed with him about the rest of the points he explicitly spells out in OP, with some exceptions where he had to talk me into his view and I eventually became convinced. (I really doubted the energy cost of the brain being near optimal for energy budget and temperature target. I later came to realize it being near optimal is fundamental to why it works at all.)
from what he’s told me and what I’ve seen him say, my impression is he hasn’t looked quite as closely at safety as I have, and to be clear, I don’t think either of us are proper experts on co-protective systems alignment or open source game theory or any of that fancy high end alignment stuff; I worked with him first while I was initially studying machine learning 2015-2016, then we worked together on a research project which then pivoted to building vast.ai. I’ve since moved on to more studying, but given assumption of our otherwise mostly shared background assumptions with varying levels of skill (read: he’s still much more skilled on some core fundamentals and I’ve settled into just being a nerd who likes to read interesting papers), I think our views are still mostly shared to the degree our knowledge overlaps.
@ Jake, re: safety, I just wish you had the kind of mind that was habitually allergic to C++’s safety issues and desperate for the safety of rustlang, exactly bounded approximation is great. Of course, we’ve had that discussion many times, he’s quite the template wizard, with all the good and bad that comes with that.
(open source game theory is a kind of template magic)
Respectfully, it’s hard for me to follow your comment because of the amount of times you say things like “If Jake claims to disagree with this,” “based on the premise that this is false,” “must therefore not rely on it or be false,” and “I don’t think they rely on it.” The double negatives plus pointing to things with the word “this” and “it” makes me lose confidence in my ability to track your line of thinking. If you could speak in the positive and replace your “pointer terms” like “this” and “it” with the concrete claims you’re referring to, that would help a lot!
Understandable, I edited in clearer references—did that resolve all the issues? I’m not sure in return that I parsed all your issues parsing :) I appreciate the specific request!
It helps! There are still some double negatives (“His claims about safety must therefore not rely on ai not surpassing humans, or be false” could be reworded to “his claims about safety can only be true if they allow for AI surpassing humans,” for example), and I, not being a superintelligence, would find that easier to parse :)
The “pointers” bit is mostly fixed by you replacing the word “this” with the phrase “the claim that ai can starkly surpass humans.” Thank you for the edits!
I don’t need to explain that as I don’t believe it. Of course you can overcome efficiency constraints somewhat by brute force—and that is why I agree energy is not by itself an especially taut constraint for early AGI, but it is a taut constraint for SI.
You can’t overcome any limits just by increasing expenditures. See my reply here for an example.
I don’t really feel this need, because EY already agrees thermodynamic efficiency is important, and i’m arguing specifically against core claims of his model.
Computation simply is energy organized towards some end, and intelligence is a form of computation. A superintelligence that can clearly overpower humanity is—almost by definition—something with greater intelligence than humanity, which thus translates into compute and energy requirements through efficiency factors.
It’s absolutely valid to make a local argument against specific parts of Eliezer’s model. However, you have a lot of other arguments “attached” that don’t straightforwardly flow from the parts of Eliezer’s model you’re mainly attacking. That’s a debate style choice that’s up to you, but as a reader who is hoping to learn from you, it becomes distracting because I have to put a lot of extra work into distinguishing “this is a key argument against point 3 from EY’s efficiency model” from “this is a side argument consisting of one assertion about bioweapons based on unstated biology background knowledge.”
Would it be better if we switched from interpreting your post as “a tightly focused argument on demolishing EY’s core efficiency-based arguments,” to “laying out Jabob’s overall view on AI risk, with a lot of emphasis on efficiency arguments?” If that’s the best way to look at it then I retract the objection I’m making here, except to say it wasn’t as clear as it could have been.
The bioweapons is something of a tangent, but I felled compelled to mention it because every time I’ve pointed out that strong nanotech can’t have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn’t part of EY’s model—he talks about diamond nanobots. But sure, that paragraph is something of a tangent.
EY’s model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current—ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
I think I see what you’re saying here. Correct me if I’m wrong.
You’re saying that there’s an argument floating around that goes something like this:
And it’s this argument specifically that you are dispatching with your efficiency arguments. Because, for inescapable physics reasons, AI will hit an efficiency wall, and it can’t become more intelligent than humans on hardware with equivalent size, energy, and so on. Loosely speaking, it’s impossible to build a device something significantly smaller than a brain and using less power than a brain running AI that’s more than 1-2 OOMs smarter than a brain, and we can certainly rule out a superintelligence 6 OOMs smarter than humans running on a device smaller and less energy-intensive than a brain.
You have other arguments about practical engineering constraints, the potential utility to an AI of keeping humans around, the difficulty of building grey goo, and so on, the “alien minds” argument, but those are all based on separate counterarguments. You’re also not arguing about whether an AI just 2-100x as intelligent as humans might be dangerous based on efficiency considerations.
You do have arguments in some or all of these areas, but the efficiency arguments are meant to just deal with this one specific scenario about a 6 OOM (not a 2 OOM) improvement in intelligence during a training run without accessing more hardware than was made available during the training run.
Is that correct?
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Also “being already superintelligent” presumes the conclusion at the onset.
So lets restart:
Someone creates an AGI a bit smarter than humans.
It creates even smarter AGI—by rewriting its own source code.
After the Nth iteration and software OOM improvement is tapped it creates nanotech assemblers to continue growing OOM in power (or alternatively somehow gets OOM improvement with existing foundry tech, but that seems less likely as part of EY’s model).
At some point it has more intelligence/compute than all of humanity, and kills us with nanotech or something.
EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY’s view of 2 where it’s just a modest “rewrite of its own source code”. The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.
Ugh yes, I have no idea why I originally formatted it with the second paragraph quoted as I had it originally (which I fully intended as an articulation of your argument, a rebuttal to the first EY-style paragraph). Just a confusing formatting and structure error on my part. Sorry about that, thanks for your patience.
So as a summary, you agree that AI could be trained a bit smarter than humans, but you disagree with the model where AI could suddenly iteratively extract like 6 OOMs better performance on the same hardware it’s running on, all at once, figure out ways to interact with the physical world again within the hardware it’s already training on, and then strike humanity all at once with undetectable nanotech before the training run is even complete.
The inability of the AI to attain 6 OOMs better performance on its training hardware during its training run by recursively self-improving its own software is mainly based on physical efficiency limits, and this is why you put such heavy emphasis on them. And the idea that neural net-like structures that are very demanding in terms of compute, energy, space, etc appear to be the only tractable road to superintelligence means that there’s no alternative, much more efficient scheme the neural net form of the AI could find to rewrite itself a fundamentally more efficienct architecture on this scale. Again, you have other arguments to deal with other concerns and to make other predictions about the outcome of training superintelligent AI, but dispatching this specific scenario is where your efficiency arguments are most important.
Is that correct?
Yes but I again expect AGI to use continuous learning, so the training run doesn’t really end. But yes I largely agree with that summary.
NN/DL in its various flavors are simply what efficient approx bayesian inference involves, and there are not viable non-equivalent dramatically better alternatives.
Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I’ve looked back through Eliezer’s old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they’re not the only path, but he outright denies that superintelligence could come from neural nets).
My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.
I am genuinely curious and confused as to what exactly you concretely imagine this supposed ‘superintelligence’ to be, such that is not already the size of a factory, such that you mention “size of a factory” as if that is something actually worth mentioning—at all. Please show at least your first pass fermi estimates for the compute requirements. By that I mean—what are the compute requirements for the initial SI—and then the later presumably more powerful ‘factory’?
I would suggest reading more about advanced GPU/accelerator design, and then about datacenter design and the thermodynamic/cooling considerations therein.
This is so wildly ridiculous that you really need to show your work. I have already shown some calculations in these threads, but I’ll quickly review here.
A quick google search indicates 1GW is a typical power plant output, which in theory could power roughly a million GPU datacenter. This is almost 100 times larger in power consumption than the current largest official supercomputer: Frontier—which has about 30k GPUs. The supercomputer used to train GPT4 is somewhat of a secret, but estimated to be about that size. So at 50x to 100x you are talking about scaling up to something approaching a hypothetical GPT-5 scale cluster.
Nvidia currently produces less than 100k high end enterprise GPUs per year in total, so you can’t even produce this datacenter unless Nvidia grows by about 10x and TSMC grows by perhaps 2x.
The datacenter would likely cost over a hundred billion dollars, and the resulting models would be proportionally more expensive to run, such that it’s unclear whether this would be a win (at least using current tech). Sure I do think there is some room for software improvement.
But no, I do not think that this hypothetical not currently achievable GPT5 - even if you were running 100k instances of it—would “likely be easily capable of disempowering humanity”.
Of course if we talk longer term, the brain is obviously evidence that one human-brain power can be achieved in about 10 watts, so the 1GW power plant could support a population of 100 million uploads or neuromorphic AGIs. That’s very much part of my model (and hansons, and moravecs) - eventually.
Remember this post is all about critiquing EY’s specific doom model which involves fast foom on current hardware through recursive self-improvement.
If you have read much of my writings, you should know that I believe its obvious we will end up with AIs much smarter than humans—but mainly because they will run faster using much more power. In fact this prediction has already come to pass in a limited sense—GPT4 was probably trained on over 100 human lifetimes worth of virtual time/data using only about 3 months of physical time, which represents a 10000x time dilation (but thankfully only for training, not for inference).
Despite your claim to be “genuinely curious and confused,” the overarching tone and content of this bit does not strike me as projecting curiosity or confusion, but instead confident and sharp-toned burden-of-proof-shifting to habryka. That’s merely a stylistic note, not impacting the content of your claims.
It sounds here like you are agreeing with him that you can deal with any limits to ops/mm^3 limits by simply building a bigger computer. It’s therefore hard for me to see why these arguments about efficiency limitations matter very much for AI’s ability to be superintelligent and exhibit superhuman takeover capabilities.
I can see why maybe human brains, being efficient according to certain metrics, might be a useful tool for the AI to keep around, but I don’t see why we ought to feel at all reassured by that. I don’t really want to serve out the end of my days as an AI’s robot.
Just as a piece of feedback, this sort of comment is not very enlightening, and it doesn’t convince me that you have background knowledge that ought to make me believe you and not habryka. If your aim is to “teach the crowd,” N of 1, but I am not being effectively taught by this bit.
So it takes 1% of a single power plant output to train GPT4. If GPT-4 got put on chips and distributed, wouldn’t it take only a very small amount of power comparatively to actually run it once trained? Why are we talking about training costs rather than primarily about the cost to operate models once they have been trained?
You have output a lot of writings on this subject and I’ve only read a fraction. Do you argue this point somewhere? This seems pretty cruxy. In my model, at this point, doomers are partly worried about what happens during training of larger and larger models, but they’re also worried about what happens when you proliferate many copies of extant models and put them into something like an AutoGPT format, where they can make plans, access real-world resources, and use plugins.
Again, I think trying to focus the conversation on efficiency by itself is not enough. Right now, I feel like you have a very deep understanding of efficiency, but that once the conversation moves away from the topic of efficiency along some metric and into topics like “why can’t AI just throw more resources at expanding itself” or “why can’t AI take over with a bunch of instances of ChatGPT 4 or 5 in an AutoGPT context” or “why should I be happy about ending my days as the AI’s efficient meat-brain robot,” it becomes hard for me to follow your argument or understand why it’s an important “contra Yudkowsky” or “contra central Yudkowsky-appreciating doomers” argument.
It feels more like constructing, I don’t know, a valid critique of some of the weakest parts (maybe?) of Yudkowsky’s entire written ouvre having to do with the topic of efficiency and nanobots, and then saying that because this fails, the entire Yudkowskian argument about AI doom fails, or even that arguments for AI takeover scenarios fail generally. And I am not at all convinced you’ve shown that it’s on the foundation of Yudkowsky’s (mis?)understanding of efficiency that the rest of his argument stands or falls.
I’ve never feared grey goo or diamondoid bacteria. Mainly, I worry that a malevolent AI might not even need very much intelligence to steamroll humanity. It may just need a combination of the willingness to do damage, the ability to be smarter than a moderately intelligent person in order to hack our social systems, and the ability to proliferate across our computing infrastructure while not caring about what happens to “itself” should individual instances get “caught” and deleted. So much of our ability to mitigate terrorism depends on the ability to catch and punish offenders, the idea that perpetrators care about their own skin and are confined within it. It is the alienness of the form of the AI, and its ability to perform on par with or exceed performance of humans in important areas, to expand the computing and material resources available to itself, that make it seem dangerous to me.
I see how that tone could come off as rude, but really I don’t understand habryka’s model when he says “a superintelligence will just make a brain the size of a factory, and then be in a position to outcompete or destroy humanity quite easily.”
The transformer arch is fully parallelizable only during training, but is roughly just as or more inefficient than RNNs on GPUs/accelerators for inference. The inference costs of GPT4 are of course a openai/microsoft secret, but it is not a cheap model. Also human-level AGI, let alone superintelligence, will likely require continual learning/training.
I guess by “put on chips” you mean baking GPT-4 into an ASIC? That usually doesn’t make sense for fast changing tech, but as moore’s law slows we could start seeing that. I would expect other various changes to tensorcores well before that.
The entire specific Yudkowskian argument about hard takeoff via recursive self improving AGI fooming to nanotech is specifically what i’m arguing against, and indeed I only need to argue against the weakest links.
See the section on That Alien Mindspace. DL-based AGI is anthropomorphic (as I predicted), not alien.
Re Yudkowsky, I don’t think his entire argument rests on efficiency, and the pieces that don’t can’t be dispatched by arguing about efficiency.
Regarding “alien mindspace,” what I mean is that the physical form of AI, and whatever awareness the AI has of that, makes it alien. Like, if I knew I could potentially transmit my consciousness with perfect precision over the internet and create self-clones almost effortlessly, I would think very differently than I do now.
His argument entirely depends on efficiency. He claims that near future AGI somewhat smarter than us creates even smarter AGI and so on, recursively bottoming out in something that is many many OOM more intelligent than us without using unrealistic amounts of energy, and all of this happens very quickly.
So that’s entirely an argument that boils down to practical computational engineering efficiency considerations. Additionally he needs the AGI to be unaligned by default, and that argument is also faulty.
In your other recent comment to me, you said:
It seems like in one place, you’re saying EY’s model depends on near term engineering practicality, and in another, that it depends on physics-constrainted efficiency which you argue invalidates it. Being no expert on the physics-based efficiency arguments, I’m happy to concede the physics constraints. But I’m struggling to understand their relevance to non-physics-based efficiency arguments or their strong bearing on matters of engineering practicality.
My understanding is that your argument goes something like this:
You can’t build something many OOMs more intelligent than a brain on hardware with roughly the same size and energy consumption as the brain.
Therefore, building a superintelligent AI would require investing more energy and more material resources than a brain uses.
Therefore… and here’s where the argument loses steam for me. Why can’t we or the AI just invest lots of material and energy resources? How much smarter than us does an unaligned AI need to be to pose a threat, and why should we think resources are a major constraint to get it to recursively self-improve itself to get to that point? Why should we think it will need constant retraining to recursively self-improve? Why do we think it’ll want to keep an economy going?
As far as the “anthropomorphic” counterargument to the “vast space of alien minds” thing, I fully agree that it appears the easiest way to predict tokens from human text is to simulate a human mind. That doesn’t mean the AI is a human mind, or that it is intrinsically constrained to human values. Being able to articulate those values and imitate behaviors that accord with those values is a capability, not a constraint. We have evidence from things like ChaosGPT or jailbreaks that you can easily have the AI behave in ways that appear unaligned, and that even the appearance of consistent alignment has to be consistently enforced in ways that look awfully fragile.
Overall, my sense is that you’ve admirably spent a lot of time probing the physical limits of certain efficiency metrics and how they bear on AI, and I think you have some intriguing arguments about nanotech and “mindspace” and practical engineering as well.
However, I think your arguments would be more impactful if you carefully and consistently delineated these different arguments and attached them more precisely to the EY claims you’re rebutting, and did more work to show how EY’s conclusion X flows from EY’s argument A, and that A is wrong for efficiency reason B, which overturns X but not Y; you disagree with Y for reason C, overturning EY’s argument D. Right now, I think you do make many of these argumentative moves, but they’re sort of scattered across various posts and comments, and I’m open to the idea that they’re all there but I’ve also seen enough inconsistencies to worry that they’re not. To be clear, I would absolutely LOVE it if EY did the very same thing—the burden of proof should ideally not be all on you, and I maintain uncertainty about this whole issue because of the fragmented nature of the debate.
So at this point, it’s hard for me to update more than “some arguments about efficiency and mindspace and practical engineering and nanotech are big points of contention between Jacob and Eliezer.” I’d like to go further and, with you reject arguments that you believe to be false, but I’m not able to do that yet because of the issue that I’m describing here. While I’m hesitant to burden you with additional work, I don’t have the background or the familiarity with your previous writings to do this very effectively—at the end of the day, if anybody’s going to bring together your argument all in one place and make it crystal clear, I think that person has to be you.
You just said in your comment to me that a single power plant is enough to run 100M brains. It seems like you need zero hardware progress in order to get something much smarter without unrealistic amounts of energy, so I just don’t understand the relevance of this.
I said longer term—using hypothetical brain-parity neuromorphic computing (uploads or neuromorphic AGI). We need enormous hardware progress to reach that.
Current tech on GPUs requires large supercomputers to train 1e25+ flops models like GPT4 that are approaching, but not quite, human level AGI. If the rurmour of 1T params is true, then it takes a small cluster and ~10KW just to run some smallish number of instances of the model.
Getting something much much smarter than us would require enormous amounts of computation and energy without large advances in software and hardware.
Sure. We will probably get enormous hardware progress over the next few decades, so that’s not really an obstacle.
It seems to me your argument is “smarter than human intelligence cannot make enormous hardware or software progress in a relatively short amount of time”, but this has nothing to do with “efficiency arguments”. The bottleneck is not energy, the bottleneck is algorithmic improvements and improvements to GPU production, neither of which is remotely bottlenecked on energy consumption.
No, as you said, it would require like, a power plant worth of energy. Maybe even like 10 power plants or so if you are really stretching it, but as you said, the really central bottleneck here is GPU production, not energy in any relevant way.
As we get more hardware and slow mostly-aligned AGI/AI progress this further raises the bar for foom.
That is actually an efficiency argument, and in my brain efficiency post I discuss multiple sub components of net efficiency that translate into intelligence/$.
Ahh I see—energy efficiency is tightly coupled to other circuit efficiency metrics as they are all primarily driven by shrinkage. As you increasingly bottom out hardware improvements energy then becomes an increasingly more direct constraint. This is already happening with GPUs where power consumption is roughly doubling with each generation, and could soon dominate operating costs.
See here where I line the roodman model up to future energy usage predictions.
All that being said I do agree that yes the primary bottlneck or crux for the EY fast takeoff/takeover seems to be the amount of slack in software and scaling laws. But only after we agree that there isn’t obvious easy routes for the AGI to bootstrap nanotech assemblers with many OOM greater compute per J than brains or current computers.
How much room is there in algorithmic improvements?
Maybe it would be a good idea to change the title of this essay to:
as to not give people hope that there would be a counter argument somewhere in this article to his more general claim:
This seems like it straightforwardly agrees that energy efficiency is not in any way a bottleneck, so I don’t understand the focus of this post on efficiency.
I also don’t know what you mean by longer term. More room at the bottom was of course also talking longer term (you can’t build new hardware in a few weeks, unless you have nanotech, but then you can also build new factories in a few weeks), so I don’t understand why you are suddenly talking as if “longer term” was some kind of shift of the topic.
Eliezer’s model is that we definitely won’t have many decades with AIs smarter but not much smarter than humans, since there appear to be many ways to scale up intelligence, both via algorithmic progress and via hardware progress. Eliezer thinks that drexlerian nanotech is one of the main ways to do this, and if you buy that premise, then the efficiency arguments don’t really matter, since clearly you can just scale things up horizontally and build a bunch of GPUs. But even if you don’t, you can still just scale things up horizontally and increase GPU production (and in any case, energy efficiency is not the bottleneck here, it’s GPU production, which this post doesn’t talk about)
I don’t understand the relevance of this. You seem to be now talking about a completely different scenario than what I understood Eliezer to be talking about. Eliezer does not think that a slightly superhuman AI would be capable of improving the hardware efficiency of its hardware completely on its own.
Both scenarios (going both big, in that you just use whole power-plant levels of energy, or going down in that you improve efficiency of chips) require changing semiconductor manufacturing, which is unlikely to be one of the first things a nascent AI does, unless it does successfully develop and deploy drexlerian nanotech. Eliezer in his model here was talking about what are reasonable limits that we would be approaching here relatively soon after an AI passes human levels.
I don’t understand the relevance of thermodynamic efficiency to a foom scenario “on current hardware”. You are not going to change the thermodynamic efficiency of the hardware you are literally running on, you have to build new hardware for that either way.
To reiterate the model of EY that I am critiquing is one where an AGI quickly rapidly fooms through many OOM efficiency improvements. All key required improvements are efficiency improvements—it needs to improve it’s world modelling/planning per unit compute, and or improve compute per dollar and or compute per joule, etc.
In EY’s model there are some perhaps many OOM software improvements over the initial NN arch/aglorithms, perhaps then continued with more OOM hardware improvements. I don’t believe “buying more GPUs” is a key part of his model—it is far far too slow to provide even one OOM upgrade. Renting/hacking your way to even one OOM more GPUs is also largely unrealistic (I run one of the larger GPU compute markets and talk to many suppliers, I have inside knowledge here).
Right, so I have arguments against drexlerian nanotech (Moore room at the bottom, but also the thermodynamic constraints indicating you just can’t get many from nanotech alone), and separate arguments against many OOM from software (mind software efficiency).
It is mostly relevant to the drexlerian nanotech, as it shows there likely isn’t much improvement over GPUs for all the enormous effort. If nanotech were feasible and could easily allow computers 6 OOM more efficient than the brain using about the same energy/space/materials, then I would more agree with his argument.
I don’t think he’s at all claiming safety is trivial or that humans can expect to remain in charge. control-capture foom is very much permitted by his model and he says so directly; much bigger minds are allowed. But his model suggests that reflective algorithmic improvement is not the panacea that yudkowsky expected, nor that beating biology head to head is easy even for a very superintelligent system.
this does not change any claim I would make about safety; it should barely be an update for anyone who has already updated off of deep learning. but it should knock down yudkowsky’s view of capability scaling in algorithms thoroughly. this is relevant to prediction of which kinds of system are a threat to other systems and how.
Presumably it takes a gigantic amount of compute to train a “brain the size of a factory”? If we assume that training a human-level AI will take 10^28 FLOP (which is quite optimistic), the Chinchilla scaling laws predict that training a model 10,000 times larger would take about 10^36 FLOP, which is far more than the total amount of compute available to humans cumulatively over our history.
By the time the world is training factory-sized brains, I expect human labor to already have been made obsolete by previous generations of AIs that were smarter than us, but not vastly so. Presumably this is Jacob’s model of the future too?
This seems really implausible. I’d like to see a debate about this. E.g. why can’t I improve on heat by having super-cooled fluid pumped throughout my artificial brain; doesn’t having no skull-size limit help a lot; doesn’t metal help; doesn’t it help to not have to worry about immune system stuff; doesn’t it help to be able to maintain full neuroplasticity; etc.
Biology is incredibly efficient at certain things that happen at the cell level. To me, it seems like OP is extrapolating this observation rather too broadly. Human brains are quite inefficient at things they haven’t faced selective pressure to be good at, like matrix multiplication.
Claiming that human brains are near Pareto-optimal efficiency for general intelligence seems like a huge stretch to me. Even assuming that’s true, I’m much more worried about absolute levels of general intelligence rather than intelligence per Watt. Conventional nuclear bombs are dangerous even though they aren’t anywhere near the efficiency of a theoretical antimatter bomb. AI “brains” need not be constrained by the size and energy constraints of a human brain.
The human brain hardware is essentially a giant analog/digital hybrid vector matrix multiplication engine if you squint the right way, and later neuromorphic hardware for AGI will look similar.
But GPT4 isn’t good at explicit matrix multiplication either.
>But GPT4 isn’t good at explicit matrix multiplication either.
So it is also very inefficient.
Probably a software problem.
Your instinct is right. The Landauer limit says that it takes at least kbTln2 energy to erase 1 bit of information, which is necessary to run a function which outputs 1 bit (to erase the output bit). The important thing to note is that it scales with temperature T (measured in an absolute scale). Human brains operate at 310 Kelvin. Ordinary chips can already operate down to around ~230 Kelvin, and there is even a recently developed chip which operates at ~0.02 Kelvin.
So human brains being near the thermodynamic limit in this case means very little about what sort of efficiencies are possible in practice.
Your point about skull-sizes [being bounded by childbirth death risk] seems very strong for evolutionary reasons, and to which I would also add the fact that bird brains seem to do similar amounts of cognition (to smallish mammals) in a much more compact volume without having substantially higher body temperatures (~315 Kelvin).
Cooling the computer doesn’t let you get around the Landauer limit! The savings in energy you get by erasing bits at low temperature are offset by the energy you need to dissipate to keep your computer cold. (Erasing a bit at low temperature still generates some heat, and when you work out how much energy your refrigerator has to use to get rid of that heat, it turns out that you must dissipate the same amount as the Landauer limit says you’d have to if you just erased the bit at ambient temperatures.) To get real savings, you have to actually put your computer in an environment that is naturally colder. For example, if you could put a computer in deep space, that would work.
On the other hand, there might also be other good reasons to keep a computer cold, for example if you want to lower the voltage needed to represent a bit, then keeping your computer cold would plausibly help with that. It just won’t reduce your Landauer-limit-imposed power bill.
None of this is to say that I agree with the rest of Jacob’s analysis of thermodynamic efficiency, I believe he’s made a couple of shaky assumptions and one actual mistake. Since this is getting a lot of attention, I might write a post on it.
Deep space is a poor medium as the only energy dissipation there is radiation, which is slower than convection in Earth. Vacuums are typically used to insulate things (thermos).
In a room temp bath this always costs more energy—there is no free lunch in cooling. However in the depths of outer space this may become relevant.
That is true, and I concede that that weakens my point.
It still seems to be the case that you could get a ~35% efficiency increase by operating in e.g. Antarctica. I also have this intuition I’ll need to think more about that there are trade-offs with the Landauer limit that could get substantial gains by separating things that are biologically constrained to be close… similar to how a human with an air conditioner can thrive in much hotter environments (using more energy overall, but not energy that has to be in thermal contact with the brain via e.g. the same circulatory system).
Norway/sweden do happen to be currently popular datacenter building locations, but more for cheap power than cooling from what I understand. The problem with Antarctica would be terrible solar production for much of the year.
You can play the same game in the other direction. Given a cold source, you can run your chips hot, and use a steam engine to recapture some of the heat.
The Landauer limit still applies.
I don’t think heat dissipation is actually a limiting factor for humans as things stand right now. Looking at the heat dissipation capabilities of a human brain from three perspectives (maximum possible heat dissipation by sweat glands across the whole body, maximum actual amount of sustained power output by a human in practice, maximum heat transfer from the brain to arterial blood with current-human levels of arterial bloodflow), none of them look to me to be close to the 20w the human brain consumes.
Based on sweat production of athletic people reaching 2L per hour, that gives an estimate of ~1kW of sustained cooling capacity for an entire human
5 watts per kg seems to be pretty close to the maximum power output well-trained humans can actually output in practice for a full hour, so that suggests that a 70 kg human has at least 350 watts of sustained cooling capacity (and probably more, because the limiting factor does not seem to be overheating).
Bloodflow to the brain is about 45L / h, and brains tolerate temperature ranges of 3-4ºC, so working backwards from that we get that a 160W brain would reach temperatures of about 3ºC higher than arterial blood assuming that arterial bloodflow was the primary heat remover. Probably add in 20-100 watts to account for sweat dissipation on the head. And also the carotid artery is less than a cm in diameter, so bloodflow to the brain could probably be substantially increased if there were evolutionary pressure in that direction.
Brains in practice produce about 20W of heat, so it seems likely to me that energy consumption could probably increase by at least one order of magnitude without causing the brain to cook itself, if there was strong enough selection pressure to use that much energy (probably not two orders of magnitude though).
Getting rid of the energy constraint would help though. Proof of concept: ten humans take more energy to run than one human does, and can do more thinking than one human.
I do also find it quite likely that skull size is probably the most tightly binding constraint for humans—we have smaller and very differently tuned neurons compared to other mammals, and I expect that the drive for smaller neurons in particular is downstream of space being very much at a premium, even more so than energy.
Further evidence for the “space, rather than energy expenditure or cooling, is the main binding constraint” hypothesis is the existence of Fontanelles—human brains continue to grow after birth and the skull is not entirely solid in order to allow for that—a skull that does not fully protect the brain seems like a very expensive adaptation, so it’s probably buying something quite valuable.
I note in passing that the elephant brain is not only much larger, but also has many more neurons than any human brain. Since I’ve no reason to believe the elephant brain is maximally efficient, making the same claim for our brains should require much more evidence than I’m seeing.
That’s if you’re counting the cerebellum, which doesn’t seem to contribute much to intelligence, but is important for controlling the complicated musculature of a trunk and large body.
By cortical neuron count, humans have about 18 billion, while elephants have less than 6, comparable to a chimpanzee. (source)
Elephants are undeniably intelligent as animals go, but not at human level.
Even blue whales barely approach human level by cortical neuron count, although some cetaceans (notably orcas) exceed it.
jacob_cannell’s post here https://www.lesswrong.com/posts/xwBuoE9p8GE7RAuhd/brain-efficiency-much-more-than-you-wanted-to-know#Space argues that:
Does that seem about right to you?
I conclude something more like “the brain consumes perhaps 1 to 2 OOM less energy than the biological limits of energy density for something of its size, but is constrained to its somewhat lower than maximal energy density due in part to energy availability considerations” but I suspect that this is more of a figure/ground type of disagreement about which things are salient to look at vs a factual disagreement.
That said @jacob_cannell is likely to be much more informed in this space than I am—if the thermodynamic cooling considerations actually bind much more tightly than I thought, I’d be interested to know that (although not necessarily immediately, I expect that he’s dealing with rather a lot of demands on his time that are downstream of kicking the hornet’s nest here).
efficient for the temperature it runs at. Jake is correct about the fundamental comparison, but he’s leaving off the part where he expects reversible computing to fundamentally change the efficiency tradeoffs for intelligence eventually, which is essentially “the best way to make use of near perfect cooling” as a research field; I don’t have a link to where he’s said this before, since I’m remembering conversations we had out loud.
But how is “efficient for the temperature it runs at” relevant to whether there’s much room to improve on how much compute biology provides?
it’s relevant in that there’s a lot of room to improve, it’s just not at the same energy budget and temperature. I’m not trying to imply a big hidden iceberg in addition to that claim; what it implies is up to your analysis.
Near pareto-optimal in terms of thermodynamic efficiency as replicators and nanobots, see the discussions and links here and here.
Then how is that relevant to the argument in your OP?
I thought you were arguing:
That’s what I responded to in my top-level comment. Is that not what you’re arguing? If it is what you’re arguing, then I’m confused because it seems like here in this comment you’re talking about something irrelevant and not responding to my comment (though I could be confused about that as well!).
The specific line where I said “biology is incredibly efficient, and generally seems to be near pareto-optimal”, occurs immediately after and is mainly referring to the EY claim that “biology is not that efficient”, and his more specific claim about thermodynamic efficiency—which I already spent a whole long post refuting.
None of your suggestions:
Improve thermodynamic efficiency, nor do they matter much in terms of OOM. EY’s argument is essentially that AGI will quickly find many OOM software improvement, and then many more OOM improvement via new nanotech hardware.
You mention a few times that you seem confident about Moore’s law ending very soon. I am confused where this confidence comes from (though you might have looked into this more than I have).
In-general the transistor-density aspect of Moore’s law always seemed pretty contingent to me. The economic pressures care about flops/$, not about transistor density, which has just historically been the best way to get flops/$. Also for forecasting AI dynamics, flops/$ seems like it matters a lot more, since in the near future AI seems unlikely to have to care much about transistor density, given that there are easily 10-20 OOMs of energy and materials to be used on earth’s surface for some kind of semiconductor or neuromorphic compute production.
And in the space of flops/$, Moore’s law seems to be going strong.The last report from AI Impacts I remember reading suggests things were going strong until at least 2020:
https://aiimpacts.org/2019-recent-trends-in-gpu-price-per-flops/
And this 2022 analysis suggests things were also going quite strong very recently, and indeed the price-effectiveness of ML-relevant chips seems to have been getting cheaper even faster than other categories, and almost perfectly in-line with Moore’s law:
https://www.lesswrong.com/posts/c6KFvQcZggQKZzxr9/trends-in-gpu-price-performance
Jensen Huang/Nvidia is almost un-arguably one of TSMC’s most important clients and probably has some insights/access to their roadmaps, and I don’t particularly suspect he is lying when he claims Moore’s Law is dead, it matches my own analysis of TSMC’s public roadmap, as well my analysis of the industry research/chatter/gossip/analysis. Moore’s Law was a long recursive miniaturization optimization process which just was always naturally destined to bottom out somewhat before new on-moore’s law leading foundries cost sizable fractions of world GDP and features approach minimal sizes (well predicted in advance).
This obviously isn’t the end of technological progress in computing! It’s just the end of the easy era. Neuromorphic computing is much harder for comparatively small gains. Reversible computing seems almost impossibly difficult, such that many envision just jumping straight to quantum computing, which itself is no panacea and very far.
As were chip clock frequencies under dennard scaling, until that suddenly ended. I have uncertainty over how far we are from minimal viable switch energies but it is not multiple OOM. There are more architectural tricks in the pipes in the nature of lower precision tensorcores, but not many of those left either.
Sure but so far the many OOM improvement in flops/$ has been driven by shrinkage, not using ever more of earth’s resources to produce fabs & chips. Growing that way is very slow and absolutely in line with the smooth slow non foomy takeoff scenarios.
Want to take a bet? $1000, even odds.
I predict flops/$ to continue going down at between a factor of 2x every 2 years and 2x every 3 years. Happy to have someone else be a referee on whether it holds up.
[Edit: Actually, to avoid having to condition on a fast takeoff itself, let’s say “going down faster than a factor of 2x every 3 years for the next 6 years”]
I may be up for that but we need to first define ‘flops’, acceptable GPUs/products, how to calculate prices (preferably some standard rental price with power cost), and finally the bet implementation.
Curious, did this bet happen? Since Jacob said he may be up for it depending on various specifics.
Part of the issue is my post/comment was about moore’s law (transistor density for mass produced nodes), which is a major input to but distinct from flops/$. As I mentioned somewhere, there is still some free optimization energy in extracting more flops/$ at the circuit level even if moore’s law ends. Moore’s law is very specifically about fab efficiency as measured in transistors/cm^2 for large chip runs—not the flops/$ habyrka wanted to bet on. Even when moore’s law is over, I expect some continued progress in flops/$.
All that being said, nvidia’s new flagship GPU everyone is using—the H100 which is replacing the A100 and launched just a bit after habryka proposed the bet—actually offers near zero improvement in flops/$ (the price increased in direct proportion to flops increase). So I probably should have taken the bet if it was narrowly defined as (flops/$ for the flagship gpus most teams using currently for training foundation models).
Thanks Jacob. I’ve been reading the back-and-forth between you and other commenters (not just habryka) in both this post and your brain efficiency writeup, and it’s confusing to me why some folks so confidently dismiss energy efficiency considerations with handwavy arguments not backed by BOTECs.
While I have your attention – do you have a view on how far we are from ops/J physical limits? Your analysis suggests we’re only 1-2 OOMs away from the ~10^-15 J/op limit, and if I’m not misapplying Koomey’s law (2x every 2.5y back in 2015, I’ll assume slowdown to 3y doubling by now) this suggests we’re only 10-20 years away, which sounds awfully near, albeit incidentally in the ballpark of most AGI timelines (yours, Metaculus etc).
TSMC 4N is a little over 1e10 transistors/cm^2 for GPUs and roughly 5e^-18 J switch energy assuming dense activity (little dark silicon). The practical transistor density limit with minimal few electron transistors is somewhere around ~5e11 trans/cm^2, but the minimal viable high speed switching energy is around ~2e^-18J. So there is another 1 to 2 OOM further density scaling, but less room for further switching energy reduction. Thus scaling past this point increasingly involves dark silicon or complex expensive cooling and thus diminishing returns either way.
Achieving 1e-15 J/flop seems doable now for low precision flops (fp4, perhaps fp8 with some tricks/tradeoffs); most of the cost is data movement as pulling even a single bit from RAM just 1 cm away costs around 1e-12J.
It did not
This seems to assume that those researchers were meant to work out how to create AI. But the goal of that research was rather to formalize and study some of the challenges in AI alignment in crisp language to make them as clear as possible. The intent was not to study the question of “how do we build AI” but rather “what would we want from an AI and what would prevent us from getting that, assuming that we could build one”. That approach doesn’t make any assumptions of how the AI would be built, it could be neural nets or anything else.
Eliezer makes that explicit in e.g. this SSC comment:
and it’s discussed in more length in “The Rocket Alignment Problem”.
EY’s belief distribution about NNs and early DL from over a decade ago and how that reflects on his predictive track record has already been extensively litigated in other recent threads like here. I mostly agree that EY 2008 and later is somewhat cautious/circumspect about making explicitly future-disprovable predictions, but he surely did seem to exude skepticism which complements my interpretation of his actions.
That being said I also largely agree that MIRI’s research path was chosen specifically to try and be more generic than any viable route to AGI. But one could consider that also as something of a failure or missed opportunity vs investing more in studying neural networks, the neuroscience of human alignment, etc.
But I’ve always said (perhaps not in public, but nonetheless) that I thought MIRI had a very small chance of success, but it was still a reasonable bet for at least one team to make, just in case the connectivists were all wrong about this DL thing.
This seems false. LLMs are trained to predict, which often results in them mimicking certain kinds of human errors. Mimicking errors doesn’t mean that the underlying cognition which produced those errors is similar.
In what sense is predicting internet text “training [LLMs] on human thoughts”? Human thoughts are causally upstream of some internet text, so learning to predict human thoughts is one way of being good at predicting, but it’s certainly not the only one. More on this general point here.
One thing that I have observed, working with LLMs, is that when they’re predicting the next token in a Python REPL they also make kinda similar mistakes to the ones that a human who wasn’t paying that much attention would make. For example, consider the following
I expect that most examples in the training data of “this looks like an interaction with the Python REPL” were in fact interactions with the Python REPL. To the extent that GPT-N models do make human-like mistakes when predicting non-human-like data (instead of predicting that non-human-like data correctly, which is what their loss function wants them to do), I think that does serve as nonzero evidence that their cognition is “human-like” in the specific narrow sense of “mirroring our quirks and limitations”.
More generally, I think it’s particularly informative to look at the cases where the thing GPT-n does is different than the thing its loss function wants it to do. The extent to which those particular cases look like human failure modes is informative to how “human-like” the cognition of GPT-n class models (as a note, the SolidGoldMagikarp class of failure modes is extremely non human-like, and so that observation caused me to update more in the “shoggoth which can exhibit human behavior among many other behaviors” view. But I haven’t actually seen a whole lot of failure modes like that in normal use, and I have seen a bunch of human-not-paying-attention type failure modes in normal use).
An interesting example! A couple remarks:
a more human mistake might be guessing 0.6 and not 1.0?
After the mistake, it’s not clear what the “correct” answer is, from a text prediction perspective. If I were trying to predict the output of my python interpreter, and it output 1.0, I’d predict that future outputs on the same input would also be “wrong”—that either I was using some kind of bugged interpreter, or that I was looking at some kind of human-guessed transcript of a python session.
Yeah, that one’s “the best example of the behavior that I was able to demonstrate from scratch with the openai playground in 2 minutes” not “the best example of the behavior I’ve ever seen”. Mostly the instances I’ve seen were chess-specific results on a model that I specifically fine-tuned on Python REPL transcripts that looked like
and it would print
N
instead ofNone
(except that in the actual examples it mostly was a much longer transcript, and it was more like it would forget where the pieces were if the transcript contained an unusual move or just too many moves).For context I was trying to see if a small language model could be fine-tuned to play chess, and was working under the hypothesis of “a Python REPL will make the model behave as if statefulness holds”.
And then, of course, the Othello paper came out, and bing chat came out and just flat out could play chess without having been explicitly trained on it, and the question of “can a language model play chess” became rather less compelling because the answer was just “yes”.
But that project is where a lot of my “the mistakes tend to look like things a careless human does, not weird alien mistakes” intuitions ultimately come from.
An alternative explanation of mistakes is that making mistakes and then correcting them was awarded during additional post-training refinement stages. I work with GPT-4 daily and sometimes it feels like it makes mistakes on purpose just to be able to say that it is sorry for the confusion and then correct it. It feels like it also makes fewer mistakes when you ask politely, which is rather strange (use please, thank you, etc.).
Nevertheless, distillation seems like a very possible thing that is also going on here.
It does not distill the whole of a human mind though. There are areas that are intuitive for the average human, even a small child, that are not for the GPT-4. For example, it has problems with concepts of 3D geometry and visualizing things in 3D. It may have similar gaps in other areas, including more important ones (like moral intuitions).
Internet text contains the inputs and outputs of human minds in the sense that every story, post, article, essay, book, etc written by humans first went through our brains, word by word, token by token, tracing through our minds.
Training on internet text is literally training on human thoughts because text written by humans is literally an encoding of human thoughts. The fact that it is an incomplete and partial encoding is mostly irrelevant, as given enough data you can infer through any such gaps. Only a small fraction of the pixels in an image are sufficient to reconstruct it.
Even if every one of your object level objections is likely to be right, this wouldn’t shift me much in terms of policies I think we should pursue because the downside risks from TAI are astronomically large even at small probabilities (unless you discount all future and non-human life to 0). I see Eliezer as making arguments about the worst ways things could go wrong and why it’s not guaranteed that they won’t go that way. We could get lucky, but we shouldn’t count on luck, so even if Eliezer is wrong he’s wrong in ways that, if we adopt policies that account for his arguments, better protect us from existential catastrophe at the cost of getting to TAI a few decades later, which is a small price to pay to offset very large risks that exist even at small probabilities.
I am reasonably sympathetic to this argument, and I agree that the difference between EY’s p(doom) > 50% and my p(doom) of perhaps 5% to 10% doesn’t obviously cash out into major policy differences.
I of course fully agree with EY/bostrom/others that AI is the dominant risk, we should be appropriately cautious, etc. This is more about why I find EY’s specific classic doom argument to be uncompelling.
My own doom scenario is somewhat different and more subtle, but mostly beyond scope of this (fairly quick) summary essay.
You mention here that “of course” you agree that AI is the dominant risk, and that you rate p(doom) somewhere in the 5-10% range.
But that wasn’t at all clear to me from reading the opening to the article.
As written, that opener suggests to me that you think the overall model of doom being likely is substantially incorrect (not just the details I’ve elided of it being the default).
I feel it would be very helpful to the reader to ground the article from the outset with the note you’ve made here somewhere near the start. I.e., that your argument is with the specific doom case from EY, that you retain a significant p(doom), but that it’s based on different reasoning.
I agree, and believe it would have been useful if Jacob (post author) had made this clear in the opening paragraph of the post. I see no point in reading the post if it does not measurably impact my foom/doom timeline probability distribution.
I am interested in his doom scenario, however.
Eliezer believes and argues that things go wrong by default, with no way he sees to avoid that. Not just “no guarantee they won’t go wrong”.
It may be that his arguments are sufficient to convince you of “no guarantee they won’t go wrong” but not to convince you of “they go wrong by default, no apparent way to avoid that”. But that’s not what he’s arguing.
This is interesting but would benefit from more citations for claims and fewer personal attacks on Eliezer.
I had the same impression at first, but in the areas where I most wanted these, I realized that Jacob linked to additional posts where he has defended specific claims at length.
Here is one example:
I usually find Tyler Cowenesque (and heck, Yudkowskian) phrases like this irritating, and usually they’re pretty hard to interrogate, but Jacob helpfully links to an entire factpost he wrote on this specific point, elaborating on this claim in detail.
He does something similar here:
He is doing an admirable job of summarizing the core arguments of Eliezer’s overall model of AI risk, then providing deeply-thought-out counterarguments. The underlying posts appear to be pretty well-cited.
That doesn’t mean that every claim is backed by an additional entire post of its own, or that every argument is convincingly correct.
This is more of the Cowenesque/Yudkowskianesque handwavey appeal to authority, intuition, and the implication that any informed listener ought to have mastered the same background knowledge and come naturally to the same conclusions. Here, my response would be “viruses are not optimizing to kill humans—they are optimizing for replication, which often means keeping hosts alive.”
He follows this with another easy-to-argue-with claim:
This sounds at best like an S-risk (how would the AI cause humans to behave like highly efficient general purpose robots? Probably not in ways we would enjoy from our perspective now). And we don’t need to posit that an unaligned AI would be bent on keeping any semblance of an “economy” running. ChaosGPT 5.0 might specifically have the goal of destroying the entire world as efficiently as possible.
But my goal in articulating these counterarguments isn’t because I’m hoping to embroil myself in a debate downthread. It’s to point out that while Jacob’s still very far from dealing with every facet of Eliezer’s argument, he is at the same time doing what I regard as a pretty admirable job of interrogating certain specific claims in depth.
And Jacob doesn’t read to me as making what I would call “personal attacks on Eliezer.” He does do things like:
Accurately paraphrase arrogant-sounding things Eliezer says
Accurately describe the confidence with which Eliezer makes his arguments (“brazenly”).
Be very clear and honest about just how weak he finds Eliezer’s argument, after putting in a very substantial amount of work in order to come to this evaluation
Offer general reasons to think the epistemics are stronger on his side of the debate:
Note: this is one area I do think Jacob’s argument is weak—I would need to see one or more specific predictions from Jacob or the others he cites from some time ago in order to evaluate this claim. But I trust it is true in his own mind and could be made legible.
Overall, I think it is straightforwaredly false to say that this post contains any “personal attacks on Eliezer.”
Edit: However, I also would say that Jacob is every bit as brazen and (over?)confident as Eliezer, and has earned the same level of vociferous pushback where his arguments are weak as he has demonstrated in his arguments here.
And I want to emphasizes that in my opinion, there’s nothing wrong with a full-throated debate on an important issue!
Humanity is generating and consuming enormous amount of power—why is the power budget even relevant? And even if it was, energy for running brains ultimately comes from Sun—if you include the agriculture energy chain, and “grade” the energy efficiency of brains by the amount of solar energy it ultimately takes to power a brain, AI definitely has a potential to be more efficient. And even if a single human brain is fairly efficient, the human civilization is clearly not. With AI, you can quickly scale up the amount of compute you use, but scaling beyond a single brain is very inefficient.
If the optimal AGI design running on GPUs takes about 10 GPUs and 10kw to rival one human-brain power, and superintelligence which kills humanity ala the foom model requires 10 billion human brain power and thus 100 billion GPUs and a 100 terrawatt power plant—that is just not something that is possible in any near term.
In EY’s model there is supposedly 6 OOM improvement from nanotech, so you could get the 10 billion human brainpower with a much more feasible 100 MW power plant and 100 thousand GPUs ish (equivalent).
you’re assuming sublinear scaling. why wouldn’t it be superlinear post training? it certainly seems like it is now. it need not be sharply superlinear like yud expected to still be superlinear.
Exactly! I’d expect compute to scale way better than humans—not necessarily because the intelligence of compute scales so well, but because the intelligence of human groups scales so poorly...
So I assumed a specific relationship between “one unit of human-brain power”, and “super intelligence capable of killing humanity”, where I use human-brain power as a unit but that doesn’t actually have to be linear scaling—imagine this is a graph with two labeled data points, with a point at (human, X:1) and then another point at (SI, X:10B), you can draw many different curves that connect those two labeled points and the Y axis is sort of arbitrary.
Now maybe 10B HBP to kill humanity seems too high, but I assume humanity as a civilization which includes a ton of other compute, AI, and AGI, and I don’t really put much credence in strong nanotech.
To be clear, I don’t know anyone who would currently defend the claim that you need a single system with computation needs of all 10 billion human brains. That seems like at least 5 OOMs too much. Even simulating 10M humans is likely enough, but you can probably do many OOMs better by skipping the incredible inefficiency of humans coordinating with each other in a global economy.
If you believe modern economies are incredibly inefficient coordination mechanisms thats a deeper disagreement beyond this post.
But in general my estimate for the intellectual work required to create an entirely new path to a much better compute substrate is something at least vaguely on the order of the amount of intellectual work accumulated into our current foundry tech.
That is not my estimate for the minimal amount of intelligence required to takeover the world in some sense—that would probably require less. But again this is focused on critiquing scenarios where a superintelligence (something greater than humanity in net intelligence) bootstraps from AGI rapidly.
Yep, seems plausibly like a relevant crux. Modern economies sure seem incredibly inefficient, especially when viewed through the lens of “how much is this system doing long-term planning and trying to improve its own intelligence”.
In many important tasks in the modern economy, it isn’t possible to replace on expert with any number of average humans. A large fraction of average humans aren’t experts.
A large fraction of human brains are stacking shelves or driving cars or playing computer games or relaxing etc. Given a list of important tasks in the computer supply chain, most humans, most of the time, are simply not making any attempt at all to solve them.
And of course a few percent of the modern economy is actively trying to blow each other up.
To put some numbers on that, USA brains directly consume 20W × 330M = 6.6 GW, whereas the USA food system consumes ≈500 GW [not counting sunlight falling on crops] (≈15% of the 3300 GW total USA energy consumption).
This I agree with and always assumed, but it is also largely irrelevant if the end conclusion is that AGI still destroys us all. To most people, I’d say, the specific method of death doesn’t matter as much as the substance. It’s a special kind of academic argument one where we can endlessly debate on how precisely will the end come to be through making this thing while we all mostly agree that this thing we are making, and that we could stop making, will likely end us all. Sane people (and civilizations) just… don’t make the deadly thing.
I haven’t gone through the numbers so I’ll give it a try, but out of the box, feels to me like your arguments about biology’s computational efficiency aren’t the end of it. I actually mentioned the topic as one possible point of interest here: https://www.lesswrong.com/posts/76n4pMcoDBTdXHTLY/ideas-for-studies-on-agi-risk. My impression is that biology can come up with some spectacularly efficient trade-offs, but that’s only within the rules of biology. For example, biology can produce very fast animals with very good legs, but not rocket cars on wheels, because that requires way too many steps that are completely non-functional. All components also need to be able to be generated from an embryo, self-maintaining, compatible with the standard sources of energy, and generally fit a bunch of other constraints that don’t necessarily exist on artificially made versions of them. Cameras are better than eyes, microphones are better than ears. Why wouldn’t computing hardware, eventually, be better than neurons? Not necessarily orders of magnitude better, but still, a lot cheaper than entire server rooms. Even if you could just copy human brains one-to-one in size, energy usage and efficiency, that already would be plenty superhuman and disruptive.
I agree most with the “room at the bottom” aspect. I don’t think there’s really that much left of it. But first, I could be wrong (after all, not like I could come up with the whole DNA and enzimes nonsense that evolution pulled off if I just knew about basic organic chemistry, so what’s not to say that there aren’t even better machineries that can be invented if something smarter than me tries to optimize for them?), and second, I don’t think that’s necessary either for doom. So what’s the point of arguing?
Just don’t build the damn thing that kills us all. Not if it does so swiftly by nanomachines and not if it does so slowly by replacing and outpricing us. Life isn’t supposed to be about a mindless pursuit of increased productivity, at least that’s not what most of us find fun and pleasurable about it, and replacing humanity with a relentless maximizer, a machine-corporation that has surpassed the need for fleshy bits in its pursuit of some pointless goal, is about the saddest tomb we can possibly turn the Earth into.
There are some other assumptions that go into Eliezer’s model that are required for doom. I can think of one very clearly which is:
5. The transition to that god-AGI will be as quick that other entities won’t have the time to reach also superhuman capabilities. There are no “intermediate” AGIs that can be used to work on Alignment related problems or even as a defence from unaligned AGIs
This is the first contra AI doom case I’ve read which felt like it was addressing some of the core questions, rather than nitpicking on some irrelevant point, or just completely failing to understand the AI doom argument.
So, whilst I still think some of your points need fleshing out/further arguments, thank you very much for this post!
The little pockets of cognitive science that I’ve geeked out about—usually in the predictive processing camp—have featured researchers who are usually quite surprised by or are going to great lengths to double underline the importance of language and culture in our embodied / extended / enacted cognition.
A simple version of the story I have in my head is this: We have physical brains thanks to evolution, and then by being an embodied predictive perception/action loop out in the world, we started transforming our world into affordances for new perceptions and actions. Things took off when language became a thing—we can could transmit categories and affordances and all kinds of other highly abstract language things in a way that are really surprisingly efficient for brains and have really high leverage for agents out in the world.
So I tend towards viewing our intelligence as resting on both our biological hardware and on the cultural memplexes we’ve created and curated and make use of pretty naturally, rather than just on our physical hardware. My gut sense—which I’m up for updates on—is that for the more abstract cognitive stuff we do, a decently high percentage of the fuel is coming from the language+culture artifact we’ve collectively made and nurtured.
One of my thoughts here is (and leaning heavily on metaphor to point at an idea, rather than making a solid concrete claim): maybe that makes arguments about the efficiency of the human brain less relevant here?
If you can run the abstract cultural code on different hardware, then looking at the tradeoffs made could be really interesting—but I’m not sure what it tells you about scaling floors or ceilings. I’d be particularly interested in whether running that cultural code on a different substrate opens the doors to glitches that are hard to find or patch, or to other surprises.
The shoggoth meme that has been going around also feels like it applies. If an AI can run our cultural code, that is a good chunk of the way to effectively putting on a human face for a time. Maybe it actually has a human face, maybe it just wearing a mask. So far I haven’t seen arguments that tilt me away from thinking of it like a mask.
For me, it doesn’t seem to imply that LLMs are or will remain a kind of “child of human minds”. As far as I know, almost all we know is how well they can wear the mask. I don’t see how it follows that it would necessarily grow and evolve in the way that it thinks/behaves/does what it does in human-like ways if it was scaled up or if it was given enough agency to reach for more resources.
I guess this is my current interpretation of “alien mind space”. Maybe lots of really surprising things can run our cultural code—in the same way that people have ported the game Doom to all kinds of surprising substrates, that have weird overlaps and non-overlaps with the original hardware the game was run on.
Actually I think the shoggoth mask framing is somewhat correct, but it also applies to humans. We don’t have a single fixed personality, we are also mask-wearers.
The argument that shifts me the most away from thinking of it with the shoggoth-mask analogy is the implication that a mask has a single coherent actor behind it. But if you can avoid that mental failure mode I think the shoggoth-mask analogy is basically correct.
Hm, neuron impulses travel at around 200 m/s, electric signals travel at around 2e8 m/s, so I think electronics have an advantage there. (I agree that you may have a point with “That Alien Mindspace”.)
The brain’s slow speed seems mostly for energy efficiency but it is also closely tuned to brain size such that signal delay is not a significant problem.
I agree that the human brain is roughly at a local optimum. But think about what could be done just with adding a fiber optic connection between two brains (I think there are some ethical issues here so this is a thought experiment, not something I recommend). The two brains could be a kilometer apart, and the signal between them on the fiber optic link takes less time than a signal takes to get from one side to the other of a regular brain. So these two brains could think together (probably with some (a lot?) neural rewiring) as fast as a regular brain thinks individually. Repeat with some more brains.
Or imagine if myelination was under conscious control. If you need to learn a new language, demyelinate the right parts of the brain, learn the language quickly, and then remyelinate it.
So I think even without changing things much neurons could be used in ways that provide faster thinking and faster learning.
As for energy efficiency, there is no reason that a superintelligence has to be limited to the approximately 20 watts that a human brain has access to. Gaming computers can have 1000 W power supplies, which is 50 times more power. I think 50 brains thinking together really quickly (as in the interbrain connections are as fast as the intrabrain connections) could probably out-think a lot more than 50 humans.
And, today, there are supercomputers that use 20 or more megawatts of power, so if we have computing that is as energy efficient as the human brain, that is equivalent to 1 million human brains (20e6/20), and I think we might be able to agree that a million brains thinking together really well could probably out-think all of humanity.
“These GPUs cost $1M and use 10x the energy of a human for the same work” is still a pretty bad deal for any workers that have to compete with that. And I don’t expect economic gains to go to displaced workers.
Even if an AI is more expensive per computational capacity than humans, it being much faster and immortal would still be a threat. I could imagine a single immortal human genius becoming world-emperor eventually. Now imagine them operating 10^6 or even 10^3 faster than ordinary humans.
Agreed in advance.
This post raised some interesting points, and stimulated a bunch of interesting discussion in the comments. I updated a little bit away from foom-like scenarios and towards slow-takeoff scenarios. Thanks. For that, I’d like to upvote this post.
On the other hand: I think direct/non-polite/uncompromising argumentation against other arguments, models, or beliefs is (usually) fine and good. And I think it’s especially important to counter-argue possible inaccuracies in key models that lots of people have about AI/ML/alignment. However, in many places, the post reads like a personal attack on a person (Yudkowsky), rather than just on models/beliefs he has promulgated.
I think that style of discourse runs a risk of
politicizing the topic under discussion, and thereby making it harder for people to think clearly about it
creating a shitty culture where people are liable to get personally attacked for participating in that discourse
For that, I’d like to downvote this post. (I ended up neither up- nor down-voting.)
A very naive question for Jacob. A few years ago the fact that bird brains are about 10x more computationally dense than human brains was mentioned on SlateStarCodex and by Diana Fleischman. This is something I would not expect to be true if there were not some significant “room at the bottom.”
Is this false? Does this not imply what I think it should? Am I just wrong in thinking this is of any relevance?
https://slatestarcodex.com/2019/03/25/neurons-and-intelligence-a-birdbrained-perspective/
I don’t understand the physics, so this is just me noticing I am confused. And not an attempt to debunk or anything.
Bird brains have higher neuron density esp in the forebrain, but I’m unsure if this also translates into higher synaptic density or just less synapses per neuron. Regardless it does look like bird brains are optimized more heavily for compactness, but its not clear what tradeoffs may being made there. But heat transport cooling scales with the surface area whereas compute and thus heat production scales with volume, so brains tend to become less dense as they grow larger absent more heroic cooling efforts.
If we look at the game of Go, AI managed to be vastly better than humans. An AI that can outcompete humans at any task the way that AlphaGo can outcompete human at Go is a serious problems even if it’s not capable of directly figuring out how to build nanobots.
While that’s true that currently most of the training data we put into LLMs seems human-created, I don’t think there’s a good reason to assume that will stay true.
AlphaGo for example is not trained with any human data at all. AlphaGo is trained based on a bunch of computer-generated data and uses that principle to achieve superhuman performance.
If you would want to make an LLM better at arithmetic, you would autogenerate a lot of data of correct arithmetic reasoning. If having an LLM that’s good at arithmetic is valuable to you, you can autogenerate text of arithmetic problems that is as large as the current human-generated text corpus from the internet.
There are going to be many tasks for which you can produce training data that will allow a LLM to successfully do the task in a way that humans currently can’t. If you train your model to solve a lot of human tasks that humans currently can’t solve the resulting LLM is likely farther from human minds than GPT4 happens to be.
Is this comment the best example of your model predicting anthropomorphic cognition? I recognize my comment here could sound snarky (“is that the best you can do?”); it’s not intended that way—I’m sincerely asking if that is the best example, or if there are better ones in addition.
It’s more fleshed out in the 2015 ULM post.
I don’t understand how OpenAIs success at scaling GPT proves the universal learning model. Couldn’t there be an as yet undiscovered algorithm for intelligence that is more efficient?
If I have a model which predicts “this simple architecture scales up to human intelligence with enough compute”, and that is tested and indeed shown to be correct, then the model is validated.
And it helps further rule out an entire space of theories about intelligence: namely all the theories that intelligence is very complicated and requires many complex interacting innate algorithms (evolved modularity, which EY seemed to subscribe to )
Sure there could be other algorithms for intelligence that are more efficient, and I already said I don’t think we are on quite on the final scaling curve with transfomers. But over time the probability mass remaining for these undiscovered algorithms continually diminishes as we explore ever more of the algorithmic search space.
Furthemore, evolution extensively explored the search space for architectures/algorithms for intelligent agents, and essentially found common variants of universal learning on NNs in multiple unrelated lineages, substantially adding to the evidence that yes this really is as good as it gets (at least for any near term conventional computers).
I see, thanks for clarifying.
Humans suck at arithmetic. Really suck. From comparison of current GPU’s to a human trying and failing to multiply 10 digit numbers in their head, we can conclude that something about humans, hardware or software, is Incredibly inefficient.
Almost all humans have roughly the same sized brain.
So even if Einsteins brain was operating at 100% efficiency, the brain of the average human is operating at a lot less.
Making a technology work at all is generally easier than making it efficient.
Current scaling laws seem entirely consistent with us having found an inefficient algorithm that works at all.
Like chatGPT uses billions of floating point operations to do basic arithmetic mostly correctly. So it’s clear that the likes of chatGPT are also inefficient.
Now you can claim that chatGPT and humans are mostly efficient, but suddenly drop 10 orders of magnitude when confronted with a multiplication. But no really, they are pushing right up against the fundamental limits for everything that isn’t one of the most basic computational operations.
True.
They also have a big pile of their own new idiosyncratic quirks.
https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation
These are bizarre behaviour patterns that don’t resemble any humans.
This looks less like a human, and more like a very realistic painted statue. It looks like a human, complete with painted on warts, but scratch the paint, and the inhuman nature shows through.
The width of mindspace is somewhat relevant.
At best, we have found a recipe, such that if we stick precisely to it, we can produce human-like minds. Start making arbitrary edits to the code, and we wander away from humanity.
At best we have found a small safe island in a vast and stormy ocean.
The likes of chatGPT are trained with RLHF. Humans don’t usually say “as a large language model, I am unable to …” so we are already wandering somewhat from the human.
Well muscles are less efficient than steam engines. Which is why hamster wheel electricity is a dumb idea, burning the hamster food in a steam engine is more efficient.
This is a clear error.
There is no particular reason to expect TSMC to taper off at a point anywhere near the theoretical limits.
A closely analogous situation is that the speed of passenger planes has tapered off. And the theoretical limit (ignoring exotic warp drives) is light speed.
But in practice, planes are limited by the energy density of jet fuel, economics, regulations against flying nuclear reactors, atmospheric drag etc.
This isn’t to say that no spaceship could ever go at 90% light speed. Just that we would need a radically different approach to do that, and we don’t yet have that tech.
So yes, TSMC could be running out of steam. Or not. The death of moores law has been proclaimed on a regular basis since it existed.
“Taiwanese engineers don’t yet have the tech to do X” doesn’t imply that X is physically impossible.
I don’t have much to contribute on AI risk, but I do want to say +1 for the gutsy title. It’s not often you see the equivalent of “Contra The Founding Mission of an Entire Community”.