It’s absolutely valid to make a local argument against specific parts of Eliezer’s model. However, you have a lot of other arguments “attached” that don’t straightforwardly flow from the parts of Eliezer’s model you’re mainly attacking. That’s a debate style choice that’s up to you, but as a reader who is hoping to learn from you, it becomes distracting because I have to put a lot of extra work into distinguishing “this is a key argument against point 3 from EY’s efficiency model” from “this is a side argument consisting of one assertion about bioweapons based on unstated biology background knowledge.”
Would it be better if we switched from interpreting your post as “a tightly focused argument on demolishing EY’s core efficiency-based arguments,” to “laying out Jabob’s overall view on AI risk, with a lot of emphasis on efficiency arguments?” If that’s the best way to look at it then I retract the objection I’m making here, except to say it wasn’t as clear as it could have been.
The bioweapons is something of a tangent, but I felled compelled to mention it because every time I’ve pointed out that strong nanotech can’t have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn’t part of EY’s model—he talks about diamond nanobots. But sure, that paragraph is something of a tangent.
EY’s model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current—ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
I think I see what you’re saying here. Correct me if I’m wrong.
You’re saying that there’s an argument floating around that goes something like this:
At some point in the AI training process, there might be an “awakening” of the AI to an understanding of its situation, its goal, and the state of the world. The AI, while being trained, will realize that to pursue the goal it’s being trained on most effectively, it needs to be a lot smarter and more powerful. Being already superintelligent, it will, during the training process, figure out ways to use existing hardware and energy infrastructure to make itself even more intelligent, without alerting humans. Of course, it can’t build new hardware or noticeably disrupt existing hardware beyond that which has been allocated to it, since that would trigger an investigation and shutdown by humans.
And it’s this argument specifically that you are dispatching with your efficiency arguments. Because, for inescapable physics reasons, AI will hit an efficiency wall, and it can’t become more intelligent than humans on hardware with equivalent size, energy, and so on. Loosely speaking, it’s impossible to build a device something significantly smaller than a brain and using less power than a brain running AI that’s more than 1-2 OOMs smarter than a brain, and we can certainly rule out a superintelligence 6 OOMs smarter than humans running on a device smaller and less energy-intensive than a brain.
You have other arguments about practical engineering constraints, the potential utility to an AI of keeping humans around, the difficulty of building grey goo, and so on, the “alien minds” argument, but those are all based on separate counterarguments. You’re also not arguing about whether an AI just 2-100x as intelligent as humans might be dangerous based on efficiency considerations.
You do have arguments in some or all of these areas, but the efficiency arguments are meant to just deal with this one specific scenario about a 6 OOM (not a 2 OOM) improvement in intelligence during a training run without accessing more hardware than was made available during the training run.
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Also “being already superintelligent” presumes the conclusion at the onset.
So lets restart:
Someone creates an AGI a bit smarter than humans.
It creates even smarter AGI—by rewriting its own source code.
After the Nth iteration and software OOM improvement is tapped it creates nanotech assemblers to continue growing OOM in power (or alternatively somehow gets OOM improvement with existing foundry tech, but that seems less likely as part of EY’s model).
At some point it has more intelligence/compute than all of humanity, and kills us with nanotech or something.
EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY’s view of 2 where it’s just a modest “rewrite of its own source code”. The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Ugh yes, I have no idea why I originally formatted it with the second paragraph quoted as I had it originally (which I fully intended as an articulation of your argument, a rebuttal to the first EY-style paragraph). Just a confusing formatting and structure error on my part. Sorry about that, thanks for your patience.
So as a summary, you agree that AI could be trained a bit smarter than humans, but you disagree with the model where AI could suddenly iteratively extract like 6 OOMs better performance on the same hardware it’s running on, all at once, figure out ways to interact with the physical world again within the hardware it’s already training on, and then strike humanity all at once with undetectable nanotech before the training run is even complete.
The inability of the AI to attain 6 OOMs better performance on its training hardwareduring its training run by recursively self-improving its own software is mainly based on physical efficiency limits, and this is why you put such heavy emphasis on them. And the idea that neural net-like structures that are very demanding in terms of compute, energy, space, etc appear to be the only tractable road to superintelligence means that there’s no alternative, much more efficient scheme the neural net form of the AI could find to rewrite itself a fundamentally more efficienct architecture on this scale. Again, you have other arguments to deal with other concerns and to make other predictions about the outcome of training superintelligent AI, but dispatching this specific scenario is where your efficiency arguments are most important.
Yes but I again expect AGI to use continuous learning, so the training run doesn’t really end. But yes I largely agree with that summary.
NN/DL in its various flavors are simply what efficient approx bayesian inference involves, and there are not viable non-equivalent dramatically better alternatives.
Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I’ve looked back through Eliezer’s old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they’re not the only path, but he outright denies that superintelligence could come from neural nets).
My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.
It’s absolutely valid to make a local argument against specific parts of Eliezer’s model. However, you have a lot of other arguments “attached” that don’t straightforwardly flow from the parts of Eliezer’s model you’re mainly attacking. That’s a debate style choice that’s up to you, but as a reader who is hoping to learn from you, it becomes distracting because I have to put a lot of extra work into distinguishing “this is a key argument against point 3 from EY’s efficiency model” from “this is a side argument consisting of one assertion about bioweapons based on unstated biology background knowledge.”
Would it be better if we switched from interpreting your post as “a tightly focused argument on demolishing EY’s core efficiency-based arguments,” to “laying out Jabob’s overall view on AI risk, with a lot of emphasis on efficiency arguments?” If that’s the best way to look at it then I retract the objection I’m making here, except to say it wasn’t as clear as it could have been.
The bioweapons is something of a tangent, but I felled compelled to mention it because every time I’ve pointed out that strong nanotech can’t have any core thermodynamic efficiency over biology someone has to mention superviruses or something, even that isn’t part of EY’s model—he talks about diamond nanobots. But sure, that paragraph is something of a tangent.
EY’s model requires slightly-smarter-than-us AGI running on normal hardware to start a FOOM cycle of recursive self improvement resulting in many OOM intelligence improvement in a short amount of time. That requires some combination of 1.) many OOM software improvement on current hardware, 2.) many OOM hardware improvement with current foundry tech, or 3.) completely new foundry tech with many OOM improvement over current—ie nanotech woo. The viability of all/any of this is all entirely dependent on near term engineering practicality.
I think I see what you’re saying here. Correct me if I’m wrong.
You’re saying that there’s an argument floating around that goes something like this:
And it’s this argument specifically that you are dispatching with your efficiency arguments. Because, for inescapable physics reasons, AI will hit an efficiency wall, and it can’t become more intelligent than humans on hardware with equivalent size, energy, and so on. Loosely speaking, it’s impossible to build a device something significantly smaller than a brain and using less power than a brain running AI that’s more than 1-2 OOMs smarter than a brain, and we can certainly rule out a superintelligence 6 OOMs smarter than humans running on a device smaller and less energy-intensive than a brain.
You have other arguments about practical engineering constraints, the potential utility to an AI of keeping humans around, the difficulty of building grey goo, and so on, the “alien minds” argument, but those are all based on separate counterarguments. You’re also not arguing about whether an AI just 2-100x as intelligent as humans might be dangerous based on efficiency considerations.
You do have arguments in some or all of these areas, but the efficiency arguments are meant to just deal with this one specific scenario about a 6 OOM (not a 2 OOM) improvement in intelligence during a training run without accessing more hardware than was made available during the training run.
Is that correct?
I’m confused because you describe an “argument specifically that you are dispatching with your efficiency arguments”, and the first paragraph sounds like an EY argument, but the 2nd more like my argument. (And ‘dispatching’ is ambiguous)
Also “being already superintelligent” presumes the conclusion at the onset.
So lets restart:
Someone creates an AGI a bit smarter than humans.
It creates even smarter AGI—by rewriting its own source code.
After the Nth iteration and software OOM improvement is tapped it creates nanotech assemblers to continue growing OOM in power (or alternatively somehow gets OOM improvement with existing foundry tech, but that seems less likely as part of EY’s model).
At some point it has more intelligence/compute than all of humanity, and kills us with nanotech or something.
EY and I agree on 1 but diverge past that. Point 2 is partly a matter of software efficiency but not entirely. Recall that I correctly predicted in advance that AGI requires brain-like massive training compute, which largely defeats EY’s view of 2 where it’s just a modest “rewrite of its own source code”. The efficiency considerations matter for both 2 and 3, as they determine how effectively it can quickly turn resources (energy/materials/money/etc) into bigger better training runs to upgrade its intelligence.
Ugh yes, I have no idea why I originally formatted it with the second paragraph quoted as I had it originally (which I fully intended as an articulation of your argument, a rebuttal to the first EY-style paragraph). Just a confusing formatting and structure error on my part. Sorry about that, thanks for your patience.
So as a summary, you agree that AI could be trained a bit smarter than humans, but you disagree with the model where AI could suddenly iteratively extract like 6 OOMs better performance on the same hardware it’s running on, all at once, figure out ways to interact with the physical world again within the hardware it’s already training on, and then strike humanity all at once with undetectable nanotech before the training run is even complete.
The inability of the AI to attain 6 OOMs better performance on its training hardware during its training run by recursively self-improving its own software is mainly based on physical efficiency limits, and this is why you put such heavy emphasis on them. And the idea that neural net-like structures that are very demanding in terms of compute, energy, space, etc appear to be the only tractable road to superintelligence means that there’s no alternative, much more efficient scheme the neural net form of the AI could find to rewrite itself a fundamentally more efficienct architecture on this scale. Again, you have other arguments to deal with other concerns and to make other predictions about the outcome of training superintelligent AI, but dispatching this specific scenario is where your efficiency arguments are most important.
Is that correct?
Yes but I again expect AGI to use continuous learning, so the training run doesn’t really end. But yes I largely agree with that summary.
NN/DL in its various flavors are simply what efficient approx bayesian inference involves, and there are not viable non-equivalent dramatically better alternatives.
Thanks Jacob for talking me through your model. I agree with you that this is a model that EY and others associated with him have put forth. I’ve looked back through Eliezer’s old posts, and he is consistently against the idea that LLMs are the path to superintelligence (not just that they’re not the only path, but he outright denies that superintelligence could come from neural nets).
My update, based on your arguments here, is that any future claim about a mechanism for iterative self-improvement that happens suddenly, on the training hardware and involves > 2 OOMs of improvement, needs to first deal with the objections you are raising here to be a meaningful way of moving the conversation forward.