Direct self-improvement (i.e. rewriting itself at the cognitive level) does seem much, much harder with deep learning systems than with the sort of systems Eliezer originally focused on.
In DL, there is no distinction between “code” and “data”; it’s all messily packed together in the weights. Classic RSI relies on the ability to improve and reason about the code (relatively simple) without needing to consider the data (irreducibly complicated).
Any verification that a change to the weights/architecture will preserve a particular non-trivial property (e.g. avoiding value drift) is likely to be commensurate in complexity to the complexity of the weights. So… very complex.
The safest “self-improvement” changes probably look more like performance/parallelization improvements than “cognitive” changes. There are likely to be many opportunities for immediate performance improvements[1], but that could quickly asymptote.
I think that recursive self-empowerment might now be a more accurate term than RSI for a possible source of foom. That is, the creation of accessory tools for capability increase. More like a metaphorical spider at the center of an increasingly large web. Or (more colorfully) a shoggoth spawning a multitude of extra tentacles.
The change is still recursive in the sense that marginal self-empowerment increase the ability to self-empower.
So I’d say that a “foom” is still possible in DL, but is both less likely and almost certainly slower. However, even if a foom is days or weeks rather than minutes, many of the same considerations apply. Especially if the AI has already broadly distributed itself via the internet.
Perhaps instead of just foom, we get “AI goes brrrr… boom… foom”.
Hypothetical examples include: more efficient matrix multiplication, faster floating point arithmetic, better techniques for avoiding memory bottlenecks, finding acceptable latency vs. throughput trade-offs, parallelization, better usage of GPU L1/L2/etc caches, NN “circuit” factoring, and many other algorithmic improvements that I’m not qualified to predict.
What if the machine has a benchmark/training suite for performance. On the benchmark is a task for designing a better machine architecture.
Machine proposes a better architecture. New architecture maybe a brand new set of files defining the networks, topology, and training procedure, or they may reuse networks for components.
For example you might imagine an architecture that uses gpt-3.5 and −4 as subsystems but the “executive control” is from a new network defined by the architecture.
Given a very large compute budget (many billions), the company hosting the RSI runs would run many of these proposals, then the machines that did the best on the bench that are distinct from each other (so a heuristic of distinctness, performance) remain “alive” to design the next generation.
So it is recursive, improvement, and the “selves” doing it are getting more capable over time but it’s not a single AGI improving itself but instead a population. Humans are also still involved and tweaking things (and maintaining the enormous farms of equipment)
Direct self-improvement (i.e. rewriting itself at the cognitive level) does seem much, much harder with deep learning systems than with the sort of systems Eliezer originally focused on.
In DL, there is no distinction between “code” and “data”; it’s all messily packed together in the weights. Classic RSI relies on the ability to improve and reason about the code (relatively simple) without needing to consider the data (irreducibly complicated).
Any verification that a change to the weights/architecture will preserve a particular non-trivial property (e.g. avoiding value drift) is likely to be commensurate in complexity to the complexity of the weights. So… very complex.
The safest “self-improvement” changes probably look more like performance/parallelization improvements than “cognitive” changes. There are likely to be many opportunities for immediate performance improvements[1], but that could quickly asymptote.
I think that recursive self-empowerment might now be a more accurate term than RSI for a possible source of foom. That is, the creation of accessory tools for capability increase. More like a metaphorical spider at the center of an increasingly large web. Or (more colorfully) a shoggoth spawning a multitude of extra tentacles.
The change is still recursive in the sense that marginal self-empowerment increase the ability to self-empower.
So I’d say that a “foom” is still possible in DL, but is both less likely and almost certainly slower. However, even if a foom is days or weeks rather than minutes, many of the same considerations apply. Especially if the AI has already broadly distributed itself via the internet.
Perhaps instead of just foom, we get “AI goes brrrr… boom… foom”.
Hypothetical examples include: more efficient matrix multiplication, faster floating point arithmetic, better techniques for avoiding memory bottlenecks, finding acceptable latency vs. throughput trade-offs, parallelization, better usage of GPU L1/L2/etc caches, NN “circuit” factoring, and many other algorithmic improvements that I’m not qualified to predict.
What if the machine has a benchmark/training suite for performance. On the benchmark is a task for designing a better machine architecture.
Machine proposes a better architecture. New architecture maybe a brand new set of files defining the networks, topology, and training procedure, or they may reuse networks for components.
For example you might imagine an architecture that uses gpt-3.5 and −4 as subsystems but the “executive control” is from a new network defined by the architecture.
Given a very large compute budget (many billions), the company hosting the RSI runs would run many of these proposals, then the machines that did the best on the bench that are distinct from each other (so a heuristic of distinctness, performance) remain “alive” to design the next generation.
So it is recursive, improvement, and the “selves” doing it are getting more capable over time but it’s not a single AGI improving itself but instead a population. Humans are also still involved and tweaking things (and maintaining the enormous farms of equipment)
Training runs already take months.
I’d expect that to take several generations of models, so double digit numbers of months in an aggressive scenario?
(Barring drastic jumps in compute that cut months long training runs to hours/days).
Read paragraph 2
But yes foom wasn’t going to happen. It takes time for ai to be improved, turns out reality gets a vote.
I think takeoff from broadly human level to strongly superhuman is years, and potentially decades.
Foom in days or weeks still seems just as fanciful as before.