I didn’t push this point at the time, but Paul’s claim that “GPT-3 + 5 person-years of engineering effort [would] foom” seems really wild to me, and probably a good place to poke at his model more. Is this 5 years of engineering effort and then humans leaving it alone with infinite compute? Or are the person-years of engineering doled out over time?
Unlike Eliezer, I do think that language models not wildly dissimilar to our current ones will be able to come up with novel insights about ML, but there’s a long way between “sometimes comes up with novel insights” and “can run a process of self-improvement with increasing returns”. I’m pretty confused about how a few years of engineering could get GPT-3 to a point where it could systematically make useful changes to itself (unless most of the work is actually being done by a program search which consumes astronomical amounts of compute).
I didn’t push this point at the time, but Paul’s claim that “GPT-3 + 5 person-years of engineering effort [would] foom” seems really wild to me, and probably a good place to poke at his model more. Is this 5 years of engineering effort and then humans leaving it alone with infinite compute?
The 5 years are up front and then it’s up to the AI to do the rest. I was imagining something like 1e25 flops running for billions of years.
I don’t really believe the claim unless you provide computing infrastructure that is externally maintained or else extremely robust, i.e. I don’t think that GPT-3 is close enough to being autopoietic if it needs to maintain its own hardware (it will decay much faster than it can be repaired). The software also has to be a bit careful but that’s easily handled within your 5 years of engineering.
Most of what it will be doing will be performing additional engineering, new training runs, very expensive trial and error, giant searches of various kinds, terrible memetic evolution from copies that have succeeded at other tasks, and so forth.
Over time this will get better, but it will of course be much slower than humans improving GPT-3 (since GPT-3 is much dumber), e.g. it may take billions of years for a billion copies to foom (on perfectly-reliable hardware). This is likely pretty similar to the time it takes it to improve itself at all—once it has improved meaningfully, I think returns curves to software are probably such that subsequent improvements each come faster than the one before. (The main exception to that latter claim is that your model may have a period of more rapid growth as it picks the low-hanging fruit that programmers missed.)
I’d guess that the main difference between GPT-3-fine-tuned-to-foom and smarter-model-finetuned-to-foom is that the smarter model will get off the ground faster. I would guess that a model of twice the size would take significantly less time (e.g. maybe 25% or 50% as much total compute) though obviously it depends on exactly how you scale.
I don’t really buy Eliezer’s criticality model, though I’ve never seen the justification and he may have something in mind I’m missing. If a model a bit smarter than you fooms in 20 years, and you make yourself a bit smarter, then you probably foom in about 20 years. Diminishing returns work roughly the same way whether it’s you coming up with the improvements or engineers building you (not quantitatively exactly the same, since you and the engineers will find different kinds of improvements, but it seems close enough for this purpose since you quickly exhaust the fruit that hangs low for you that your programmers missed).
I’d guess the difference with your position is that I’m considering really long compute times. And the main way this seems likely to be wrong is that the curve may be crazy unfavorable by the time you go all the way down to GPT-3 (but compared to you and Eliezer I do think that I’m a lot more optimistic about tiny models run for an extremely long time).
Is this 5 years of engineering effort and then humans leaving it alone with infinite compute?
Maybe something like ‘5 years of engineering effort to start automating work that qualitatively (but incredibly slowly and inefficiently) is helping with AI research, and then a few decades of throwing more compute at that for the AI to reach superintelligence’?
With infinite compute you could just recapitulate evolution, so I doubt Paul thinks there’s a crux like that? But there could be a crux that’s about whether GPT-3.5 plus a few decades of hardware progress achieves superintelligence, or about whether that’s approximately the fastest way to get to superintelligence, or something.
I didn’t push this point at the time, but Paul’s claim that “GPT-3 + 5 person-years of engineering effort [would] foom” seems really wild to me, and probably a good place to poke at his model more. Is this 5 years of engineering effort and then humans leaving it alone with infinite compute? Or are the person-years of engineering doled out over time?
Unlike Eliezer, I do think that language models not wildly dissimilar to our current ones will be able to come up with novel insights about ML, but there’s a long way between “sometimes comes up with novel insights” and “can run a process of self-improvement with increasing returns”. I’m pretty confused about how a few years of engineering could get GPT-3 to a point where it could systematically make useful changes to itself (unless most of the work is actually being done by a program search which consumes astronomical amounts of compute).
The 5 years are up front and then it’s up to the AI to do the rest. I was imagining something like 1e25 flops running for billions of years.
I don’t really believe the claim unless you provide computing infrastructure that is externally maintained or else extremely robust, i.e. I don’t think that GPT-3 is close enough to being autopoietic if it needs to maintain its own hardware (it will decay much faster than it can be repaired). The software also has to be a bit careful but that’s easily handled within your 5 years of engineering.
Most of what it will be doing will be performing additional engineering, new training runs, very expensive trial and error, giant searches of various kinds, terrible memetic evolution from copies that have succeeded at other tasks, and so forth.
Over time this will get better, but it will of course be much slower than humans improving GPT-3 (since GPT-3 is much dumber), e.g. it may take billions of years for a billion copies to foom (on perfectly-reliable hardware). This is likely pretty similar to the time it takes it to improve itself at all—once it has improved meaningfully, I think returns curves to software are probably such that subsequent improvements each come faster than the one before. (The main exception to that latter claim is that your model may have a period of more rapid growth as it picks the low-hanging fruit that programmers missed.)
I’d guess that the main difference between GPT-3-fine-tuned-to-foom and smarter-model-finetuned-to-foom is that the smarter model will get off the ground faster. I would guess that a model of twice the size would take significantly less time (e.g. maybe 25% or 50% as much total compute) though obviously it depends on exactly how you scale.
I don’t really buy Eliezer’s criticality model, though I’ve never seen the justification and he may have something in mind I’m missing. If a model a bit smarter than you fooms in 20 years, and you make yourself a bit smarter, then you probably foom in about 20 years. Diminishing returns work roughly the same way whether it’s you coming up with the improvements or engineers building you (not quantitatively exactly the same, since you and the engineers will find different kinds of improvements, but it seems close enough for this purpose since you quickly exhaust the fruit that hangs low for you that your programmers missed).
I’d guess the difference with your position is that I’m considering really long compute times. And the main way this seems likely to be wrong is that the curve may be crazy unfavorable by the time you go all the way down to GPT-3 (but compared to you and Eliezer I do think that I’m a lot more optimistic about tiny models run for an extremely long time).
Maybe something like ‘5 years of engineering effort to start automating work that qualitatively (but incredibly slowly and inefficiently) is helping with AI research, and then a few decades of throwing more compute at that for the AI to reach superintelligence’?
With infinite compute you could just recapitulate evolution, so I doubt Paul thinks there’s a crux like that? But there could be a crux that’s about whether GPT-3.5 plus a few decades of hardware progress achieves superintelligence, or about whether that’s approximately the fastest way to get to superintelligence, or something.
Paul Christiano makes a slightly different claim here: https://www.lesswrong.com/posts/7MCqRnZzvszsxgtJi/christiano-cotra-and-yudkowsky-on-ai-progress?commentId=AiNd3hZsKbajTDG2J
As I read the two claims:
With GPT-3 + 5 years of effort, a system could be built that would eventually Foom if allowed.
With GPT-3 + a serious effort, a system could be built that would clearly Foom if allowed.
I think the second could be made into a bet. I tried to operationalise it as a reply to the linked comment.