I didn’t push this point at the time, but Paul’s claim that “GPT-3 + 5 person-years of engineering effort [would] foom” seems really wild to me, and probably a good place to poke at his model more. Is this 5 years of engineering effort and then humans leaving it alone with infinite compute?
The 5 years are up front and then it’s up to the AI to do the rest. I was imagining something like 1e25 flops running for billions of years.
I don’t really believe the claim unless you provide computing infrastructure that is externally maintained or else extremely robust, i.e. I don’t think that GPT-3 is close enough to being autopoietic if it needs to maintain its own hardware (it will decay much faster than it can be repaired). The software also has to be a bit careful but that’s easily handled within your 5 years of engineering.
Most of what it will be doing will be performing additional engineering, new training runs, very expensive trial and error, giant searches of various kinds, terrible memetic evolution from copies that have succeeded at other tasks, and so forth.
Over time this will get better, but it will of course be much slower than humans improving GPT-3 (since GPT-3 is much dumber), e.g. it may take billions of years for a billion copies to foom (on perfectly-reliable hardware). This is likely pretty similar to the time it takes it to improve itself at all—once it has improved meaningfully, I think returns curves to software are probably such that subsequent improvements each come faster than the one before. (The main exception to that latter claim is that your model may have a period of more rapid growth as it picks the low-hanging fruit that programmers missed.)
I’d guess that the main difference between GPT-3-fine-tuned-to-foom and smarter-model-finetuned-to-foom is that the smarter model will get off the ground faster. I would guess that a model of twice the size would take significantly less time (e.g. maybe 25% or 50% as much total compute) though obviously it depends on exactly how you scale.
I don’t really buy Eliezer’s criticality model, though I’ve never seen the justification and he may have something in mind I’m missing. If a model a bit smarter than you fooms in 20 years, and you make yourself a bit smarter, then you probably foom in about 20 years. Diminishing returns work roughly the same way whether it’s you coming up with the improvements or engineers building you (not quantitatively exactly the same, since you and the engineers will find different kinds of improvements, but it seems close enough for this purpose since you quickly exhaust the fruit that hangs low for you that your programmers missed).
I’d guess the difference with your position is that I’m considering really long compute times. And the main way this seems likely to be wrong is that the curve may be crazy unfavorable by the time you go all the way down to GPT-3 (but compared to you and Eliezer I do think that I’m a lot more optimistic about tiny models run for an extremely long time).
The 5 years are up front and then it’s up to the AI to do the rest. I was imagining something like 1e25 flops running for billions of years.
I don’t really believe the claim unless you provide computing infrastructure that is externally maintained or else extremely robust, i.e. I don’t think that GPT-3 is close enough to being autopoietic if it needs to maintain its own hardware (it will decay much faster than it can be repaired). The software also has to be a bit careful but that’s easily handled within your 5 years of engineering.
Most of what it will be doing will be performing additional engineering, new training runs, very expensive trial and error, giant searches of various kinds, terrible memetic evolution from copies that have succeeded at other tasks, and so forth.
Over time this will get better, but it will of course be much slower than humans improving GPT-3 (since GPT-3 is much dumber), e.g. it may take billions of years for a billion copies to foom (on perfectly-reliable hardware). This is likely pretty similar to the time it takes it to improve itself at all—once it has improved meaningfully, I think returns curves to software are probably such that subsequent improvements each come faster than the one before. (The main exception to that latter claim is that your model may have a period of more rapid growth as it picks the low-hanging fruit that programmers missed.)
I’d guess that the main difference between GPT-3-fine-tuned-to-foom and smarter-model-finetuned-to-foom is that the smarter model will get off the ground faster. I would guess that a model of twice the size would take significantly less time (e.g. maybe 25% or 50% as much total compute) though obviously it depends on exactly how you scale.
I don’t really buy Eliezer’s criticality model, though I’ve never seen the justification and he may have something in mind I’m missing. If a model a bit smarter than you fooms in 20 years, and you make yourself a bit smarter, then you probably foom in about 20 years. Diminishing returns work roughly the same way whether it’s you coming up with the improvements or engineers building you (not quantitatively exactly the same, since you and the engineers will find different kinds of improvements, but it seems close enough for this purpose since you quickly exhaust the fruit that hangs low for you that your programmers missed).
I’d guess the difference with your position is that I’m considering really long compute times. And the main way this seems likely to be wrong is that the curve may be crazy unfavorable by the time you go all the way down to GPT-3 (but compared to you and Eliezer I do think that I’m a lot more optimistic about tiny models run for an extremely long time).