so then OpenAI comes along and dumps 100x or 10,000x the compute into something like it just to see what happens.
10000x would be unprecedented—why wouldn’t you first do a 100x run to make sure things work well before doing a 10000x run? (This only increases costs by 1%.)
Also, 10000x increase in compute corresponds to 100-1000x more parameters, which does not usually lead to things I would call “discontinuities” (e.g. GPT-2 to GPT-3 does not seem like an important discontinuity to me, even if we ignore the in-between models trained along the way). Put another way—I’m happy to posit “sudden jumps” of size similar to the difference between GPT-3 and GPT-2 (they seem rare but possible); I don’t think these should make us particularly pessimistic about engineering-style approaches to alignment.
I feel like I keep responding to this argument in the same way and I wish these predictions would be made in terms of $ spent and compared to current $ spent—it just seems nearly impossible to have a discontinuity via compute at this point. Perhaps I should just write a post called “10,000x compute is not a discontinuity”.
The story seems less obviously incorrect if we talk about discontinuity via major research insight, but historical track record seems to suggest this does not usually cause major discontinuities.
I’m concerned that either they won’t bother to re-implement all the previous things that patched alignment problems, or that there won’t be an obvious way to port some old patches to the new model (or that there will be an obvious way, but it doesn’t work).
One assumes that they scale up the compute, notice some dangerous aspects, turn off the AI system, and then fix the problem. (Well, really, if we’ve already seen dangerous aspects from previous AI systems, one assumes they don’t run it in the first place until they have ported the safety features.)
I also agree that direct jumps in capability due to research insight are rare. But in part I think that’s just because things get tried at small scale first, and so there’s always going to be some scaling-up period where the new insight gets fed more and more resources, eventually outpacing the old state of the art. From a coarse-grained perspective GPT-2 relative to your favorite LSTM model from 2018 is the “jump in capability” due to research insight, it just got there in a not-so-discontinuous way.
Maybe you’re optimistic that in the future, everyone will eventually be doing safety checks of their social media recommender algorithms or whatever during training. But even if some company is partway through scaling up the hot new algorithm and (rather than training to completion) they trip the alarm that was searching for undesirable real-world behavior because of learned agent-like reasoning, what then? The assumption that progress will be slow relative to adaptation already seems to be out the window.
This is basically the punctuated equilibria theory of software evolution :P
I also agree that direct jumps in capability due to research insight are rare. But in part I think that’s just because things get tried at small scale first, and so there’s always going to be some scaling-up period where the new insight gets fed more and more resources, eventually outpacing the old state of the art. From a coarse-grained perspective GPT-2 relative to your favorite LSTM model from 2018 is the “jump in capability” due to research insight, it just got there in a not-so-discontinuous way.
Seems right to me.
if some company is partway through scaling up the hot new algorithm and (rather than training to completion) they trip the alarm that was searching for undesirable real-world behavior because of learned agent-like reasoning, what then?
(I’m not convinced this is a good tripwire, but under the assumption that it is:)
Ideally they have already applied safety solutions and so this doesn’t even happen in the first place. But supposing this did happen, they turn off the AI system because they remember how Amabook lost a billion dollars through their AI system embezzling money from them, and they start looking into how to fix this issue.
10000x would be unprecedented—why wouldn’t you first do a 100x run to make sure things work well before doing a 10000x run? (This only increases costs by 1%.)
Also, 10000x increase in compute corresponds to 100-1000x more parameters, which does not usually lead to things I would call “discontinuities” (e.g. GPT-2 to GPT-3 does not seem like an important discontinuity to me, even if we ignore the in-between models trained along the way). Put another way—I’m happy to posit “sudden jumps” of size similar to the difference between GPT-3 and GPT-2 (they seem rare but possible); I don’t think these should make us particularly pessimistic about engineering-style approaches to alignment.
I feel like I keep responding to this argument in the same way and I wish these predictions would be made in terms of $ spent and compared to current $ spent—it just seems nearly impossible to have a discontinuity via compute at this point. Perhaps I should just write a post called “10,000x compute is not a discontinuity”.
The story seems less obviously incorrect if we talk about discontinuity via major research insight, but historical track record seems to suggest this does not usually cause major discontinuities.
One assumes that they scale up the compute, notice some dangerous aspects, turn off the AI system, and then fix the problem. (Well, really, if we’ve already seen dangerous aspects from previous AI systems, one assumes they don’t run it in the first place until they have ported the safety features.)
I think you should write this post!
I also agree that direct jumps in capability due to research insight are rare. But in part I think that’s just because things get tried at small scale first, and so there’s always going to be some scaling-up period where the new insight gets fed more and more resources, eventually outpacing the old state of the art. From a coarse-grained perspective GPT-2 relative to your favorite LSTM model from 2018 is the “jump in capability” due to research insight, it just got there in a not-so-discontinuous way.
Maybe you’re optimistic that in the future, everyone will eventually be doing safety checks of their social media recommender algorithms or whatever during training. But even if some company is partway through scaling up the hot new algorithm and (rather than training to completion) they trip the alarm that was searching for undesirable real-world behavior because of learned agent-like reasoning, what then? The assumption that progress will be slow relative to adaptation already seems to be out the window.
This is basically the punctuated equilibria theory of software evolution :P
Seems right to me.
(I’m not convinced this is a good tripwire, but under the assumption that it is:)
Ideally they have already applied safety solutions and so this doesn’t even happen in the first place. But supposing this did happen, they turn off the AI system because they remember how Amabook lost a billion dollars through their AI system embezzling money from them, and they start looking into how to fix this issue.