Cheap, Fast, and Doesn’t Kill Everyone: choose only two.
That is an extremely easy choice. “Doesn’t Kill Everyone” is blatantly essential. “Fast” is unfortunately a requirement given that the open-source community intent on releasing everything to every nutjob, criminal, and failed state on the planet is only ~18 month behind you, so waiting until this can be done cheaply means that there will by many thousands of groups that can do this and we’re all dead if any one of them does something stupid (a statistical inevitability). So the blindingly obvious decision is to skip “Cheap” and hit up the tech titans for tens of billions of dollars, followed by Wall Street for hundreds of billions of dollars as training costs increase. While also clamming up on publishing capabilities work and only publishing alignment work, and throwing grant money around to fund external Alignment Research. Which sounds to me like an description of the externally visible strategies of the ~3 labs making frontier models at the moment.
I honestly think this is still cheap. Non-cheap would be monumentally bigger and with much larger teams employed on alignment to attack it from all angles. I think we’re seeing Cheap and Fast, with the obvious implied problem.
You’re talking about a couple of thousand extremely smart people, quite a few of them of them alignment researchers (some of whom post regularly on Less Wrong/The Alignment Forum), and suggesting they’re all not noticing the possibility of the extinction of the human race. The desirability of not killing everyone is completely obvious, to anyone aware that it’s a possibility. Absolutely no one wants to kill themselves and all their friends and family. (This is obviously not a problem that private bunker will help with: a paperclip maximizer will want to turn that into paperclips too. I believe Elon Musk is on record pointing out the fact that Mars is not far enough to run.) Yes, there are people like Yann LeCun who are publically making it clear that they’re still missing the point that this could happen any time soon. On the other hand, Sam Altman, Ilya Suskever, Dario & Daniella Amodei, and Demis Hassabis are all on public record with significant personal reputational skin in the game saying that killing everyone is a real risk in the relatively near term, and also that not doing so is obviously vital, while Sundar Pichai is constitutionally incapable of speaking in public without using the words ‘helpful’, ‘responsible’ and ‘trustworthy’ at least once every few paragraphs, so it’s hard to tell how worried he is. OpenAI routinely delay shipping their models for ~6 months while they and other external groups do safety work, Google just delayed Gemini Ultra for what sounds rather like safety reasons, and Anthropic are publically committed to never ship first, and never have. This is not what “cheap”+”fast” looks like.
Tens to hundreds of billions of dollars is not cheap in anyone’s books, not even tech titans’. Add Google and Microsoft’s entire current market capitalizations together, you get 4-or-5 trillion. The only place we could get significantly more money than that to throw at the problem is the US government. Now, it is entirely true that the proportion of that largesse going to alignment research isn’t anything like as high as the proportion going to build training compute (though OpenAI did publicly commit 20% of their training compute to AGI-level alignment work, and that’s a lot of money), But if they threw a couple of orders of magnitude more than the $10m in grants that OpenAI just threw at alignment, are there enough competent alignment researchers to spend it without seriously diminishing returns? I think alignment field-building is the bottleneck.
Just because this isn’t cheap relative to the world GDP doesn’t mean it’s enough. If our goal was “build a Dyson sphere” even throwing our whole productivity towards it would be cheap. I’m not saying there aren’t any concerns, but the money is still mostly going to capabilites and safety, while a concern, still needs to be compromised also with commercial needs and race dynamics—albeit mercifully dampened. Honestly with LeCun’s position we’re just lucky that Meta isn’t that good at AI, or they alone would set the pace of the race for everyone else.
I think Meta have been somewhat persuaded by the Biden administration to sign on for safety, or at least for safety-theatre, despite LeCun. They actually did a non-trivial amount of real safety work on Llama-2 (a model small enough not to need it), and then never released one size of it for safety reasons.. Which was of course pointless, or more exactly just showing off, since they then open-sourced the weights, including to the base models, so anyone with $200 can fine-tune their safety work out again. However, it’s all basically window dressing, as these models are (we believe) too small to be an x-risk, and they were reasonably certain of that before they started (as far as we know, about the worst these models can do is write badly-written underage porn or phishing emails, or similarly marginally assist criminals.)
Obviously no modern models are an existential risk, the problem is the trajectory. Does the current way of handling the situation extrapolate properly to even just AGI, something that is an open goal for many of these companies? I’d say not, or at least, I very much doubt it. As in, if you’re not doing that kind of work inside a triple-airgapped and firewalled desert island and planning for layers upon layers of safety testing before even considering releasing the resulting product as a commercial tool, you’re doing it wrong—and that’s just for technical safety. I still haven’t seen a serious proposal of how do you make human labor entirely unnecessary and maintain a semblance of economic order instead of collapsing every social and political structure at once.
That is an extremely easy choice. “Doesn’t Kill Everyone” is blatantly essential. “Fast” is unfortunately a requirement given that the open-source community intent on releasing everything to every nutjob, criminal, and failed state on the planet is only ~18 month behind you, so waiting until this can be done cheaply means that there will by many thousands of groups that can do this and we’re all dead if any one of them does something stupid (a statistical inevitability). So the blindingly obvious decision is to skip “Cheap” and hit up the tech titans for tens of billions of dollars, followed by Wall Street for hundreds of billions of dollars as training costs increase. While also clamming up on publishing capabilities work and only publishing alignment work, and throwing grant money around to fund external Alignment Research. Which sounds to me like an description of the externally visible strategies of the ~3 labs making frontier models at the moment.
I honestly think this is still cheap. Non-cheap would be monumentally bigger and with much larger teams employed on alignment to attack it from all angles. I think we’re seeing Cheap and Fast, with the obvious implied problem.
You’re talking about a couple of thousand extremely smart people, quite a few of them of them alignment researchers (some of whom post regularly on Less Wrong/The Alignment Forum), and suggesting they’re all not noticing the possibility of the extinction of the human race. The desirability of not killing everyone is completely obvious, to anyone aware that it’s a possibility. Absolutely no one wants to kill themselves and all their friends and family. (This is obviously not a problem that private bunker will help with: a paperclip maximizer will want to turn that into paperclips too. I believe Elon Musk is on record pointing out the fact that Mars is not far enough to run.) Yes, there are people like Yann LeCun who are publically making it clear that they’re still missing the point that this could happen any time soon. On the other hand, Sam Altman, Ilya Suskever, Dario & Daniella Amodei, and Demis Hassabis are all on public record with significant personal reputational skin in the game saying that killing everyone is a real risk in the relatively near term, and also that not doing so is obviously vital, while Sundar Pichai is constitutionally incapable of speaking in public without using the words ‘helpful’, ‘responsible’ and ‘trustworthy’ at least once every few paragraphs, so it’s hard to tell how worried he is. OpenAI routinely delay shipping their models for ~6 months while they and other external groups do safety work, Google just delayed Gemini Ultra for what sounds rather like safety reasons, and Anthropic are publically committed to never ship first, and never have. This is not what “cheap”+”fast” looks like.
Tens to hundreds of billions of dollars is not cheap in anyone’s books, not even tech titans’. Add Google and Microsoft’s entire current market capitalizations together, you get 4-or-5 trillion. The only place we could get significantly more money than that to throw at the problem is the US government. Now, it is entirely true that the proportion of that largesse going to alignment research isn’t anything like as high as the proportion going to build training compute (though OpenAI did publicly commit 20% of their training compute to AGI-level alignment work, and that’s a lot of money), But if they threw a couple of orders of magnitude more than the $10m in grants that OpenAI just threw at alignment, are there enough competent alignment researchers to spend it without seriously diminishing returns? I think alignment field-building is the bottleneck.
Just because this isn’t cheap relative to the world GDP doesn’t mean it’s enough. If our goal was “build a Dyson sphere” even throwing our whole productivity towards it would be cheap. I’m not saying there aren’t any concerns, but the money is still mostly going to capabilites and safety, while a concern, still needs to be compromised also with commercial needs and race dynamics—albeit mercifully dampened. Honestly with LeCun’s position we’re just lucky that Meta isn’t that good at AI, or they alone would set the pace of the race for everyone else.
I think Meta have been somewhat persuaded by the Biden administration to sign on for safety, or at least for safety-theatre, despite LeCun. They actually did a non-trivial amount of real safety work on Llama-2 (a model small enough not to need it), and then never released one size of it for safety reasons.. Which was of course pointless, or more exactly just showing off, since they then open-sourced the weights, including to the base models, so anyone with $200 can fine-tune their safety work out again. However, it’s all basically window dressing, as these models are (we believe) too small to be an x-risk, and they were reasonably certain of that before they started (as far as we know, about the worst these models can do is write badly-written underage porn or phishing emails, or similarly marginally assist criminals.)
Obviously no modern models are an existential risk, the problem is the trajectory. Does the current way of handling the situation extrapolate properly to even just AGI, something that is an open goal for many of these companies? I’d say not, or at least, I very much doubt it. As in, if you’re not doing that kind of work inside a triple-airgapped and firewalled desert island and planning for layers upon layers of safety testing before even considering releasing the resulting product as a commercial tool, you’re doing it wrong—and that’s just for technical safety. I still haven’t seen a serious proposal of how do you make human labor entirely unnecessary and maintain a semblance of economic order instead of collapsing every social and political structure at once.