Contra both the ‘doomers’ and the ‘optimists’ on (not) pausing. Rephrased: RSPs (done right) seem right.
Contra ‘doomers’. Oversimplified, ‘doomers’ (e.g. PauseAI, FLI’s letter, Eliezer) ask(ed) for pausing now / even earlier - (e.g. the Pause Letter). I expect this would be / have been very much suboptimal, even purely in terms of solving technical alignment. For example, Some thoughts on automating alignment research suggests timing the pause so that we can use automated AI safety research could result in ‘[...] each month of lead that the leader started out with would correspond to 15,000 human researchers working for 15 months.’ We clearly don’t have such automated AI safety R&D capabilities now, suggesting that pausing later, when AIs are closer to having the required automated AI safety R&D capabilities would be better. At the same time, current models seem very unlikely to be x-risky (e.g. they’re still very bad at passing dangerous capabilities evals), which is another reason to think pausing now would be premature.
Contra ‘optimists’. I’m more unsure here, but the vibe I’m getting from e.g. AI Pause Will Likely Backfire (Guest Post) is roughly something like ‘no pause ever’; largely based on arguments of current systems seeming easy to align / control. While I agree with the point that current systems do seem easy to align / control and I could even see this holding all the way up to ~human-level automated AI safety R&D, I can easily see scenarios where around that time things get scary quickly without any pause. For example, similar arguments to those about the scalability of automated AI safety R&D suggest automated AI capabilities R&D could also be scaled up significantly. For example, figures like those in Before smart AI, there will be many mediocre or specialized AIs suggest very large populations of ~human-level automated AI capabilities researchers could be deployed (e.g. 100x larger than the current [human] population of AI researchers). Given that even with the current relatively small population, algorithmic progress seems to double LM capabilities ~every 8 months, it seems like algorithmic progress could be much faster with 100x larger populations, potentially leading to new setups (e.g. new AI paradigms, new architectures, new optimizers, synthetic data, etc.) which could quite easily break the properties that make current systems seem relatively easy / safe to align. In this scenario, pausing to get this right (especially since automated AI safety R&D would also be feasible) seems like it could be crucial.
At least Eliezer has been extremely clear that he is in favor of a stop not a pause (indeed, that was like the headline of his article “Pausing AI Developments Isn’t Enough. We Need to Shut it All Down”), so I am confused why you list him with anything related to “pause”.
My guess is me and Eliezer are both in favor of a pause, but mostly because a pause seems like it would slow down AGI progress, not because the next 6 months in-particular will be the most risky period.
At the same time, current models seem very unlikely to be x-risky (e.g. they’re still very bad at passing dangerous capabilities evals), which is another reason to think pausing now would be premature.
The relevant criterion is not whether the current models are likely to be x-risky (it’s obviously far too late if they are!), but whether the next generation of models have more than an insignificant chance of being x-risky together with all the future frameworks they’re likely to be embedded into.
Given that the next generations are planned to involve at least one order of magnitude more computing power in training (and are already in progress!) and that returns on scaling don’t seem to be slowing, I think the total chance of x-risk from those is not insignificant.
I agree with some points here Bogdan, but not all of them.
I do think that current models are civilization-scale-catastrophe-risky (but importantly not x-risky!) from a misuse perspective, but not yet from a self-directed perspective. Which means neither Alignment nor Control are currently civilization-scale-catastrophe-risky, much less x-risky.
I also agree that pausing now would be counter-productive. My reasoning for this is that I agree with Samo Burja about some key points which are relevant here (while disagreeing with his conclusions due to other points).
To quote myself:
I agree with [Samo’s] premise that AGI will require fundamental scientific advances beyond currently deployed tech like transformer LLMs.
I agree that scientific progress is hard, usually slow and erratic, fundamentally different from engineering or bringing a product to market.
I agree with [Samo’s] estimate that the current hype around chat LLMs, and focus on bringing better versions to market, is slowing fundamental scientific progress by distracting top AI scientists from pursuit of theoretical advances.
Think about how you’d expect these factors to change if large AI training runs were paused. I think you might agree that this would likely result in a temporary shift in much of the top AI scientist talent to making theoretical progress. They’d want to be ready to come in strong after the pause was ended, with lots of new advances tested at small scale. I think this would actually result more high quality scientific thought directed at the heart of the problem of AGI, and thus make AGI very likely to be achieved sooner after the pause ends than it otherwise would have been.
I would go even farther, and make the claim that AGI could arise during a pause on large training runs. I think that the human brain is not a supercomputer, my upper estimate for ‘human brain inference’ is about at the level of a single 8x A100 server. Less than an 8x H100 server. Also, I have evidence from analysis of the long-range human connectome (long range axons are called tracts, so perhaps I should call this a ‘tractome’). [Hah, I just googled this term I came up with just now, and found it’s already in use, and that it brings up some very interesting neuroscience papers. Cool.] Anyway… I was saying, this evidence shows that the range of bandwidth (data throughput in bits per second) between two cortical regions in the human brain is typically around 5 mb/s, and maxes out at about 50 mb/s. In other words, well within range for distributed federated training runs to work over long distance internet connections. So unless you are willing to monitor the entire internet so robustly that nobody can scrape together the equivalent compute of an 8X A100 server, you can’t fully block AGI.
Of course, if you wanted to train the AGI in a reasonable amount of time, you’d want to do a parallel run of much more than a single inference instance of compute. So yeah, it’d definitely make things inconvenient if an international government were monitoring all datacenters… but far from impossible.
For the same reason, I don’t think a call to ‘Stop AI development permanently’ works without the hypothetical enforcement agency literally going around the world confiscating all personal computers and shutting down the internet. Not gonna happen, why even advocate for such a thing? Makes me think that Eliezer is advocating for this in order to have some intended effect other than this on the world.
Contra both the ‘doomers’ and the ‘optimists’ on (not) pausing. Rephrased: RSPs (done right) seem right.
Contra ‘doomers’. Oversimplified, ‘doomers’ (e.g. PauseAI, FLI’s letter, Eliezer) ask(ed) for pausing now / even earlier - (e.g. the Pause Letter). I expect this would be / have been very much suboptimal, even purely in terms of solving technical alignment. For example, Some thoughts on automating alignment research suggests timing the pause so that we can use automated AI safety research could result in ‘[...] each month of lead that the leader started out with would correspond to 15,000 human researchers working for 15 months.’ We clearly don’t have such automated AI safety R&D capabilities now, suggesting that pausing later, when AIs are closer to having the required automated AI safety R&D capabilities would be better. At the same time, current models seem very unlikely to be x-risky (e.g. they’re still very bad at passing dangerous capabilities evals), which is another reason to think pausing now would be premature.
Contra ‘optimists’. I’m more unsure here, but the vibe I’m getting from e.g. AI Pause Will Likely Backfire (Guest Post) is roughly something like ‘no pause ever’; largely based on arguments of current systems seeming easy to align / control. While I agree with the point that current systems do seem easy to align / control and I could even see this holding all the way up to ~human-level automated AI safety R&D, I can easily see scenarios where around that time things get scary quickly without any pause. For example, similar arguments to those about the scalability of automated AI safety R&D suggest automated AI capabilities R&D could also be scaled up significantly. For example, figures like those in Before smart AI, there will be many mediocre or specialized AIs suggest very large populations of ~human-level automated AI capabilities researchers could be deployed (e.g. 100x larger than the current [human] population of AI researchers). Given that even with the current relatively small population, algorithmic progress seems to double LM capabilities ~every 8 months, it seems like algorithmic progress could be much faster with 100x larger populations, potentially leading to new setups (e.g. new AI paradigms, new architectures, new optimizers, synthetic data, etc.) which could quite easily break the properties that make current systems seem relatively easy / safe to align. In this scenario, pausing to get this right (especially since automated AI safety R&D would also be feasible) seems like it could be crucial.
At least Eliezer has been extremely clear that he is in favor of a stop not a pause (indeed, that was like the headline of his article “Pausing AI Developments Isn’t Enough. We Need to Shut it All Down”), so I am confused why you list him with anything related to “pause”.
My guess is me and Eliezer are both in favor of a pause, but mostly because a pause seems like it would slow down AGI progress, not because the next 6 months in-particular will be the most risky period.
The relevant criterion is not whether the current models are likely to be x-risky (it’s obviously far too late if they are!), but whether the next generation of models have more than an insignificant chance of being x-risky together with all the future frameworks they’re likely to be embedded into.
Given that the next generations are planned to involve at least one order of magnitude more computing power in training (and are already in progress!) and that returns on scaling don’t seem to be slowing, I think the total chance of x-risk from those is not insignificant.
I agree with some points here Bogdan, but not all of them.
I do think that current models are civilization-scale-catastrophe-risky (but importantly not x-risky!) from a misuse perspective, but not yet from a self-directed perspective. Which means neither Alignment nor Control are currently civilization-scale-catastrophe-risky, much less x-risky.
I also agree that pausing now would be counter-productive. My reasoning for this is that I agree with Samo Burja about some key points which are relevant here (while disagreeing with his conclusions due to other points).
To quote myself:
Think about how you’d expect these factors to change if large AI training runs were paused. I think you might agree that this would likely result in a temporary shift in much of the top AI scientist talent to making theoretical progress. They’d want to be ready to come in strong after the pause was ended, with lots of new advances tested at small scale. I think this would actually result more high quality scientific thought directed at the heart of the problem of AGI, and thus make AGI very likely to be achieved sooner after the pause ends than it otherwise would have been.
I would go even farther, and make the claim that AGI could arise during a pause on large training runs. I think that the human brain is not a supercomputer, my upper estimate for ‘human brain inference’ is about at the level of a single 8x A100 server. Less than an 8x H100 server. Also, I have evidence from analysis of the long-range human connectome (long range axons are called tracts, so perhaps I should call this a ‘tractome’). [Hah, I just googled this term I came up with just now, and found it’s already in use, and that it brings up some very interesting neuroscience papers. Cool.] Anyway… I was saying, this evidence shows that the range of bandwidth (data throughput in bits per second) between two cortical regions in the human brain is typically around 5 mb/s, and maxes out at about 50 mb/s. In other words, well within range for distributed federated training runs to work over long distance internet connections. So unless you are willing to monitor the entire internet so robustly that nobody can scrape together the equivalent compute of an 8X A100 server, you can’t fully block AGI.
Of course, if you wanted to train the AGI in a reasonable amount of time, you’d want to do a parallel run of much more than a single inference instance of compute. So yeah, it’d definitely make things inconvenient if an international government were monitoring all datacenters… but far from impossible.
For the same reason, I don’t think a call to ‘Stop AI development permanently’ works without the hypothetical enforcement agency literally going around the world confiscating all personal computers and shutting down the internet. Not gonna happen, why even advocate for such a thing? Makes me think that Eliezer is advocating for this in order to have some intended effect other than this on the world.