Like Andrew, I don’t see strong reasons to believe that near-term loss-of-control accounts for more x-risk than medium-term multi-polar “going out with a whimper”. This is partly due to thinking oversight of near-term AI might be technically easy. I think Andrew also thought along those lines: an intelligence explosion is possible, but relatively easy to prevent if people are scared enough, and they probably will be. Although I do have lower probabilities than him, and some different views on AI conflict. Interested in your take @Daniel Kokotajlo
I don’t think people will be scared enough of intelligence explosion to prevent it. Indeed the leadership of all the major AI corporations are actively excited about, and gunning for, an intelligence explosion. They are integrating AI into their AI R&D as fast as they can.
Indeed the leadership of all the major AI corporations are actively excited about, and gunning for, an intelligence explosion. They are integrating AI into their AI R&D as fast as they can.
It’s hard to know for sure what they are planning in secret. If I were them, I’d currently be a mode of “biding my time, waiting for the optimal moment to focus on automating AI R&D, building up the prerequisites.”
I think the current LLMs and other AI systems are not quite strong enough to pass a critical threshold where this RSI feedback loop could really take off. Thus, if I had the option to invest in preparing the scaffolding now, or racing to get to the first version so that I got to be the first to start doing RSI… I’d just push hard for that first good-enough version. Then I’d pivot hard to RSI as soon as I had it.
I don’t know, intuitively it would seem suboptimal to put very little of the research portfolio on preparing the scaffolding, since somebody else who isn’t that far behind on the base model (e.g. another lab, maybe even the opensource community) might figure out the scaffolding (and perhaps not even make anything public) and get ahead overall.
Importantly, I think that preventing an intelligence explosion (as opposed to delaying it by a couple of years, or slowing it by 10%) is really really hard. My expectation is that even if large training runs and the construction of new large datacenters were halted worldwide today, that algorithmic progress would continue. Within ten years I’d expect the cost of training and running a RSI-capable AGI would continue to drop as hardware improved and algorithms improved. At some point during that ten year period, it would come within reach of small private datacenters, then personal home servers (e.g. bitcoin mining rigs), then ordinary personal computers.
If my view on this is correct, then during this ten year period the governments of the world would not only need to coordinate to block RSI in large datacenters, but actually to expand their surveillance and control to ever smaller and more personal compute sources. Eventually they’d need to start confiscating personal computers beyond a certain power level, close all non-government controlled datacenters, prevent the public sale of hardware components which could be assembled into compute clusters, monitor the web and block all encrypted traffic in order to prevent federated learning, and continue to inspect all the government facilities (including all secret military facilities) of all other governments to prevent any of them from defecting against the ban on AI progress.
I don’t think that will happen, no matter how scared the leadership of any particular company or government gets. There’s just too many options for someone somewhere to defect, and the costs of control are too high.
I agree that pausing for 10 years would be difficult. However, I think even a 1-year pause would be GREAT and would substantially improve humanity’s chances of staying in control and surviving. In practice I expect ‘you can’t pause forever, what about bitcoin miners and north korea’ to be used as just one more in a long list of rationalizations for why we shouldn’t pause at all.
I do think that we shouldn’t pause. I agree that we need to end this corporate race towards AGI, but I don’t think pausing is the right way to do it.
I think we should nationalize, then move ahead cautiously but determinedly. Not racing, not treating this as just normal corporate R&D. I think the seriousness of the risks, and the huge potential upsides, means that AI development should be treated more like nuclear power and nuclear weapons. We should control and restrict it. Don’t allow even API access to non-team members. Remember when we talked about keeping AI in a box for safety? What if we actually did that?
I expect that nationalization would slow things down, especially at first during the transition. I think that’s a good thing.
I disagree that creating approximately human-level AGI in a controlled lab environment is a bad thing. There are a lot of risks, but the alternatives are also risky.
Risks of nationalizing:
Model and/or code could get stolen or leaked.
Word of its existence could spread, and encourage others to try to catch up.
It would be more useful for weapons development and other misuse.
It could be used for RSI, resulting in:
a future model so superintelligent that it can escape even from the controlled lab,
it could result in finding algorithmic improvements that make training much much cheaper (and this secret would then need to be prevented from being leaked)
Benefits of nationalization:
We would have a specimen of true AGI to study in the lab.
We could use it, even without robust alignment, via a control strategy like Buck’s/Ryan’s ideas.
We could use it for automated alignment research. Including:
creating synthetic data
creating realistic simulations for training and testing, with multiagent interactions, censored training runs, honeypots to catch deception or escape attempts, etc
parallel exploration of many theories with lots of very fast workers
exploration of a wider set of possible algorithms and architectures, to see if some are particularly safe (or hazardous)
We could use it for non-AI R&D:
this could help with defensive acceleration of defense-dominant technology. Protecting the world from bioweapons and ICBMs, etc.
this would enable beneficial rapid progress in many intellect-bottlenecked fields like medicine.
Nationalization would enable:
preventing AI experts from leaving the country even if they decided not to work for the government project,
removing restrictions (and adding incentives) on immigration for AI experts,
not being dependent on the whims of corporate politics for the safety of humanity,
not needing to develop and deploy a consumer product (distracting from the true quest of alignment), not needing to worry about profits vs expenditures
removing divisions between the top labs by placing them all in the government project, preventing secret-hording of details important for alignment
Risks of Pausing:
There is still a corporate race on, it just has to work around the rules of the pause now. This creates enormous pressure for the AI experts to find ways of improving the outputs of AI without breaking limits. This is especially concerning in the case of compute limits since it explicitly pushes research in the direction of searching for algorithms that would allow things like:
incremental / federated training methods that break up big training runs into sub-limit pieces, or allow for better combinations of prior models
the search for much more efficient algorithms that can work on much less data and training compute. I am confident that huge gains are possible here, and would be found quickly. This point has enough detail that could be added that it could become a post all on its own. I am reluctant to publicly post these thoughts though, since they might contribute to the thing I’m worried about actually happening. Such developments would gravely undercut the new compute restrictions and enable the risk of wider proliferation into the hands of many smaller actors.
the search for loopholes in the regulation or flaws in the surveillance and enforcement that allow for racing while appearing not to race
the continuation of many corporate race dynamics that are bad for safety like pressure for espionage and employee-sniping,
lack of military-grade security on AI developers. Many different independent corporate systems each presenting their own set of vulnerabilities. Dangerous innovation could occur in relatively small and insecure companies, and then get stolen or leaked.
any employee can decide to quit and go off to start their own project, spreading tech secrets and changing control structures. The companies have no power to stop this (as opposed to a nationalized project).
If large training runs are banned, then this reduces incentive to work for the big companies, smaller companies will seem more competitive and tempting.
The whole reason I think we should pause is that, sooner or later, we will hit a threshold where, if we do not pause, we literally die, basically immediately (or, like, a couple months later in a way that is hard to find and intervene on) and it doesn’t matter whether pausing has downsides.
(Where “pause” means “pause further capability developments that are more likely to produce the kinds of agentic thinking that could result in a recursive self-improvement)
So for me the question is “do we literally pause at the last possible second, or, sometime beforehand.” (I doubt “last possible second” is a reasonable choice, if it’s necessary to pause “at all”, although I’m much more agnostic about “we should literally pause right now” vs “right at the moment we have minimum-viable-AGI but before recursive self-improvement can start happening” vs “2-6 months before MVP AGI”)
I’m guessing you probably disagree that there will be a moment where, if we do not pause, we literally die?
Short answer, yes I disagree. I also don’t think we are safe if we do pause. That additional fact makes pausing seem less optimal.
Long answer:
I think I need to break this concept down a bit more. There’s a variety of places one might consider pausing, and a variety of consequences which I think could happen at each of those places.
Pre-AGI, dangerous tool AI: this is where we are at now. A combination of tool AI and the limited LLMs we have can together provide pretty substantial uplift to a terrorist org attempting to wipe out humanity. Not an x-risk, but could certainly kill 99% of humanity. A sensible civilization would have paused before we got this far, and put appropriate defenses in place before proceeding. We still have a chance to build defenses, and should do that ASAP. Civilization is not safe. Our existing institutions are apparently failing to protect us from the substantial risks we are facing.
Assistant AGI
weak AGI is being trained. With sufficient scaffolding, this will probably speed up AI research substantially, but isn’t enough to fully automate the whole research process. This can probably be deployed safely behind an API without dooming us all. If it is following the pattern of LLMs so far, this involves a system with tons of memorized facts but subpar reasoning skills and lack of integration of the logical implications of combinations of the facts. The facts are scattered and disconnected.
If we were wise, we’d stop here and accept this substantial speed-up to our research capabilities and try to get civilization into a safe state. If others are racing though, the group at this level is going to be strongly tempted to proceed. This is not yet enough AI power to assure a decisive technological-economic victory for the leader.
If this AGI escaped, we could probably catch it before it self-improved much at all, and it would likely pose no serious (additional) danger to humanity.
Full Researcher AGI
Enough reasoning capability added to the existing LLMs and scaffolding systems (perhaps via some non-LLM architecture) that the system can now reason and integrate facts at least as well as a typical STEM scientist.
I don’t expect that this kills us or escapes its lab, if it is treated cautiously. I think Buck&Ryan’s control scheme works well to harness capabilities without allowing harms, with only a moderate safety tax.
The speed up to AI research could now be said to truly be RSI. The time to the next level up, the more powerful version, may be only a few months, if the research is gone ahead with at full speed. This is where we would depend on organizational adequacy to protect us, to restrain the temptation of some researchers to accelerate full speed. This is where we have the chance to instead turn the focus fully onto alignment, defensive technology acceleration, and human-augmentation. This probably enough AI power to grant the leader decisive economic/technological/military power and allow them to take action to globally halt further racing towards AI. In other words, this is the first point at which I believe we could safely ‘pause’, although given that the ‘pause’ would look like pretty drastic actions and substantial use of human level AI, I don’t think that the ‘pause’ framing quite fits. Civilization can quickly be made safe.
If released into the world, could probably evolve into something that could wipe out humanity. Could be quite hard to catch and delete it in time, if it couldn’t be proven to the relevant authorities that the risk was real and required drastic action. Not a definite game-over though, just high risk.
Mildly Superhuman AGI (weak ASI)
Not only knows more about the world than any human ever has, but also integrates this information and reasons about it more effectively than any human could. Pretty dangerous even in the lab, but still could be safely controlled via careful schemes. For instance, keeping it impaired with slow-downs, deliberately censored and misleading datasets/simulations, and noise injection into its activations. Deploying it safely would likely require so much impairment that it wouldn’t end up any more useful than the merely-high-human-level AGI.
There would be huge temptation to relax the deliberate impairments and harness the full power. Organizational insufficiency could strike here, failing to sufficiently restrain the operators.
If it escaped, it would definitely be able to destroy humanity within a fairly short time frame.
Strongly Superhuman
By the time you’ve relaxed your impairment and control measures enough to even measure how superhuman it is, you are already in serious danger from it. This is the level where you would need to seriously worry about the lab employees getting mind-hacked in some way, or the lab compute equipment getting hacked. If we train this model, and even try to test it at full power, we are in great danger.
If it escapes, it is game over.
Given these levels, I think that you are technically correct that there is a point where if we don’t pause we are pretty much doomed. But I think that that pause point is somewhere above human-level AGI.
Furthermore, I think that attempting to pause before we get to ‘enough AI power to make humanity safe’ leads to us actually putting humanity at greater risk.
How?
By continuing in our current state of great vulnerability to self-replicating weapons like bioweapons.
By putting research pressure on the pursuit of more efficient algorithms, and resulting in a much more distributed, harder to control progress front proceeding nearly as fast to superhuman AGI. I think this rerouting would put us in far more danger of having a containment breach, or overshooting the target level of useful-but-not-suicidally-dangerous.
I don’t have a deep confident argument that we should pause before ‘slightly superhuman level’, but I do think we will need to pause then, and I think getting humanity ready to pause takes like 3 years which is also what my earliesh possible timelines are, so I think we need to start laying the groundwork now so that, even if you think we shouldn’t pause till later, we are ready civilizationally to pause quite abruptly.
Well, it seems we are closer to agreement than we thought about when a pause could be good. I am unsure about the correct way to prep for the pause. I do think we are about 2-4 years from human-level AI, and could get to above-human-level within a year after that if we so chose.
It’s more like an intuitive guess than anything based on anything particularly rigorous, but, like, it takes time for companies and nation-states and international communities to get to agree to things, we don’t seem anywhere close, there will be political forces opposing the pause, and 3 years seems like a generously short time if we even got moderately lucky, to get all the necessary actors to pause in a stable way.
Honestly, my view is that assuming a baseline level of competence where AI legislation is inadequate until a crisis appears, and that crisis has to be fairly severe, it depends fairly clearly on when it happens.
A pause 2-3 years before AI can takeover everything is probably net-positive, but attempting to pause say 1 year or 6 months before AI can takeover everything is plausibly net negative, because I suspect a lot of the pauses to essentially be pauses on giant training runs, which unfortunately introduces lots of risks from overhang from algorithmic advances, and I expect that as soon as very strong laws on AI are passed, AI will probably be either 1 OOM in compute away from takeover, or could already takeover given new algorithms, which becomes a massive problem.
In essence, overhangs are the reason I expect the MNM effect to backfire/have a negative effect for AI regulation in a way it doesn’t for other regulation:
Like Andrew, I don’t see strong reasons to believe that near-term loss-of-control accounts for more x-risk than medium-term multi-polar “going out with a whimper”. This is partly due to thinking oversight of near-term AI might be technically easy. I think Andrew also thought along those lines: an intelligence explosion is possible, but relatively easy to prevent if people are scared enough, and they probably will be. Although I do have lower probabilities than him, and some different views on AI conflict. Interested in your take @Daniel Kokotajlo
I don’t think people will be scared enough of intelligence explosion to prevent it. Indeed the leadership of all the major AI corporations are actively excited about, and gunning for, an intelligence explosion. They are integrating AI into their AI R&D as fast as they can.
Can you expand on this? My rough impression (without having any inside knowledge) is that auto AI R&D is probably very much underelicited, including e.g. in this recent OpenAI auto ML evals paper; which might suggest they’re not gunning for it as hard as they could?
It’s hard to know for sure what they are planning in secret. If I were them, I’d currently be a mode of “biding my time, waiting for the optimal moment to focus on automating AI R&D, building up the prerequisites.”
I think the current LLMs and other AI systems are not quite strong enough to pass a critical threshold where this RSI feedback loop could really take off. Thus, if I had the option to invest in preparing the scaffolding now, or racing to get to the first version so that I got to be the first to start doing RSI… I’d just push hard for that first good-enough version. Then I’d pivot hard to RSI as soon as I had it.
I don’t know, intuitively it would seem suboptimal to put very little of the research portfolio on preparing the scaffolding, since somebody else who isn’t that far behind on the base model (e.g. another lab, maybe even the opensource community) might figure out the scaffolding (and perhaps not even make anything public) and get ahead overall.
Maybe. I think it’s hard to say from an outside perspective. I expect that what’s being done inside labs is not always obvious on the outside.
And isn’t o1/strawberry something pointing in the direction of RSI, such that it implies that thought and effort is being put into that direction?
Importantly, I think that preventing an intelligence explosion (as opposed to delaying it by a couple of years, or slowing it by 10%) is really really hard. My expectation is that even if large training runs and the construction of new large datacenters were halted worldwide today, that algorithmic progress would continue. Within ten years I’d expect the cost of training and running a RSI-capable AGI would continue to drop as hardware improved and algorithms improved. At some point during that ten year period, it would come within reach of small private datacenters, then personal home servers (e.g. bitcoin mining rigs), then ordinary personal computers.
If my view on this is correct, then during this ten year period the governments of the world would not only need to coordinate to block RSI in large datacenters, but actually to expand their surveillance and control to ever smaller and more personal compute sources. Eventually they’d need to start confiscating personal computers beyond a certain power level, close all non-government controlled datacenters, prevent the public sale of hardware components which could be assembled into compute clusters, monitor the web and block all encrypted traffic in order to prevent federated learning, and continue to inspect all the government facilities (including all secret military facilities) of all other governments to prevent any of them from defecting against the ban on AI progress.
I don’t think that will happen, no matter how scared the leadership of any particular company or government gets. There’s just too many options for someone somewhere to defect, and the costs of control are too high.
I agree that pausing for 10 years would be difficult. However, I think even a 1-year pause would be GREAT and would substantially improve humanity’s chances of staying in control and surviving. In practice I expect ‘you can’t pause forever, what about bitcoin miners and north korea’ to be used as just one more in a long list of rationalizations for why we shouldn’t pause at all.
I do think that we shouldn’t pause. I agree that we need to end this corporate race towards AGI, but I don’t think pausing is the right way to do it.
I think we should nationalize, then move ahead cautiously but determinedly. Not racing, not treating this as just normal corporate R&D. I think the seriousness of the risks, and the huge potential upsides, means that AI development should be treated more like nuclear power and nuclear weapons. We should control and restrict it. Don’t allow even API access to non-team members. Remember when we talked about keeping AI in a box for safety? What if we actually did that?
I expect that nationalization would slow things down, especially at first during the transition. I think that’s a good thing.
I disagree that creating approximately human-level AGI in a controlled lab environment is a bad thing. There are a lot of risks, but the alternatives are also risky.
Risks of nationalizing:
Model and/or code could get stolen or leaked.
Word of its existence could spread, and encourage others to try to catch up.
It would be more useful for weapons development and other misuse.
It could be used for RSI, resulting in:
a future model so superintelligent that it can escape even from the controlled lab,
it could result in finding algorithmic improvements that make training much much cheaper (and this secret would then need to be prevented from being leaked)
Benefits of nationalization:
We would have a specimen of true AGI to study in the lab.
We could use it, even without robust alignment, via a control strategy like Buck’s/Ryan’s ideas.
We could use it for automated alignment research. Including:
creating synthetic data
creating realistic simulations for training and testing, with multiagent interactions, censored training runs, honeypots to catch deception or escape attempts, etc
parallel exploration of many theories with lots of very fast workers
exploration of a wider set of possible algorithms and architectures, to see if some are particularly safe (or hazardous)
We could use it for non-AI R&D:
this could help with defensive acceleration of defense-dominant technology. Protecting the world from bioweapons and ICBMs, etc.
this would enable beneficial rapid progress in many intellect-bottlenecked fields like medicine.
Nationalization would enable:
preventing AI experts from leaving the country even if they decided not to work for the government project,
removing restrictions (and adding incentives) on immigration for AI experts,
not being dependent on the whims of corporate politics for the safety of humanity,
not needing to develop and deploy a consumer product (distracting from the true quest of alignment), not needing to worry about profits vs expenditures
removing divisions between the top labs by placing them all in the government project, preventing secret-hording of details important for alignment
Risks of Pausing:
There is still a corporate race on, it just has to work around the rules of the pause now. This creates enormous pressure for the AI experts to find ways of improving the outputs of AI without breaking limits. This is especially concerning in the case of compute limits since it explicitly pushes research in the direction of searching for algorithms that would allow things like:
incremental / federated training methods that break up big training runs into sub-limit pieces, or allow for better combinations of prior models
the search for much more efficient algorithms that can work on much less data and training compute. I am confident that huge gains are possible here, and would be found quickly. This point has enough detail that could be added that it could become a post all on its own. I am reluctant to publicly post these thoughts though, since they might contribute to the thing I’m worried about actually happening. Such developments would gravely undercut the new compute restrictions and enable the risk of wider proliferation into the hands of many smaller actors.
the search for loopholes in the regulation or flaws in the surveillance and enforcement that allow for racing while appearing not to race
the continuation of many corporate race dynamics that are bad for safety like pressure for espionage and employee-sniping,
lack of military-grade security on AI developers. Many different independent corporate systems each presenting their own set of vulnerabilities. Dangerous innovation could occur in relatively small and insecure companies, and then get stolen or leaked.
any employee can decide to quit and go off to start their own project, spreading tech secrets and changing control structures. The companies have no power to stop this (as opposed to a nationalized project).
If large training runs are banned, then this reduces incentive to work for the big companies, smaller companies will seem more competitive and tempting.
The whole reason I think we should pause is that, sooner or later, we will hit a threshold where, if we do not pause, we literally die, basically immediately (or, like, a couple months later in a way that is hard to find and intervene on) and it doesn’t matter whether pausing has downsides.
(Where “pause” means “pause further capability developments that are more likely to produce the kinds of agentic thinking that could result in a recursive self-improvement)
So for me the question is “do we literally pause at the last possible second, or, sometime beforehand.” (I doubt “last possible second” is a reasonable choice, if it’s necessary to pause “at all”, although I’m much more agnostic about “we should literally pause right now” vs “right at the moment we have minimum-viable-AGI but before recursive self-improvement can start happening” vs “2-6 months before MVP AGI”)
I’m guessing you probably disagree that there will be a moment where, if we do not pause, we literally die?
Short answer, yes I disagree. I also don’t think we are safe if we do pause. That additional fact makes pausing seem less optimal.
Long answer:
I think I need to break this concept down a bit more. There’s a variety of places one might consider pausing, and a variety of consequences which I think could happen at each of those places.
Pre-AGI, dangerous tool AI: this is where we are at now. A combination of tool AI and the limited LLMs we have can together provide pretty substantial uplift to a terrorist org attempting to wipe out humanity. Not an x-risk, but could certainly kill 99% of humanity. A sensible civilization would have paused before we got this far, and put appropriate defenses in place before proceeding. We still have a chance to build defenses, and should do that ASAP. Civilization is not safe. Our existing institutions are apparently failing to protect us from the substantial risks we are facing.
Assistant AGI
weak AGI is being trained. With sufficient scaffolding, this will probably speed up AI research substantially, but isn’t enough to fully automate the whole research process. This can probably be deployed safely behind an API without dooming us all. If it is following the pattern of LLMs so far, this involves a system with tons of memorized facts but subpar reasoning skills and lack of integration of the logical implications of combinations of the facts. The facts are scattered and disconnected.
If we were wise, we’d stop here and accept this substantial speed-up to our research capabilities and try to get civilization into a safe state. If others are racing though, the group at this level is going to be strongly tempted to proceed. This is not yet enough AI power to assure a decisive technological-economic victory for the leader.
If this AGI escaped, we could probably catch it before it self-improved much at all, and it would likely pose no serious (additional) danger to humanity.
Full Researcher AGI
Enough reasoning capability added to the existing LLMs and scaffolding systems (perhaps via some non-LLM architecture) that the system can now reason and integrate facts at least as well as a typical STEM scientist.
I don’t expect that this kills us or escapes its lab, if it is treated cautiously. I think Buck&Ryan’s control scheme works well to harness capabilities without allowing harms, with only a moderate safety tax.
The speed up to AI research could now be said to truly be RSI. The time to the next level up, the more powerful version, may be only a few months, if the research is gone ahead with at full speed. This is where we would depend on organizational adequacy to protect us, to restrain the temptation of some researchers to accelerate full speed. This is where we have the chance to instead turn the focus fully onto alignment, defensive technology acceleration, and human-augmentation. This probably enough AI power to grant the leader decisive economic/technological/military power and allow them to take action to globally halt further racing towards AI. In other words, this is the first point at which I believe we could safely ‘pause’, although given that the ‘pause’ would look like pretty drastic actions and substantial use of human level AI, I don’t think that the ‘pause’ framing quite fits. Civilization can quickly be made safe.
If released into the world, could probably evolve into something that could wipe out humanity. Could be quite hard to catch and delete it in time, if it couldn’t be proven to the relevant authorities that the risk was real and required drastic action. Not a definite game-over though, just high risk.
Mildly Superhuman AGI (weak ASI)
Not only knows more about the world than any human ever has, but also integrates this information and reasons about it more effectively than any human could. Pretty dangerous even in the lab, but still could be safely controlled via careful schemes. For instance, keeping it impaired with slow-downs, deliberately censored and misleading datasets/simulations, and noise injection into its activations. Deploying it safely would likely require so much impairment that it wouldn’t end up any more useful than the merely-high-human-level AGI.
There would be huge temptation to relax the deliberate impairments and harness the full power. Organizational insufficiency could strike here, failing to sufficiently restrain the operators.
If it escaped, it would definitely be able to destroy humanity within a fairly short time frame.
Strongly Superhuman
By the time you’ve relaxed your impairment and control measures enough to even measure how superhuman it is, you are already in serious danger from it. This is the level where you would need to seriously worry about the lab employees getting mind-hacked in some way, or the lab compute equipment getting hacked. If we train this model, and even try to test it at full power, we are in great danger.
If it escapes, it is game over.
Given these levels, I think that you are technically correct that there is a point where if we don’t pause we are pretty much doomed. But I think that that pause point is somewhere above human-level AGI.
Furthermore, I think that attempting to pause before we get to ‘enough AI power to make humanity safe’ leads to us actually putting humanity at greater risk.
How?
By continuing in our current state of great vulnerability to self-replicating weapons like bioweapons.
By putting research pressure on the pursuit of more efficient algorithms, and resulting in a much more distributed, harder to control progress front proceeding nearly as fast to superhuman AGI. I think this rerouting would put us in far more danger of having a containment breach, or overshooting the target level of useful-but-not-suicidally-dangerous.
I don’t have a deep confident argument that we should pause before ‘slightly superhuman level’, but I do think we will need to pause then, and I think getting humanity ready to pause takes like 3 years which is also what my earliesh possible timelines are, so I think we need to start laying the groundwork now so that, even if you think we shouldn’t pause till later, we are ready civilizationally to pause quite abruptly.
Well, it seems we are closer to agreement than we thought about when a pause could be good. I am unsure about the correct way to prep for the pause. I do think we are about 2-4 years from human-level AI, and could get to above-human-level within a year after that if we so chose.
What makes you say 3 years?
It’s more like an intuitive guess than anything based on anything particularly rigorous, but, like, it takes time for companies and nation-states and international communities to get to agree to things, we don’t seem anywhere close, there will be political forces opposing the pause, and 3 years seems like a generously short time if we even got moderately lucky, to get all the necessary actors to pause in a stable way.
Honestly, my view is that assuming a baseline level of competence where AI legislation is inadequate until a crisis appears, and that crisis has to be fairly severe, it depends fairly clearly on when it happens.
A pause 2-3 years before AI can takeover everything is probably net-positive, but attempting to pause say 1 year or 6 months before AI can takeover everything is plausibly net negative, because I suspect a lot of the pauses to essentially be pauses on giant training runs, which unfortunately introduces lots of risks from overhang from algorithmic advances, and I expect that as soon as very strong laws on AI are passed, AI will probably be either 1 OOM in compute away from takeover, or could already takeover given new algorithms, which becomes a massive problem.
In essence, overhangs are the reason I expect the MNM effect to backfire/have a negative effect for AI regulation in a way it doesn’t for other regulation:
https://www.lesswrong.com/posts/EgdHK523ZM4zPiX5q/coronavirus-as-a-test-run-for-x-risks#Implications_for_X_risks