Both of these points precisely reflect our current circumstances.
No, there is plenty of room between the current circumstances and the bottom. We might be back to Eliezer’s “an unknown random team creates a fooming AI in a garage” old threat model, if we curtail the current high-end too much.
Just like there is plenty of room between legal pharmacy and black market for pain relievers (even when the name of the drug is the same).
It’s very easy to make things worse.
It may not even be possible to accidentally make these two things worse with regulation.
It’s probably possible. But regulation is often good, and we do need more regulation for AI.
In this post we are not talking about regulation, we are talking about prohibition, which is a different story.
Replace “unknown random team” with the US military and a garage with a “military base” and you would be correct. There is no incentive for militaries to stop building autonomous drones/AGI.
However, I am not sure they are creative enough and not-control-freaks enough to try to build seriously self-modifying systems. They also don’t mind spending tons of money and allocating large teams, so they might not be aiming for artificial AI researchers all that much. And they are afraid to lose control (they know how to control people, but artificial self-modifying systems are something else).
Whereas a team in a garage is creative, is short on resources and quite interested in creating a team of artificial co-workers to help them (a success in that leads to a serious recursive self-improvement situation automatically), and might not hesitate to try other recursive self-improvements schemas (we are seeing more and more descriptions of novel recursive self-improvements schemas in recent publications), so they might end up with a foom even before they build more conventional artificial AI researchers (a sufficiently powerful self-referential metalearning schema might result in that; a typical experience is that all those recursive self-improvement schemas saturate disappointingly early, so the teams will be pushing harder at them trying to prevent premature saturation, and someone might succeed too well).
Basically, having “true AGI” means being able to create competent artificial AI researchers, which are sufficient for very serious recursive self-improvement capabilities, but one might also obtain drastic recursive self-improvement capabilities way before achieving anything like “true AGI”. “True AGI” is sufficient to start a far reaching recursive self-improvement, but there is no reason to think that “true AGI” is necessary for that (being more persistent at hacking the currently crippled self-improvement schemas and at studying ways to improve them might be enough).
I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.
There is no “good guy with an AGI” or “marginally safer frontier lab.” There is only “oops, all entity smarter than us that we never figured out how to align or control.”
If just the State of California suddenly made training runs above 10^26 FLOP illegal, that would be a massive improvement over our current situation on multiple fronts: it would significantly inconvenience most frontier labs for at least a few months, and it would send a strong message around the world that it is long past time to actually start taking this issue seriously.
Being extremely careful about our initial policy proposals doesn’t buy us nearly as much utility as being extremely loud about not wanting to die.
I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.
There is no “good guy with an AGI” or “marginally safer frontier lab.” There is only “oops, all entity smarter than us that we never figured out how to align or control.”
So what do you allocate the remaining 5% to? No matter who builds the AGI, there’s 5% chance that it doesn’t wipe out humanity because… what? (Or is it just model uncertainty?)
I do expect a foom (via AGI or via other route), and my timelines are much shorter than 5 years.
But algorithms for AI are improving faster than hardware (people seem to quote doubling in compute efficiency approximately each 8 months), so if one simply bans training runs above fixed compute thresholds, one trades off a bit of extra time before a particular achievement vs increase of number of active players achieving it a bit later (basically, this delays the most well-equipped companies a bit and democratizes the race, which is not necessarily better).
We can make bans progressively tighter, so we can buy some time, but as the algorithms progress further, it is not unlikely that we might at some point face the choice of banning computers altogether or facing a foom. So eventually we are likely going to face huge risks anyway.
I do think it’s time to focus not on “aligning” or “controlling” self-modifying ecosystems of self-modifying super-intelligences, but on figuring out how to increase the chances that a possible foom goes well for us instead of killing us. I believe that thinking only in terms of “aligning” or “controlling” limits the space of possible approaches to AI existential safety, and that approaches not based on notions of “alignment” and “control” might be more fruitful.
And, yes, I think our chances are better if the most thoughtful of practitioners achieve that first. For example, Ilya Sutskever’s thinking on the subject has been very promising (which is why I tend to favor OpenAI if he continues to lead the AI safety effort there, but I would be much more skeptical of them otherwise).
It doesn’t matter how promising anyone’s thinking has been on the subject. This isn’t a game. If we are in a position such that continuing to accelerate toward the cliff and hoping it works out is truly our best bet, then I strongly expect that we are dead people walking. Nearly 100% of the utility is in not doing the outrageously stupid dangerous thing. I don’t want a singularity and I absolutely do not buy the fatalistic ideologies that say it is inevitable, while actively shoveling coal into Moloch’s furnace.
I physically get out into the world to hand out flyers and tell everyone I can that the experts say the world might end soon because of the obscene recklessness of a handful of companies. I am absolutely not the best person to do so, but no one else in my entire city will, and I really, seriously, actually don’t want everyone to die soon. If we are not crying out and demanding that the frontier labs be forced to stop what they are doing, then we are passively committing suicide. Anyone who has a P(doom) above 1% and debates the minutiae of policy but hasn’t so much as emailed a politician is not serious about wanting the world to continue to exist.
I am confident that this comment represents what the billions of normal, average people of the world would actually think and want if they heard, understood, and absorbed the basic facts of our current situation with regard to AI and doom. I’m with the reasonable majority who say when polled that they don’t want AGI. How dare we risk murdering every last one of them by throwing dice at the craps table to fulfill some childish sci-fi fantasy.
Like I said, if we try to apply forceful measures we might delay it for some time (at the price of people aging and dying from old age and illnesses to the tune of dozens of millions per year due to the continuing delay; but we might think that the risks are so high that the price is worth it, and we might think that the price of everyone who is alive today eventually dying of old age is worth it, although some of us might disagree with that and might say that taking the risk of foom is better than the guaranteed eventual death of old age or other illness; there is a room for disagreement on which of these risks it is preferable to choose).
But if we are talking about avoiding foom indefinitely, we should start with asking ourselves, how easy or how difficult is it to achieve. How long before a small group of people equipped with home computers can create foom?
And the results of this analysis are not pretty. Computers are ultimate self-modifying devices, they can produce code which programs them. Methods to produce computer code much better than we do it today are not all that difficult, they are just in the collective cognitive blindspot, like backpropagation was for a long time, like ReLU activations were for decades, like residual connectivity in neural machines was in the collective cognitive blindspot for unreasonably long time. But this state of those enhanced methods of producing new computer code being relatively unknown would not last forever.
And just like backpropagation, ReLU, or residual connections, these methods are not all that difficult, it’s not like if a single “genius” who might discover them would refrain from sharing them, they would remain unknown. People keep rediscovering and rediscovering those methods, they are not that tricky (backpropagation was independently discovered more than 10 times between 1970 and 1986, before people stopped ignoring it and started to use it).
It’s just the case that the memetic fitness of those methods is currently low, just like memetic fitness of backpropagation, ReLU, and residual connections used to be low in the strangest possible ways. But this would not last, the understanding of how to have competent self-improvement in small-scale software on ordinary laptops will gradually form and proliferate. At the end of the day, we’ll end up having to either phase out universal computers (at least those which are as powerful as our home computers today) or to find ways to control them very tightly, so that people are no longer free to program their own computers as they see fit.
Perhaps, humans will chose to do that, I don’t know. Nor do I know whether they would succeed in a Butlerian jihad of this kind, or whether some of the side effects of trying to get rid of computers would become X-risks themselves. In any case, in order to avoid foom indefinitely, people will have to take drastic, radical measures which would make everyone miserable, would kill a lot of people, and might create other existential risks.
I think it’s safer if the most competent leading group tries to go ahead, that our eventual chances of survival are higher along this path, compared to the likely alternatives.
I do think that the risks on the alternative paths are very high too; the great powers are continuing to inch towards major nuclear confrontation; we are enabling more and more people to create diverse super-covid-like artificial pandemics with 30% mortality or more; things are pretty bad in terms of major risks this civilization is facing; instead of asking “what’s your P(doom)” we should be asking people, “what’s your P(doom) conditional on foom and what’s your P(doom) conditional on no foom happening”. My P(doom) is not small, but is it higher conditional on foom, than conditional on no foom? I don’t know...
No, there is plenty of room between the current circumstances and the bottom. We might be back to Eliezer’s “an unknown random team creates a fooming AI in a garage” old threat model, if we curtail the current high-end too much.
Just like there is plenty of room between legal pharmacy and black market for pain relievers (even when the name of the drug is the same).
It’s very easy to make things worse.
It’s probably possible. But regulation is often good, and we do need more regulation for AI.
In this post we are not talking about regulation, we are talking about prohibition, which is a different story.
Replace “unknown random team” with the US military and a garage with a “military base” and you would be correct. There is no incentive for militaries to stop building autonomous drones/AGI.
Militaries are certainly doing that, I agree.
However, I am not sure they are creative enough and not-control-freaks enough to try to build seriously self-modifying systems. They also don’t mind spending tons of money and allocating large teams, so they might not be aiming for artificial AI researchers all that much. And they are afraid to lose control (they know how to control people, but artificial self-modifying systems are something else).
Whereas a team in a garage is creative, is short on resources and quite interested in creating a team of artificial co-workers to help them (a success in that leads to a serious recursive self-improvement situation automatically), and might not hesitate to try other recursive self-improvements schemas (we are seeing more and more descriptions of novel recursive self-improvements schemas in recent publications), so they might end up with a foom even before they build more conventional artificial AI researchers (a sufficiently powerful self-referential metalearning schema might result in that; a typical experience is that all those recursive self-improvement schemas saturate disappointingly early, so the teams will be pushing harder at them trying to prevent premature saturation, and someone might succeed too well).
Basically, having “true AGI” means being able to create competent artificial AI researchers, which are sufficient for very serious recursive self-improvement capabilities, but one might also obtain drastic recursive self-improvement capabilities way before achieving anything like “true AGI”. “True AGI” is sufficient to start a far reaching recursive self-improvement, but there is no reason to think that “true AGI” is necessary for that (being more persistent at hacking the currently crippled self-improvement schemas and at studying ways to improve them might be enough).
I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.
There is no “good guy with an AGI” or “marginally safer frontier lab.” There is only “oops, all entity smarter than us that we never figured out how to align or control.”
If just the State of California suddenly made training runs above 10^26 FLOP illegal, that would be a massive improvement over our current situation on multiple fronts: it would significantly inconvenience most frontier labs for at least a few months, and it would send a strong message around the world that it is long past time to actually start taking this issue seriously.
Being extremely careful about our initial policy proposals doesn’t buy us nearly as much utility as being extremely loud about not wanting to die.
So what do you allocate the remaining 5% to? No matter who builds the AGI, there’s 5% chance that it doesn’t wipe out humanity because… what? (Or is it just model uncertainty?)
Yes, that’s my model uncertainty.
I do expect a foom (via AGI or via other route), and my timelines are much shorter than 5 years.
But algorithms for AI are improving faster than hardware (people seem to quote doubling in compute efficiency approximately each 8 months), so if one simply bans training runs above fixed compute thresholds, one trades off a bit of extra time before a particular achievement vs increase of number of active players achieving it a bit later (basically, this delays the most well-equipped companies a bit and democratizes the race, which is not necessarily better).
We can make bans progressively tighter, so we can buy some time, but as the algorithms progress further, it is not unlikely that we might at some point face the choice of banning computers altogether or facing a foom. So eventually we are likely going to face huge risks anyway.
I do think it’s time to focus not on “aligning” or “controlling” self-modifying ecosystems of self-modifying super-intelligences, but on figuring out how to increase the chances that a possible foom goes well for us instead of killing us. I believe that thinking only in terms of “aligning” or “controlling” limits the space of possible approaches to AI existential safety, and that approaches not based on notions of “alignment” and “control” might be more fruitful.
And, yes, I think our chances are better if the most thoughtful of practitioners achieve that first. For example, Ilya Sutskever’s thinking on the subject has been very promising (which is why I tend to favor OpenAI if he continues to lead the AI safety effort there, but I would be much more skeptical of them otherwise).
It doesn’t matter how promising anyone’s thinking has been on the subject. This isn’t a game. If we are in a position such that continuing to accelerate toward the cliff and hoping it works out is truly our best bet, then I strongly expect that we are dead people walking. Nearly 100% of the utility is in not doing the outrageously stupid dangerous thing. I don’t want a singularity and I absolutely do not buy the fatalistic ideologies that say it is inevitable, while actively shoveling coal into Moloch’s furnace.
I physically get out into the world to hand out flyers and tell everyone I can that the experts say the world might end soon because of the obscene recklessness of a handful of companies. I am absolutely not the best person to do so, but no one else in my entire city will, and I really, seriously, actually don’t want everyone to die soon. If we are not crying out and demanding that the frontier labs be forced to stop what they are doing, then we are passively committing suicide. Anyone who has a P(doom) above 1% and debates the minutiae of policy but hasn’t so much as emailed a politician is not serious about wanting the world to continue to exist.
I am confident that this comment represents what the billions of normal, average people of the world would actually think and want if they heard, understood, and absorbed the basic facts of our current situation with regard to AI and doom. I’m with the reasonable majority who say when polled that they don’t want AGI. How dare we risk murdering every last one of them by throwing dice at the craps table to fulfill some childish sci-fi fantasy.
Like I said, if we try to apply forceful measures we might delay it for some time (at the price of people aging and dying from old age and illnesses to the tune of dozens of millions per year due to the continuing delay; but we might think that the risks are so high that the price is worth it, and we might think that the price of everyone who is alive today eventually dying of old age is worth it, although some of us might disagree with that and might say that taking the risk of foom is better than the guaranteed eventual death of old age or other illness; there is a room for disagreement on which of these risks it is preferable to choose).
But if we are talking about avoiding foom indefinitely, we should start with asking ourselves, how easy or how difficult is it to achieve. How long before a small group of people equipped with home computers can create foom?
And the results of this analysis are not pretty. Computers are ultimate self-modifying devices, they can produce code which programs them. Methods to produce computer code much better than we do it today are not all that difficult, they are just in the collective cognitive blindspot, like backpropagation was for a long time, like ReLU activations were for decades, like residual connectivity in neural machines was in the collective cognitive blindspot for unreasonably long time. But this state of those enhanced methods of producing new computer code being relatively unknown would not last forever.
And just like backpropagation, ReLU, or residual connections, these methods are not all that difficult, it’s not like if a single “genius” who might discover them would refrain from sharing them, they would remain unknown. People keep rediscovering and rediscovering those methods, they are not that tricky (backpropagation was independently discovered more than 10 times between 1970 and 1986, before people stopped ignoring it and started to use it).
It’s just the case that the memetic fitness of those methods is currently low, just like memetic fitness of backpropagation, ReLU, and residual connections used to be low in the strangest possible ways. But this would not last, the understanding of how to have competent self-improvement in small-scale software on ordinary laptops will gradually form and proliferate. At the end of the day, we’ll end up having to either phase out universal computers (at least those which are as powerful as our home computers today) or to find ways to control them very tightly, so that people are no longer free to program their own computers as they see fit.
Perhaps, humans will chose to do that, I don’t know. Nor do I know whether they would succeed in a Butlerian jihad of this kind, or whether some of the side effects of trying to get rid of computers would become X-risks themselves. In any case, in order to avoid foom indefinitely, people will have to take drastic, radical measures which would make everyone miserable, would kill a lot of people, and might create other existential risks.
I think it’s safer if the most competent leading group tries to go ahead, that our eventual chances of survival are higher along this path, compared to the likely alternatives.
I do think that the risks on the alternative paths are very high too; the great powers are continuing to inch towards major nuclear confrontation; we are enabling more and more people to create diverse super-covid-like artificial pandemics with 30% mortality or more; things are pretty bad in terms of major risks this civilization is facing; instead of asking “what’s your P(doom)” we should be asking people, “what’s your P(doom) conditional on foom and what’s your P(doom) conditional on no foom happening”. My P(doom) is not small, but is it higher conditional on foom, than conditional on no foom? I don’t know...