A lot of my thinking over the last few months has shifted from “how do we get some sort of AI pause in place?” to “how do we win the peace?”. That is, you could have a picture of AGI as the most important problem that precedes all other problems; anti-aging research is important, but it might actually be faster to build an aligned artificial scientist who solves it for you than to solve it yourself (on this general argument, see Artificial Intelligence as a Positive and Negative Factor in Global Risk). But if alignment requires a thirty-year pause on the creation of artificial scientists to work, that belief flips—now actually it makes sense to go ahead with humans researching the biology of aging, and to do projects like Loyal.
This isn’t true of just aging; there are probably something more like twelve major areas of concern. Some of them are simply predictable catastrophes we would like to avert; others are possibly necessary to be able to safely exit the pause at all (or to keep the pause going when it would be unsafe to exit).
I think ‘solutionism’ is basically the right path, here. What I’m interested in: what’s the foundation for solutionism, or what support does it need? Why is solutionism not already the dominant view? I think one of the things I found most exciting about SENS was the sense that “someone had done the work”, had actually identified the list of seven problems, and had a plan of how to address all of the problems. Even if those specific plans didn’t pan out, the superstructure was there and the ability to pivot was there. It looked like a serious approach by serious people. What is the superstructure for solutionism such that one can be reasonably confident that marginal efforts are actually contributing to success, instead of bailing water on the Titanic?
Vaniver is it your belief that a worldwide AI pause—not one limited to a specific geographic area—is a plausible outcome? Could you care to elaborate in more detail why you think it would be possible? The recent news, to me, doesn’t sound like it is in the process of happening. Almost all the news I have read has consisted of announcements consistent with an accelerating arms race, with 2 governance actions : EU AI act, Biden executive order, that aren’t pauses. I don’t know of any historical pauses in history that happened for useful technology without an adjacent alternative that continued to be used.
For example, cfcs were successfully banned but there are refrigerants and fire suppressants readily available without as much ozone layer hazard.
Bioweapons and nerve agents were semi successfully banned but nukes are strictly better.
Nukes were reduced in number but the superpowers keep arsenals capable of “more than 1.0 complete destruction of any enemy economic or military capacity”. Or “greater than 1.0 doomsdays”.
I could see an AI ban on totally untested systems, large system weight release, physical security requirements for large systems, and a ban on AI systems that can self modify their own framework. But this wouldn’t be a ban on AGI or ASI, there are system topologies that would be just as effective in doing things for humans without the above hazardous features.
I think as the race heats up and AI becomes more and more promising, we might see national total efforts to develop AI faster. Instead of private labs and whatever VC capital they can raise it would be government funded, and a “total” effort means an effort like a total war—all available resources would be invested.
Would you please share your world model with me. What am I missing?
Vaniver is it your belief that a worldwide AI pause—not one limited to a specific geographic area—is a plausible outcome? Could you care to elaborate in more detail why you think it would be possible?
Yes, I think it’s plausible. I don’t think it’s especially likely—my modal scenario still involves everyone dying—but I think especially if you condition on success it seems pretty likely, and it makes sense to play to your outs.
The basic argument for plausibility is that 1) people are mostly awake to risks from advanced AI, 2) current power structures are mostly not enamored with AI / view it as more likely to be destabilizing than stabilizing, 3) the people pushing for unregulated AI development are not particularly charismatic or sympathetic, and 4) current great powers are pretty willing to meddle in other countries when it comes to serious national security issues.
I expect pauses to look more like “significant regulatory apparatus” than “ban”; the sort of thing where building new nuclear plants was legal with approval and yet it takes decades to get NRC approval. Probably this involves a significant change in how chips are constructed and sold. [I note that computer hardware seems like an area where people are pouring gas onto the race instead of trying to slow down.]
I think as the race heats up and AI becomes more and more promising, we might see national total efforts to develop AI faster.
I think this might happen, and is >98% likely to be game over for humanity.
Ok, so your belief is that however low the odds are, it’s the only hope. And the odds are pretty low. I thought of a “dumbest possible frequentist algorithm” to estimate the odds.
The dumbest algorithm is to simply ask how many times outcome A vs B happens. For example, if the question is “how likely is a Green party candidate to be elected president”, and it’s never happened, and there have been 10 presidential elections since the founding of the green party, then we know the odds are under 10 percent. Obviously the odds are much lower than that, 2 party winner take all system makes the actual odds about 0, but say you are a Green party supporter—even you have to admit, based on this evidence, it isn’t likely.
And “humans failed to build useful weapons technology for concern about its long term effects”. Well, as far as I know, bioweapons research was mostly done for non replicating bioweapons. Anthrax can’t spread from person to person. The replicating ones would affect everyone, they aren’t specific enough. Like developing a suicide nuke as a weapon.
So it’s happened 1 time? And humans have developed how many major weapons in human history? Even if we go by category and only count major ones there’s bronze age, iron age, those spear launchers, roman phalanxes, horse archers, cannon, muskets, castles, battleships, submarines, aircraft, aircraft carriers, machine guns, artillery, nukes, ballistic missiles, cruise missiles, SAMs, stealth aircraft, tanks..at 21 categories and I am bored.
To steelman your argument would you say the odds are under 5 percent? Because AI isn’t just a weapon, it lets you make better medicine and mass produce housing and consumer goods and find criminals and so on. Frankly there is almost no category of human endeavor an AI won’t help with, vs like a tank where you can’t use it for anything but war.
So would you say, in your model, it works out to :
5 percent chance of multilateral AI slowdown. In these futures, what’s the odds of surviving here? If it’s 50 percent, then 2.5 percent survival here.
95 percent chance of arms race, where you think only 2 percent of these futures humans survive in. Then 1.9 percent survival here.
This how you see it?
people are mostly awake to risks from
advanced AI,
Nukes have x-risk but humans couldn’t help but build them
current power structures are mostly not
. enamored with AI / view it as more likely to be > destabilizing than stabilizing,
Each weapons advance I mentioned changed the world map and power balance. Entire powers fell as a consequence. They were destabilizing but agreements couldn't be reached not to build and use them. Like for a simple example, the British empire benefitted hugely from cannons on warships and really good sail driven warships. What if all the other powers at that time went to the Pope and asked for a bull that firing grapeshot wasn't Christian. Would this change anything?
In today’s world some powers are currently in a weaker position and AI offers them an opportunity to move to dominance.
the people pushing for unregulated AI
development are not particularly charismatic or sympathetic, and
That's not a particularly strong argument. There's thousands of other people who aren't pushing e/acc but the rules of the game means they are aligned with AI racing. The most clear incentives go to chip vendors, they stand to gain enormous revenue from AI silicon sales. Jensen Huang, Lisa Su, C. C. Wei, Patrick P. Gelsinger - they are all plenty charismatic, and they implicitly stand to gain. A lot. https://www.reddit.com/r/dataisbeautiful/comments/16u9w6f/oc_nvidias_revenue_breaks_records/
current great powers are pretty willing to
meddle in other countries when it comes to
serious national security
This is true. Note however the largest powers do want they want, where most meddling is information theft not sabotage. Soviets never bothered to try to sabotage USA defense complexes directly because the scale made it pointless, there was too much resiliency.
AI development has a tremendous amount of inherent resilience, much more than physical world tech. Each cluster of AI accelerators is interchangeable. Model checkpoints can be stored at many geographic locations. If you read the Gemini model card, they mention developing full determinism. This means someone could put a bomb on a tpuv5 cluster and the Google sysadmins could resume training, possibly autonomously.
The bottlenecks are in the chip fabrication tooling.
Nukes have x-risk but humans couldn’t help but build them
I think no one seriously considered the prospect of nuclear winter until well after stockpiles were large, and even now it’s not obviously an existential concern instead of merely catastrophic. If you’re talking about the ‘ignite the atmosphere’ concern, I think that’s actually evidence for voluntary relinquishment—they came up with a number where if they thought the risk was that high, they would give up on the project and take the risk of Nazi victory.
I expect the consensus estimate will be that AGI projects have risks in excess of that decision criterion, and that will motivate a halt until the risks are credibly lowered.
What if all the other powers at that time went to the Pope and asked for a bull that firing grapeshot wasn’t Christian. Would this change anything?
I assume you’re familiar with Innocent II’s prohibition on crossbows, and that it wasn’t effectively enforced. I am more interested in, say, the American/Israeli prohibition on Iranian nuclear weapons, which does seem to be effectively enforced on Earth.
The bottlenecks are in the chip fabrication tooling.
Yeah, I think it is more likely that we get compute restrictions / compute surveillance than restrictions on just AI developers. But even then, I think there aren’t that many people involved in AI development and it is within the capacities of intelligence agencies to surveil them (tho I am not confident that a “just watch them all the time” plan works out; you need to be able to anticipate the outcomes of the research work they’re doing, which requires technical competence that I don’t expect those agencies to have).
I believe there is minimal chance that all the superpowers will simultaneously agree on a meaningful ai pause. It seems like you agree the same way. A superpower cannot be stopped by the measures you mentioned, they will train new ai experts, build their own chip fabrication equipment, build lots of spare capacity etc. Iran is not a superpower.
I think there is some dispute over what we even mean by “AGi/ASI”. I am thinking of any system that scores above a numerical threshold on a large benchmark of tasks, where the majority of the score comes from complex multimodal tasks that are withheld. AGI means the machine did at least as well as humans on a broad selection of these tasks, ASI means it beat the best human experts on a broad selection of the tasks.
Any machine able to do the above counts.
Note you can pass such tasks without situational or context awareness or ongoing continuity of existence or self modification or online learning. (All forms of state buildup)
So this is a major split in our models I think. I am thinking of an arms race that builds tool ASI without the state above, and I think you are assuming past a certain point the AI systems will have context awareness and the ability to coordinate among each other?
Like is that what drives your doom assumptions and assumptions that people would stop? Do you think decision-makers would avoid investing in tool AI that they have a high confidence they can control? (The confidence would come from controlling context. An isolated model without context can’t even know it’s not still in training)
A lot of my thinking over the last few months has shifted from “how do we get some sort of AI pause in place?” to “how do we win the peace?”. That is, you could have a picture of AGI as the most important problem that precedes all other problems; anti-aging research is important, but it might actually be faster to build an aligned artificial scientist who solves it for you than to solve it yourself (on this general argument, see Artificial Intelligence as a Positive and Negative Factor in Global Risk). But if alignment requires a thirty-year pause on the creation of artificial scientists to work, that belief flips—now actually it makes sense to go ahead with humans researching the biology of aging, and to do projects like Loyal.
This isn’t true of just aging; there are probably something more like twelve major areas of concern. Some of them are simply predictable catastrophes we would like to avert; others are possibly necessary to be able to safely exit the pause at all (or to keep the pause going when it would be unsafe to exit).
I think ‘solutionism’ is basically the right path, here. What I’m interested in: what’s the foundation for solutionism, or what support does it need? Why is solutionism not already the dominant view? I think one of the things I found most exciting about SENS was the sense that “someone had done the work”, had actually identified the list of seven problems, and had a plan of how to address all of the problems. Even if those specific plans didn’t pan out, the superstructure was there and the ability to pivot was there. It looked like a serious approach by serious people. What is the superstructure for solutionism such that one can be reasonably confident that marginal efforts are actually contributing to success, instead of bailing water on the Titanic?
Vaniver is it your belief that a worldwide AI pause—not one limited to a specific geographic area—is a plausible outcome? Could you care to elaborate in more detail why you think it would be possible? The recent news, to me, doesn’t sound like it is in the process of happening. Almost all the news I have read has consisted of announcements consistent with an accelerating arms race, with 2 governance actions : EU AI act, Biden executive order, that aren’t pauses. I don’t know of any historical pauses in history that happened for useful technology without an adjacent alternative that continued to be used.
For example, cfcs were successfully banned but there are refrigerants and fire suppressants readily available without as much ozone layer hazard.
Bioweapons and nerve agents were semi successfully banned but nukes are strictly better.
Nukes were reduced in number but the superpowers keep arsenals capable of “more than 1.0 complete destruction of any enemy economic or military capacity”. Or “greater than 1.0 doomsdays”.
I could see an AI ban on totally untested systems, large system weight release, physical security requirements for large systems, and a ban on AI systems that can self modify their own framework. But this wouldn’t be a ban on AGI or ASI, there are system topologies that would be just as effective in doing things for humans without the above hazardous features.
I think as the race heats up and AI becomes more and more promising, we might see national total efforts to develop AI faster. Instead of private labs and whatever VC capital they can raise it would be government funded, and a “total” effort means an effort like a total war—all available resources would be invested.
Would you please share your world model with me. What am I missing?
Yes, I think it’s plausible. I don’t think it’s especially likely—my modal scenario still involves everyone dying—but I think especially if you condition on success it seems pretty likely, and it makes sense to play to your outs.
The basic argument for plausibility is that 1) people are mostly awake to risks from advanced AI, 2) current power structures are mostly not enamored with AI / view it as more likely to be destabilizing than stabilizing, 3) the people pushing for unregulated AI development are not particularly charismatic or sympathetic, and 4) current great powers are pretty willing to meddle in other countries when it comes to serious national security issues.
I expect pauses to look more like “significant regulatory apparatus” than “ban”; the sort of thing where building new nuclear plants was legal with approval and yet it takes decades to get NRC approval. Probably this involves a significant change in how chips are constructed and sold. [I note that computer hardware seems like an area where people are pouring gas onto the race instead of trying to slow down.]
I think this might happen, and is >98% likely to be game over for humanity.
Ok, so your belief is that however low the odds are, it’s the only hope. And the odds are pretty low. I thought of a “dumbest possible frequentist algorithm” to estimate the odds.
The dumbest algorithm is to simply ask how many times outcome A vs B happens. For example, if the question is “how likely is a Green party candidate to be elected president”, and it’s never happened, and there have been 10 presidential elections since the founding of the green party, then we know the odds are under 10 percent. Obviously the odds are much lower than that, 2 party winner take all system makes the actual odds about 0, but say you are a Green party supporter—even you have to admit, based on this evidence, it isn’t likely.
And “humans failed to build useful weapons technology for concern about its long term effects”. Well, as far as I know, bioweapons research was mostly done for non replicating bioweapons. Anthrax can’t spread from person to person. The replicating ones would affect everyone, they aren’t specific enough. Like developing a suicide nuke as a weapon.
So it’s happened 1 time? And humans have developed how many major weapons in human history? Even if we go by category and only count major ones there’s bronze age, iron age, those spear launchers, roman phalanxes, horse archers, cannon, muskets, castles, battleships, submarines, aircraft, aircraft carriers, machine guns, artillery, nukes, ballistic missiles, cruise missiles, SAMs, stealth aircraft, tanks..at 21 categories and I am bored.
To steelman your argument would you say the odds are under 5 percent? Because AI isn’t just a weapon, it lets you make better medicine and mass produce housing and consumer goods and find criminals and so on. Frankly there is almost no category of human endeavor an AI won’t help with, vs like a tank where you can’t use it for anything but war.
So would you say, in your model, it works out to :
5 percent chance of multilateral AI slowdown. In these futures, what’s the odds of surviving here? If it’s 50 percent, then 2.5 percent survival here.
95 percent chance of arms race, where you think only 2 percent of these futures humans survive in. Then 1.9 percent survival here.
This how you see it?
Nukes have x-risk but humans couldn’t help but build them
In today’s world some powers are currently in a weaker position and AI offers them an opportunity to move to dominance.
AI development has a tremendous amount of inherent resilience, much more than physical world tech. Each cluster of AI accelerators is interchangeable. Model checkpoints can be stored at many geographic locations. If you read the Gemini model card, they mention developing full determinism. This means someone could put a bomb on a tpuv5 cluster and the Google sysadmins could resume training, possibly autonomously.
The bottlenecks are in the chip fabrication tooling.
Roughly.
I think no one seriously considered the prospect of nuclear winter until well after stockpiles were large, and even now it’s not obviously an existential concern instead of merely catastrophic. If you’re talking about the ‘ignite the atmosphere’ concern, I think that’s actually evidence for voluntary relinquishment—they came up with a number where if they thought the risk was that high, they would give up on the project and take the risk of Nazi victory.
I expect the consensus estimate will be that AGI projects have risks in excess of that decision criterion, and that will motivate a halt until the risks are credibly lowered.
I assume you’re familiar with Innocent II’s prohibition on crossbows, and that it wasn’t effectively enforced. I am more interested in, say, the American/Israeli prohibition on Iranian nuclear weapons, which does seem to be effectively enforced on Earth.
Yeah, I think it is more likely that we get compute restrictions / compute surveillance than restrictions on just AI developers. But even then, I think there aren’t that many people involved in AI development and it is within the capacities of intelligence agencies to surveil them (tho I am not confident that a “just watch them all the time” plan works out; you need to be able to anticipate the outcomes of the research work they’re doing, which requires technical competence that I don’t expect those agencies to have).
So ok here’s some convergence.
I believe there is minimal chance that all the superpowers will simultaneously agree on a meaningful ai pause. It seems like you agree the same way. A superpower cannot be stopped by the measures you mentioned, they will train new ai experts, build their own chip fabrication equipment, build lots of spare capacity etc. Iran is not a superpower.
I think there is some dispute over what we even mean by “AGi/ASI”. I am thinking of any system that scores above a numerical threshold on a large benchmark of tasks, where the majority of the score comes from complex multimodal tasks that are withheld. AGI means the machine did at least as well as humans on a broad selection of these tasks, ASI means it beat the best human experts on a broad selection of the tasks.
Any machine able to do the above counts.
Note you can pass such tasks without situational or context awareness or ongoing continuity of existence or self modification or online learning. (All forms of state buildup)
So this is a major split in our models I think. I am thinking of an arms race that builds tool ASI without the state above, and I think you are assuming past a certain point the AI systems will have context awareness and the ability to coordinate among each other?
Like is that what drives your doom assumptions and assumptions that people would stop? Do you think decision-makers would avoid investing in tool AI that they have a high confidence they can control? (The confidence would come from controlling context. An isolated model without context can’t even know it’s not still in training)