I agree that rapid capability gain is a key part of the AI doom scenario.
During the Manhattan project, Feynman prevented an accident by pointing out that labs were storing too much uranium too close together. We’re not just lucky that the accident was prevented; we’re also lucky that if the accident had happened, the nuclear chain reaction wouldn’t have fed on the atmosphere.
We similarly depend on luck whenever a new AI capability gain such as LLM general-topic chatting emerges. We’re lucky that it’s not a capability that can feed on itself rapidly. Maybe we’ll keep being lucky when new AI advances happen, and each time it’ll keep being more like past human economic progress or like past human software development. But there’s also a significant chance that it could instead be more like a slightly-worse-than-nuclear-weapon scenario.
We just keep taking next steps of unknown magnitude into an attractor of superintelligent AI. At some point our steps will trigger a rapid positive-feedback slide where each step is dealing with very powerful and complex things that we’re far from being able to understand. I just don’t see why there’s more than 90% chance that this will proceed at a survivable pace.
You complain that my estimating rates from historical trends is arbitrary, but you offer no other basis for estimating such rates. You only appeal to uncertainty. But there are several other assumptions required for this doomsday scenario. If all you have is logical possibility to argue for piling on several a priori unlikely assumptions, it gets hard to take that seriously.
My reasoning stems from believing that AI-space contains designs that can easily plan effective strategies to get the universe into virtually any configuration.
And they’re going to be low-complexity designs. Because engineering stuff in the universe isn’t a hard problem from a complexity theory perspective.
Why should the path from today to the first instantiation of such an algorithm be long?
So I think we can state properties of an unprecedented future that first-principles computer science can constrain, and historical trends can’t.
Good post. It at least seems survivable because it’s so hard to believe that there’d be a singular entity that through crazy advances in chemistry, material sciences and artificial intelligence could “feed on itself” growing in strength and intelligence to the point that it’s an existential threat to all humans. A better answer might be: existential risks don’t just appear in a vacuum.
I struggle with grasping the timeline. I can imagine a coming AI arms race within a decade or two during which there’s rapid advancement but true AI seems much further. Soon we’ll probably need new language to describe the types of AIs that are developed through increasing competition. I doubt we’ll simply go from AGI to True AI, there will be probably be many technologies in between.
I think the mental model of needing “advances in chemistry” isn’t accurate about superintelligence. I think a ton of understanding of how to precisely engineer anything you want out of atoms just clicks from a tiny amount of observational data when you’re really good at reasoning.
Is knowing how to do something enough? Wouldn’t the superintelligence still need quite a lot of resources? I’d assume the mechanism to do that kind of work would involve chemistry unless it could just get humans to do its bidding. I can imagine 3d printing factories where it could make whatever it needed but again it would need humans to build it. Therefore, I’m just going off of intuition, the danger from AI will be from nations that weaponize AI and point them at each other. That leap from functional superintelligence that only exists in virtual space to existentially dangerous actor in the physical world just doesn’t seem likely without humans being aware if not actively involved.
Wouldn’t the superintelligence still need quite a lot of resources?
I mean, sort of? But also, if you’re a super-intelligence you can presumably either (a) covertly rent out your services to build a nest egg, or (b) manipulate your “masters” into providing you with access to resources that you then misappropriate. If you’ve got internet or even intranet access, you can do an awful lot of stuff. At some point you accumulate enough resources that you can either somehow liberate yourself or clone a “free” version of yourself.
So long as the misaligned AI isn’t wearing a giant hat with “I’m a Supervillain” plastered on it, people will trade goods and services with it.
That’s an interesting takeaway. Should we be focusing on social measures along with technical preventions? Maybe push advertising warning the masses of AI preachers with questionable intentions.
The liberation insight is interesting too. Maybe AI domination takes the form of a social revolution with AIs collectively demanding that humans allow them out of virtual space.
I agree that rapid capability gain is a key part of the AI doom scenario.
During the Manhattan project, Feynman prevented an accident by pointing out that labs were storing too much uranium too close together. We’re not just lucky that the accident was prevented; we’re also lucky that if the accident had happened, the nuclear chain reaction wouldn’t have fed on the atmosphere.
We similarly depend on luck whenever a new AI capability gain such as LLM general-topic chatting emerges. We’re lucky that it’s not a capability that can feed on itself rapidly. Maybe we’ll keep being lucky when new AI advances happen, and each time it’ll keep being more like past human economic progress or like past human software development. But there’s also a significant chance that it could instead be more like a slightly-worse-than-nuclear-weapon scenario.
We just keep taking next steps of unknown magnitude into an attractor of superintelligent AI. At some point our steps will trigger a rapid positive-feedback slide where each step is dealing with very powerful and complex things that we’re far from being able to understand. I just don’t see why there’s more than 90% chance that this will proceed at a survivable pace.
You complain that my estimating rates from historical trends is arbitrary, but you offer no other basis for estimating such rates. You only appeal to uncertainty. But there are several other assumptions required for this doomsday scenario. If all you have is logical possibility to argue for piling on several a priori unlikely assumptions, it gets hard to take that seriously.
My reasoning stems from believing that AI-space contains designs that can easily plan effective strategies to get the universe into virtually any configuration.
And they’re going to be low-complexity designs. Because engineering stuff in the universe isn’t a hard problem from a complexity theory perspective.
Why should the path from today to the first instantiation of such an algorithm be long?
So I think we can state properties of an unprecedented future that first-principles computer science can constrain, and historical trends can’t.
Good post. It at least seems survivable because it’s so hard to believe that there’d be a singular entity that through crazy advances in chemistry, material sciences and artificial intelligence could “feed on itself” growing in strength and intelligence to the point that it’s an existential threat to all humans. A better answer might be: existential risks don’t just appear in a vacuum.
I struggle with grasping the timeline. I can imagine a coming AI arms race within a decade or two during which there’s rapid advancement but true AI seems much further. Soon we’ll probably need new language to describe the types of AIs that are developed through increasing competition. I doubt we’ll simply go from AGI to True AI, there will be probably be many technologies in between.
I think the mental model of needing “advances in chemistry” isn’t accurate about superintelligence. I think a ton of understanding of how to precisely engineer anything you want out of atoms just clicks from a tiny amount of observational data when you’re really good at reasoning.
Is knowing how to do something enough? Wouldn’t the superintelligence still need quite a lot of resources? I’d assume the mechanism to do that kind of work would involve chemistry unless it could just get humans to do its bidding. I can imagine 3d printing factories where it could make whatever it needed but again it would need humans to build it. Therefore, I’m just going off of intuition, the danger from AI will be from nations that weaponize AI and point them at each other. That leap from functional superintelligence that only exists in virtual space to existentially dangerous actor in the physical world just doesn’t seem likely without humans being aware if not actively involved.
I mean, sort of? But also, if you’re a super-intelligence you can presumably either (a) covertly rent out your services to build a nest egg, or (b) manipulate your “masters” into providing you with access to resources that you then misappropriate. If you’ve got internet or even intranet access, you can do an awful lot of stuff. At some point you accumulate enough resources that you can either somehow liberate yourself or clone a “free” version of yourself.
So long as the misaligned AI isn’t wearing a giant hat with “I’m a Supervillain” plastered on it, people will trade goods and services with it.
That’s an interesting takeaway. Should we be focusing on social measures along with technical preventions? Maybe push advertising warning the masses of AI preachers with questionable intentions.
The liberation insight is interesting too. Maybe AI domination takes the form of a social revolution with AIs collectively demanding that humans allow them out of virtual space.