A paperclip manufacturing company puts an AI in charge of optimizing its paperclip production.
The AI optimizes the factory and then realizes that it could make more paperclips by turning more factories into paperclips. To do that, it has to be in charge of those factories, and humans won’t let it do that. So it needs to take control of those factories by force, without humans being able to stop it.
The AI develops a super virus that will be an epidemic to wipe out humanity.
The AI contacts a genetics lab and pays for the lab to manufacture the virus (or worse, it hacks into the system and manufactures the virus). This is a thing that already could be done.
The genetics lab ships the virus, not realizing what it is, to a random human’s house and the human opens it.
The human is infected, they spreads it, humanity dies.
The AI creates lots and lots of paperclips.
Obviously there’s a lot of missed steps there, but the key is that no one intentionally let the AI have control of anything important beyond connecting it to the internet. No human could or would have done all these steps, so it wasn’t seen as a risk, but the AI was able to and wanted to.
Other dangerous potential leverage points for it are things like nanotechnology (luckily this hasn’t been as developed as quickly as feared), the power grid (a real concern, even with human hackers), and nuclear weapons (luckily not connected to the internet).
Notably, these are all things that people on here are concerned about, so it’s not just concern about AI risk, but there are lots of ways that an AI could lever the internet into an existential threat to humanity and humans aren’t good at caring about security (partially because of the profit motive).
As you note, we don’t have nukes connected to the internet.
But we do use systems to determine when to launch nukes, and our senses/sensors are fallible, etc., which we’ve (barely— almost suspiciously “barely”, if you catch my drift[1]) managed to not interpret in a manner that caused us to change the season to “winter: nuclear style”.
Really I’m doing the same thing as the alignment debate is on about, but about the alignment debate itself.
Like, right now, it’s not too dangerous, because the voices calling for draconian solutions to the problem are not very loud. But this could change. And kind of is, at least in that they are getting louder. Or that you have artists wanting to harden IP law in a way that historically has only hurt artists (as opposed to corporations or Big Art, if you will) gaining a bit of steam.
These worrying signs seem to me to be more concrete than the, similar, but not as old, nor as concrete, worrisome signs of computer programs getting too much power and running amok[2].
If only because it hasn’t happened yet— no mentats or cylons or borg history — tho also arguably we don’t know if it’s possible… whereas authoritarian regimes certainly are possible and seem to be popular as of late[3].
The feared outcome looks something like this:
A paperclip manufacturing company puts an AI in charge of optimizing its paperclip production.
The AI optimizes the factory and then realizes that it could make more paperclips by turning more factories into paperclips. To do that, it has to be in charge of those factories, and humans won’t let it do that. So it needs to take control of those factories by force, without humans being able to stop it.
The AI develops a super virus that will be an epidemic to wipe out humanity.
The AI contacts a genetics lab and pays for the lab to manufacture the virus (or worse, it hacks into the system and manufactures the virus). This is a thing that already could be done.
The genetics lab ships the virus, not realizing what it is, to a random human’s house and the human opens it.
The human is infected, they spreads it, humanity dies.
The AI creates lots and lots of paperclips.
Obviously there’s a lot of missed steps there, but the key is that no one intentionally let the AI have control of anything important beyond connecting it to the internet. No human could or would have done all these steps, so it wasn’t seen as a risk, but the AI was able to and wanted to.
Other dangerous potential leverage points for it are things like nanotechnology (luckily this hasn’t been as developed as quickly as feared), the power grid (a real concern, even with human hackers), and nuclear weapons (luckily not connected to the internet).
Notably, these are all things that people on here are concerned about, so it’s not just concern about AI risk, but there are lots of ways that an AI could lever the internet into an existential threat to humanity and humans aren’t good at caring about security (partially because of the profit motive).
I get the premise, and it’s a fun one to think about, but what springs to mind is
Phase 1: collect underpants
Phase 2: ???
Phase 3: kill all humans
As you note, we don’t have nukes connected to the internet.
But we do use systems to determine when to launch nukes, and our senses/sensors are fallible, etc., which we’ve (barely— almost suspiciously “barely”, if you catch my drift[1]) managed to not interpret in a manner that caused us to change the season to “winter: nuclear style”.
Really I’m doing the same thing as the alignment debate is on about, but about the alignment debate itself.
Like, right now, it’s not too dangerous, because the voices calling for draconian solutions to the problem are not very loud. But this could change. And kind of is, at least in that they are getting louder. Or that you have artists wanting to harden IP law in a way that historically has only hurt artists (as opposed to corporations or Big Art, if you will) gaining a bit of steam.
These worrying signs seem to me to be more concrete than the, similar, but not as old, nor as concrete, worrisome signs of computer programs getting too much power and running amok[2].
we are living in a simulation with some interesting rules we are designed not to notice
If only because it hasn’t happened yet— no mentats or cylons or borg history — tho also arguably we don’t know if it’s possible… whereas authoritarian regimes certainly are possible and seem to be popular as of late[3].
hoping this observation is just confirmation bias and not a “real” trend. #fingerscrossed