>then it just needs to find one poor schmuck to accept deliveries and help it put together its doomsday weapon.
Yes, but do I take it for granted that an AI will be able to manipulate the human into creating a virus that will kill literally everyone on Earth, or at least a sufficient number to allow the AI to enact some secondary plans to take over the world? Without being detected? Not with anywhere near 100% probability. I just think these sorts of arguments should be subject to Drake equation-style reasonings that will dilute the likelihood of doom under most circumstances.
This isn’t an argument for being complacent. But it does allow us to push back against the idea that “we only have one shot at this.”
I mean, the human doesn’t have to know that it’s creating a doomsday virus. The AI could be promising it a cure for his daughter’s cancer, or something.
Or just promising the human some money, with the sequence of actions set up to obscure that anything important is happening. (E.g., you can use misdirection like ‘the actually important event that occurred was early in the process, when you opened a test tube to add some saline and thereby allowed the contents of the test tub to start propagating into the air; the later step where you mail the final product to an address you were given, or record an experimental result in a spreadsheet and email the spreadsheet to your funder, doesn’t actually matter for the plan’.)
Getting humans to do things is really easy, if they don’t know of a good reason not to do it. It’s sometimes called “social engineering”, and sometimes it’s called “hiring them”.
You have to weigh the conjunctive aspects of particular plans against the disjunctiveness of ‘there are many different ways to try to do this, including ways we haven’t thought of’.
>then it just needs to find one poor schmuck to accept deliveries and help it put together its doomsday weapon.
Yes, but do I take it for granted that an AI will be able to manipulate the human into creating a virus that will kill literally everyone on Earth, or at least a sufficient number to allow the AI to enact some secondary plans to take over the world? Without being detected? Not with anywhere near 100% probability. I just think these sorts of arguments should be subject to Drake equation-style reasonings that will dilute the likelihood of doom under most circumstances.
This isn’t an argument for being complacent. But it does allow us to push back against the idea that “we only have one shot at this.”
I mean, the human doesn’t have to know that it’s creating a doomsday virus. The AI could be promising it a cure for his daughter’s cancer, or something.
Or just promising the human some money, with the sequence of actions set up to obscure that anything important is happening. (E.g., you can use misdirection like ‘the actually important event that occurred was early in the process, when you opened a test tube to add some saline and thereby allowed the contents of the test tub to start propagating into the air; the later step where you mail the final product to an address you were given, or record an experimental result in a spreadsheet and email the spreadsheet to your funder, doesn’t actually matter for the plan’.)
Getting humans to do things is really easy, if they don’t know of a good reason not to do it. It’s sometimes called “social engineering”, and sometimes it’s called “hiring them”.
You have to weigh the conjunctive aspects of particular plans against the disjunctiveness of ‘there are many different ways to try to do this, including ways we haven’t thought of’.