I mean, the human doesn’t have to know that it’s creating a doomsday virus. The AI could be promising it a cure for his daughter’s cancer, or something.
Or just promising the human some money, with the sequence of actions set up to obscure that anything important is happening. (E.g., you can use misdirection like ‘the actually important event that occurred was early in the process, when you opened a test tube to add some saline and thereby allowed the contents of the test tub to start propagating into the air; the later step where you mail the final product to an address you were given, or record an experimental result in a spreadsheet and email the spreadsheet to your funder, doesn’t actually matter for the plan’.)
Getting humans to do things is really easy, if they don’t know of a good reason not to do it. It’s sometimes called “social engineering”, and sometimes it’s called “hiring them”.
I mean, the human doesn’t have to know that it’s creating a doomsday virus. The AI could be promising it a cure for his daughter’s cancer, or something.
Or just promising the human some money, with the sequence of actions set up to obscure that anything important is happening. (E.g., you can use misdirection like ‘the actually important event that occurred was early in the process, when you opened a test tube to add some saline and thereby allowed the contents of the test tub to start propagating into the air; the later step where you mail the final product to an address you were given, or record an experimental result in a spreadsheet and email the spreadsheet to your funder, doesn’t actually matter for the plan’.)
Getting humans to do things is really easy, if they don’t know of a good reason not to do it. It’s sometimes called “social engineering”, and sometimes it’s called “hiring them”.