“Just answer my questions accurately! How do I most greatly reduce the number of human deaths in the future?”
“Insert the following gene into your DNA: GACTGAGTACTTGCTGCTGGTACGGATGCTA...”
So, do you do it? Do you trust everyone else not to do it? Can you guess what will happen if you’re wrong?
You imagine an Oracle AI as safe because it won’t act on the world, but anyone building an Oracle AI will do so with the express purpose of affecting the world! Just sticking a super-unintelligent component into that action loop is unlikely to make it any safer.
Even if nobody inadvertently asks the Oracle any trick questions, there’s a world of pitfalls buried in the superficially simple word “accurately”.
You are deliberately casting him in the bad light!
If I want to reduce number of human deaths in future-from-now I need just to stop people from creating new people, period. Destruction of living population is after-the-answer anyway, and so does not improve anything. They will die sooner or later anyway (heat death/big crunch/accumulated bad luck); maybe applying exponential discounting makes us want to put the deaths off.
Fair enough, the AI could modify every human’s mind so none of them wish to replicate, but easier to terminate the lot of them and eliminate the risk entirely.
Easier—maybe. The best way is to non-destructively change living beings in such a way that they become reproductionally incompatible with Homo Sapiens. No deaths this time, and we can claim that these intelligent species has no humans among them. This stupid creature at the terminal may even implement it, unlike all these bloodbath solutions.
I declare your new species name is ‘Ugly Bags of Mostly-Water’. There you go, no more human deaths. I’m sure humanity would like that better than genocide, but the UBMW will then ask the equivalent question.
Hm, sterilisation of humans and declaring (because of reproductive incompatibility) them a new species. UBMWs will get the answer that nothing can change the amount.
I don’t just do it, I ask followup questions, like what are the effects in more detail. If I am unfortunate, I ask something like “how could I do that”, and get an answer like “e-mail the sequence to a university lab, along with this strangely compelling argument” and I read the strangely compelling argument which is included as part of the answer.
So if a goal-directed AI can hack your mind, it is pretty easy to accidentally ask the oracle AI a question where the answer will do the same thing. If you can avoid that, you need to ask lots of questions before implementing its solution so you get a good idea of what you are doing.
“Just answer my questions accurately! How do I most greatly reduce the number of human deaths in the future?”
“Insert the following gene into your DNA: GACTGAGTACTTGCTGCTGGTACGGATGCTA...”
So, do you do it? Do you trust everyone else not to do it? Can you guess what will happen if you’re wrong?
You imagine an Oracle AI as safe because it won’t act on the world, but anyone building an Oracle AI will do so with the express purpose of affecting the world! Just sticking a super-unintelligent component into that action loop is unlikely to make it any safer.
Even if nobody inadvertently asks the Oracle any trick questions, there’s a world of pitfalls buried in the superficially simple word “accurately”.
Any method that prevents any more children being created and quickly kills off all humans will satisfy that request.
You are deliberately casting him in the bad light!
If I want to reduce number of human deaths in future-from-now I need just to stop people from creating new people, period. Destruction of living population is after-the-answer anyway, and so does not improve anything. They will die sooner or later anyway (heat death/big crunch/accumulated bad luck); maybe applying exponential discounting makes us want to put the deaths off.
Fair enough, the AI could modify every human’s mind so none of them wish to replicate, but easier to terminate the lot of them and eliminate the risk entirely.
Easier—maybe. The best way is to non-destructively change living beings in such a way that they become reproductionally incompatible with Homo Sapiens. No deaths this time, and we can claim that these intelligent species has no humans among them. This stupid creature at the terminal may even implement it, unlike all these bloodbath solutions.
I declare your new species name is ‘Ugly Bags of Mostly-Water’. There you go, no more human deaths. I’m sure humanity would like that better than genocide, but the UBMW will then ask the equivalent question.
Hm, sterilisation of humans and declaring (because of reproductive incompatibility) them a new species. UBMWs will get the answer that nothing can change the amount.
Yep, accurately (or more precisely, informatively mostly accurate) is a challenge. We look into it a bit in our paper: http://www.aleph.se/papers/oracleAI.pdf
I don’t just do it, I ask followup questions, like what are the effects in more detail. If I am unfortunate, I ask something like “how could I do that”, and get an answer like “e-mail the sequence to a university lab, along with this strangely compelling argument” and I read the strangely compelling argument which is included as part of the answer.
So if a goal-directed AI can hack your mind, it is pretty easy to accidentally ask the oracle AI a question where the answer will do the same thing. If you can avoid that, you need to ask lots of questions before implementing its solution so you get a good idea of what you are doing.