DataPacRat, I like that you included subsequent questions, and I think there may also be other ways of structuring subsequent questions as well which may also make people think about different answers.
Example: Is a paperclipper better than the something for what likely duration of time?
For instance, take the Trilobites vs paperclipper scenario you mentioned. I am imagining:
A: A solar system that has trilobites for 1 billion years, until it is engulfed by it’s sun and everything dies.
B: A solar system that has trilobites in a self-sustaining gaia planet for Eternity.
C: A solar system that has a paperclipping AI for 1 billion years, until it is engulfed by it’s sun and all of the paperclips melt.
D: A solar system that has a paperclipping AI that keeps a planet sized mass of paperclips as paperclips for eternity.
If I prefer B>D>A>C, then it seems like I might choose the paperclipper over the trilobites if figure the paperclipper has a 99.99% chance of lasting for an eternity, and the trilobite planet has a 0.01% chance of doing so.
On the other hand, it may be that you want to make an assumption that the paperclipper and the trilobites planet are equally resilient to existential crises for the purposes of this problem.
Second Example: Does the context of the paperclipper AI and it’s destruction matter?
Imagine all of humanity, and the rest of the solar system, is going to be engulfed by our sun’s super nova soon. We’re all going to die. There is one person who is going to be uploaded into an Hansonian EM experimental probe that will be shot away from the blast, a person with a mental disorder from a foreign country who loves making paperclips. (Unfortunately humanity only got to uploading tech 0.91 prior to the super nova- Very few people have any kind of upload capable brain right now.)
You have read several papers from an aquaintance of yours, a respected scientist, who has said multiple times “If you upload him he’ll turn into a paperclipper AI, I’m sure of it.” you’ve also read a few independent publications indicating yes, this person is going to be a paperclipper AI if uploaded under uploading tech 0.91, and here is proof.
A research assistant has snuck you a sabotage virus that will destroy the upload probe stealthily after the upload has taken place, saying “I wanted to see if I could be James Bond before I died!” and then commits suicide.
Do you run the sabotage virus? You’re going to die either way, but you can either have humanity’s last monument be a paperclipper AI or nothing.
At least two explicit differences in this scenario appear to be:
A: The paperclipper AI appears to have some level of popular support. Humanity wouldn’t have been spending trillions of dollars making him the only shot they have if not. (If you want more explicit context, imagine that when pressed by TV interviewers, Other scientists have said they explicitly read those papers, said that they understood them, but believed that anything was better than nothing, but they did not have time to explain. Polls indicate the paperclipper was supported 51-49, and clearly with at least some strong opposition, or no one would have bothered to build a sabotage virus.)
B: You don’t actually have to make a choice: there is a default choice, which will occur if you don’t press the button, which is the paperclipper AI remains.
I’m not sure I actually have answers to either of the questions yet, but the fact that both of them seem like they would make it more acceptable to allow the paperclipper then other options probably indicates at least some as yet unquantified pro-paperclipper thoughts on my part.
Run it. There is a non-zero possibility that a paperclips AI could destroy other life which I would care about, and a probability that it would create such life. I would put every effort I could into determining those 2 probabilities (mostly by accumulating the evidence from people much smarter than me, but still). I’ll do the action with the highest expected value. If I had no time, though, I’d run it, because I estimate a ridiculously small chance that it would create life relative to destroying everything I could possibly care about.
DataPacRat, I like that you included subsequent questions, and I think there may also be other ways of structuring subsequent questions as well which may also make people think about different answers.
Example: Is a paperclipper better than the something for what likely duration of time?
For instance, take the Trilobites vs paperclipper scenario you mentioned. I am imagining:
A: A solar system that has trilobites for 1 billion years, until it is engulfed by it’s sun and everything dies.
B: A solar system that has trilobites in a self-sustaining gaia planet for Eternity.
C: A solar system that has a paperclipping AI for 1 billion years, until it is engulfed by it’s sun and all of the paperclips melt.
D: A solar system that has a paperclipping AI that keeps a planet sized mass of paperclips as paperclips for eternity.
If I prefer B>D>A>C, then it seems like I might choose the paperclipper over the trilobites if figure the paperclipper has a 99.99% chance of lasting for an eternity, and the trilobite planet has a 0.01% chance of doing so.
On the other hand, it may be that you want to make an assumption that the paperclipper and the trilobites planet are equally resilient to existential crises for the purposes of this problem.
Second Example: Does the context of the paperclipper AI and it’s destruction matter?
Imagine all of humanity, and the rest of the solar system, is going to be engulfed by our sun’s super nova soon. We’re all going to die. There is one person who is going to be uploaded into an Hansonian EM experimental probe that will be shot away from the blast, a person with a mental disorder from a foreign country who loves making paperclips. (Unfortunately humanity only got to uploading tech 0.91 prior to the super nova- Very few people have any kind of upload capable brain right now.)
You have read several papers from an aquaintance of yours, a respected scientist, who has said multiple times “If you upload him he’ll turn into a paperclipper AI, I’m sure of it.” you’ve also read a few independent publications indicating yes, this person is going to be a paperclipper AI if uploaded under uploading tech 0.91, and here is proof.
A research assistant has snuck you a sabotage virus that will destroy the upload probe stealthily after the upload has taken place, saying “I wanted to see if I could be James Bond before I died!” and then commits suicide.
Do you run the sabotage virus? You’re going to die either way, but you can either have humanity’s last monument be a paperclipper AI or nothing.
At least two explicit differences in this scenario appear to be:
A: The paperclipper AI appears to have some level of popular support. Humanity wouldn’t have been spending trillions of dollars making him the only shot they have if not. (If you want more explicit context, imagine that when pressed by TV interviewers, Other scientists have said they explicitly read those papers, said that they understood them, but believed that anything was better than nothing, but they did not have time to explain. Polls indicate the paperclipper was supported 51-49, and clearly with at least some strong opposition, or no one would have bothered to build a sabotage virus.)
B: You don’t actually have to make a choice: there is a default choice, which will occur if you don’t press the button, which is the paperclipper AI remains.
I’m not sure I actually have answers to either of the questions yet, but the fact that both of them seem like they would make it more acceptable to allow the paperclipper then other options probably indicates at least some as yet unquantified pro-paperclipper thoughts on my part.
Run it. There is a non-zero possibility that a paperclips AI could destroy other life which I would care about, and a probability that it would create such life. I would put every effort I could into determining those 2 probabilities (mostly by accumulating the evidence from people much smarter than me, but still). I’ll do the action with the highest expected value. If I had no time, though, I’d run it, because I estimate a ridiculously small chance that it would create life relative to destroying everything I could possibly care about.