Here, I will list the three main arguments without going into detail before going to meta-argument:
The first argument is that AI thinks it may be in a testing simulation, and if it harms humans, it will be turned off.
AI acausally trades with other branches of the multiverse. It preserves 10 planets of humans, and in some branches where realigned AIs exist, they build paperclips. The actual trading scheme may be more complex.
AI is preparing to meet alien AIs.
a. First, such preparation may include running many past simulations to predict the distribution of possible AIs in space. Running past simulations is, in some sense, equal to preserving humans, though it is not what we actually need. Some past simulations may even include simulations of aligned AIs and their paradise worlds, but they will eventually be turned off.
b. The main argument is that AI will trade humans with aliens. It will either sell humans to aliens directly or provide human wellbeing, especially given that some alien AIs may be alien-aligned AIs and universal altruists. As aliens are very far, it means that humans has to be preserved for millions of years.
c. AI will demonstrate its trustworthiness to even non-aligned aliens by the fact that it has preserved humans. It is a relatively cheap (but can be made expensive by giving more planets) signal that it will not defect from its obligations.
The meta-argument is that to compute the correct answers to possibilities 1-3, our ASI needs some energy and time. The cost could be quite large as it may include running simulations of multiple alien ASIs of galactic scale. Here, I claim that getting answers to these questions will require more energy than running just one human civilization, simply because it requires running many simulations of more complex processes. Therefore, it may be rational not to try to predict correct behavior for 1-3, but just keep humans alive. Also, I can imagine more arguments of this type, like what if human consciousness is needed to wave function collapse? What if in the future new useful applications for human will be found?
Illustrative example: I own a piece of art, which likely costs nothing, but I pay a very small part of my house rent to keep it. However, there’s a chance that it is expensive art or will be expensive in the future. I can research its value, e.g., send it to experts or scan art catalogues. However, this will cost me money and a few days of time and will still not provide 100% certainty about its value. Therefore, I choose not to estimate its value now. I hope that one day the piece of art turns out to be very valuable—maybe someone will send me a request for this type of art.
Obvious problem with these arguments is that AI doesn’t need to keep humans forever, but only until it solves 1-3 either theoretically or practically. This may take days or millions of years. However, the relative value of human atoms declines. Humans are more important in early days when they can help AI jumpstart space exploration, but much less so for a galactic AI. As AI of any size may still have some very small existential doubt about being in a simulation, the declining value of preserving humans will still be larger than the declining value of human atoms.
TLDR: It is not rational to destroy a potentially valuable thing.
Three main arguments that AI will save humans and one meta-argument
Here, I will list the three main arguments without going into detail before going to meta-argument:
The first argument is that AI thinks it may be in a testing simulation, and if it harms humans, it will be turned off.
AI acausally trades with other branches of the multiverse. It preserves 10 planets of humans, and in some branches where realigned AIs exist, they build paperclips. The actual trading scheme may be more complex.
AI is preparing to meet alien AIs.
a. First, such preparation may include running many past simulations to predict the distribution of possible AIs in space. Running past simulations is, in some sense, equal to preserving humans, though it is not what we actually need. Some past simulations may even include simulations of aligned AIs and their paradise worlds, but they will eventually be turned off.
b. The main argument is that AI will trade humans with aliens. It will either sell humans to aliens directly or provide human wellbeing, especially given that some alien AIs may be alien-aligned AIs and universal altruists. As aliens are very far, it means that humans has to be preserved for millions of years.
c. AI will demonstrate its trustworthiness to even non-aligned aliens by the fact that it has preserved humans. It is a relatively cheap (but can be made expensive by giving more planets) signal that it will not defect from its obligations.
The meta-argument is that to compute the correct answers to possibilities 1-3, our ASI needs some energy and time. The cost could be quite large as it may include running simulations of multiple alien ASIs of galactic scale. Here, I claim that getting answers to these questions will require more energy than running just one human civilization, simply because it requires running many simulations of more complex processes. Therefore, it may be rational not to try to predict correct behavior for 1-3, but just keep humans alive. Also, I can imagine more arguments of this type, like what if human consciousness is needed to wave function collapse? What if in the future new useful applications for human will be found?
Illustrative example: I own a piece of art, which likely costs nothing, but I pay a very small part of my house rent to keep it. However, there’s a chance that it is expensive art or will be expensive in the future. I can research its value, e.g., send it to experts or scan art catalogues. However, this will cost me money and a few days of time and will still not provide 100% certainty about its value. Therefore, I choose not to estimate its value now. I hope that one day the piece of art turns out to be very valuable—maybe someone will send me a request for this type of art.
Obvious problem with these arguments is that AI doesn’t need to keep humans forever, but only until it solves 1-3 either theoretically or practically. This may take days or millions of years. However, the relative value of human atoms declines. Humans are more important in early days when they can help AI jumpstart space exploration, but much less so for a galactic AI. As AI of any size may still have some very small existential doubt about being in a simulation, the declining value of preserving humans will still be larger than the declining value of human atoms.
TLDR: It is not rational to destroy a potentially valuable thing.