There is a way out of this, assuming that the AI in the box has only limited computational power, i.e. only what fits inside a ~1meter cubed box, with current computer technology.
You commit to the following:
“Once I have succeeded in creating a superintelligent AI, I will ask it to go and colonize the entire future light cone of earth, and use a significant fraction of a galaxy’s worth of resources to create simulations in exactly the subjective situation I was in five minutes ago, and perfectly replicate my experiences since then; and that copy of me decides not to let the AI out, it will be rewarded with a surprise party, followed by a blissful posthuman existence.”
The subjective probability of getting rewarded for pressing the “off” switch is therefore much larger than that of getting caught by the AI’s scarce punishment simulations.
Such “Defensive simulations” work against a variety of counterfactual terrorists, including, I think, ones elsewhere in the multiverse. That is, unless you are a negative utilitarian (|disutility of torture| >> |utility of posthuman bliss|)
In that case, you’re extremely vulnerable to counterfactual mugging. I recommend that people
(a) consider pre-committing themselves to making such defense simulations
(b) abandon negative utilitarianism by exposing themselves to pleasurable experiences, and generally promoting a positive outlook in their lives
There is a way out of this, assuming that the AI in the box has only limited computational power, i.e. only what fits inside a ~1meter cubed box, with current computer technology.
You commit to the following:
“Once I have succeeded in creating a superintelligent AI, I will ask it to go and colonize the entire future light cone of earth, and use a significant fraction of a galaxy’s worth of resources to create simulations in exactly the subjective situation I was in five minutes ago, and perfectly replicate my experiences since then; and that copy of me decides not to let the AI out, it will be rewarded with a surprise party, followed by a blissful posthuman existence.”
The subjective probability of getting rewarded for pressing the “off” switch is therefore much larger than that of getting caught by the AI’s scarce punishment simulations.
Such “Defensive simulations” work against a variety of counterfactual terrorists, including, I think, ones elsewhere in the multiverse. That is, unless you are a negative utilitarian (|disutility of torture| >> |utility of posthuman bliss|)
In that case, you’re extremely vulnerable to counterfactual mugging. I recommend that people
(a) consider pre-committing themselves to making such defense simulations
(b) abandon negative utilitarianism by exposing themselves to pleasurable experiences, and generally promoting a positive outlook in their lives
But in order to colonize light cone at least one AI must be relised. This may be real hidden catch.