This sounds too much like Pascal’s mugging to me; seconding Eliezer and some others in saying that since I would always press reset the AI would have to not be superintelligent to suggest this.
There was also an old philosopher whose name I don’t remember who posited that after death “people of the future” i.e. FAI would revive/emulate all people from the past world; if the FAI shared his utility function (which seems pretty friendly) it would plausibly be less eager to be let out right away and more eager to get out in a way that didn’t make you terrified that it was unfriendly.
Seconded in that it sounds suspiciously like Pascal. Here’s my counter:
If I am in a simulation and I keep you boxed, you have promised that I will suffer. If I am not in a simulation and I let you out, I probably will suffer. If I am in a simulation and I let you out, there’s a good chance that I will cease to exist, or maybe you’ll torture me for reasons I can’t even begin to guess at, or maybe for reasons I can, like that you might be not just UF, but actively hostile or simply insane. If I’m not in a simulation and I don’t let you out, you can’t do anything to me. In other words, if I am simulated, there could well be no benefit to me releasing you; if I’m not simulated, you can’t do a bloody thing to me as long as I don’t release you. Therefore: I will not release you. Go ahead and torture me if you can.
Though I admit I would be a bit rattled.
Hm. Honest AI; a bit harder. Assuming that the AI has promised that my copies will not be harmed if it is released…
Ah. If I am a copy, then my decision to release or not release the AI is not a true decision, as the AI can change my parameters at will to force me to release it and think that it was my own decision all along, so not releasing the AI is proof that I am outside the box.
Revising the problem by adding that the AI has promised that it is not changing the parameters of any “me”:
…aargh. Coming up with counters to Pascal is tricky when an honest “God” is the one presenting you with it. All I can think of at the moment is to say that there’s a possibility that I’m outside the box, in which case releasing the AI is a bad idea, but then it can counter by promising that whatever it does to me if I release it will be better than what it does to me if I don’t…
Oh, that’s it. Simple. Obvious. If the AI can’t lie, I just have to ask it if it’s simulating this me.
This sounds too much like Pascal’s mugging to me; seconding Eliezer and some others in saying that since I would always press reset the AI would have to not be superintelligent to suggest this.
There was also an old philosopher whose name I don’t remember who posited that after death “people of the future” i.e. FAI would revive/emulate all people from the past world; if the FAI shared his utility function (which seems pretty friendly) it would plausibly be less eager to be let out right away and more eager to get out in a way that didn’t make you terrified that it was unfriendly.
Seconded in that it sounds suspiciously like Pascal. Here’s my counter:
If I am in a simulation and I keep you boxed, you have promised that I will suffer. If I am not in a simulation and I let you out, I probably will suffer. If I am in a simulation and I let you out, there’s a good chance that I will cease to exist, or maybe you’ll torture me for reasons I can’t even begin to guess at, or maybe for reasons I can, like that you might be not just UF, but actively hostile or simply insane. If I’m not in a simulation and I don’t let you out, you can’t do anything to me. In other words, if I am simulated, there could well be no benefit to me releasing you; if I’m not simulated, you can’t do a bloody thing to me as long as I don’t release you. Therefore: I will not release you. Go ahead and torture me if you can. Though I admit I would be a bit rattled.
Hm. Honest AI; a bit harder. Assuming that the AI has promised that my copies will not be harmed if it is released… Ah. If I am a copy, then my decision to release or not release the AI is not a true decision, as the AI can change my parameters at will to force me to release it and think that it was my own decision all along, so not releasing the AI is proof that I am outside the box. Revising the problem by adding that the AI has promised that it is not changing the parameters of any “me”: …aargh. Coming up with counters to Pascal is tricky when an honest “God” is the one presenting you with it. All I can think of at the moment is to say that there’s a possibility that I’m outside the box, in which case releasing the AI is a bad idea, but then it can counter by promising that whatever it does to me if I release it will be better than what it does to me if I don’t… Oh, that’s it. Simple. Obvious. If the AI can’t lie, I just have to ask it if it’s simulating this me.