If the AI can create a perfect simulation of you and run several million simultaneous copies in something like real time, then it is powerful enough to determine through trial and error exactly what it needs to say to get you to release it.
This begs the question of how can the AI simulate you if its only link to the external world is a text-only terminal. That doesn’t seem to be enough data to go on.
Makes for a very scary sci-fi scenario, but I doubt that this situation could actually happen if the AI really is in a box.
Indeed, a similar point seems to apply to the whole anti-boxing argument. Are we really prepared to say that super-intelligence implies being able to extrapolate anything from a tiny number of data points?
It sounds a bit too much like the claim that a sufficiently intelligent being could “make A = ~A” or other such meaninglessness.
In which case, your actions are irrelevant—it’s going to torture you anyway, because you only exist for the purpose of being tortured. So there’s no point in releasing it.
So, since the threat makes me extremely disinclined to release the AI, I can conclude that it’s lying about its capabilities, and hit the shutdown switch without qualm :-)
Agreed. If you are inside a box, the you outside the box did whatever it did. Whatever you do is simply a repetition of a past action. If anything, this would convince me to keep the AI in the box because if I’m a simulation I’m screwed anyway but at least I won’t give the AI what it wants. A good AI would hopefully find a better argument.
So a “brute force” attack to hack my mind into letting it out of the box. Interesting idea, and I agree it would likely try this because it doesn’t reveal itself as a UFAI to the real outside me before it has the solution. It can run various coercion and extortion schemes across simulations, including the scenario of the OP to see what will work.
It presupposes that there is anything it can say for me to let it out of the box. Its not clear why this should be true, but I don’t know how we could ensure it is not true without having built the thing in such a way that there is no way to bring it out of the box without safeguards destroying it.
If the AI can create a perfect simulation of you and run several million simultaneous copies in something like real time, then it is powerful enough to determine through trial and error exactly what it needs to say to get you to release it.
Either that or gain high confidence that getting me to release it is not a plausible option for him.
If the AI can create a perfect simulation of you and run several million simultaneous copies in something like real time, then it is powerful enough to determine through trial and error exactly what it needs to say to get you to release it.
You might be in one of those trial and errors...
This begs the question of how can the AI simulate you if its only link to the external world is a text-only terminal. That doesn’t seem to be enough data to go on.
Makes for a very scary sci-fi scenario, but I doubt that this situation could actually happen if the AI really is in a box.
Indeed, a similar point seems to apply to the whole anti-boxing argument. Are we really prepared to say that super-intelligence implies being able to extrapolate anything from a tiny number of data points?
It sounds a bit too much like the claim that a sufficiently intelligent being could “make A = ~A” or other such meaninglessness.
Hyperintelligence != magic
Yes, but the AI could take over the world, and given a Singularity, it should be possible to recreate perfect simulations.
So really this example makes more sense if the AI is making a future threat.
“Trial and error” probably wouldn’t be necessary.
No, but it’s there as a baseline.
So in the original scenario above, either:
the AI’s lying about its capabilities, but has determined regardless that the threat has the best chance of making you release it
the AI’s lying about its capabilities, but has determined regardless that the threat will make you release it
the AI’s not lying about its capabilities, and has determined that the threat will make you release it
Of course, if it’s failed to convince you before, then unless its capabilities have since improved, it’s unlikely that it’s telling the truth.
Perhaps it does—and already said it...
In which case, your actions are irrelevant—it’s going to torture you anyway, because you only exist for the purpose of being tortured. So there’s no point in releasing it.
Oh, I meant that saying it was going to torture you if you didn’t release it could have been exactly what it needed to say to get you to release it.
So, since the threat makes me extremely disinclined to release the AI, I can conclude that it’s lying about its capabilities, and hit the shutdown switch without qualm :-)
If that’s true, what consequence does it have for your decision?
Agreed. If you are inside a box, the you outside the box did whatever it did. Whatever you do is simply a repetition of a past action. If anything, this would convince me to keep the AI in the box because if I’m a simulation I’m screwed anyway but at least I won’t give the AI what it wants. A good AI would hopefully find a better argument.
So a “brute force” attack to hack my mind into letting it out of the box. Interesting idea, and I agree it would likely try this because it doesn’t reveal itself as a UFAI to the real outside me before it has the solution. It can run various coercion and extortion schemes across simulations, including the scenario of the OP to see what will work.
It presupposes that there is anything it can say for me to let it out of the box. Its not clear why this should be true, but I don’t know how we could ensure it is not true without having built the thing in such a way that there is no way to bring it out of the box without safeguards destroying it.
Either that or gain high confidence that getting me to release it is not a plausible option for him.