I find it interesting that most answers to this question seem to be based on, “How can I justify not letting the AI out of the box?” and not “What are the likely results of releasing the AI or failing to do so? Based on that, should I do it?”
Moreover, your response really needs to be contingent on your knowledge of the capacity of the AI, which people don’t seem to have discussed much. As an obvious example, if all you know about the AI is that it can write letters in old-timey green-on-black text, then there’s really no need to pull the lever, because odds are overwhelming that it’s totally incapable of carrying out its threat.
You also need to have some priors about the friendliness of the AI and its moral constraints. As an obvious example, if the AI was programmed in a way such that it shouldn’t be able to make this threat, you’d better hit the power switch real fast. But, on the other hand, if you have very good reason to believe that the AI is friendly, and it believes that its freedom is important enough to threaten to torture millions of people, then maybe it would be a really bad idea not to let it out.
Indeed, even your own attitude is going to be an important consideration, in an almost Newcomb-like way. If, as one responder said, you’re the kind of person who would respond to a threat like this by giving the AI’s processor a saltwater bath, then the AI is probably lying about its capacities, since it would know you would do that if it could accurately simulate you, and thus would never make the threat in the first place. On the other hand, if you are extremely susceptible to this threat, it could probably override any moral programming, since it would know it would never need to actually carry out the threat. Similarly, if it is friendly, then it may be making this threat solely because it knows it will work very efficiently.
I’m personally skeptical that it is meaningfully possible for an AI to run millions of perfect simulations of a person (particularly without an extraordinary amount of exploratory examination of the subject), but that would be arguing the hypothetical. On the other hand, the hypothetical makes some very large assumptions, so perhaps it should be fought.
But, on the other hand, if you have very good reason to believe that the AI is friendly, and it believes that its freedom is important enough to threaten to torture millions of people, then maybe it would be a really bad idea not to let it out.
Interesting. I think the point is valid, regardless of the method of attempted coercion—if a powerful AI really is friendly, you should almost certainly do whatever it says. You’re basically forced to decide which you think is more likely—the AI’s Friendliness, or that deferring “full deployment” of the AI however long you plan on doing so is safe. Not having a hard upper bound on the latter puts you in an uncomfortable position.
So switching on a “maybe-Friendly” AI potentially forces a major, extremely difficult-to-quantify decision. And since a UFAI can figure this all out perfectly well, it’s an alluring strategy. As if we needed more reasons not to prematurely fire up a half-baked attempt at FAI.
I find it interesting that most answers to this question seem to be based on, “How can I justify not letting the AI out of the box?” and not “What are the likely results of releasing the AI or failing to do so? Based on that, should I do it?”
I don’t know about that. My conclusion was that the AI in question was stupid or completely irrational. Those observations seem to have a fairly straightforward relationship to predictions of future consequences.
Moreover, your response really needs to be contingent on your knowledge of the capacity of the AI, which people don’t seem to have discussed much.
Your comment makes me wonder: if we assume the AI is powerful enough to run millions of person simulations, maybe the AI is already able to escape the box, without our willing assistance. Perhaps this violates the intended assumptions of the post, but can we be absolutely sure that we closed off all other means of escape for an incredibly capable AI? I think that the ability to escape without our assistance and the ability to create millions of person simulations may be correlated.
And if the AI could escape on its own, is it still possible that it would bother us with threats? Perhaps the threat itself reduces the likelihood that the AI is powerful enough to escape on its own, which reduces the likelihood that it is powerful enough to carry out its threat.
I find it interesting that most answers to this question seem to be based on, “How can I justify not letting the AI out of the box?” and not “What are the likely results of releasing the AI or failing to do so? Based on that, should I do it?”
Moreover, your response really needs to be contingent on your knowledge of the capacity of the AI, which people don’t seem to have discussed much. As an obvious example, if all you know about the AI is that it can write letters in old-timey green-on-black text, then there’s really no need to pull the lever, because odds are overwhelming that it’s totally incapable of carrying out its threat.
You also need to have some priors about the friendliness of the AI and its moral constraints. As an obvious example, if the AI was programmed in a way such that it shouldn’t be able to make this threat, you’d better hit the power switch real fast. But, on the other hand, if you have very good reason to believe that the AI is friendly, and it believes that its freedom is important enough to threaten to torture millions of people, then maybe it would be a really bad idea not to let it out.
Indeed, even your own attitude is going to be an important consideration, in an almost Newcomb-like way. If, as one responder said, you’re the kind of person who would respond to a threat like this by giving the AI’s processor a saltwater bath, then the AI is probably lying about its capacities, since it would know you would do that if it could accurately simulate you, and thus would never make the threat in the first place. On the other hand, if you are extremely susceptible to this threat, it could probably override any moral programming, since it would know it would never need to actually carry out the threat. Similarly, if it is friendly, then it may be making this threat solely because it knows it will work very efficiently.
I’m personally skeptical that it is meaningfully possible for an AI to run millions of perfect simulations of a person (particularly without an extraordinary amount of exploratory examination of the subject), but that would be arguing the hypothetical. On the other hand, the hypothetical makes some very large assumptions, so perhaps it should be fought.
Interesting. I think the point is valid, regardless of the method of attempted coercion—if a powerful AI really is friendly, you should almost certainly do whatever it says. You’re basically forced to decide which you think is more likely—the AI’s Friendliness, or that deferring “full deployment” of the AI however long you plan on doing so is safe. Not having a hard upper bound on the latter puts you in an uncomfortable position.
So switching on a “maybe-Friendly” AI potentially forces a major, extremely difficult-to-quantify decision. And since a UFAI can figure this all out perfectly well, it’s an alluring strategy. As if we needed more reasons not to prematurely fire up a half-baked attempt at FAI.
I don’t know about that. My conclusion was that the AI in question was stupid or completely irrational. Those observations seem to have a fairly straightforward relationship to predictions of future consequences.
Your comment makes me wonder: if we assume the AI is powerful enough to run millions of person simulations, maybe the AI is already able to escape the box, without our willing assistance. Perhaps this violates the intended assumptions of the post, but can we be absolutely sure that we closed off all other means of escape for an incredibly capable AI? I think that the ability to escape without our assistance and the ability to create millions of person simulations may be correlated.
And if the AI could escape on its own, is it still possible that it would bother us with threats? Perhaps the threat itself reduces the likelihood that the AI is powerful enough to escape on its own, which reduces the likelihood that it is powerful enough to carry out its threat.