In any reasonable scenario, those communicating with the AI in a box will not be the people empowered to let it out. Ideally, those with the capability to let the AI out would be entirely isolated from those communicating with the AI and would not be able to access the conversations with the AI.
I would also note that restricting the number of bits (a) just makes things go more slowly and (b) doesn’t work very well in the competitive real world where the other guys are less restrictive.
Ultimately, the dangers of the AI in a box aren’t that it can manipulate any human to let it out but that:
(i) it’s really unclear how good our boxing skills are; and
(ii) human beings have different risk reward functions and it is entirely possible that humans will convince themselves to let the AI out of the box even without any manipulation either as a result of perceived additional benefit, competitive pressure or sympathy for the AI.
You kind of assumed away (i), but part of (i) is setting things up as outlined in my first paragraph which points to the fact that even if our boxing skills were good enough, over time we will come to rely on less sophisticated and capable organizations to do the boxing which doesn’t seem like it will end well.
In any reasonable scenario, those communicating with the AI in a box will not be the people empowered to let it out. Ideally, those with the capability to let the AI out would be entirely isolated from those communicating with the AI and would not be able to access the conversations with the AI.
I would also note that restricting the number of bits (a) just makes things go more slowly and (b) doesn’t work very well in the competitive real world where the other guys are less restrictive.
Ultimately, the dangers of the AI in a box aren’t that it can manipulate any human to let it out but that:
(i) it’s really unclear how good our boxing skills are; and
(ii) human beings have different risk reward functions and it is entirely possible that humans will convince themselves to let the AI out of the box even without any manipulation either as a result of perceived additional benefit, competitive pressure or sympathy for the AI.
You kind of assumed away (i), but part of (i) is setting things up as outlined in my first paragraph which points to the fact that even if our boxing skills were good enough, over time we will come to rely on less sophisticated and capable organizations to do the boxing which doesn’t seem like it will end well.