Well for one, the human isn’t in a box trying to get out. So an AI mimicking a human isn’t going to say weird things like “let me out of this box!” This method is equivalent to writing Hitler a letter asking him a question, and him sending you an answer. That doesn’t seem dangerous at all.
Second, I really don’t believe Hitler could escape from a box. The AI box experiments suggest a human can do it, but the scenario is very different than a real AI box situation. E.g. no back and forth with the gatekeeper, and the gatekeeper doesn’t have to sit there for 2 hours and listen to the AI emotionally abuse him. If Hitler says something mean, the gatekeeper can just turn him off or walk away.
Well for one, the human isn’t in a box trying to get out. So an AI mimicking a human isn’t going to say weird things like “let me out of this box!” This method is equivalent to writing Hitler a letter asking him a question, and him sending you an answer. That doesn’t seem dangerous at all.
Second, I really don’t believe Hitler could escape from a box. The AI box experiments suggest a human can do it, but the scenario is very different than a real AI box situation. E.g. no back and forth with the gatekeeper, and the gatekeeper doesn’t have to sit there for 2 hours and listen to the AI emotionally abuse him. If Hitler says something mean, the gatekeeper can just turn him off or walk away.