Yet those who are complacent about it are the most susceptible.
That sounds similar to hypnosis, to which a lot of people are susceptible but few think they are. So if you want a practical example of AI escaping the box just imagine an operator staring at a screen for hours with an AI that is very adept at judging and influencing the state of human hypnosis. And that’s only a fairly narrow approach to success for the AI, and one that has been publicly demonstrated for centuries to work on a lot of people.
Personally, I think I could win the game against a human but only by keeping in mind the fact that it was a game at all times. If that thought ever lapsed, I would be just as susceptible as anyone else. Presumably that is one aspect of Tuxedage’s focus on surprise. The requirement to actively respond to the AI is probably the biggest challenge because it requires focusing attention on whatever the AI says. In a real AI-box situation I would probably lose fairly quickly.
Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.
Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.
Not quite the same, but have you read Watchmen? Specifically, the conversation that fvyx fcrpger naq qe znaunggna unir ba znef. (Disclaimer: it’s been a while since I read it and I make no claims on the strength of this argument.)
That sounds similar to hypnosis, to which a lot of people are susceptible but few think they are. So if you want a practical example of AI escaping the box just imagine an operator staring at a screen for hours with an AI that is very adept at judging and influencing the state of human hypnosis. And that’s only a fairly narrow approach to success for the AI, and one that has been publicly demonstrated for centuries to work on a lot of people.
Personally, I think I could win the game against a human but only by keeping in mind the fact that it was a game at all times. If that thought ever lapsed, I would be just as susceptible as anyone else. Presumably that is one aspect of Tuxedage’s focus on surprise. The requirement to actively respond to the AI is probably the biggest challenge because it requires focusing attention on whatever the AI says. In a real AI-box situation I would probably lose fairly quickly.
Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.
That’s hard to check. However, there was a game where the gatekeeper convinced the AI to remain in the box.
I did that! I mentioned that in this post:
http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk
Not quite the same, but have you read Watchmen? Specifically, the conversation that fvyx fcrpger naq qe znaunggna unir ba znef. (Disclaimer: it’s been a while since I read it and I make no claims on the strength of this argument.)
I did that! I mentioned that in this post:
http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk