Now what I really want to see is an AI-box experiment where the Gatekeeper wins early by convincing the AI to become Friendly.
I did that! I mentioned that in this post:
http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk
I did that! I mentioned that in this post:
http://lesswrong.com/lw/iqk/i_played_the_ai_box_experiment_again_and_lost/9thk