bouilhet comments on I attempted the AI Box Experiment again! (And won—Twice!)

bouilhet 11 Sep 2013 22:52 UTC
1 point
The Lector/AI analogy occurred to me as well. The problem, in strategic—and perhaps also existential—terms, is that Starling/Gatekeeper is convinced that Lector/AI is the only one holding the answer to some problem that Starling/Gatekeeper is equally convinced must be solved. Lector/AI, that is, has managed to make himself (or already is) indispensable to Starling/Gatekeeper.

On a side note, these experiments also reminded me of the short-lived game show The Moment of Truth. I watched a few episodes back when it first aired and was mildly horrified. Contestants were frequently willing to accept relatively paltry rewards in exchange for the ruination of what appeared at least to be close personal relationships. The structure here is that the host asks the contestants increasingly difficult (i.e. embarrassing, emotionally damaging) questions before an audience of their friends and family members. Truthful answers move the player up the prize-money/humiliation/relationship-destruction pyramid, while a false answer (as determined by a lie-detector test), ends the game and forfeits all winnings. Trying to imagine some potentially effective arguments for the AI in the box experiment, the sort of thing going on here came instantly to mind, namely, that oldest and arguably most powerful blackmail tool of them all: SHAME. As I understand it, Dark Arts are purposely considered in-bounds for these experiments. Going up against a Gatekeeper, then, I’d want some useful dirt in reserve. Likewise, going up against an AI, I’d have to expect threats (and consequences) of this nature, and prepare accordingly.