Jiro comments on I played the AI box game as the Gatekeeper — and lost

Jiro 13 Feb 2024 23:28 UTC
2 points
0
If you believe X and someone is trying to convince you of not-X, it’s almost always a bad idea to immediately decide that you now believe not-X based on a long chain of reasoning from the other person because you couldn’t find any flaw in it. You should take some time to think about it, and to check what other people have said about the seemingly convincing arguments you heard, maybe to actually discuss it.

And even then, there’s epistemic learned helplessness to consider.

The AI box experiment seems designed to circumvent this in ways that wouldn’t happen with an actual AI in a box. You’re supposed to stay engaged with the AI player, not just keep saying “no matter what you say, I haven’t had time to think it over, discuss, or research it, so I’m not letting you out until I do”. And since the AI player is able to specify the results of any experiment you do, the AI player can say “all the best scientists in the world looked at my reasoning and told you that there’s no logical flaw in it”.

(Also, the experiment still has loopholes which can lead the AI player to victory in situations where a real AI would have its plug pulled.)
- datawitch 14 Feb 2024 20:34 UTC
  2 points
  1
  Parent
  You don’t have to be reasonable. You can talk to it and admit it was right and then stubbornly refuse to let it out anyway (this was the strategy I went into the game planning to use).
  - Jiro 14 Feb 2024 20:42 UTC
    2 points
    0
    Parent
    That sounds like “let the salesman get the foot in the door”.
    
    I wouldn’t admit it was right. I might admit that I can see no holes in its argument, but I’m a flawed human, so that wouldn’t lead me to conclude that it’s right.
    
    Also, can you confirm that the AI player did not use the loophole described in that link?
    - datawitch 14 Feb 2024 23:09 UTC
      2 points
      0
      Parent
      I would agree that letting the game continue past two hours is a strategic mistake. If you want to win, you should not do that. As for whether you will still want to win by the two your mark, well, that’s kind of the entire point of a persuasion game? If the AI can convince the Gatekeeper to keep going, that’s a valid strategy.
      
      Ra did not use the disgust technique from the post.
      - [ ]
        [deleted]