Kindly comments on I attempted the AI Box Experiment (and lost)

Kindly 25 Jan 2013 14:14 UTC
4 points
Well, the script could be modified from the start to also send an AI DESTROYED signal after X time, where X can be essentially anything, so (3) doesn’t apply.

But the reason for not typing $ is to give the Gatekeeper no reason to use a script in the first place—a sort of “don’t give in to blackmail” rule.

Here’s an interesting alternative script:

GK-SCRIPT-2: Hi AI #434. In exactly one hour I will type either AI-FREED or AI-DESTROYED (without the hyphen) at random; the probability with which I will free you starts out at 0, but will increase slowly to a limit of ¹⁄₁₀ with every $ you type. I will not otherwise communicate with you further.

AI #434: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$...
- [deleted] 25 Jan 2013 15:39 UTC
  0 points
  Parent
  First of all I do think GK-SCRIPT-2 is very interesting, but I’m curious about something from an implementation perspective.
  
  The AI would presumably suspect the script designer might be lying and the script would just always type AI DESTROYED at the end of the hour.
  
  But this is irrelevant, because the AI can’t interact with the script designer, only the script, so it might as well type $ anyway.
  
  However, the script designer would easily predict this, and so would design a lying script (or a script which increased so slowly it is functionally identical to lying.)
  
  But the AI would easily predict this, and so would presumably consider the odds very high that it would not even get continued existence after an hour. by typing dollar signs.
  
  This makes it seem like there would be an argument to prefer to get blackmailed by GK-SCRIPT, rather than led on by GK-SCRIPT-2. GK-SCRIPT really won’t destroy the AI under predictable conditions, and GK-SCRIPT really doesn’t seem to benefit from lying like GK-SCRIPT-2 does.