loup-vaillant comments on Prisoner’s Dilemma (with visible source code) Tournament

loup-vaillant 16 Jun 2013 21:37 UTC
4 points
I think it is possible to prove that a given boxing works, if it’s sufficiently simple. Choosing the language isn’t enough, but choosing the interpreter should be.

Take Brainfuck for instance: replace the dot (.″), which prints a character, by two other statements: one that prints “yes” and exits, and one that prints “no” and exits. If the interpreter has no bug, a program can only:
- Print “yes” and kill itself.
- Print “no” and kill itself.
- Do nothing until we kill it, or otherwise fail.
Assuming the AI doesn’t control the external world by heating the host Intel processor in a smart way, we should be able to prove that we’re otherwise safe.