Actually, an AI that believes it only communicates with the environment via input/output channels cannot represent the hypothesis that it will stop receiving input bits.
But I am an intelligence that can only communicate with the environment via input/output channels!
Incorrect—your implementation itself also affects the environment via more than your chosen output channels.
Okay, fair enough. But until you pointed that out, I was an intelligence that believed it only communicated with the environment via input/output channels (that was your original phrasing, which I should have copied in the first place), and yet I did (and do) believe that it is possible for me to die.
Thus, what the AIXI will do is this: it will move right, then it will do nothing for the rest of time.
Incorrect. I’ll assume for the sake of argument that you’re right about what AIXI will do at first. But AIXI learns by Solomonoff induction, which is infallible at “noticing that it is confused”—all Turing machines that fail to predict what actually happens get dropped from the hypothesis space. AIXI does nothing just until that fails to cause the right-room robot to move, whereupon any program that predicted that merely outputting “Pass” forever would do the trick gets zeroed out.
The AIXI’s problem is that it assumes that if it acts like the best Turing machine it found then it will do as well as that Turing machine.
If there are programs in the hypothesis space that do not make this assumption (and as far as I know, you and I agree that naturalized induction would be such a program), then these are the only programs that will survive the failure of AIXI’s first plan.
Has Paul Christiano looked at this stuff?
ETA: I don’t usually mind downvotes, but I find these ones (currently −2) are niggling at me. I don’t think I’m being conspicuously stupid, and I do think that discussing AIXI in a relatively concrete scenario could be valuable, so I’m a bit at a loss for an explanation. …Perhaps it’s because I appealed to Paul Christiano’s authority?
Okay, fair enough. But until you pointed that out, I was an intelligence that believed it only communicated with the environment via input/output channels (that was your original phrasing, which I should have copied in the first place), and yet I did (and do) believe that it is possible for me to die.
Incorrect. I’ll assume for the sake of argument that you’re right about what AIXI will do at first. But AIXI learns by Solomonoff induction, which is infallible at “noticing that it is confused”—all Turing machines that fail to predict what actually happens get dropped from the hypothesis space. AIXI does nothing just until that fails to cause the right-room robot to move, whereupon any program that predicted that merely outputting “Pass” forever would do the trick gets zeroed out.
If there are programs in the hypothesis space that do not make this assumption (and as far as I know, you and I agree that naturalized induction would be such a program), then these are the only programs that will survive the failure of AIXI’s first plan.
Has Paul Christiano looked at this stuff?
ETA: I don’t usually mind downvotes, but I find these ones (currently −2) are niggling at me. I don’t think I’m being conspicuously stupid, and I do think that discussing AIXI in a relatively concrete scenario could be valuable, so I’m a bit at a loss for an explanation. …Perhaps it’s because I appealed to Paul Christiano’s authority?