I suspect the unspecified implementation of Omega hides assumptions if not contradictions. Let me propose a more concrete version:
The problem is presented by Conservative Finite Omega (CFO), who works by pulling the agent’s source code, simulating it for a long but finite time, and putting $1,000,000 in the opaque box iff the simulation is determined to definitely one-box. The agent never walks away with the full $1,001,000, though the agent does sometimes walk away with $0.
So, assuming AIXI is confident in accurate models of how CFO works, CFO will simulate AIXI, which requires it to simulate AIXI’s (accurate) simulation of CFO—endless recursion. AIXI ‘wins’ the timeout war (correctly predicting CFO’s timeout), concludes that CFO has left the opaque box empty, and two-boxes.
You could look at that outcome as AIXI being penalized for being too smart. You could also say that an even smarter agent would prepend ‘if (facing CFO-like situation) then return one-box’ to its source code. Fundamentally, the specification of AIXI cannot conceive of its source code being an output; it’s baked into the assumptions that the explicit output bits are the only outputs.
Sure, I don’t necessarily blame the AIXI equation when it’s facing a relatively stupid Omega in that kind of situation.
However, consider “More Intelligent Finite Omega”, who pulls the agent’s source code and uses an approximate theorem-proving approach until it determines, with high confidence, what AIXI is going to do. Assuming that AIXI has received sufficient evidence to be reasonably confident in its model of MIFO, MIFO can reason like this:
AIXI will be able to accurately simulate me, therefore it will either have determined that the box is already empty, or already full.
Given either of those two models, AIXI will calculate that the best action is to two-box.
Consequently, AIXI will two-box. and then MIFO will leave the opaque box empty, and its prediction will have been correct. Moreover, MIFO had no other choice; if it were put the money in the opaque box, AIXI would still have two-boxed, and MIFO’s prediction would have been incorrect.
If you’re allowed to make the assumption that AIXI is confident in its model of CFO and CFO knows this, then I can make the same assumption about MIFO.
I think you’re right. At first I was worried (here and previously in the thread) that the proof that AIXI would two-box was circular, but I think it works out if you fill in the language about terminating turing machines and stuff. I was going to write up my formalization, but once I went through it in my head your proof suddenly looked too obviously correct to be worth expanding.
I suspect the unspecified implementation of Omega hides assumptions if not contradictions. Let me propose a more concrete version: The problem is presented by Conservative Finite Omega (CFO), who works by pulling the agent’s source code, simulating it for a long but finite time, and putting $1,000,000 in the opaque box iff the simulation is determined to definitely one-box. The agent never walks away with the full $1,001,000, though the agent does sometimes walk away with $0.
So, assuming AIXI is confident in accurate models of how CFO works, CFO will simulate AIXI, which requires it to simulate AIXI’s (accurate) simulation of CFO—endless recursion. AIXI ‘wins’ the timeout war (correctly predicting CFO’s timeout), concludes that CFO has left the opaque box empty, and two-boxes.
You could look at that outcome as AIXI being penalized for being too smart. You could also say that an even smarter agent would prepend ‘if (facing CFO-like situation) then return one-box’ to its source code. Fundamentally, the specification of AIXI cannot conceive of its source code being an output; it’s baked into the assumptions that the explicit output bits are the only outputs.
Sure, I don’t necessarily blame the AIXI equation when it’s facing a relatively stupid Omega in that kind of situation.
However, consider “More Intelligent Finite Omega”, who pulls the agent’s source code and uses an approximate theorem-proving approach until it determines, with high confidence, what AIXI is going to do. Assuming that AIXI has received sufficient evidence to be reasonably confident in its model of MIFO, MIFO can reason like this:
AIXI will be able to accurately simulate me, therefore it will either have determined that the box is already empty, or already full.
Given either of those two models, AIXI will calculate that the best action is to two-box.
Consequently, AIXI will two-box.
and then MIFO will leave the opaque box empty, and its prediction will have been correct. Moreover, MIFO had no other choice; if it were put the money in the opaque box, AIXI would still have two-boxed, and MIFO’s prediction would have been incorrect.
If you’re allowed to make the assumption that AIXI is confident in its model of CFO and CFO knows this, then I can make the same assumption about MIFO.
I think you’re right. At first I was worried (here and previously in the thread) that the proof that AIXI would two-box was circular, but I think it works out if you fill in the language about terminating turing machines and stuff. I was going to write up my formalization, but once I went through it in my head your proof suddenly looked too obviously correct to be worth expanding.