Wait a minute, what exactly do you mean by “you”? TDT? or “any agent whatsoever”? If it’s TDT alone why? If I read you correctly, you already agree that’s it’s not because Omega said “running TDT” instead of “running WTF-DT”. If it’s “any agent whatsoever”, then are you really sure the simulated and real problem aren’t actually the same? (I’m sure they aren’t, but, just checking.)
Well, no, this would be my disagreement: it’s precisely because Omega told you that the simulated agent is running TDT that only TDT could or could not be the simulation; the simulated and real problem are, for all intents and purposes, identical (Omega doesn’t actually need to put a reward in the simulated boxes, because he doesn’t need to reward the simulated agent, but both problems appear exactly the same to the simulated and real TDT agents).
Well, no, this would be my disagreement: it’s precisely because Omega told you that the simulated agent is running TDT that only TDT could or could not be the simulation; the simulated and real problem are, for all intents and purposes, identical (Omega doesn’t actually need to put a reward in the simulated boxes, because he doesn’t need to reward the simulated agent, but both problems appear exactly the same to the simulated and real TDT agents).
This comment from lackofcheese finally made it click. Your comment also make sense.
I now understand that this “problematic” problem just isn’t fair. TDT 1-boxes because it’s the only way to get the million.