shokwave comments on Problematic Problems for TDT

shokwave 25 May 2012 8:16 UTC
0 points

The simulated TDT agent is not aware that it won’t receive a reward, and therefore it does not work. … I don’t think that the ability to simulate without rewarding the simulation is what pushes it over the threshold of “unfair”.

I do agree. I think my previous post was still exploring the “can TDT break with a simulation of itself?” question, which is interesting but orthogonal.