Yeah. It looks like there’s a discontinuity between using a RNG and having perfect memory. Perfect memory lets us get away with “action-optimal” reasoning, but if it’s even a little imperfect, we need to go “planning-optimal”.
Yeah. It looks like there’s a discontinuity between using a RNG and having perfect memory. Perfect memory lets us get away with “action-optimal” reasoning, but if it’s even a little imperfect, we need to go “planning-optimal”.