jacob_cannell comments on Misc. questions about EfficientZero

jacob_cannell 5 Dec 2021 6:08 UTC
3 points
Just real quick
5. So at a high level we know exactly how many flops it takes to simulate atari—it’s about 10^6 flop/s vs perhaps 10^12 flops/s for typical games today (with the full potential of modern gpus at 10^14 flops/s, similar to reality). So you (and by you I mean DM) can actually directly compare—using knowledge of atari, circuits, or the sim code—the computational cost of the learned atari predictive model inside the agent vs the simulation cost of the (now defunct) actual atari circuit. There isn’t much uncertainty in that calculation—both are known things (not like comparing to reality).
The parameter count isn’t really important—this isn’t a GPT-3 style language model designed to absorb the web. It’s parameter count is about as relevant as the parameter count of a super high end atari simulator that can simulate billions of atari frames per second—not much, because atari is small. Also—that is exactly what this thing is.