Razied comments on EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised

Razied 3 Nov 2021 13:08 UTC
14 points
The famous problem here is the “noisy TV problem”. If your AI is driven to go towards regions of uncertainty then it will be completely captivated by a TV on the wall showing random images, no need for a copy of Doom, any random giberish that the AI can’t predict will work.
- maximkazhenkov 4 Nov 2021 3:27 UTC
  3 points
  Parent
  OpenAI claims to have already solved the noisy TV problem via Random Network Distillation, although I’m still skeptical of it. I think it’s a clever hack that only solves a specific subclass of this problem that is relatively superficial.
- FeepingCreature 3 Nov 2021 16:09 UTC
  2 points
  Parent
  Well, one may develop an AI that handles noisy TV by learning that it can’t predict the noisy TV. The idea was to give it a space that is filled with novelty reward, but doesn’t lead to a performance payoff.