This seems in line with your position, but I want to reply so people won’t conclude “And coinrun experiments don’t tell you important things.” I think the more interesting question for that experiment is “how will the agent generalize? Can we predict it in advance? In what ways do we systematically mispredict, and why?”
(And, roughly speaking, there are a range of possible algorithms by which the agent can generalize, so it’s way more than one bit. I got way more out of that paper by asking the authors for hundreds of videos for different training settings. Probably will mention the results in a post, soon.)
This seems in line with your position, but I want to reply so people won’t conclude “And coinrun experiments don’t tell you important things.” I think the more interesting question for that experiment is “how will the agent generalize? Can we predict it in advance? In what ways do we systematically mispredict, and why?”
(And, roughly speaking, there are a range of possible algorithms by which the agent can generalize, so it’s way more than one bit. I got way more out of that paper by asking the authors for hundreds of videos for different training settings. Probably will mention the results in a post, soon.)