This is indeed a cool and surprising result. I think it strengthens the introspection interpretation, but without a requirement to make a judgement of the reliability of some internal signal (right?), it doesn’t directly address the question of whether there is a discriminator in there.
Thanks Felix!
This is indeed a cool and surprising result. I think it strengthens the introspection interpretation, but without a requirement to make a judgement of the reliability of some internal signal (right?), it doesn’t directly address the question of whether there is a discriminator in there.