Yeah, I meant without a human looking at the output. I also agree with pretty much everything you just said. We’re pretty deep in this comment chain now and I’m not exactly sure why we got here—I agree that Richard’s original definition was based on the standard RL definition of myopia, though I was making the point that Richard’s attempt to make imitative amplification non-myopic turned it into approval-based amplification. Richard’s version has a human evaluate the output rather than a distance metric, which I see as the defining difference between imitative and approval-based amplification.
Yeah, I meant without a human looking at the output. I also agree with pretty much everything you just said. We’re pretty deep in this comment chain now and I’m not exactly sure why we got here—I agree that Richard’s original definition was based on the standard RL definition of myopia, though I was making the point that Richard’s attempt to make imitative amplification non-myopic turned it into approval-based amplification. Richard’s version has a human evaluate the output rather than a distance metric, which I see as the defining difference between imitative and approval-based amplification.