I think this is a fair point that an open reward function is subject to “SEO” efforts to game it. But, how about a “training” reward function that is open, and a “test” reward function that is hidden?
I would love to know what are some other OSS efforts on reward function (I do follow Carper’s development on RF), and love to contribute.
I think this is a fair point that an open reward function is subject to “SEO” efforts to game it. But, how about a “training” reward function that is open, and a “test” reward function that is hidden?
I would love to know what are some other OSS efforts on reward function (I do follow Carper’s development on RF), and love to contribute.