I wonder if there’s a way to give the black box recommended a different objective function. CTR is bad for the obvious clickbait reasons, but signals for user interaction are still valuable if you can find the right signal to use.
I would propose that returning to the site some time in the future is a better signal of quality than CTR, assuming the future is far enough away. You could try a week, a month, and a quarter.
This is maybe a good time to use reinforcement learning, since the signal is far away from the decision you need to make. When someone interacts with an article, reward the things they interacted with n weeks ago. Combined with karma, I bet that would be a better signal than CTR.
I wonder if there’s a way to give the black box recommended a different objective function. CTR is bad for the obvious clickbait reasons, but signals for user interaction are still valuable if you can find the right signal to use.
I would propose that returning to the site some time in the future is a better signal of quality than CTR, assuming the future is far enough away. You could try a week, a month, and a quarter.
This is maybe a good time to use reinforcement learning, since the signal is far away from the decision you need to make. When someone interacts with an article, reward the things they interacted with n weeks ago. Combined with karma, I bet that would be a better signal than CTR.