(I run the team that created that game. I made the guess-most-likely-next-token game and Fabien Roger made the other one.)
The optimal strategy for picking probabilities in that game is to say what your probability for those two next tokens would have been if you hadn’t updated on being asked about them. What’s your problem with this?
It’s kind of sad that this scoring system is kind of complicated. But I don’t know how to construct simpler games such that we can unbiasedly infer human perplexity from what the humans do.
(I run the team that created that game. I made the guess-most-likely-next-token game and Fabien Roger made the other one.)
The optimal strategy for picking probabilities in that game is to say what your probability for those two next tokens would have been if you hadn’t updated on being asked about them. What’s your problem with this?
It’s kind of sad that this scoring system is kind of complicated. But I don’t know how to construct simpler games such that we can unbiasedly infer human perplexity from what the humans do.