I’ve tried the method from that paper (typical sampling), and I wasn’t hugely impressed with it. In fact, it was worse than my usual sampler to a sufficient extent that users noticed the difference, and I switched back after a few days. See this post and these tweets.
(My usual sampler one I came up with myself, called Breakruns. It works the best in practice of any I’ve tried.)
I’m also not sure I really buy the argument behind typical sampling. It seems to conflate “there are a lot of different ways the text could go from here” with “the text is about to get weird.” In practice, I noticed it would tend to do the latter at points where the former was true, like the start of a sample or of a new paragraph or section.
Deciding how you sample is really important for avoiding the repetition trap, but I haven’t seen sampling tweaks yield meaningful gains outside of that area.
I’ve tried the method from that paper (typical sampling), and I wasn’t hugely impressed with it. In fact, it was worse than my usual sampler to a sufficient extent that users noticed the difference, and I switched back after a few days. See this post and these tweets.
(My usual sampler one I came up with myself, called Breakruns. It works the best in practice of any I’ve tried.)
I’m also not sure I really buy the argument behind typical sampling. It seems to conflate “there are a lot of different ways the text could go from here” with “the text is about to get weird.” In practice, I noticed it would tend to do the latter at points where the former was true, like the start of a sample or of a new paragraph or section.
Deciding how you sample is really important for avoiding the repetition trap, but I haven’t seen sampling tweaks yield meaningful gains outside of that area.
Very comprehensive, thank you!