If you’re curious, the most interesting pure stochastic sampling variant I’ve seen lately is: “Contrastive Search Is What You Need For Neural Text Generation”, Su & Collier 2022. (Unfortunately, only benchmarked on very small models and AFAIK no one has generated samples from large GPT-3 scale models or provided quantitative/qualitative description.)
Thanks! I had actually skimmed this recently but forgot to add it to my reading list. The cherry-picked examples for text generation seem a bit low-information, but it would be interesting to see their technique applied to a larger model.
If you’re curious, the most interesting pure stochastic sampling variant I’ve seen lately is: “Contrastive Search Is What You Need For Neural Text Generation”, Su & Collier 2022. (Unfortunately, only benchmarked on very small models and AFAIK no one has generated samples from large GPT-3 scale models or provided quantitative/qualitative description.)
Thanks! I had actually skimmed this recently but forgot to add it to my reading list. The cherry-picked examples for text generation seem a bit low-information, but it would be interesting to see their technique applied to a larger model.