almath123 comments on Forecasting progress in language models

almath123 29 Oct 2021 18:21 UTC
6 points
Perplexity depends on the vocabulary and is sensitive to preprocessing which could skew the results presented here. This is a common problem. See the following reference:

Unigram-Normalized Perplexity as a Language Model Performance Measure with Different Vocabulary Sizes Jihyeon Roha, Sang-Hoon Ohb, Soo-Young Lee, 2020
- Matthew Barnett 29 Oct 2021 20:31 UTC
  2 points
  Parent
  Thanks! That’s really interesting. I’ll check it out.