SoerenMind comments on Language models seem to be much better than humans at next-token prediction

SoerenMind Aug 14, 2022, 8:11 PM
LW: 1 AF: 1
0
AF
Playing this game made me realize that humans aren’t trainged to predict at the token-level. I don’t know the token-level vocabulary; and made lots of mistakes by missing spaces and punctuation. Is it possible to convert the token-level prediction in to word-level prediction? This may get you a better picture of human ability.
- SoerenMind Aug 14, 2022, 8:15 PM
  LW: 1 AF: 1
  0
  AF Parent
  One way to convert: measure how accurate the LM is at word-level prediction by measuring its likelihood of each possible word. For example the LM’s likelihood of the word “[token A][token B]” could be $p (token A | context) * p (token B |$ $token A, context)$ .