One way to convert: measure how accurate the LM is at word-level prediction by measuring its likelihood of each possible word. For example the LM’s likelihood of the word “[token A][token B]” could be p(token A|context)∗p(token B|token A, context).
One way to convert: measure how accurate the LM is at word-level prediction by measuring its likelihood of each possible word. For example the LM’s likelihood of the word “[token A][token B]” could be p(token A|context)∗p(token B|token A, context).