La Wik says 8 bits per word, FWIW.
La Wiki is apparently not using the entropy estimates extracted from human predictions (who are the best modelers of natural language). Crude stuff like trigram models are going to considerably overestimate matters.
La Wik says 8 bits per word, FWIW.
La Wiki is apparently not using the entropy estimates extracted from human predictions (who are the best modelers of natural language). Crude stuff like trigram models are going to considerably overestimate matters.