A1987dM comments on Open Thread February 25 - March 3

A1987dM 28 Feb 2014 11:16 UTC
0 points
La Wik says 8 bits per word, FWIW.
- gwern 28 Feb 2014 16:21 UTC
  6 points
  Parent
  La Wiki is apparently not using the entropy estimates extracted from human predictions (who are the best modelers of natural language). Crude stuff like trigram models are going to considerably overestimate matters.