look at the underlying random variable (‘surprisal’) logp(X=xi) of which entropy is the expectation.
Level 3: Coding functions
Shannon’s source coding theorem says entropy of a source X is the expected number of bits for an optimal encoding of samples of X.
Related quantity like mutual information, relative entropy, cross entropy, etc can also be given coding interpretations.
Level 4: Epsilon machine (transducer)
On level 3 we saw that entropy/information actually reflects various forms of (constrained) optimal coding. It talks about the codes but it does not talk about how these codes are implemented.
This is the level of Epsilon machines, more precisely epsilon transducers. It says not just what the coding function is but how it is (optimally) implemented mechanically.
Four levels of information theory
There are four levels of information theory.
Level 1: Number Entropy
Information is measured by Shannon entropy
H(X)=∑ip(X=xi)logp(X=xi)
Level 2: Random variable
look at the underlying random variable (‘surprisal’) logp(X=xi) of which entropy is the expectation.
Level 3: Coding functions
Shannon’s source coding theorem says entropy of a source X is the expected number of bits for an optimal encoding of samples of X.
Related quantity like mutual information, relative entropy, cross entropy, etc can also be given coding interpretations.
Level 4: Epsilon machine (transducer)
On level 3 we saw that entropy/information actually reflects various forms of (constrained) optimal coding. It talks about the codes but it does not talk about how these codes are implemented.
This is the level of Epsilon machines, more precisely epsilon transducers. It says not just what the coding function is but how it is (optimally) implemented mechanically.