Thomas and Cover is a good, readable book on info theory.
My favorite neat info theory puzzle asks how can you force a source of biased coin flips to give you unbiased coin flips instead, even if you don’t know the bias of the coin. (The glib answer is to zip the bits you get, a more precise answer is that that’s what universal source codes do, if the bit sequence is long enough they will compress to entropy, regardless of what distribution generated the bits. It’s actually kind of surprising and weird that universal source codes are a thing).
Info theorists reinvented causality in the 90s (the academic coordination problem is hard...):
Thomas and Cover is a good, readable book on info theory.
My favorite neat info theory puzzle asks how can you force a source of biased coin flips to give you unbiased coin flips instead, even if you don’t know the bias of the coin. (The glib answer is to zip the bits you get, a more precise answer is that that’s what universal source codes do, if the bit sequence is long enough they will compress to entropy, regardless of what distribution generated the bits. It’s actually kind of surprising and weird that universal source codes are a thing).
Info theorists reinvented causality in the 90s (the academic coordination problem is hard...):
http://arxiv.org/pdf/1110.0718v1.pdf
(and lots of papers on channels with feedback).
Ain’t that the truth, brother.