EDIT: Welcome to LessWrong. You may want introduce yourself in the welcome thread
Especially since the value of the entropy depends on the units (i.e. what logarithm base you use). Is the use of base two related to the fact that the answers can be either ‘yes’ or ‘no’ (binary)?
Precisely.
{3/8,2/8,2/8,1/8} … 1.91 bits … 2 questions.
Am I not looking hard enough?
No, there really aren’t. A simpler example is just taking two options, with unequal probabilities (take {1/8, 7⁄8} for concreteness). Again, you have to ask one question even though there is less than one bit (0.54 bits if I did the math right). However, if you have many copies of this system, you can ask questions about the entire subset that can (on average over all possible states of the entire system) describe the entire system with less than one question per system, and in the limit of an infinite number of subsystems, this approaches the entropy.
E.g, for two systems, labeling the first state as 0, and the second as 1, you have the following tree:
Are they both 1? a. Yes: stop, state is (1,1) and 1 question asked with probability 49⁄64 = 49⁄64. b. No: continue
Is the first one 1? a: Yes: stop (1, 0), 2 questions asked with probability 7⁄64 b: no: continue
Is the second one in 1? a: Yes: stop (0,1), 3 questions asked with probability 7⁄64 b: No: stop (0,0), 3 questions asked with probability 1⁄64
Total average questions = 49 + 2*7 + 3*8 = 87⁄64 < 2, but greater than 2*0.54.
EDIT: Welcome to LessWrong. You may want introduce yourself in the welcome thread
Precisely.
No, there really aren’t. A simpler example is just taking two options, with unequal probabilities (take {1/8, 7⁄8} for concreteness). Again, you have to ask one question even though there is less than one bit (0.54 bits if I did the math right). However, if you have many copies of this system, you can ask questions about the entire subset that can (on average over all possible states of the entire system) describe the entire system with less than one question per system, and in the limit of an infinite number of subsystems, this approaches the entropy.
E.g, for two systems, labeling the first state as 0, and the second as 1, you have the following tree:
Are they both 1?
a. Yes: stop, state is (1,1) and 1 question asked with probability 49⁄64 = 49⁄64.
b. No: continue
Is the first one 1?
a: Yes: stop (1, 0), 2 questions asked with probability 7⁄64
b: no: continue
Is the second one in 1?
a: Yes: stop (0,1), 3 questions asked with probability 7⁄64
b: No: stop (0,0), 3 questions asked with probability 1⁄64
Total average questions = 49 + 2*7 + 3*8 = 87⁄64 < 2, but greater than 2*0.54.