thomblake comments on How to pick your categories

thomblake 11 Nov 2010 22:54 UTC
20 points

When you play Twenty Questions with the universe, some questions are more useful than others.

Very quotable
- sketerpot 12 Nov 2010 22:45 UTC
  3 points
  Parent
  And it brings to mind decision trees, which are essentially an automated way of playing Twenty Questions with the universe. In order to avoid over-fitting your training data, once you’ve constructed a complete decision tree, you go back and prune it, removing questions that are below a certain threshold of usefulness.
  
  The usual way you do this is, you look at the expected reduction in entropy from asking a particular question. If it doesn’t reduce the entropy much, don’t bother asking. If you know that an animal is a bird, you don’t gain much by asking “Is it an Emperor penguin?”. You would reduce the entropy in your pool of possible birds more by asking if it’s a songbird, or if its average adult wingspan is more than 10 cm.
  
  SarahC’s quote is not only clever, but also supported by solid math and practical application.