PhilGoetz comments on Open Thread: March 2010, part 2

PhilGoetz 21 Mar 2010 20:29 UTC
0 points

A bin is most informative if the statistics of the bin have the least entropy.

That’s a good idea.

A natural measure of the entropy is just -p log p - (1-p) log (1- p), where p is the revealed frequency, but it’s not the right one.

I’m glad you said that, since that was what I immediately thought of doing. I’ll read up on the beta distribution, thanks!
- wnoise 22 Mar 2010 15:41 UTC
  2 points
  Parent
  I still think it’s not a great choice, though clearly my other choices haven’t worked well. But please do try it.
  
  Given that the probability is a continuous distribution, the Fisher information might instead be a reasonable thing to look at. For a single distribution, maximizing it corresponds to minimizing the variance, so my suggestion for that wasn’t as ad-hoc as I thought. I’m not sure the equivalence holds for multiple distributions.