Barkley, priors aren’t meant to be detailed objective models of the world—that’s why they’re called “priors”. :)
A good prior learns from evidence, and the more probability mass it concentrates into sequences of the sort that are actually likely to occur, the faster it will learn. In a certain sense, the “optimal prior” is the one that learns so fast that it doesn’t need any evidence at all—but that’s not really what a “prior” is for. Even with an excellent prior, nearly all of the information will come from the environment.
Sense data is light, the prior is a camera. Most of the information is in the light, but you need a camera to develop it; a rock won’t do. A good camera needs less light to develop an accurate picture, but the detailed picture is still carried by the light’s message, not factory-preprinted inside the camera.
As for the Diaconis and Freedman paper, I haven’t read it, but kindly remember that I am an infinite set atheist. In any case it is easy for poor priors to not learn, or anti-learn. Every prior that assigns more mass than maxent to “plausible” sequences, does so by draining mass from “implausible” sequences. If reality falls into one of the “implausible” sequences, we will do worse than maximum entropy, anti-learn from experience, and not pass on our genes to a whole lot of offspring.
Barkley, priors aren’t meant to be detailed objective models of the world—that’s why they’re called “priors”. :)
A good prior learns from evidence, and the more probability mass it concentrates into sequences of the sort that are actually likely to occur, the faster it will learn. In a certain sense, the “optimal prior” is the one that learns so fast that it doesn’t need any evidence at all—but that’s not really what a “prior” is for. Even with an excellent prior, nearly all of the information will come from the environment.
Sense data is light, the prior is a camera. Most of the information is in the light, but you need a camera to develop it; a rock won’t do. A good camera needs less light to develop an accurate picture, but the detailed picture is still carried by the light’s message, not factory-preprinted inside the camera.
As for the Diaconis and Freedman paper, I haven’t read it, but kindly remember that I am an infinite set atheist. In any case it is easy for poor priors to not learn, or anti-learn. Every prior that assigns more mass than maxent to “plausible” sequences, does so by draining mass from “implausible” sequences. If reality falls into one of the “implausible” sequences, we will do worse than maximum entropy, anti-learn from experience, and not pass on our genes to a whole lot of offspring.