Finney, if you consider probability distributions over sequences, then—for example—a mixture of 33% first distribution, 33% second distribution, and 33% third distribution, produces a new and coherent probability distribution over sequences. This would create an inductive prior that could learn any of the three sequences, given only slightly more evidence to determine which one was most likely.
Annan, I’m making a more general point. (Obviously not so general as to encompass ‘maximum-entropy methods’ of machine learning, which find the distribution that maximizes entropy subject to constraints; they are not literally maximum entropy.) Think of physical matter in a state of very high thermodynamic entropy, such as a black hole or radiation bath. A heat bath doesn’t learn from observation, right? There’s not enough order present to carry out operations of observing, or learning. Only highly ordered matter, like brains, can extract information from the environment. A probability distribution in a state of maximum entropy likewise lacks structure and does not update in any systematic direction. The marginal posteriors will resemble the marginal priors. It can’t learn from experience; it doesn’t do induction.
Finney, if you consider probability distributions over sequences, then—for example—a mixture of 33% first distribution, 33% second distribution, and 33% third distribution, produces a new and coherent probability distribution over sequences. This would create an inductive prior that could learn any of the three sequences, given only slightly more evidence to determine which one was most likely.
Annan, I’m making a more general point. (Obviously not so general as to encompass ‘maximum-entropy methods’ of machine learning, which find the distribution that maximizes entropy subject to constraints; they are not literally maximum entropy.) Think of physical matter in a state of very high thermodynamic entropy, such as a black hole or radiation bath. A heat bath doesn’t learn from observation, right? There’s not enough order present to carry out operations of observing, or learning. Only highly ordered matter, like brains, can extract information from the environment. A probability distribution in a state of maximum entropy likewise lacks structure and does not update in any systematic direction. The marginal posteriors will resemble the marginal priors. It can’t learn from experience; it doesn’t do induction.