How does deciding one model is true give you more information? Did you mean “If a model allows you to make more predictions about future observations, then it is a priori less likely?”
How does deciding one model is true give you more information?
Let’s assume a strong version of Bayesianism, which entails the maximum entropy principle. So our belief is the one that has the maximum entropy, among those consistent with our prior information. If we now add the information that some model is true, this generally invalidate our previous belief, making the new maximum-entropy belief one of lower entropy. The reduction in entropy is the amount of information you gain by learning the model. In a way, this is a cost we pay for “narrowing” our belief.
The upside of it is that it tells us something useful about the future. Of course, not all information regarding the world is relevant for future observations. The part that doesn’t help control our anticipation is failing to pay rent, and should be evacuated. The part that does inform us about the future may be useful enough to be worth the cost we pay in taking in new information.
At what point does the decision “This is true” diverge from the observation “There is very strong evidence for this”, other than in cases where the model is accepted as true despite a lack of strong evidence?
I’m not discussing the case where a model goes from unknown to known- how does deciding to believe a model give you more information than knowing what the model is and the reason for the model. To better model an actual agent, one could replace all of the knowledge about why the model is true with the value of the strength of the supporting knowledge.
How does deciding that things always fall down give you more information than observing things fall down?
I believe the idea was to ask “hypothetically, if I found out that this hypothesis was true, how much new information would that give me?”
You’ll have two or more hypotheses, and one of them is the one that would (hypothetically) give you the least amount of new information. The one that would give you the least amount of new information should be considered the “simplest” hypothesis. (assuming a certain definition of “simplest”, and a certain definition of “information”)
How does deciding one model is true give you more information? Did you mean “If a model allows you to make more predictions about future observations, then it is a priori less likely?”
Let’s assume a strong version of Bayesianism, which entails the maximum entropy principle. So our belief is the one that has the maximum entropy, among those consistent with our prior information. If we now add the information that some model is true, this generally invalidate our previous belief, making the new maximum-entropy belief one of lower entropy. The reduction in entropy is the amount of information you gain by learning the model. In a way, this is a cost we pay for “narrowing” our belief.
The upside of it is that it tells us something useful about the future. Of course, not all information regarding the world is relevant for future observations. The part that doesn’t help control our anticipation is failing to pay rent, and should be evacuated. The part that does inform us about the future may be useful enough to be worth the cost we pay in taking in new information.
I’ll expand on all of this in my sequence on reinforcement learning.
At what point does the decision “This is true” diverge from the observation “There is very strong evidence for this”, other than in cases where the model is accepted as true despite a lack of strong evidence?
I’m not discussing the case where a model goes from unknown to known- how does deciding to believe a model give you more information than knowing what the model is and the reason for the model. To better model an actual agent, one could replace all of the knowledge about why the model is true with the value of the strength of the supporting knowledge.
How does deciding that things always fall down give you more information than observing things fall down?
I believe the idea was to ask “hypothetically, if I found out that this hypothesis was true, how much new information would that give me?”
You’ll have two or more hypotheses, and one of them is the one that would (hypothetically) give you the least amount of new information. The one that would give you the least amount of new information should be considered the “simplest” hypothesis. (assuming a certain definition of “simplest”, and a certain definition of “information”)