If both hypotheses explain some set of data, I’ve usually been able to make a direct comparison even in what look like tough cases by following the information in the data—what sort of process generates it, etc. Keeping things in terms of the “language” of the data is in fact also justified by the idea that pulling information from nowhere is bad.
This sort of reliance on our observations is certainly an empiricist assumption, but I don’t think a reductionist one.
Consider the following problem. You know that there is some some property that some integers have and others don’t and you are trying to figure out what the property is. After testing every integer under 10^4, you find that there are 1229 integers under 10^4 that work. You have two hypotheses that describe these. One is that they are every prime number. The other is a given by a 1228 degree polynomial where P(n) gives the nth number in your set. One of these is clearly simpler. This isn’t just a language issue- if I tried to right these out in any reasonable equivalent of a Turing machine or programming language one of them will be a much shorter program. The distinction here however is not just one of one of them making up information. One is genuinely shorter.
If one wants we can give similar historical examples. In 1620 you could make a Copernican model of the solar system that would rival Kepler’s model in accuracy. But you would need a massive number of epicycles. The problem here doesn’t seem to be pulling information from nowhere. The problem seems to be that one of the hypotheses is simpler in a different way.
Both of these examples do have something in common which is that in both of the complicated examples there are a lot of parameters that are observationally dependent whereas the other has many fewer of those. But that seems to be a distinct issue (although it is possibly a good very rough way of measuring complexity of hypotheses).
If both hypotheses explain some set of data, I’ve usually been able to make a direct comparison even in what look like tough cases by following the information in the data—what sort of process generates it, etc. Keeping things in terms of the “language” of the data is in fact also justified by the idea that pulling information from nowhere is bad.
This sort of reliance on our observations is certainly an empiricist assumption, but I don’t think a reductionist one.
Consider the following problem. You know that there is some some property that some integers have and others don’t and you are trying to figure out what the property is. After testing every integer under 10^4, you find that there are 1229 integers under 10^4 that work. You have two hypotheses that describe these. One is that they are every prime number. The other is a given by a 1228 degree polynomial where P(n) gives the nth number in your set. One of these is clearly simpler. This isn’t just a language issue- if I tried to right these out in any reasonable equivalent of a Turing machine or programming language one of them will be a much shorter program. The distinction here however is not just one of one of them making up information. One is genuinely shorter.
If one wants we can give similar historical examples. In 1620 you could make a Copernican model of the solar system that would rival Kepler’s model in accuracy. But you would need a massive number of epicycles. The problem here doesn’t seem to be pulling information from nowhere. The problem seems to be that one of the hypotheses is simpler in a different way.
Both of these examples do have something in common which is that in both of the complicated examples there are a lot of parameters that are observationally dependent whereas the other has many fewer of those. But that seems to be a distinct issue (although it is possibly a good very rough way of measuring complexity of hypotheses).