Consider the following problem. You know that there is some some property that some integers have and others don’t and you are trying to figure out what the property is. After testing every integer under 10^4, you find that there are 1229 integers under 10^4 that work. You have two hypotheses that describe these. One is that they are every prime number. The other is a given by a 1228 degree polynomial where P(n) gives the nth number in your set. One of these is clearly simpler. This isn’t just a language issue- if I tried to right these out in any reasonable equivalent of a Turing machine or programming language one of them will be a much shorter program. The distinction here however is not just one of one of them making up information. One is genuinely shorter.
If one wants we can give similar historical examples. In 1620 you could make a Copernican model of the solar system that would rival Kepler’s model in accuracy. But you would need a massive number of epicycles. The problem here doesn’t seem to be pulling information from nowhere. The problem seems to be that one of the hypotheses is simpler in a different way.
Both of these examples do have something in common which is that in both of the complicated examples there are a lot of parameters that are observationally dependent whereas the other has many fewer of those. But that seems to be a distinct issue (although it is possibly a good very rough way of measuring complexity of hypotheses).
Consider the following problem. You know that there is some some property that some integers have and others don’t and you are trying to figure out what the property is. After testing every integer under 10^4, you find that there are 1229 integers under 10^4 that work. You have two hypotheses that describe these. One is that they are every prime number. The other is a given by a 1228 degree polynomial where P(n) gives the nth number in your set. One of these is clearly simpler. This isn’t just a language issue- if I tried to right these out in any reasonable equivalent of a Turing machine or programming language one of them will be a much shorter program. The distinction here however is not just one of one of them making up information. One is genuinely shorter.
If one wants we can give similar historical examples. In 1620 you could make a Copernican model of the solar system that would rival Kepler’s model in accuracy. But you would need a massive number of epicycles. The problem here doesn’t seem to be pulling information from nowhere. The problem seems to be that one of the hypotheses is simpler in a different way.
Both of these examples do have something in common which is that in both of the complicated examples there are a lot of parameters that are observationally dependent whereas the other has many fewer of those. But that seems to be a distinct issue (although it is possibly a good very rough way of measuring complexity of hypotheses).