Frequentists reject the very concept of “the probability of the theory given the data.” They take probabilities to be objective, so they think it a category error to remark about the probability of a theory:
Then they should also reject the very concept of “the probability of the data given the theory”, since that quantity has “the probability of the theory” explicitly in the denominator.
Then they should also reject the very concept of “the probability of the data given the theory”, since that quantity has “the probability of the theory” explicitly in the denominator.
You are reading “the probability of the data D given the theory T” to mean p(D | T), which in turn is short for a ratio p(D & T)/p(T) of probabilities with respect to some universal prior p. But, for the frequentist, there is no universal prior p being invoked.
Rather, each theory comes with its own probability distribution p_T over data, and “the probability of the data D given the theory T” just means p_T(D). The different distributions provided by different theories don’t have any relationship with one another. In particular, the different distributions are not the result of conditioning on a common prior. They are incommensurable, so to speak.
The different theories are just more or less correct. There is a “true” probability of the data, which describes the objective propensity of reality to yield those data. The different distributions from the different theories are comparable only in the sense that they each get that true distribution more or less right.
Then they should also reject the very concept of “the probability of the data given the theory”, since that quantity has “the probability of the theory” explicitly in the denominator.
You are reading “the probability of the data D given the theory T” to mean p(D | T), which in turn is short for a ratio p(D & T)/p(T) of probabilities with respect to some universal prior p. But, for the frequentist, there is no universal prior p being invoked.
Rather, each theory comes with its own probability distribution p_T over data, and “the probability of the data D given the theory T” just means p_T(D). The different distributions provided by different theories don’t have any relationship with one another. In particular, the different distributions are not the result of conditioning on a common prior. They are incommensurable, so to speak.
The different theories are just more or less correct. There is a “true” probability of the data, which describes the objective propensity of reality to yield those data. The different distributions from the different theories are comparable only in the sense that they each get that true distribution more or less right.