Frequentists deny that one always has conditional probabilities of the form P(data | parameter), because in many cases they deny that the parameter can be treated as the value of a random variable.
If this is really what you mean, can you clarify it? Are you talking about going from P(data ; parameter) to P(data | parameter) by abuse of notation and then taking the conditioning seriously?
I’m not sure what you mean by “abuse of notation”. I don’t think P(data ; parameter) and P(data | parameter) are the same thing. The former is a member of a family of distributions indexed by parameter value, the latter is a conditional distribution. I do think that, from a Bayesian point of view, the former determines the latter.
As a Bayesian, you treat the parameter value m as the value of an unobserved random variable M. The observed data y is the value of a random variable Y. Your model,
),
can be used to straightforwardly derive the conditional distribution
).
In conjunction with your prior distribution for M, this gives you the posterior probability of the parameter value being m.
I’m not a statistician, so I might be using notation in an unorthodox manner here, but I don’t think there’s anything wrong with the content of what I said. Is there?
Frequentists deny that one always has conditional probabilities of the form P(data | parameter), because in many cases they deny that the parameter can be treated as the value of a random variable.
What cases? Where does this what you said come from the view that probability is the limit in infinitely many trials?
This post doesn’t clarify that. I’m still not sure what you mean exactly (or based on what you determined what ‘frequentists’ do, survey of literature? some sort of actual issue with interpreting probability as limit in many trials?).
Suppose I’m performing an experiment whose purpose is to estimate the value of some physical constant, say the fine structure constant. Can you make sense of assigning a probability distribution to this parameter from a frequentist perspective? The probability of the constant being in some range would presumably be the limit of the relative frequency of that range as the number of trials goes to infinity, but what could a “trial” possibly be in this case?
Let’s see how Bayesianists here propose to assign probability distribution to something like that: Solomonoff induction, ‘universal prior’. Trials of random tape on Turing machines (which you can do by considering all possible tape). The logic that follows afterwards should be identical; as you ‘update your beliefs’ you select states compatible with evidence, as per top post in that thread; mathematically, Bayes rule.
Not convinced that this issue is something specific to frequentism.
If this is really what you mean, can you clarify it? Are you talking about going from P(data ; parameter) to P(data | parameter) by abuse of notation and then taking the conditioning seriously?
I’m not sure what you mean by “abuse of notation”. I don’t think P(data ; parameter) and P(data | parameter) are the same thing. The former is a member of a family of distributions indexed by parameter value, the latter is a conditional distribution. I do think that, from a Bayesian point of view, the former determines the latter.
As a Bayesian, you treat the parameter value m as the value of an unobserved random variable M. The observed data y is the value of a random variable Y. Your model,
),can be used to straightforwardly derive the conditional distribution
In conjunction with your prior distribution for M, this gives you the posterior probability of the parameter value being m.
I’m not a statistician, so I might be using notation in an unorthodox manner here, but I don’t think there’s anything wrong with the content of what I said. Is there?
What cases? Where does this what you said come from the view that probability is the limit in infinitely many trials?
This post doesn’t clarify that. I’m still not sure what you mean exactly (or based on what you determined what ‘frequentists’ do, survey of literature? some sort of actual issue with interpreting probability as limit in many trials?).
Suppose I’m performing an experiment whose purpose is to estimate the value of some physical constant, say the fine structure constant. Can you make sense of assigning a probability distribution to this parameter from a frequentist perspective? The probability of the constant being in some range would presumably be the limit of the relative frequency of that range as the number of trials goes to infinity, but what could a “trial” possibly be in this case?
Let’s see how Bayesianists here propose to assign probability distribution to something like that: Solomonoff induction, ‘universal prior’. Trials of random tape on Turing machines (which you can do by considering all possible tape). The logic that follows afterwards should be identical; as you ‘update your beliefs’ you select states compatible with evidence, as per top post in that thread; mathematically, Bayes rule.
Not convinced that this issue is something specific to frequentism.