I think I’m mostly confused about how both Daniel and Adria are using the terms bayesian and frequentist. Like, I thought the difference between frequentist and bayesian interpretations of probability theory is that bayesian interpretations say the probability is in your head, while frequentist interpretations say the probability is in the world.
In that sense, showing that the kinds of methods motivated by frequentist considerations can give you insight into algorithms usefulness is maybe a little bit of evidence that probabilities actually exist in some objective sense. But it doesn’t seem to trump the “but that just sounds really absurd to me though” consideration.
In particular, logical induction and boundedly rational inductive agents were given as examples of frequentist methods by Daniel. The first at least seems pretty subjectivist to me, wouldn’t a frequentist think the probability of logical statements, being the most deterministic system, should have only 1 or 0 probabilities? Every time I type 1+1 into my calculator I always get 2! The second seems relatively unrelated to the question, though I know less about it.
First, “probability is in the world” is an oversimplification. Quoting from Wikipedia, “probabilities are discussed only when dealing with well-defined random experiments”. Since most things in the world are not well-defined random experiments, probability is reduced to a theoretical tool for analyzing things that works when real processes are similar enough to well-defined random experiments.
it doesn’t seem to trump the “but that just sounds really absurd to me though” consideration
Is there anything that could trump that consideration? One of my main objections to Bayesianism is that it prescribes that ideal agent’s beliefs must be probability distributions, which sounds even more absurd to me.
first at least seems pretty subjectivist to me,
Estimators in frequentism have ‘subjective beliefs’, in the sense that their output/recommendations depends on the evidence they’ve seen (i.e., the particular sample that’s input into it). The objectivity of frequentist methods is aspirational: the ‘goodness’ of an estimator is decided by how good it is in all possible worlds. (Often the estimator which is best in the least convenient world is preferred, but sometimes that isn’t known or doesn’t exist. Different estimators will be better in some worlds than others, and tough choices must be made, for which the theory mostly just gives up. See e.g. “Evaluating estimators”, Section 7.3 of “Statistical Inference” by Casella and Berger).
wouldn’t a frequentist think the probability of logical statements, being the most deterministic system, should have only 1 or 0 probabilities?
Indeed, in reality logical statements are either true or false, and thus their probabilities are either 1 or 0. But the estimator-algorithm is free to assign whatever belief it wants to it.
I agree that logical induction is very much Bayesianism-inspired, precisely because it wants to assign weights from zero to 1 that are as self-consistent as possible (i.e. basically probabilities) to statements. But it is frequentist in the sense that it’s examining “unconditional” properties of the algorithm, as opposed to properties assuming the prior distribution is true. (It can’t do the latter because, as you point out, the prior probability of logical statements is just 0 or 1).
But also, assigning probabilities of 0 or 1 to things is not exclusively a Bayesian thing. You could think of an predictor that outputs numbers between 0 and 1 as an estimator of whether a statement will be true or false. If you were to evaluate this estimator you could choose, say, mean-squared error. The best estimator is the one with the least MSE. And indeed, that’s how probabilistic forecasts are typically evaluated.
Daniel states he considers these frequentist because:
I call logical induction and boundedly rational inductive agents ‘frequentist’ because they fall into the family of “have a ton of ‘experts’ and play them off against each other” (and crucially, don’t constrain those experts to be ‘rational’ according to some a priori theory of good reasoning).
and I think indeed not prescribing that things must think in probabilities is more of a frequentist thing. I’m not sure I’d call them decidedly frequentist (logical induction is very much a different beast than classical statistics) but they’re not in the other camp either.
One of my main objections to Bayesianism is that it prescribes that ideal agent’s beliefs must be probability distributions, which sounds even more absurd to me.
From one viewpoint, I think this objection is satisfactorily answered by Cox’s theorem—do you find it unsatisfactory (and if so, why)?
Let me focus on another angle though, namely the “absurdity” and gut level feelings of probabilities.
So, my gut feels quite good about probabilities. Like, I am uncertain about various things (read: basically everything), but this uncertainty comes in degrees: I can compare and possibly even quantify my uncertainties. I feel like some people get stuck on the numeric probabilities part (one example I recently ran to was this quote from Section III of this essay by Scott, “Does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?”). Not sure if this is relevant here, but at the risk of going to a tangent, here’s a way of thinking about probabilities I’ve found clarifying and which I haven’t seen elsewhere:
The correspondence
beliefs <-> probabilities
is of the same type as
temperature <-> Celsius-degrees.
Like, people have feelings of warmth and temperature. These come in degrees: sometimes it’s hotter than some other times, now it is a lot warmer than yesterday and so on. And sure, people don’t have a built-in thermometer mapping these feelings to Celsius-degrees, they don’t naturally think of temperature in numeric degrees, they frequently make errors in translating between intuitive feelings and quantitative formulations (though less so with more experience). Heck, the Celsius scale is only a few hundred years old! Still, Celsius degrees feel like the correct way of thinking about temperature.
And the same with beliefs and uncertainty. These come in degrees: sometimes you are more confident than some other times, now you are way more confident than yesterday and so on. And sure, people don’t have a built-in probabilitymeter mapping these feelings to percentages, they don’t naturally think of confidence in numeric degrees, they frequently make errors in translating between intuitive feelings and quantitative formulations (though less so with more experience). Heck, the probability scale is only a few hundred years old! Still, probabilities feel like the correct way of thinking about uncertainty.
From this perspective probabilities feel completely natural to me—or at least as natural as Celsius-degrees feel. Especially questions like “does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?” seem to miss the point, in the same way that “does anyone actually consistently use numerical degrees in everyday situations of temperature?” seems to miss the point of the Celsius scale. And I have no gut level objections to the claim that an ideal agent’s conceptions of warmth beliefs correspond to probabilities.
I think I’m mostly confused about how both Daniel and Adria are using the terms bayesian and frequentist. Like, I thought the difference between frequentist and bayesian interpretations of probability theory is that bayesian interpretations say the probability is in your head, while frequentist interpretations say the probability is in the world.
In that sense, showing that the kinds of methods motivated by frequentist considerations can give you insight into algorithms usefulness is maybe a little bit of evidence that probabilities actually exist in some objective sense. But it doesn’t seem to trump the “but that just sounds really absurd to me though” consideration.
In particular, logical induction and boundedly rational inductive agents were given as examples of frequentist methods by Daniel. The first at least seems pretty subjectivist to me, wouldn’t a frequentist think the probability of logical statements, being the most deterministic system, should have only 1 or 0 probabilities? Every time I type 1+1 into my calculator I always get 2! The second seems relatively unrelated to the question, though I know less about it.
First, “probability is in the world” is an oversimplification. Quoting from Wikipedia, “probabilities are discussed only when dealing with well-defined random experiments”. Since most things in the world are not well-defined random experiments, probability is reduced to a theoretical tool for analyzing things that works when real processes are similar enough to well-defined random experiments.
Is there anything that could trump that consideration? One of my main objections to Bayesianism is that it prescribes that ideal agent’s beliefs must be probability distributions, which sounds even more absurd to me.
Estimators in frequentism have ‘subjective beliefs’, in the sense that their output/recommendations depends on the evidence they’ve seen (i.e., the particular sample that’s input into it). The objectivity of frequentist methods is aspirational: the ‘goodness’ of an estimator is decided by how good it is in all possible worlds. (Often the estimator which is best in the least convenient world is preferred, but sometimes that isn’t known or doesn’t exist. Different estimators will be better in some worlds than others, and tough choices must be made, for which the theory mostly just gives up. See e.g. “Evaluating estimators”, Section 7.3 of “Statistical Inference” by Casella and Berger).
Indeed, in reality logical statements are either true or false, and thus their probabilities are either 1 or 0. But the estimator-algorithm is free to assign whatever belief it wants to it.
I agree that logical induction is very much Bayesianism-inspired, precisely because it wants to assign weights from zero to 1 that are as self-consistent as possible (i.e. basically probabilities) to statements. But it is frequentist in the sense that it’s examining “unconditional” properties of the algorithm, as opposed to properties assuming the prior distribution is true. (It can’t do the latter because, as you point out, the prior probability of logical statements is just 0 or 1).
But also, assigning probabilities of 0 or 1 to things is not exclusively a Bayesian thing. You could think of an predictor that outputs numbers between 0 and 1 as an estimator of whether a statement will be true or false. If you were to evaluate this estimator you could choose, say, mean-squared error. The best estimator is the one with the least MSE. And indeed, that’s how probabilistic forecasts are typically evaluated.
Daniel states he considers these frequentist because:
and I think indeed not prescribing that things must think in probabilities is more of a frequentist thing. I’m not sure I’d call them decidedly frequentist (logical induction is very much a different beast than classical statistics) but they’re not in the other camp either.
From one viewpoint, I think this objection is satisfactorily answered by Cox’s theorem—do you find it unsatisfactory (and if so, why)?
Let me focus on another angle though, namely the “absurdity” and gut level feelings of probabilities.
So, my gut feels quite good about probabilities. Like, I am uncertain about various things (read: basically everything), but this uncertainty comes in degrees: I can compare and possibly even quantify my uncertainties. I feel like some people get stuck on the numeric probabilities part (one example I recently ran to was this quote from Section III of this essay by Scott, “Does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?”). Not sure if this is relevant here, but at the risk of going to a tangent, here’s a way of thinking about probabilities I’ve found clarifying and which I haven’t seen elsewhere:
The correspondence
beliefs <-> probabilities
is of the same type as
temperature <-> Celsius-degrees.
Like, people have feelings of warmth and temperature. These come in degrees: sometimes it’s hotter than some other times, now it is a lot warmer than yesterday and so on. And sure, people don’t have a built-in thermometer mapping these feelings to Celsius-degrees, they don’t naturally think of temperature in numeric degrees, they frequently make errors in translating between intuitive feelings and quantitative formulations (though less so with more experience). Heck, the Celsius scale is only a few hundred years old! Still, Celsius degrees feel like the correct way of thinking about temperature.
And the same with beliefs and uncertainty. These come in degrees: sometimes you are more confident than some other times, now you are way more confident than yesterday and so on. And sure, people don’t have a built-in probabilitymeter mapping these feelings to percentages, they don’t naturally think of confidence in numeric degrees, they frequently make errors in translating between intuitive feelings and quantitative formulations (though less so with more experience). Heck, the probability scale is only a few hundred years old! Still, probabilities feel like the correct way of thinking about uncertainty.
From this perspective probabilities feel completely natural to me—or at least as natural as Celsius-degrees feel. Especially questions like “does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?” seem to miss the point, in the same way that “does anyone actually consistently use numerical degrees in everyday situations of temperature?” seems to miss the point of the Celsius scale. And I have no gut level objections to the claim that an ideal agent’s conceptions of
warmthbeliefs correspond to probabilities.