“No single prior seems to accurately represent our actual state of knowledge/ignorance” is a really ridiculously strong claim, and one which should be provable/disprovable by starting from some qualitative observations about the state of knowledge/ignorance in question. But I’ve never seen someone advocate for imprecise probabilities by actually making that case.
Let me illustrate a bit how I imagine this would go, and how strong a case would need to be made.
Let’s take the simple example of a biased coin with unknown bias. A strawman imprecise-probabilist might argue something like: “If the coin has probability p of landing heads, then after n flips (for some large-ish n) I expect to see roughly pn (plus or minus O(√n)) heads. But for any particular number p, that’s not actually what I expect a-priori, because I don’t know which p is right—e.g. I don’t actually confidently expect to see roughly 0.3n±O(√(n)) heads a priori. Therefore no distribution can represent my state of knowledge.”.
… and then the obvious Bayesian response would be: “Sure, if you’re artificially restricting your space of distributions/probabilistic models to IID distributions of coin flips. But our actual prior is not in that space; our actual prior involves a latent variable (the bias), and the coin flips are not independent if we don’t know the bias (since seeing one outcome tells us something about the bias, which in turn tells us something about the other coin flips). We can represent our prior state of knowledge in this problem just fine with a distribution over the bias.”.
Now, the imprecise probabilist could perhaps argue against that by pointing out some other properties of our state of knowledge, and then arguing that no distribution can represent our prior state of knowledge over all the coin flips, no matter how much we introduce latent variables. But that’s a much stronger claim, a much harder case to make, and I have no idea what properties of our state of knowledge one would even start from in order to argue for it. On the other hand, I do know of various sets of properties of our state-of-knowledge which are sufficient to conclude that it can be accurately represented by a single prior distribution—e.g. the preconditions of Cox’ Theorem, or the preconditions for the Dutch Book theorems (if our hypothetical agent is willing to make bets on its priors).
What’s your prior that in 1000 years, an Earth-originating superintelligence will be aligned to object-level values close to those of humans alive today [for whatever operationalization of “object-level” or “close” you like]? And why do you think that prior uniquely accurately represents your state of knowledge? Seems to me like the view that a single prior does accurately represent your state of knowledge is the strong claim. I don’t see how the rest of your comment answers this.
(Maybe you have in mind a very different conception of “represent” or “state of knowledge” than I do.)
Right, so there’s room here for a burden-of-proof disagreement—i.e. you find it unlikely on priors that a single distribution can accurately capture realistic states-of-knowledge, I don’t find it unlikely on priors.
If we’ve arrived at a burden-of-proof disagreement, then I’d say that’s sufficient to back up my answer at top-of-thread:
both imprecise probabilities and maximality seem like ad-hoc, unmotivated methods which add complexity to Bayesian reasoning for no particularly compelling reason.
I said I don’t know of any compelling reason—i.e. positive argument, beyond just “this seems unlikely to Anthony and some other people on priors”—to add this extra piece to Bayesian reasoning. And indeed, I still don’t. Which does not mean that I necessarily expect you to be convinced that we don’t need that extra piece; I haven’t spelled out a positive argument here either.
It’s not that I “find it unlikely on priors” — I’m literally asking what your prior on the proposition I mentioned is, and why you endorse that prior. If you answered that, I could answer why I’m skeptical that that prior really is the unique representation of your state of knowledge. (It might well be the unique representation of the most-salient-to-you intuitions about the proposition, but that’s not your state of knowledge.) I don’t know what further positive argument you’re looking for.
Someone could fail to report a unique precise prior (and one that’s consistent with their other beliefs and priors across contexts) for any of the following reasons, which seem worth distinguishing:
There is no unique precise prior that can represent their state of knowledge.
There is a unique precise prior that represents their state of knowledge, but they don’t have or use it, even approximately.
There is a unique precise prior that represents their state of knowledge, but, in practice, they can only report (precise or imprecise) approximations of it (not just computing decimal places for a real number, but also which things go into the prior could differ by approximation). Hypothetically, in the limit of resources spent on computing its values, the approximations would converge to this unique precise prior.
I’d be inclined to treat all three cases like imprecise probabilities, e.g. I wouldn’t permanently commit to a prior I wrote down to the exclusion of all other priors over the same events/possibilities.
“No single prior seems to accurately represent our actual state of knowledge/ignorance” is a really ridiculously strong claim, and one which should be provable/disprovable by starting from some qualitative observations about the state of knowledge/ignorance in question. But I’ve never seen someone advocate for imprecise probabilities by actually making that case.
Let me illustrate a bit how I imagine this would go, and how strong a case would need to be made.
Let’s take the simple example of a biased coin with unknown bias. A strawman imprecise-probabilist might argue something like: “If the coin has probability p of landing heads, then after n flips (for some large-ish n) I expect to see roughly pn (plus or minus O(√n)) heads. But for any particular number p, that’s not actually what I expect a-priori, because I don’t know which p is right—e.g. I don’t actually confidently expect to see roughly 0.3n±O(√(n)) heads a priori. Therefore no distribution can represent my state of knowledge.”.
… and then the obvious Bayesian response would be: “Sure, if you’re artificially restricting your space of distributions/probabilistic models to IID distributions of coin flips. But our actual prior is not in that space; our actual prior involves a latent variable (the bias), and the coin flips are not independent if we don’t know the bias (since seeing one outcome tells us something about the bias, which in turn tells us something about the other coin flips). We can represent our prior state of knowledge in this problem just fine with a distribution over the bias.”.
Now, the imprecise probabilist could perhaps argue against that by pointing out some other properties of our state of knowledge, and then arguing that no distribution can represent our prior state of knowledge over all the coin flips, no matter how much we introduce latent variables. But that’s a much stronger claim, a much harder case to make, and I have no idea what properties of our state of knowledge one would even start from in order to argue for it. On the other hand, I do know of various sets of properties of our state-of-knowledge which are sufficient to conclude that it can be accurately represented by a single prior distribution—e.g. the preconditions of Cox’ Theorem, or the preconditions for the Dutch Book theorems (if our hypothetical agent is willing to make bets on its priors).
What’s your prior that in 1000 years, an Earth-originating superintelligence will be aligned to object-level values close to those of humans alive today [for whatever operationalization of “object-level” or “close” you like]? And why do you think that prior uniquely accurately represents your state of knowledge? Seems to me like the view that a single prior does accurately represent your state of knowledge is the strong claim. I don’t see how the rest of your comment answers this.
(Maybe you have in mind a very different conception of “represent” or “state of knowledge” than I do.)
Right, so there’s room here for a burden-of-proof disagreement—i.e. you find it unlikely on priors that a single distribution can accurately capture realistic states-of-knowledge, I don’t find it unlikely on priors.
If we’ve arrived at a burden-of-proof disagreement, then I’d say that’s sufficient to back up my answer at top-of-thread:
I said I don’t know of any compelling reason—i.e. positive argument, beyond just “this seems unlikely to Anthony and some other people on priors”—to add this extra piece to Bayesian reasoning. And indeed, I still don’t. Which does not mean that I necessarily expect you to be convinced that we don’t need that extra piece; I haven’t spelled out a positive argument here either.
It’s not that I “find it unlikely on priors” — I’m literally asking what your prior on the proposition I mentioned is, and why you endorse that prior. If you answered that, I could answer why I’m skeptical that that prior really is the unique representation of your state of knowledge. (It might well be the unique representation of the most-salient-to-you intuitions about the proposition, but that’s not your state of knowledge.) I don’t know what further positive argument you’re looking for.
Someone could fail to report a unique precise prior (and one that’s consistent with their other beliefs and priors across contexts) for any of the following reasons, which seem worth distinguishing:
There is no unique precise prior that can represent their state of knowledge.
There is a unique precise prior that represents their state of knowledge, but they don’t have or use it, even approximately.
There is a unique precise prior that represents their state of knowledge, but, in practice, they can only report (precise or imprecise) approximations of it (not just computing decimal places for a real number, but also which things go into the prior could differ by approximation). Hypothetically, in the limit of resources spent on computing its values, the approximations would converge to this unique precise prior.
I’d be inclined to treat all three cases like imprecise probabilities, e.g. I wouldn’t permanently commit to a prior I wrote down to the exclusion of all other priors over the same events/possibilities.