It’s slow loading for me due to a slow internet connection, but if the questions at the end are included, I was the one who asked about insurance companies.
I don’t think his response was very satisfactory, though I have a better version of my question.
Suppose I give you some odds p:q and force you to bet on some proposition X (say, Democrats win in 2012) being true, but I let you pick which side of the bet you take; a payoff of p if X is true, or a payoff of q if X is false. For some (unique) value of p/q, you’ll switch which side you want to take.
It seems this can force you to assign probabilities to arbitrary hypothesis.
Suppose I give you some odds p:q and force you to bet on some proposition X (say, Democrats win in 2012) being true, but I let you pick which side of the bet you take; a payoff of p if X is true, or a payoff of q if X is false. For some (unique) value of p/q, you’ll switch which side you want to take.
It seems this can force you to assign probabilities to arbitrary hypothesis.
So, how precise should these probabilities be? Any why can’t I apply this argument to force the probabilities to have arbitrary high precision?
Imagine that you had to give a probability density to each probability estimate you could make of Obama winning in 2012 being the correct one. You’d end up with something looking like a bell curve over probabilities, centered somewhere around “Obama has a 70% (or something) chance of winning.” Then to make a decision based on that distribution using normal decision theory, you would average over the possible results of an action, weighted by the probability. But this is equivalent to taking the mean of your bell curve—no matter how wide or narrow the bell curve, all that matters to your (standard decision theory) decision is the location of the mean.
Less evidence is like a wider bell curve, more evidence like a sharper one. But as long as the mean stays the same, the average result of each decision stays the same, so your decision will also be the same.
So there are two kinds of precision here: the precision of the mean probability given your current (incomplete) information, which can be arbitrarily high, and the precision with which you estimate the true answer, which is the width of the bell curve. So when you say “precision,” there is a possible confusion. Your first post was about the “how precise can these probabilities be,” which was the first (and boring, since it’s so high) kind of precision, while this post seems to be talking about the second kind, the kind that is more useful because it reflects how much evidence you have.
So there are two kinds of precision here: the precision of the mean probability given your current (incomplete) information, which can be arbitrarily high, and the precision with which you estimate the true answer, which is the width of the bell curve.
I’m not sure what you mean by the “true answer”. After all, in some sense the true probability is either 0 or 1 it’s just that we don’t know which.
That’s a good point. So I guess the second kind of precision doesn’t make sense in this case (like it would if the bell curve were over, say, the number of beans in a jar), and “precision” should only refer to “precision with which we can extract an average probability from our information,” which is very high.
Imagine that you had to give a probability density to each probability estimate you could make of Obama winning in 2012 being the correct one. You’d end up with something looking like a bell curve over probabilities
Bell curves prefer to live on unbounded intervals! It would be less jarring, (and less convenient for you?), if he ended up with something looking like a uniform distribution over probabilities.
It’s equally convenient, since the mean doesn’t care about the shape. I don’t think it’s particularly jarring—just imagine it going to 0 at the edges.
The reason you’ll probably end up with something like a bell curve is a practical one—the central limit theorem. For complicated problems, you very often get what looks something like a bell curve. Hardly watertight, but I’d bet decent amounts of money that it is true in this case, so why not use it to add a little color to the description?
In that case, we’re done. Standard probability theory/Cox Theorem/de Finette would give us a ready made criticism of any conjecture that wasn’t isomorphic to probability theory, so we’d have isomorphism, which is all we need. Once we have functional equivalence, we can prove results in probability theory, apply Bayes theorem, etc., and then at the end translate back into Popperesque.
(Also, IIRC, Jaynes only claimed to have proven that rational reasoning must be isomorphic to probability theory)
I don’t quite get your point. You are saying that if you bring up betting (a real life scenario where probability is highly relevant), then given your explanations that help you come up with priors (background knowledge needed to be able to do any math about it), you shouldn’t act on those explanations in ways that violates math. OK, so what? probability math is useful in some limited cases, given some explanatory knowledge to get set up. no one said otherwise.
I think you are beginning to get the point. :) The key missing fact here is that in fact the resulting math is highly constraining, to the point that if you actually follow it all the way you will be acting in a manner isomorphic to a Bayesian utility-maximizer.
But the background knowledge part is highly not-constraining (just given your math). When a math algorithm gives constrained output, but you have wide scope for choice of input, it’s not so good. you need to do stuff to constrain the inputs.
it seems to me you just dump all the hard parts of thinking into the priors and then say the rest follows. but the hard parts are still there. we still need to work out good explanations to use as input for the last step of not doing stuff that violates math/logic.
It’s slow loading for me due to a slow internet connection, but if the questions at the end are included, I was the one who asked about insurance companies.
I don’t think his response was very satisfactory, though I have a better version of my question.
Suppose I give you some odds p:q and force you to bet on some proposition X (say, Democrats win in 2012) being true, but I let you pick which side of the bet you take; a payoff of p if X is true, or a payoff of q if X is false. For some (unique) value of p/q, you’ll switch which side you want to take.
It seems this can force you to assign probabilities to arbitrary hypothesis.
So, how precise should these probabilities be? Any why can’t I apply this argument to force the probabilities to have arbitrary high precision?
Not that I can think of, besides memory/speed constaints, and how much updating you can have done with the evidence you’ve recieved.
Why can’t it happen that you have so little and/or such weak evidence, that the amount of precision you should have is none at all?
Imagine that you had to give a probability density to each probability estimate you could make of Obama winning in 2012 being the correct one. You’d end up with something looking like a bell curve over probabilities, centered somewhere around “Obama has a 70% (or something) chance of winning.” Then to make a decision based on that distribution using normal decision theory, you would average over the possible results of an action, weighted by the probability. But this is equivalent to taking the mean of your bell curve—no matter how wide or narrow the bell curve, all that matters to your (standard decision theory) decision is the location of the mean.
Less evidence is like a wider bell curve, more evidence like a sharper one. But as long as the mean stays the same, the average result of each decision stays the same, so your decision will also be the same.
So there are two kinds of precision here: the precision of the mean probability given your current (incomplete) information, which can be arbitrarily high, and the precision with which you estimate the true answer, which is the width of the bell curve. So when you say “precision,” there is a possible confusion. Your first post was about the “how precise can these probabilities be,” which was the first (and boring, since it’s so high) kind of precision, while this post seems to be talking about the second kind, the kind that is more useful because it reflects how much evidence you have.
I’m not sure what you mean by the “true answer”. After all, in some sense the true probability is either 0 or 1 it’s just that we don’t know which.
That’s a good point. So I guess the second kind of precision doesn’t make sense in this case (like it would if the bell curve were over, say, the number of beans in a jar), and “precision” should only refer to “precision with which we can extract an average probability from our information,” which is very high.
Bell curves prefer to live on unbounded intervals! It would be less jarring, (and less convenient for you?), if he ended up with something looking like a uniform distribution over probabilities.
It’s equally convenient, since the mean doesn’t care about the shape. I don’t think it’s particularly jarring—just imagine it going to 0 at the edges.
The reason you’ll probably end up with something like a bell curve is a practical one—the central limit theorem. For complicated problems, you very often get what looks something like a bell curve. Hardly watertight, but I’d bet decent amounts of money that it is true in this case, so why not use it to add a little color to the description?
Well, your prior gives you a unique value, and bayes theorem is a function, so it gives you a unique value for every input.
So the claim is that you have arbitrary precision priors. What are they, and where are they stored?
Sorry, I haven’t been very clear. A perfect bayesian agent would have a unique real number to represent it’s level of belief in every hypothesis.
The betting-offer system I described about can force people (and force any hypothetical agent) to assign unique values.
Of course, an actual person won’t be capable of this level of precision or coherence.
Yes, but actually computing that function is computationally intractable in all but the simplest examples.
That does not require probabilities. You could also come up with an explanation of at what value to switch.
In that case, we’re done. Standard probability theory/Cox Theorem/de Finette would give us a ready made criticism of any conjecture that wasn’t isomorphic to probability theory, so we’d have isomorphism, which is all we need. Once we have functional equivalence, we can prove results in probability theory, apply Bayes theorem, etc., and then at the end translate back into Popperesque.
(Also, IIRC, Jaynes only claimed to have proven that rational reasoning must be isomorphic to probability theory)
I don’t quite get your point. You are saying that if you bring up betting (a real life scenario where probability is highly relevant), then given your explanations that help you come up with priors (background knowledge needed to be able to do any math about it), you shouldn’t act on those explanations in ways that violates math. OK, so what? probability math is useful in some limited cases, given some explanatory knowledge to get set up. no one said otherwise.
Every decision is a bet.
I think you are beginning to get the point. :) The key missing fact here is that in fact the resulting math is highly constraining, to the point that if you actually follow it all the way you will be acting in a manner isomorphic to a Bayesian utility-maximizer.
But the background knowledge part is highly not-constraining (just given your math). When a math algorithm gives constrained output, but you have wide scope for choice of input, it’s not so good. you need to do stuff to constrain the inputs.
it seems to me you just dump all the hard parts of thinking into the priors and then say the rest follows. but the hard parts are still there. we still need to work out good explanations to use as input for the last step of not doing stuff that violates math/logic.