Two probabilities
Consider the following statements:
1. The result of this coin flip is heads.
2. There is life on Mars.
3. The millionth digit of pi is odd.
What is the probability of each statement?
A frequentist might say, “P1 = 0.5. P2 is either epsilon or 1-epsilon, we don’t know which. P3 is either 0 or 1, we don’t know which.”
A Bayesian might reply, “P1 = P2 = P3 = 0.5. By the way, there’s no such thing as a probability of exactly 0 or 1.”
Which is right? As with many such long-unresolved debates, the problem is that two different concepts are being labeled with the word ‘probability’. Let’s separate them and replace P with:
F = the fraction of possible worlds in which a statement is true. F can be exactly 0 or 1.
B = the Bayesian probability that a statement is true. B cannot be exactly 0 or 1.
Clearly there must be a relationship between the two concepts, or the confusion wouldn’t have arisen in the first place, and there is: apart from both obeying various laws of probability, in the case where we know F but don’t know which world we are in, B = F. That’s what’s going on in case 1. In the other cases, we know F != 0.5, but our ignorance of its actual value makes it reasonable to assign B = 0.5.
When does the difference matter?
Suppose I offer to bet my $200 the millionth digit of pi is odd, versus your $100 that it’s even. With B3 = 0.5, that looks like a good bet from your viewpoint. But you also know F3 = either 0 or 1. You can also infer that I wouldn’t have offered that bet unless I knew F3 = 1, from which inference you are likely to update your B3 to more than 2⁄3, and decline.
On a larger scale, suppose we search Mars thoroughly enough to be confident there is no life there. Now we know F2 = epsilon. Our Bayesian estimate of the probability of life on Europa will also decline toward 0.
Once we understand F and B are different functions, there is no contradiction.
Why does the Bayesian say that the probability of there being life on Mars is 0.5?
We don’t. I’m not sure what’s up with that, unless it was a deliberately bad example.
I can’t imagine anyone assigning the event probability 0.5 just because it’s a Yes/No question. Does the probability drop to 1⁄3 if I added 1 more option to the question?
The person who assigns probability 1/k to all outcomes of any question with k options is NOT a Bayesian. That’s someone who has misunderstood Bayes rule and should re-read all of Eliezer’s posts.
“So roughly speaking, what are the chances the world is going to be destroyed? One in a million, one in a billion?”
“Well, the best we can say is about a 1 in 2 chance.”
http://www.thedailyshow.com/watch/thu-april-30-2009/large-hadron-collider (video, region blocked)
I seem to remember seeing the idea that “all possibilities equally likely” is sort of a “default prior”. In the case of life on Mars: Imagine getting all your information about life and Mars in little dribs and drabs, each one of which lets you update your probability of life on Mars. The place you start from (before you know stuff like what DNA is and whether Mars has an atmosphere) is 0.5.
We don’t know the answer, or even have data for an estimate, and ignorance translates to B = 0.5. Even if you feel you do have justification for saying e.g. B = 0.2 or 0.8 (I did say might in the original), B will have a much less extreme value than F.
edit: Of all the comments I’ve made on LW that I expected to get down voted, this wasn’t one of them. Can the down voters explain your reasons for disagreement?
edit 2: To clarify, I’m not claiming we have no data bearing on the question of whether there is life on Mars—of course we have. I’m claiming different chunks of data support different conclusions, nothing is anywhere near being conclusive, and after all of it is added up, the only reasonable and impartial conclusion at this time is “I don’t know”, for B somewhere in the neighborhood of 0.5.
(Downvoted.) No, it doesn’t: we know quite a bit about Mars and about life, surely enough to have some sort of prior probability before encountering specific data. More to the point, we’re never that ignorant. If you literally had no information about a topic, then you wouldn’t know enough to even phrase a question about it, or recognize an answer to such a question. By asking the question, you must have some notion in your mind of what you’re asking about, and it is from that notion that we must draw our prior probabilities, rather than arbitrarily picking 0.5. “Whereof we cannot speak, we must pass over in silence.”
So priors matter if you and I have already updated our beliefs on really divergent sets of data. But if you’re coming at a question from a place of total ignorance assigning even possibilities as your priors should work just fine. 0.5 sounds just fine to me. It doesn’t really matter though because as soon as we have any significant amount of evidence to update on our beliefs will rapidly converge. My prior probability for life on Mars could be 99.99, as soon as my rover gets there and doesn’t find any life that number drops dramatically. My prior could also be 0.01 but as soon as I learn about the geological indicators that suggest early Mars was much like early Earth that is going to go up somewhat (and come down again when I show up and don’t see any life) either way the whole point is that it shouldn’t matter much at all what your priors are.
At least that is how it was explained to me, I could be totally off base.
What doesn’t make sense about rwallace’s position is just that he doesn’t seem to think we have any evidence to update on when it comes to life on Mars.
I’m not saying we have no data on Mars. I’m saying we have evidence one person reasonably believes is in favor of life on Mars, and evidence another person reasonably believes is against life on Mars; we even have knowledgeable scientists holding very strong opinions on either side of the issue. My conclusion is that when you add it all up, the net evidence doesn’t justify a position very far from 0.5, and to take a position like 0.01 or 0.99 is really an expression of personal bias.
Well thats an interesting conclusion and maybe someone has written something somewhere demonstrating that the right posterior probability given our science is around 0.5. But you can hardly expect your reader to have any idea where that number is coming from. 0.5 sounds much too high to me though what I know I basically know from general scientific knowledge and having done a science report on Mars in the 4th grade.
I agree that 99.99 or 00.01 seem much too extreme for estimations given our evidence- but they’d function perfectly fine as priors, was my point.
Sure we are. A priori, the probability that there’s something instead of nothing is .5.
What probabilities do you assign to the following propositions?
There is life on the northern hemisphere of Mars.
There is life on the southern hemisphere of Mars.
There is life on Mars.
Good example. I assign B = 0.5 in all three cases, but I expect the (unknown) value of F to be very similar (and close to 0 or 1) for all three, unlike in the case of three coin flips.
The above probability assessments are only coherent if you judge that proposition 1 has the same truth value as proposition 2; I don’t know how that could be justified.
Now, without using probabilities of 0 or 1, can you coherently assign probabilities to
Sure. B(A) = B(D) = 0.5, B(B) = B(C) = epsilon. (The 0.5 is only good to one significant figure, and even that’s a stretch.)
Just how small is this epsilon? I might want to propose a bet.
If I had a number, I would’ve given the number instead of saying “epsilon” :) What’s your proposed bet?
I might bet on B or C against A or D at odds of epsilon to 1, to be settled when we have thoroughly explored Mars, assuming that if there is life, we will find it. This of course depends on the actual value of epsilon.
So basically you’re saying that the probability of there being life on only one of the hemispheres is arbitrarily small?
Mathematically nonzero, but small enough that we can treat it as zero for practical purposes, yes.
Can you link to some of the major chunks of data supporting life on Mars?
Even if ignorance were total, some kind of Occam’s-Razor prior would be applicable.
Put it this way: Of all the things that might have turned out to be controversial, I’m just surprised “we have no idea whether there is life on Mars” turned out to be the one; it still strikes me as fairly obviously reasonable. Oh well, surprises are part of the point of talking to people :-)
The problem may be that “we have no idea whether there is life on Mars” sounds a lot like “we have no evidence about the presence of life on Mars”, which is simply not the case. (Of course, “we have no idea whether there is life on Mars” was operationalized as a probability assessment of 0.5, which does not imply zero evidence, so I can’t be sure if my re-quotation is what you meant.)
Ah, indeed the paraphrased version would be incorrect, and wasn’t what I meant; perhaps I phrased my version badly. I’ve added a clarification to my earlier comment, does that help?
Yup.
Surely we have some data.
Right? I’ve seen lots of pictures of Mars and not identified any life in any of them!
If “possible world” means “any imaginable world history which is consistent with our knowledge”, F and B will probably collapse into one concept.
If it means “any world history which is consistent with logic, some specified set of physical laws and initial conditions given in one moment”, and if we expect the laws to be deterministic, then F1 would also be 0 or 1 and not 0.5.
I would like to see your definition of “possible world”.
The frequentist’s probability you refer to is called “physical probability”. See for example http://en.wikipedia.org/wiki/Probability_interpretations
Yes, that’s exactly what I’m talking about. Thanks for the link.
The millionth digit of pi is odd—see:
http://wiki.answers.com/Q/What_is_the_1_millionth_digit_for_pi
I’m not sure why the frequentist would put an epsilon in P2. Surely there is a fact of the matter about statement 2 just as there is for statement 3.
I was assuming per Tegmark that we live in (at least one variety of) big world, and “I” denotes a set of entities indistinguishable with current information, but who live in different parts of the multiverse. But more prosaically, you could note that there is a nonzero albeit small probability that atoms on a lifeless Mars will arrange themselves into a life form between one visit and the next.
I grant that a big world provides an ensemble such that epsilon could make sense. I think that the prosaic explanation fails, though—a fully specified version of statement two refers either to an instant in time or an interval, and either way, in a small world there will be a fact of the matter.
Hmm, frequentist probability is most usually described in terms of, er, frequency; what fraction of the time we will get a given result when we run the test. But if you take it as referring to an instant of time (and you assume small world and no fuzziness) in that case I agree the epsilon would disappear.
It’s a minor point, but wackily enough, the above quote is a subtle equivocation on the word “time”. I can flip N exchangeable coins simultaneously and count the number of times I see “heads”, and this is perfectly sensible in the frequentist interpretation. Physical clock time is something else again.
Sure, B and F don’t directly contradict each other, but which one should we use when reasoning under uncertainty?
edit: better statement of what I was getting at
I answered a different question than what this sits below, but I think that the answer is still both. B is probably the one that fits in the formulas, but you should also remember that the cases where B/=F are the cases where such formulas are least likely to serve you well.
Yep. To expand: correctly used, they should not contradict each other. If they give different answers, then at least one of them is being used in a way in which it is not applicable.
You’re right, so far as it goes, but I don’t think it gets you very far. My point is that this dodges what the debate is about. Proponents of F say that it should be used for doing inductive inference while proponents of B say they’re wrong, B is what should be used. If you’re not answering that question, you’re not settling the debate.
Now if you’re not trying to settle the debate, then we have no argument.
Well, the only debates I’m claiming to definitively settle are the philosophical ones about “what does probability really mean?”, “are 0 and 1 really probabilities?” and suchlike, over which I’ve seen enough electrons spilled that I considered it well worth trying to put them to rest.
But in a typical inductive scenario, it seems to me that since we can’t work directly with F, whereas we can work directly with B, Bayesian reasoning is the appropriate tool to use. Do you have any counterexamples in mind where the two approaches give different answers and the difference can’t be resolved by noting that they aren’t answering the same question?
Well, no, but but proponents of F may disagree.