That doesn’t sound right to me. Sure, with a sample size of 1, your estimate won’t be very accurate, but that one data point is still going to be giving you some information, even if it isn’t very much. You gotta update incrementally, right?
Updating incrementally is useful, but only if you keep in perspective how little you know and how unreliable your information is based on a single trial. If you forget that, then you end up like the guy who says “Well, I drove drunk once, and I didn’t crash my car, so therefore driving drunk isn’t dangerous”. Sometimes “I don’t know” is a better first approximation then anything else.
Of course, it would be accurate to say that we can get some information from this. I mentioned “anything from 10% to 90%”, but on the other hand, I would say that the our experience so far makes the hypothesis “99% intelligent species blow themselves up within 50 years of creating a nuclear bomb” pretty unlikely.
However, any hypothesis from “10% of the time, MAD works at preventing a nuclear war” to “99% of the time, MAD works at preventing a nuclear war” or anything in between seems like it’s still quite plausible. Based on a sample size of 1, I would say that any hypothesis that fits the observed data at least 10% of the time would have to be considered a plausible hypothesis.
If you flip a coin and it comes up heads, do you have any new information about whether it is a fair coin or not? If you thought the odds of MAD working were 50⁄50, do you have any new information on which to update?
If you flip a coin and it comes up heads, do you have any new information about whether it is a fair coin or not?
Yes. If you started with a uniform distribution over the coin’s probability of heads being 0 to 1, your new posterior distribution will tilt toward the 0.5-1 half and the 0-0.5 will shrink. Sivia’s Data Analysis on page 16⁄29 even includes an illustration of how the distribution evolves for some random choices of probability and possible coinflips; I’ve screenshotted it: http://i.imgur.com/KbpduAj.png Note how drastically the distribution changes between n=0 (top left) and n=1 (top middle) - one has learned information from this single coinflip! (Diminishing returns implies the first observation carries the most information...)
Given that the coin either has two of the same faces or is fair, and it is not more likely to be two-tailed than two-headed: Take the hypothesis h “the coin has either two heads or two tails” and the evidence e “The first flip came up heads”
I don’t think that you can update your probability of h, regardless of your prior belief, since P(e|h)=P(e|~h).
I think that you are asking what the evidence indicates the bias of this one coin is, rather than asking if this one coin is biased. Which, in fact, does apply to the case I was trying to illustrate.
Given that the coin either has two of the same faces or is fair, and it is not more likely to be two-tailed than two-headed...I think that you are asking what the evidence indicates the bias of this one coin is, rather than asking if this one coin is biased. Which, in fact, does apply to the case I was trying to illustrate.
In what sense is a “fair coin” not a coin with a ‘bias’ of exactly 0.5?
The first throw being heads is evidence for the proposition that the coin is biased towards heads, evidence against the proposition that the coin is biased towards tails, and neutral towards the union of the two propositions.
If showing heads for the first throw was evidence for or against the coin being fair, not showing heads for the first row would have to be evidence against or for.
That we survived is evidence for “MAD reduces risk” and evidence against “MAD increases risk”, but is neutral for “MAD changes risk”.
Thought experiment: Get 100 coins, with 50 designed to land on heads 90% of the time and 50 designed to land on tails 90% of the time. If you flipped each coin once, and put all the coins that happened to land on heads (50ish) in one pile, on average, 45 of them will be coins biased towards heads, and only 5 will be biased towards tails.
If you only got the chance to flip one randomly selected coin, and it came up heads, you should say it has a 90% probability of being a heads-biased coin, because it will be 45 out of 50 times.
That’s how I’m seeing this situation, anyway. I’m not really understanding what you’re trying to say here.
Take those 100 coins, and add 100 fair coins. Select one at random and flip it. It comes up heads. What are the odds that it is one of the biased coins?
Okay, I think I get it. I was initially thinking that the probabilities of the relationship between MAD and reducing risk being negative, nothing, weak, strong, whatever, would all be similar. If you assume that the probability that we all die without MAD is 50%, and each coin represents a possible probability of death with MAD, then I would have put in one 1% coin, one 2% coin, and so on up to 100. That would give us a distribution just like gwern’s given graph.
You’re saying that it is very likely that there is no relationship at all, and while surviving provides evidence of a positive relationship over a negative one (if we ignore anthropic stuff, and we probably shouldn’t), it doesn’t change the probability that there is no relationship. So you’d have significantly more 50% coins than 64% coins or 37% coins to draw from. The updates would look the same, but with only one data point, your best guess is that there is no relationship. Is that what you’re saying?
So then the difference is all about prior probabilities, yes? If you have two variables that coorelated one time, and that’s all the experimenting that you get to do, how likely is it that they have a positive relationship, and how likely is it that it was a coincidence? I… don’t know. I’d have to think about it more.
You’re right, but just a tiny note: you could also interpret Decius’s question as “do you have any new information about how far the coin is from being fair?” and then the answer seems to be “no”.
That doesn’t sound right to me. Sure, with a sample size of 1, your estimate won’t be very accurate, but that one data point is still going to be giving you some information, even if it isn’t very much. You gotta update incrementally, right?
Updating incrementally is useful, but only if you keep in perspective how little you know and how unreliable your information is based on a single trial. If you forget that, then you end up like the guy who says “Well, I drove drunk once, and I didn’t crash my car, so therefore driving drunk isn’t dangerous”. Sometimes “I don’t know” is a better first approximation then anything else.
Of course, it would be accurate to say that we can get some information from this. I mentioned “anything from 10% to 90%”, but on the other hand, I would say that the our experience so far makes the hypothesis “99% intelligent species blow themselves up within 50 years of creating a nuclear bomb” pretty unlikely.
However, any hypothesis from “10% of the time, MAD works at preventing a nuclear war” to “99% of the time, MAD works at preventing a nuclear war” or anything in between seems like it’s still quite plausible. Based on a sample size of 1, I would say that any hypothesis that fits the observed data at least 10% of the time would have to be considered a plausible hypothesis.
Um… yes. I guess we’re on the same page then. :)
If you flip a coin and it comes up heads, do you have any new information about whether it is a fair coin or not? If you thought the odds of MAD working were 50⁄50, do you have any new information on which to update?
Yes. If you started with a uniform distribution over the coin’s probability of heads being 0 to 1, your new posterior distribution will tilt toward the 0.5-1 half and the 0-0.5 will shrink. Sivia’s Data Analysis on page 16⁄29 even includes an illustration of how the distribution evolves for some random choices of probability and possible coinflips; I’ve screenshotted it: http://i.imgur.com/KbpduAj.png Note how drastically the distribution changes between n=0 (top left) and n=1 (top middle) - one has learned information from this single coinflip! (Diminishing returns implies the first observation carries the most information...)
Given that the coin either has two of the same faces or is fair, and it is not more likely to be two-tailed than two-headed: Take the hypothesis h “the coin has either two heads or two tails” and the evidence e “The first flip came up heads”
I don’t think that you can update your probability of h, regardless of your prior belief, since P(e|h)=P(e|~h).
I think that you are asking what the evidence indicates the bias of this one coin is, rather than asking if this one coin is biased. Which, in fact, does apply to the case I was trying to illustrate.
In what sense is a “fair coin” not a coin with a ‘bias’ of exactly 0.5?
The first throw being heads is evidence for the proposition that the coin is biased towards heads, evidence against the proposition that the coin is biased towards tails, and neutral towards the union of the two propositions.
If showing heads for the first throw was evidence for or against the coin being fair, not showing heads for the first row would have to be evidence against or for.
That we survived is evidence for “MAD reduces risk” and evidence against “MAD increases risk”, but is neutral for “MAD changes risk”.
Thought experiment: Get 100 coins, with 50 designed to land on heads 90% of the time and 50 designed to land on tails 90% of the time. If you flipped each coin once, and put all the coins that happened to land on heads (50ish) in one pile, on average, 45 of them will be coins biased towards heads, and only 5 will be biased towards tails.
If you only got the chance to flip one randomly selected coin, and it came up heads, you should say it has a 90% probability of being a heads-biased coin, because it will be 45 out of 50 times.
That’s how I’m seeing this situation, anyway. I’m not really understanding what you’re trying to say here.
Take those 100 coins, and add 100 fair coins. Select one at random and flip it. It comes up heads. What are the odds that it is one of the biased coins?
Okay, I think I get it. I was initially thinking that the probabilities of the relationship between MAD and reducing risk being negative, nothing, weak, strong, whatever, would all be similar. If you assume that the probability that we all die without MAD is 50%, and each coin represents a possible probability of death with MAD, then I would have put in one 1% coin, one 2% coin, and so on up to 100. That would give us a distribution just like gwern’s given graph.
You’re saying that it is very likely that there is no relationship at all, and while surviving provides evidence of a positive relationship over a negative one (if we ignore anthropic stuff, and we probably shouldn’t), it doesn’t change the probability that there is no relationship. So you’d have significantly more 50% coins than 64% coins or 37% coins to draw from. The updates would look the same, but with only one data point, your best guess is that there is no relationship. Is that what you’re saying?
So then the difference is all about prior probabilities, yes? If you have two variables that coorelated one time, and that’s all the experimenting that you get to do, how likely is it that they have a positive relationship, and how likely is it that it was a coincidence? I… don’t know. I’d have to think about it more.
You’re right, but just a tiny note: you could also interpret Decius’s question as “do you have any new information about how far the coin is from being fair?” and then the answer seems to be “no”.