I’d recommend using the beta distribution. I’d recommend Jeffrey’s prior (beta(1/2,1/2)), though I don’t fully understand it.
I’ve only ever used it to figure out the probability of getting heads on the next step, but you could just multiply these together to get the probability of the sequence, so it’s 0.5/1x1.5/2x2.5/3x3.5/4x4.5/5x4.5/6x4.5/7 if that is where the bug is (since it failed five times then passed twice, or might as well have since order doesn’t really matter, and clearly it will pass the next ten) and 0.5/1x1.5/2x2.5/3x3.5/4x4.5/5x4.5/6x4.5/7x5.5/8x6.5/9x...x14.5/17 if you didn’t find it. The first seven terms will cancel, so you just get that it’s 5.5/8x...x14.5/17 = (14.5!/4.5!)/(17!/7!) = 0.0906 times as unlikely if there’s no bug.
I’m afraid I don’t quite see how to apply this to the problem. The beta distribution is presumably a probability, but what is it a probability of? Is there an interpretation to its two parameters that I’m not seeing?
It’s a probability distribution of probabilities. You don’t know how likely it is that the program crashes given that there’s no bug. You just know that it crashed two out of seven times you ran it. If you start with a beta distribution for how likely it is to have different probabilities of crashing, you’ll have an easy-to-calculate beta distribution for how likely it is after, and it’s easy to calculate exactly how likely it is to crash given that distribution of probabilities.
Let me see if I understand what you are saying. You seem to be analogising the problem to detecting whether a coin or die is biased? That is, a biased die has some frequency of coming up 6 which is not one-sixth; and you are proposing beta(0.5, 0.5) as my prior distribution for that frequency. I roll the die 100 times, getting some number of sixes; for each point in the [0, 1] space of possible frequencies, I do Bayes. This presumably concentrates my probability around some particular frequency, and if that frequency is different from 0.16667 I say the die is biased.
Now, the analogy is that commenting out the line is equivalent to removing a piece of gum from the die. So I perform the test twice, once with and once without the gum; and if the results differ then I say that the gum changed the bias, and if the un-gummed die is fair then the gum caused the bias. Or, going back to the program, the mean of my probability distribution for the crash frequency in the presence of the line may be high relative to the mean without the line, and that’s what we mean by “We think this line is causing the crash”. Right?
So, if I understand correctly, you are proposing the same math as gjm did below, but suggesting a specific prior
No. My way was assuming that it either crashes exclusively because of that line or exclusively because of something else. Furthermore, I only gave priors for if it’s given which it’s doing.
Let A = that line is the cause of the crash.
Let q!=x mean that this is true for any value of q besides x.
I’d recommend using the beta distribution. I’d recommend Jeffrey’s prior (beta(1/2,1/2)), though I don’t fully understand it.
I’ve only ever used it to figure out the probability of getting heads on the next step, but you could just multiply these together to get the probability of the sequence, so it’s 0.5/1x1.5/2x2.5/3x3.5/4x4.5/5x4.5/6x4.5/7 if that is where the bug is (since it failed five times then passed twice, or might as well have since order doesn’t really matter, and clearly it will pass the next ten) and 0.5/1x1.5/2x2.5/3x3.5/4x4.5/5x4.5/6x4.5/7x5.5/8x6.5/9x...x14.5/17 if you didn’t find it. The first seven terms will cancel, so you just get that it’s 5.5/8x...x14.5/17 = (14.5!/4.5!)/(17!/7!) = 0.0906 times as unlikely if there’s no bug.
I’m afraid I don’t quite see how to apply this to the problem. The beta distribution is presumably a probability, but what is it a probability of? Is there an interpretation to its two parameters that I’m not seeing?
It’s a probability distribution of probabilities. You don’t know how likely it is that the program crashes given that there’s no bug. You just know that it crashed two out of seven times you ran it. If you start with a beta distribution for how likely it is to have different probabilities of crashing, you’ll have an easy-to-calculate beta distribution for how likely it is after, and it’s easy to calculate exactly how likely it is to crash given that distribution of probabilities.
Let me see if I understand what you are saying. You seem to be analogising the problem to detecting whether a coin or die is biased? That is, a biased die has some frequency of coming up 6 which is not one-sixth; and you are proposing beta(0.5, 0.5) as my prior distribution for that frequency. I roll the die 100 times, getting some number of sixes; for each point in the [0, 1] space of possible frequencies, I do Bayes. This presumably concentrates my probability around some particular frequency, and if that frequency is different from 0.16667 I say the die is biased.
Now, the analogy is that commenting out the line is equivalent to removing a piece of gum from the die. So I perform the test twice, once with and once without the gum; and if the results differ then I say that the gum changed the bias, and if the un-gummed die is fair then the gum caused the bias. Or, going back to the program, the mean of my probability distribution for the crash frequency in the presence of the line may be high relative to the mean without the line, and that’s what we mean by “We think this line is causing the crash”. Right?
So, if I understand correctly, you are proposing the same math as gjm did below, but suggesting a specific prior
P(p, q) = beta(p; 0.5, 0.5) beta(q; 0.5, 0.5).
Is that correct?
No. My way was assuming that it either crashes exclusively because of that line or exclusively because of something else. Furthermore, I only gave priors for if it’s given which it’s doing.
Let A = that line is the cause of the crash.
Let q!=x mean that this is true for any value of q besides x.
P(p,0|A) = beta(0.5,0.5)(p)
P(p,q!=0|A) = 0
P(p,p|!A) = beta(0.5,0.5)(p)
P(p,q!=p|A) = 0