Screwtape comments on shortest goddamn bayes guide ever

Screwtape 12 May 2024 5:25 UTC
3 points
1
I’m not sure I’m following your actual objection. Is your point that this algorithm is wrong and won’t update towards the right probabilities even if you keep feeding it new pieces of evidence, that the explanations and numbers for these pieces of evidence don’t make sense for the implied story, that you shouldn’t try to do explicit probability calculations this way, or some fourth thing?
If this algorithm isn’t actually equivalent to Bayes in some way, that would be really useful for someone to point out. At first glance it seems like a simpler (to me anyway) way to express how making updates works, not just on an intuitive “I guess the numbers move that direction?” way but in a way that might not get fooled by e.g. the mammogram example.
If these explanations and numbers don’t make exact sense for the implied story, that seems fine? “A train is moving from east to west at a uniform speed of 12 m/s, ten kilometers west a second train is moving west to east at a uniform speed of 15 m/s, how far will the first train have traveled when they meet?” is a fine word problem even if that’s oversimplified for how trains work.
If you don’t think it’s worth doing explicit probability calculations this way, even to practice and try and get better or as a way to train the habit of how the numbers should move, that seems like a different objection and one you would have with any guide to Bayes. That’s not to say you shouldn’t raise the objection, but that doesn’t seem like an objection that someone did the math wrong!
And of course maybe I’m completely missing your point.
- Zane 12 May 2024 6:46 UTC
  12 points
  2
  Parent
  Multiple points, really. I believe that this calculation is flawed in specific ways, but I also think that most calculations that attempt to estimate the relative odds of two events that were both very unlikely a priori will end up being off by a large amount. These two points are not entirely unrelated.
  The specific problems that I noticed were:
  1. The probabilities are not independent of each other, so they cannot be multiplied together directly. A bear flipping over your tent would almost always immediately be preceded by the bear scratching your tent, so updating on both events would just be double-counting evidence.
  2. The probabilities do not appear to be conditional probabilities. P(A&B&C&D) doesn’t equal P(A)*P(B)*P(C)*P(D), it equals P(A)*P(B|A)*P(C|A&B)*P(D|A&B&C).
  3. The “nonbear” hypothesis is lumping together several different hypotheses. P(A|notbear) & P(B|notbear) cannot be multiplied together to get P(A&B|notbear), because (among other reasons) there may be some types of notbears that are very likely to do A but very unlikely to do B, some that are very likely to do both, and so on. Once you’ve observed A, it should update you on what kind of notbear it could be, and thus change the probability it does B.
  4. The “20% a bear would scratch my tent : 50% a notbear would” claim is incorrect for the reasons I mentioned above. If your tent would be scratched 50% of the time in the absence of a bear, and a bear would scratch it 20% of the time, then the chance it gets scratched if there is a bear is 1-(1-50%)(1-20%), or 60%. (Unless you’re postulating that bears always scare off anything else that might scratch the tent—which it seems Luke is indeed claiming.)
  5. I disagree with several of the specific claims about the probabilities, such as “95% chance a bear would look exactly like a fucking bear inside my tent” and “1% chance a notbear would.”
  And then the meta-problem: when you’re multiplying together more than two or three probabilities that you estimated, particularly small ones, errors in your ability to estimate them start to add up. Which is why I don’t think it’s usually worthwhile to try and estimate probabilities like this.
  But you have a fair point about it being a good idea to practice explicit calculations, even if they’re too complicated to reliably get right in real life. So here’s how I might calculate it:
  P(bear encounters you): 1%.
  P(tent scratched | bear): 60%, for the reasons I said above… unless we take into account it scaring away other tent-scratching animals, in which case maybe 40%.
  P(tent flipped over | bear & tent scratched): 20%, maybe? I think if the bear has already taken an interest in your tent, it’s more likely than usual to flip it over.
  P(you see a bear-shaped object | bear & tent scratched & tent flipped over): Bears always look like bears. This is so close to 100% I wouldn’t even normally include it in the calculation, but let’s call it 99.99%.
  P(you get eaten | bear & tent scratched & tent flipped over & you see a bear-shaped object): It’s already pretty been aggressive so far, so I’d say perhaps 5%.
  On the other side, there are almost no objects for which the probability of it looking exactly like a bear isn’t infinitesimal; let’s only consider Bigfoot and serial-killer-who’s-a-furry for simplicity, then add them up.
  P(Bigfoot exists): …hmm. I am not an expert on the matter, but let’s say 1%.
  P(Bigfoot encounters you | Bigfoot exists): There can’t be that many Bigfoots (Bigfeet?) out there, or else people would have caught one. 0.01%.
  P(tent scratched | Bigfoot): Bigfeet are probably more aggressive than bears, so 70%.
  P(tent flipped over | Bigfoot): Again, Bigfeet are supposed to be pretty aggressive, so 50%.
  P(you see a bear-shaped object | Bigfoot & tent scratched & tent flipped over): Bigfoot looks similar enough to a bear that you’ll almost certainly think he’s a bear. 99%.
  P(you get eaten | Bigfoot & tent scratched & tent flipped over & you see a bear-shaped object): Again, Bigfeet aggressive, 30%.
  Then for the furry cannibal one:
  P(furry cannibal stalking this forest): 0.000001% (that’s one in a hundred million, if I got my zeroes right). I welcome you to prove me wrong on the matter by manually increasing the number of furry cannibals in a given forest.
  P(furry cannibal encounters you | furry cannibal exists): How large of a forest is this? Well, he probably has his methods of locating prey, so let’s say 10%. Wait, why did I assume he’s a “he”? What gender is the typical furry cannibal? Probably a trans woman? Let’s name this furry cannibal Susan.
  P(tent scratched | Susan): Probably not that high; she doesn’t want to wake you up too soon. 30%.
  P(tent flipped over | Susan & tent scratched): She might just sneak in, but let’s say 90%.
  P(you see a bear-shaped object | Susan & tent scratched & tent flipped over): She’s wearing a bear costume, as hypothesized; 99.99%.
  P(you get eaten | Susan & tent scratched & tent flipped over & you see a bear-shaped object): Yes, of course this happens; this was her whole kink in the first place! 99%.
  So for “bear,” we have 1%*40%*20%*99.99%*5% = 0.004%. For “Bigfoot,” we have 1%*0.01%*70%*50%*99%*30% = 0.00001%. For “Susan,” we have 0.000001%*10%*30%*90%*99.99%*99% = .000000027%. Looks like Bigfoot was so much more likely than Susan that we can pretty much just forget the Susan possibility altogether. It’s 0.004 to 0.00001, so 400 to 1 chance that you’re being eaten by a bear.
  (Although I actually think you should be even more confident than 400 to 1 that it’s a bear rather than Bigfoot, and that I just was off by an order of magnitude for one reason or another, as happens when you’re doing these sorts of calculations. And if you ever actually observe all of these things, the most likely hypothesis is that you’re dreaming.)