Yes, the primary benefit of calibration training for me has been that I can now say “hmm, if I “feel” this confident about something, it’s 90% likely. If I “feel” this confident of something after running it through my basic calibration exercises, I’m 80% likely, etc. Also, if I’m asked to give a numerical estimate, I’m very good at giving a 90% confidence interval that is within range 90% of the time.
If you haven’t used calibration training in this way, I highly recommend it.
In terms of what I’m trying to accomplish, you’re right that I want a way to make my Bayesian updates more accurate. Part of the problem in training this is that I AM a bit fuzzy on the math. Like, I mostly get it when talking about toy problems like taking balls from a cup or figuring out how likely a disease test is to make false positives, but it gets very confusing how all the numbers work when you’re talking about a previous prior you had where you thought you were 30% likely to reget a vasectomy, and then you get new information that suggests the base rate is 10% of the population (and of course, that’s still a relatively simple example where you have hard numbers).
My basic idea was to calibrate on toy problems, then use the same feelings based “ok, I know that this feeling correlates to “x decibels of evidence”—but I don’t really have a surefire plan more than that.
Based on your predicted base rate, estimate a conditional probability based on new information.
Compare the estimated base rate against the actual base rate.
Using the actual base rate, now estimate a new conditional probability.
Compare both the estimated conditional probabilities against the actual conditional probability.
So, I think there are multiple levels here. You want to make sure you get the base rate part right. You also want to make sure that you get the update right. You can see how well calibrated you are for each. You might find that you’re okay at estimating conditional probabilities, but bad at estimating the base rate, etc.
I tend not to use my old estimates as a prior. I’m not an expert at Bayesian probability (so maybe I get all of this wrong!). I interpret what I’m looking for as a conditional probability, maybe with an estimated prior/base rate (which you could call your “old estimate”, I guess). I prefer data whenever it is available.
The toy problems are okay, and I’m sure you can generate a lot of them.
The vasectomy example was much less straightforward than I would have expected. I spent at least 10 minutes rearranging different equations for the conditional probability before finding one where I could get what I wanted in terms of what data I could find. The problem is that the data you can find in the literature often does not fit so nicely into a simple statement of Bayes rule.
Another example I found to be useful was computing my risk for developing a certain cancer. The base rate of this cancer is very low, but I have a family member who developed the cancer (and recovered, thankfully), and the relative risk for me is considerably higher. I had felt this gave me a probability of developing the cancer on the order of 10% or so, but doing the math showed that while it was higher than the base rate, it’s still basically negligible. This sounds to me like the sort of exercise you want to do.
Yes, the primary benefit of calibration training for me has been that I can now say “hmm, if I “feel” this confident about something, it’s 90% likely. If I “feel” this confident of something after running it through my basic calibration exercises, I’m 80% likely, etc. Also, if I’m asked to give a numerical estimate, I’m very good at giving a 90% confidence interval that is within range 90% of the time.
If you haven’t used calibration training in this way, I highly recommend it.
In terms of what I’m trying to accomplish, you’re right that I want a way to make my Bayesian updates more accurate. Part of the problem in training this is that I AM a bit fuzzy on the math. Like, I mostly get it when talking about toy problems like taking balls from a cup or figuring out how likely a disease test is to make false positives, but it gets very confusing how all the numbers work when you’re talking about a previous prior you had where you thought you were 30% likely to reget a vasectomy, and then you get new information that suggests the base rate is 10% of the population (and of course, that’s still a relatively simple example where you have hard numbers).
My basic idea was to calibrate on toy problems, then use the same feelings based “ok, I know that this feeling correlates to “x decibels of evidence”—but I don’t really have a surefire plan more than that.
Here’s my idea to get better at doing updates:
Estimate the base rate.
Based on your predicted base rate, estimate a conditional probability based on new information.
Compare the estimated base rate against the actual base rate.
Using the actual base rate, now estimate a new conditional probability.
Compare both the estimated conditional probabilities against the actual conditional probability.
So, I think there are multiple levels here. You want to make sure you get the base rate part right. You also want to make sure that you get the update right. You can see how well calibrated you are for each. You might find that you’re okay at estimating conditional probabilities, but bad at estimating the base rate, etc.
I tend not to use my old estimates as a prior. I’m not an expert at Bayesian probability (so maybe I get all of this wrong!). I interpret what I’m looking for as a conditional probability, maybe with an estimated prior/base rate (which you could call your “old estimate”, I guess). I prefer data whenever it is available.
The toy problems are okay, and I’m sure you can generate a lot of them.
The vasectomy example was much less straightforward than I would have expected. I spent at least 10 minutes rearranging different equations for the conditional probability before finding one where I could get what I wanted in terms of what data I could find. The problem is that the data you can find in the literature often does not fit so nicely into a simple statement of Bayes rule.
Another example I found to be useful was computing my risk for developing a certain cancer. The base rate of this cancer is very low, but I have a family member who developed the cancer (and recovered, thankfully), and the relative risk for me is considerably higher. I had felt this gave me a probability of developing the cancer on the order of 10% or so, but doing the math showed that while it was higher than the base rate, it’s still basically negligible. This sounds to me like the sort of exercise you want to do.