Not all calibration is probability calibration, e.g., calibrating my scales or voltmeter, but as you suggest, calibration discussion on LessWrong is effectively calibration about credences/probabilities. Not worth keeping finer gradations distinct.
But I think the disambiguation is good because it explains the tag to someone new on LessWrong and doesn’t rely on your already knowing the content, so I’m in favor of the disambiguation. The way we use calibration is a our own jargon, so good to explain a bit what we mean just in the title.
I can see the case for that, but FYI it just made me go make this meta comment rather than intuitively classifying a calibration post. I think it might be fine to have the disambiguation live in the text rather than the title.
Epistemic status: I’m generally pro disambiguations in parentheses, I was the one who advocated we borrow the practice from Wikipedia. I’m really not sure in the case between Calibration and Calibration (Probability), so I’m just trying to think through this with others.
The tag description is this:
Do the events that you give a 70% probability in advance, actually end up happening 70% of the time?
It’s pretty brief, I’m guessing that didn’t clarify enough. I actually suspect here that if the description had been like the following, clarification wouldn’t have been necessary. Raemon?
Someone is probability or credence calibrated if the things they predict with 70% chance of happening in fact occur 70% of the time. Importantly, calibration is not the same as accuracy. Calibration is about accurately assessing how good your predictions are, not making good predictions. Person A, whose predictions are marginally better than chance, e.g. 60% of them come true, and who knows that, is well-calibrated. In contrast, Person B, whose predictions are 90% accurate, yet thinks they are 99% accurate, is more accurate than Person A while being less well calibrated.
Knowing how good your predictions are is a key rationalist skill. Among other things, being calibrated lets you make good bets [Link to Betting tag]/make good decisions [link Planning & Decision-making tag], communicate information helpfully to others if they know you to be well-calibrated [link to Group Rationality], and helps prioritize which information is worth acquring [link to VoI tag].
Note that calibration applies to all expressions of quantified confidence in beliefs/predictions [reference: Anticipate Experiences]. For example, calibration applies to whether a person’s 95% confidence intervals capture things 95% of the time. Or if their 80% chance of completion estimates are met 80% of the time (not much more or less). Trivially, odds ratio placed on things are convertible to probabilities.
See also: prediction & forecasting
I think this makes it much clearer that “tends to be correct” is always a quantitive/probabilistic statement beneath the hood.
Hmm, what this makes me think is really it’s about calibrating credences, which isn’t standard jargon anyway. So maybe just plain “Calibration” is better than “Probability”.Or maybe Calibration (belief strength)?
My main thought now is the issue wasn’t the name so much as lack of good explanation for the topic. [No complaint against the tag creator – it was a good tag to make.] I’d kind of like to have a big tag description writing push sometime after the plain “tag at all” push since so far few tags have good explanations. But we’d have to decide we’re definitely moving more in this wiki-ish direction.
Yeah, I’d propose Calibration or Calibration (belief-strength).
I dislike Probability Calibration because I dislike leading adjectives/modifiers and prefer the main thing to be the first word in the noun phrase (some languages like Hebrew do this). I expect people to be looking for the core thing, e.g. Relationships, “R”, and if you put modifiers in front, e.g. “Business”, “Personal/Interpersonal”, “Romantic”, “Conceptual”, you then require someone to guess which modifier you used, and also split up Relationship tags from being adjacent in an alphabetical list.
I think having as much in the title as possible is better than in the text, just because even triggering the hover-over is 10-20x costly than just skimming all the titles, and also doesn’t work on mobile where are no hover-overs. I think if someone’s looking over the tags list and sees “Calibration (belief-strength)” they have much more of an idea of what tag is about than just Calibration which is pretty opaque to an outsider.
I dislike Probability Calibration because I dislike leading adjectives/modifiers and prefer the main thing to be the first word in the noun phrase (some languages like Hebrew do this). I expect people to be looking for the core thing, e.g. Relationships, “R”, and if you put modifiers in front, e.g. “Business”, “Personal/Interpersonal”, “Romantic”, “Conceptual”, you then require someone to guess which modifier you used, and also split up Relationship tags from being adjacent in an alphabetical list.
There is that, but at the same time, you probably wouldn’t want tags like Experiences (Anticipated) or Induction (Solomonoff). I don’t have any principled argument for this, but to me “Probability Calibration” feels more like one of those examples. It being put alphabetically close to “Probability” may also be good.
(I also keep feeling confused by Relationships (Interpersonal) each time I see it, though that’s probably in part because there’s no other ‘Relationships’, so I just think ‘well what other relationships could you even mean’ and then don’t find another Relationships that it would be contrasted with.)
That’s a fair point to raise. I think it’s more work to give an explicit theory for why those are different. Something like those are both technical terms/jargon and they don’t belong to a broader class of things that we might be discussing. In a world where there were three types of induction we discussed, might then go for Induction (Solomonoff). Also that they’re the actual phrase people use. I don’t think I can remember anyone saying “probability calibration” ever.
With Relationships (Interpersonal), I think it makes sense because to me, the default way to read “relationships” is specifically romantic relationships. Like if your friend says “I’m reading a book about relationships”, what do you assume? To make it clear the tag also covers friendship, family, and work relationships I think actually does require some disambiguation even if the site doesn’t have any other relationship tags right now.
See my comment in the other thread about my argument for modifiers after the main word. It’s not LW team consensus, just something I’ve been pushing for.
Not all calibration is probability calibration, e.g., calibrating my scales or voltmeter, but as you suggest, calibration discussion on LessWrong is effectively calibration about credences/probabilities. Not worth keeping finer gradations distinct.
But I think the disambiguation is good because it explains the tag to someone new on LessWrong and doesn’t rely on your already knowing the content, so I’m in favor of the disambiguation. The way we use calibration is a our own jargon, so good to explain a bit what we mean just in the title.
I can see the case for that, but FYI it just made me go make this meta comment rather than intuitively classifying a calibration post. I think it might be fine to have the disambiguation live in the text rather than the title.
Epistemic status: I’m generally pro disambiguations in parentheses, I was the one who advocated we borrow the practice from Wikipedia. I’m really not sure in the case between Calibration and Calibration (Probability), so I’m just trying to think through this with others.
The tag description is this:
It’s pretty brief, I’m guessing that didn’t clarify enough. I actually suspect here that if the description had been like the following, clarification wouldn’t have been necessary. Raemon?
I think this makes it much clearer that “tends to be correct” is always a quantitive/probabilistic statement beneath the hood.
Hmm, what this makes me think is really it’s about calibrating credences, which isn’t standard jargon anyway. So maybe just plain “Calibration” is better than “Probability”. Or maybe Calibration (belief strength)?
My main thought now is the issue wasn’t the name so much as lack of good explanation for the topic. [No complaint against the tag creator – it was a good tag to make.] I’d kind of like to have a big tag description writing push sometime after the plain “tag at all” push since so far few tags have good explanations. But we’d have to decide we’re definitely moving more in this wiki-ish direction.
Yeah, I’d propose Calibration or Calibration (belief-strength).
I dislike Probability Calibration because I dislike leading adjectives/modifiers and prefer the main thing to be the first word in the noun phrase (some languages like Hebrew do this). I expect people to be looking for the core thing, e.g. Relationships, “R”, and if you put modifiers in front, e.g. “Business”, “Personal/Interpersonal”, “Romantic”, “Conceptual”, you then require someone to guess which modifier you used, and also split up Relationship tags from being adjacent in an alphabetical list.
I think having as much in the title as possible is better than in the text, just because even triggering the hover-over is 10-20x costly than just skimming all the titles, and also doesn’t work on mobile where are no hover-overs. I think if someone’s looking over the tags list and sees “Calibration (belief-strength)” they have much more of an idea of what tag is about than just Calibration which is pretty opaque to an outsider.
There is that, but at the same time, you probably wouldn’t want tags like Experiences (Anticipated) or Induction (Solomonoff). I don’t have any principled argument for this, but to me “Probability Calibration” feels more like one of those examples. It being put alphabetically close to “Probability” may also be good.
(I also keep feeling confused by Relationships (Interpersonal) each time I see it, though that’s probably in part because there’s no other ‘Relationships’, so I just think ‘well what other relationships could you even mean’ and then don’t find another Relationships that it would be contrasted with.)
That’s a fair point to raise. I think it’s more work to give an explicit theory for why those are different. Something like those are both technical terms/jargon and they don’t belong to a broader class of things that we might be discussing. In a world where there were three types of induction we discussed, might then go for Induction (Solomonoff). Also that they’re the actual phrase people use. I don’t think I can remember anyone saying “probability calibration” ever.
With Relationships (Interpersonal), I think it makes sense because to me, the default way to read “relationships” is specifically romantic relationships. Like if your friend says “I’m reading a book about relationships”, what do you assume? To make it clear the tag also covers friendship, family, and work relationships I think actually does require some disambiguation even if the site doesn’t have any other relationship tags right now.
So perhaps we should also have Relationships (Romance)? :)
Indeed. <3
I went ahead and edited the tag to have your description
Many thanks!!
“Probability Calibration” rather than “Calibration (Probability)” feels like a more natural name for the tag, while keeping the disambiguation.
See my comment in the other thread about my argument for modifiers after the main word. It’s not LW team consensus, just something I’ve been pushing for.