Hi. I’m Baisius. I came here, like most, through HPMOR. I’ve read a lot of the sequences and they’ve helped me reanalyze the things I believe and why I believe them. I’ve been lurking here for awhile, but I’ve never really felt I had anything to add to the site, content wise. That’s changed, however—I just launched a blog. The blog is generally LW themed, so I thought it appropriate. I wouldn’t ordinarily advertise for it, but I would particularly like some help on one of the problems I explored in my first post. (see footnote 3)
One of the things that’s bothered me about PredictionBook, and one of the reasons I don’t use it much, is that its analysis seems a bit… lacking. In the post, I tried to come up with a rigorous way of comparing sets of predictions to see which are more accurate. I did this by looking at the distribution of residuals (outcome—predicted probability) for a set of predictions. The odd thing was that when I looked at the variance, the inverse of the variance showed some very odd patterns. It’s all there in the post, but if anyone who knows a bit more math than I do could explain it, I’d really appreciate it.
I wasn’t thanks. I’ll try to read that sometime when I get a chance. At first glance though, I’m unsure why you would want it to be logarithmic. I thought about doing it that way too, but you then you lose the meaning associated with average error, which I think is undesirable.
So, let’s say you want a scoring rule with two properties.
You want it to be local: that is to say, all that matters is the probability you assigned to the actual outcome. This is in contrast to rules like the quadratic scoring rule, where your score is different depending on how the outcomes that didn’t happen are grouped. Based on this assumption, I’m going to write the scoring rule as S(p), where S(p) is the score you get when you assign a probability p to the true outcome.
You also want it to play nicely with combining separate events. That is to say, if you estimate 10% of it being cloudy when it actually is, and 10% of it being warm outside when it actually is, you want your score to be the same as if you had assigned 1% to the correct proposition that it is warm and cloudy outside. More succinctly:
S(p)+S(q)=S(pq).
If you add in the additional caveat that some scores are not 0, then you are forced by the above statement to a logarithmic scoring rule. Interestingly, you don’t need to include the requirement that it be a proper scoring rule, although the logarithmic scoring rule is proper.
Hi. I’m Baisius. I came here, like most, through HPMOR. I’ve read a lot of the sequences and they’ve helped me reanalyze the things I believe and why I believe them. I’ve been lurking here for awhile, but I’ve never really felt I had anything to add to the site, content wise. That’s changed, however—I just launched a blog. The blog is generally LW themed, so I thought it appropriate. I wouldn’t ordinarily advertise for it, but I would particularly like some help on one of the problems I explored in my first post. (see footnote 3)
One of the things that’s bothered me about PredictionBook, and one of the reasons I don’t use it much, is that its analysis seems a bit… lacking. In the post, I tried to come up with a rigorous way of comparing sets of predictions to see which are more accurate. I did this by looking at the distribution of residuals (outcome—predicted probability) for a set of predictions. The odd thing was that when I looked at the variance, the inverse of the variance showed some very odd patterns. It’s all there in the post, but if anyone who knows a bit more math than I do could explain it, I’d really appreciate it.
Welcome!
For assessing prediction accuracy, are you familiar with scoring rules?
I wasn’t thanks. I’ll try to read that sometime when I get a chance. At first glance though, I’m unsure why you would want it to be logarithmic. I thought about doing it that way too, but you then you lose the meaning associated with average error, which I think is undesirable.
So, let’s say you want a scoring rule with two properties.
You want it to be local: that is to say, all that matters is the probability you assigned to the actual outcome. This is in contrast to rules like the quadratic scoring rule, where your score is different depending on how the outcomes that didn’t happen are grouped. Based on this assumption, I’m going to write the scoring rule as S(p), where S(p) is the score you get when you assign a probability p to the true outcome.
You also want it to play nicely with combining separate events. That is to say, if you estimate 10% of it being cloudy when it actually is, and 10% of it being warm outside when it actually is, you want your score to be the same as if you had assigned 1% to the correct proposition that it is warm and cloudy outside. More succinctly: S(p)+S(q)=S(pq).
If you add in the additional caveat that some scores are not 0, then you are forced by the above statement to a logarithmic scoring rule. Interestingly, you don’t need to include the requirement that it be a proper scoring rule, although the logarithmic scoring rule is proper.