Jan Christian Refsgaard comments on Use Normal Predictions

Jan Christian Refsgaard 13 Jan 2022 6:10 UTC
1 point
That’s also how I conseptiolize it, you have to change your intervals because you are to stupid to make better predictions, if the prediction was always spot on then sigma should be 0 and then my scheme does not make sense

If you suck like me and get a prediction very close then I would probably say: that sometimes happen :) note I assume the average squared error should be 1, which means most errors are less than 1, because 0²⁺²2=2>1
- SimonM 13 Jan 2022 14:32 UTC
  1 point
  Parent
  If you suck like me and get a prediction very close then I would probably say: that sometimes happen :) note I assume the average squared error should be 1, which means most errors are less than 1, because 0²⁺²2=2>1
  I assume you’re making some unspoken assumptions here, because $0^{2} + 2^{2} > 1^{2}$ is not enough to say that. A naive application of Chebyshev’s inequality would just say that $E (X^{2}) = 1, E (X) = 0 \Rightarrow P (X \leq 1) \leq 1$ .
  To be more concrete, if you were very weird, and either end up forecasting 0.5 s.d. or 1.1 s.d. away, (still with mean 0 and average squared error 1) then you’d find “most” errors are more than 1.
  - Jan Christian Refsgaard 13 Jan 2022 15:36 UTC
    1 point
    Parent
    I am making the simple observation that the median error is less than one because the mean squares error is one.
    - SimonM 13 Jan 2022 15:39 UTC
      2 points
      Parent
      That isn’t a “simple” observation.
      Consider an error which is 0.5 22% of the time, 1.1 78% of the time. The squared errors are 0.25 and 1.21. The median error is 1.1 > 1. (The mean squared error is 1)
      - Jan Christian Refsgaard 14 Jan 2022 10:56 UTC
        1 point
        Parent
        Yes you are right, but under the assumption the errors are normal distributed, then I am right:
        
        If:
        
        $p \sim B e r n (0.78) σ = p \times N (0, 1.1) + (p - 1) N (0, 0.5)$
        
        Then $E [σ^{2}] \approx 0.37$ Which is much less than 1.
        
        proof:
        
        import scipy as sp x1 = sp.stats.norm(0, 0.5).rvs(22 * 10000) x2 = sp.stats.norm(0, 1.1).rvs(78 * 10000) x12 = pd.Series(np.array(x1.tolist() + x2.tolist())) print((x12 ** 2).median())
        SimonM 14 Jan 2022 14:54 UTC
        3 points
        Parent
        Under what assumption?
        1/ You aren’t “[assuming] the errors are normally distributed”. (Since a mixture of two normals isn’t normal) in what you’ve written above.
        2/ If your assumption is $X \sim N (0, 1)$ then yes, I agree the median of $X^{2}$ is ~0.45 (although
        from scipy import stats stats.chi2.ppf(.5, df=1) >>> 0.454936
        would have been an easier way to illustrate your point). I think this is actually the assumption you’re making. [Which is a horrible assumption, because if it were true, you would already be perfectly calibrated].
        3/ I guess you’re new claim is “[assuming] the errors are a mixture of normal distributions, centered at 0”, which okay, fine that’s probably true, I don’t care enough to check because it seems a bad assumption to make.
        More importantly, there’s a more fundamental problem with your post. You can’t just take some numbers from my post and then put them in a different model and think that’s in some sense equivalent. It’s quite frankly bizarre. The equivalent model would be something like:
        $p \sim B e r n (0.78)$
        $σ \sim p \cdot N (1.1, ε) + (1 - p) \sim N (0.5, ε)$
        Jan Christian Refsgaard 16 Jan 2022 21:47 UTC
        1 point
        Parent
        Our ability to talk past each other is impressive :)
        
        would have been an easier way to illustrate your point). I think this is actually the assumption you’re making. [Which is a horrible assumption, because if it were true, you would already be perfectly calibrated].
        
        Yes this is almost the assumption I am making, the general point of this post is to assume that all your predictions follow a Normal distribution, with $μ$ as “guessed” and with a $σ$ that is different from what you guessed, and then use $X^{2}$ to get a point estimate for the counterfactual $σ$ you should have used. And as you point out if (counterfactual) $σ = 1$ then the point estimate suggests you are well calibrated.
        
        In the post counter factual $σ$ is ${^σ}_{z}$