SarahNibs comments on How to evaluate (50%) predictions

SarahNibs 10 Apr 2020 21:34 UTC
22 points
I think you’re conflating impressiveness of predictions with calibration of predictions. It’s possible to be perfectly calibrated and unimpressive (for every statement, guess 50% and then randomly choose whether to use the statement or its negation). It’s also possible to be uncalibrated and very impressive (be really good at finding all the evidence swaying things from baseline, but count all evidence 4x, for instance).
50% predictions don’t really tell us about calibration due to the “swap statement with negation at random” strategy, but they can tell us plenty about impressiveness.
- Rafael Harth 11 Apr 2020 9:07 UTC
  3 points
  Parent
  I might have been unclear, but I didn’t mean to conflate them. The post is meant to be just about impressiveness. I’ve stated in the end that impressiveness is boldness $\cdot$ accuracy (which I probably should have called calibration). It’s possible to have perfect accuracy and zero boldness by making predictions about random number generators.
  I disagree that 50% predictions can’t tell you anything about calibration. Suppose I give you 200 statements with baseline probabilities, and you have to turn them into predictions by assigning them your own probabilities while following the rule. Once everything can be evaluated, the results on your 50% group will tell me something about how well calibrated you are.
  (Edit: I’ve changed the post to say impressiveness = calibration $\cdot$ boldness)
  - Douglas_Knight 11 Apr 2020 16:57 UTC
    6 points
    Parent
    The title and first sentence are about calibration. You never hear very smart people saying that 50% predictions are meaningless in the context of accuracy.
    There’s nothing magical about 50%. The closer the predictions are to 50%, the harder it is to judge calibration.
    - Matt Goldenberg 3 May 2020 15:40 UTC
      2 points
      Parent
      You never hear very smart people saying that 50% predictions are meaningless in the context of accuracy.
      I have heard this from very smart people.
      - Douglas_Knight 3 May 2020 17:46 UTC
        2 points
        Parent
        Could you give an example?
        Could you give an example where the claim is that 50% predictions are less meaningful than 10% predictions?
        How do you know that it is about accuracy?
        Matt Goldenberg 4 May 2020 1:20 UTC
        2 points
        Parent
        I don’t really want to point to specific people. I can think of a couple of conversations with smart EAs or Rationalists where this claim was made.
        Douglas_Knight 4 May 2020 2:17 UTC
        2 points
        Parent
        So you probably won’t convince me that these people know what the claim is, but you haven’t even attempted to convince me that you know what the claim is. Do you see that I asked multiple questions?
        Matt Goldenberg 4 May 2020 2:23 UTC
        2 points
        Parent
        Do you see how giving very specific answers to this question would be the same as stating people’s names?
        Suffice it to say that I understand the difference between impressiveness and calibration, and it didn’t seem like they did before our conversation, even though they are smart.
        Douglas_Knight 4 May 2020 2:43 UTC
        2 points
        Parent
        Well, that’s something, but I don’t see how it’s relevant to this thread.
        Matt Goldenberg 4 May 2020 2:50 UTC
        2 points
        Parent
        I think you’re conflating impressiveness of predictions with calibration of predictions.
        Could you give an example where the claim is that 50% predictions are less meaningful than 10% predictions?
        I mean, these things? A very similar claim to “10% are less meaningful than 50%” which was due to conflating impressiveness and calibration.
        It may be that we’re just talking past each other?
        Douglas_Knight 4 May 2020 18:54 UTC
        2 points
        Parent
        Yes, exactly: this post conflates accuracy and calibration. Thus it is a poor antidote to people who make that mistake.
        Expand this thread
        Matt Goldenberg 4 May 2020 19:01 UTC
        2 points
        Parent
        I do think we’re talking past each other now as I don’t know how this relates to our previous discussion.
        At any rate I don’t think the discussion is that high value to the rest of the post so I think I’ll just leave it here.