TobyBartels comments on 2014 Less Wrong Census/Survey

TobyBartels 23 Oct 2014 7:07 UTC
6 points
I’d be much more comfortable answering the probability sections if I knew what epsilon is. I usually say 0% when the value is less than 0.5% and 100% when the value is greater than 99.5%, rounding to the nearest whole percentage, on the grounds that the whole point of using percentages is to avoid explicit fractions (common or decimal). But then you ruin this by explicitly mentioning 0.5% and 99.99% as possible answers. If you had put a hard limit on the number of digits allowed, then I could have used that. In the end, since I saw no consistent guidance, I fell back on my usual practice. The result is that I had a lot of 0s and 100s; hopefully that won’t mess up your algorithms.

ETA: It is probably relevant here that I am a naturally lazy person.
What links here?
- TobyBartels's comment on 2014 Less Wrong Census/Survey by Scott Alexander (27 Oct 2014 1:01 UTC; 0 points)
- Sarunas 26 Oct 2014 21:07 UTC
  4 points
  Parent
  I think it might have been better to ask people to estimate what are the odds that a given statement is true. If a probability of a statement is close to zero or close to one, it gives us better precision without having to worry about digits after the decimal point (however, if a probability is close to one half, it is probably better to ask for a probability). Although it is easy to convert odds to probabilities, how many people in this survey actually took the mental effort to remind themselves to calculate the odds first and only then to express them as probabilities? I might be wrong, but I guess that only a minority. An idea for the next year survey—it might be interesting to compare the answers of two groups, one of which would be asked to estimate probabilities, the other one to estimate the odds.
  - Elund 27 Oct 2014 1:18 UTC
    2 points
    Parent
    Are you using “odds” to refer to percentages and “probabilities” to refer to fractions? I don’t think there is actually any difference in meaning between the two terms.
    - TobyBartels 27 Oct 2014 3:22 UTC
      2 points
      Parent
      Colloquial language doesn’t make this distinction, but by technical convention, they are different.
      
      Specifically, ‘odds’ refers to expressions like ‘5 to 3 against’; numerically, that’s the fraction ⁵⁄₃, or rather (because of the ‘against’) its reciprocal, ³⁄₅. Thus odds run from 0 (impossible) to infinity (certain), with odds of 1 being perfectly balanced between Yes and No. In contrast, probabilities run only from 0 to 1. An event with odds of 5 to 3 against, or equivalently odds of ³⁄₅, has a probability of 3/(3+5) = ³⁄₈. So the numbers are different. The conversion formulas are O = P/(1 − P) and P = O/(1 + O).
      
      Then there are log-odds; this is log₂ O bits. (You can also use other bases than 2 and correspondingly other units than bits.) Now 0 indicates perfect balance between Yes and No; a positive number means more likely Yes than No, and a negative number means less likely Yes than No. Log-odds run from negative infinity (impossible) to infinity (certain).
      What links here?
      arundelo's comment on 2014 Less Wrong Census/Survey by Scott Alexander (7 Nov 2014 8:30 UTC; 0 points)
      - Elund 27 Oct 2014 6:09 UTC
        2 points
        Parent
        
        Specifically, ‘odds’ refers to expressions like ‘5 to 3 against’
        
        Oh right, I forgot about that definition. The main probability conversions that I was aware of involved converting between fractions and percentages, sometimes expressed instead as probabilities between 0 and 1. Theoretically, it makes sense that odds can also be converted to or from probabilities, now that I think about it. Thanks for your explanation.
      - VAuroch 2 Nov 2014 22:48 UTC
        −5 points
        Parent
        ‘5 to 3 against’ is ³⁄₈, not ³⁄₅. Odds of ‘N to M’ or ‘N to M against’ are always between 0 and 1.
        EHeller 7 Nov 2014 2:40 UTC
        3 points
        Parent
        5 to 3 against is ³⁄₅ (as odds), which is a probability of ³⁄₈. You are muddling probability and odds ratios in an unacceptable way.
        arundelo 2 Nov 2014 23:25 UTC
        −1 points
        Parent
        Wikipedia:
        
        Odds can be expressed as a ratio of two numbers [or] as a number, by dividing the terms in the ratio [....] Odds range from 0 to infinity, while probabilities range from 0 to 1 [...]”
        
        VAuroch 3 Nov 2014 23:09 UTC
        −4 points
        Parent
        Yes, that’s exactly what I said. There is no way to express a fraction greater than 100% using odds notation; Saying that odds are “1 million to 1” is 99.9999%, still under 1.
        arundelo 7 Nov 2014 8:30 UTC
        0 points
        Parent
        In the Wikipedia article, take a look at the table below the words “These are worked out for some simple odds”. The odds that TobyBartels is talking about, which one gets by dividing the numbers in an “n to m” expression, and which go from zero to infinity, are shown in the second and third columns of that table (o_f and o_a). Probabilities, which go from 0 to 1 or 0% to 100%, are shown in the fourth and fifth columns (p and q).
        TobyBartels 6 Nov 2014 19:56 UTC
        0 points
        Parent
        
        Yes, that’s exactly what I said.
        
        You said ‘Odds […] are always between 0 and 1’, while Wikipedia said ‘Odds range from 0 to infinity’, so you didn’t say the same thing.
        VAuroch 7 Nov 2014 0:41 UTC
        −4 points
        Parent
        Did you actually read the article you linked? It says the exact same thing as I did, phrased differently. Their “Odds range from 0 to infinity” means that any number from 0 to infinity can be used in the odds ratio, but still always represent a probability between 0 and 1. Which is precisely what I said.
        TobyBartels 10 Nov 2014 10:26 UTC
        0 points
        Parent
        No, that’s not what you said. I am now done with this conversation.
        nshepperd 7 Nov 2014 1:53 UTC
        0 points
        Parent
        Um, representing a number between 0 and 1 is not the same as being a number between 0 and 1. The representation of p = ³⁄₈ as odds = ³⁄₅ (“5 to 3 against”) is useful in practice, for example because bayes’ rule reduces to plain multiplication for odds ratios.
  - TobyBartels 27 Oct 2014 0:15 UTC
    1 point
    Parent
    Yes, odds are good (and log-odds are even better), but people are bad at both dealing with very large absolute values and dealing with very fine precisions. I think that the survey is correct to put in a cut-off (whether an ϵ for probabilities, an N for log-odds, or one of each for odds); it should just tell us where. (Edit: put in stuff about log-odds properly.)
- NancyLebovitz 25 Oct 2014 2:25 UTC
  3 points
  Parent
  Epsilon is a minuscule amount. It’s vanishingly small, but it’s still there.
  - TobyBartels 27 Oct 2014 0:13 UTC
    5 points
    Parent
    Yes, but which minuscule amount?
    
    To be more specific: If ϵ ≥ 5 × 10⁻ⁿ (which it must be for some n, if it is a positive real number), then I only need to figure out my probability to n + 1 digits. Upon doing so, if it’s all 0s, then my probability is no more than ϵ, so I can enter 0. Otherwise, I should enter something larger. (And a similar thing holds on the other end.) Specifying ϵ serves the practical purpose of telling us how much work to put into estimating our probabilities. Since I had no guideline for that, I chose to default to ϵ = ¹⁄₂ (in percentage points), rather than try to additionally work out how small ϵ was supposed to be.
    
    If, instead of bringing up ϵ, the survey had instructed us to use as many decimals as we need to avoid ever answering either 0 or 100, then I probably would have done more work. (There are reasons why this is bad, since the results will be increasingly unreliable, but still it could have said that.) But since I knew that at some point my work would be ignored, I didn’t do any.
    
    (Edits: minor grammar and precise phrasing of inequalities.)
    - CBHacking 27 Oct 2014 11:08 UTC
      1 point
      Parent
      I took epsilon to be simply 0.5, on the basis of “the survey can take decimals but I’m going to use whole numbers as suggested, so 0 means I rounded down anything less than 0.5”. This is imprecise but gives me greater confidence in my answers, and (as you say), I have some tendency towards laziness.
      - TobyBartels 30 Oct 2014 4:45 UTC
        0 points
        Parent
        Yes, that’s what I did too (0.5%).
- Elund 27 Oct 2014 1:25 UTC
  0 points
  Parent
  I don’t think it will mess up the algorithms. My guess is that most people probably rounded most calibration answers to the tens place due to lack of enough confidence to be more precise, but since people are giving different values, the average across all respondents is unlikely to fall on an increment of ten, and should be a reasonably accurate measure of the respondents’ collective assigned probability for a question.
  - TobyBartels 27 Oct 2014 3:27 UTC
    0 points
    Parent
    It could mess them up, because in theory a single wrong answer with 100% confidence renders the entire series infinitely poorly calibrated. The survey says that this won’t be done, that 100% will be treated as something slightly less than that. But how much less could depend on assumptions that the survey-makers made about how often people would answer this way, and maybe I did it too much.
    
    I doubt it, since I’m pretty sure that they know enough about these pitfalls to avoid them. But I felt that I answered 0 and 100 quite a lot, so I thought that some warning was in order.
- Vulture 25 Oct 2014 21:40 UTC
  0 points
  Parent
  Even though percentages are typically used for cases where precision is less important, I’d say that in this context it would be better to err on the side of precision.

TobyBartels comments on 2014 Less Wrong Census/​Survey

TobyBartels comments on 2014 Less Wrong Census/Survey