Manfred comments on Log-odds (or logits)

Manfred Nov 28, 2011, 2:08 AM
13 points
Comments:

Log base ten may be more intuitive for conversion purposes. Then adding another 9 corresponds to adding 1.

“Five times more likely” should overflow for probabilities greater than 0.2. This is because the terminology “times more likely” is usually used in the context of decision-making, so it manipulates the linear probabilities because that’s what goes into the expected utility.
- brilee Nov 28, 2011, 2:47 AM
  8 points
  Parent
  Yeah, I was definitely thinking about that. The mathematician in me won out in the end.
  
  It occurs to me that a lot of people have probably thought about this, and they have alternately used base 2, base e, and base 10. Unless we get the entire LW community to standardize on one base, we won’t be able to coherently communicate with one another using log-probabilities, and therefore log-probabilities will stay relegated to the dustbin.
  
  base 2 - advantages, we can talk about N bytes’ worth of evidences.
  
  base e—mathematician’s base
  
  base 10 - common layperson can understand it, advantages with the 9′s and 0′s.
  
  Actually, I think you’re right, log base 10 is probably better. If others agree, I’ll rewrite the article in base 10.
  - Zack_M_Davis Nov 28, 2011, 3:02 AM
    12 points
    Parent
    
    base e—mathematician’s base
    
    What’s the specific benefit of base e for log-odds, though? Base e has lots of special properties that make it useful in many areas of mathematics (e^x is its own derivative, de Moivre’s formula, &c.), but is this one of them? (It could be; I don’t know.)
    - [deleted]Nov 28, 2011, 9:22 PM
      12 points
      Parent
      To quote Jaynes, p.91 of PT:TLoS:
      
      In many applications it is convenient to take the logarithm of the odds because of the fact that we can then add up terms. Now we could take the logarithm to any base we please, and this cost the writer some trouble. Our analytic expressions always look neater in terms of natural (base e) logarithms. But back in the 1940s and 1950s when this theory was first developed, we used base 10 logarithms because they were easier to find numerically; the four-figure tables would fit on a single page. Finding a natural logarithm was a tedious process, requiring leafing through enormous old volumes of tables.
      
      Today, thanks to hand calculators, all such tables are obsolete and anyone can find a ten-digit natural logarithm just as easily as a base 10 logarithm. Therefore, we started happily to rewrite this section in terms of the aesthetically prettier natural logarithms. But the result taught us that there is another, even stronger, reason for using base 10 logarithms. Our minds are thoroughly conditioned to the base 10 number system, and base 10 logarithms have an immediate, clear intuitive meaning to all of us. However, we just don’t know what to make of a conclusion stated in terms of natural logarithms, until it is translated back into base 10 terms. Therefore, we re-wrote this discussion, reluctantly, back into the old, ugly base 10 convention.
      
      So to answer your question, the only advantage of base e is that “ln” looks tidier than “log10″.
      
      Apart from being more intuitively understandable to humans, using base 10 also allows us to multiply by 10 and measure evidence in the familiar unit of decibels.
  - Steve_Rayhawk Nov 30, 2011, 6:43 PM
    8 points
    Parent
    The natural unit of ratio, the neper (Np), is easier to interpret for small ratio contributions, where the derivative of exp(x) is ≈1:
    
    0.1Np = exp( 0.1) ∶ 1 ≈ 1.1 ∶ 1
    -0.1Np = exp(-0.1) ∶ 1 ≈ 0.9 ∶ 1
    
    This could make for an easy upgrade path to use of nepers or centinepers instead of percents in comparatives involving rates, which would reduce semantic confusion. “50% faster” can mean “gets 150% as far” (so .41Np faster, or 41 cNp, or perhaps 41Np%) or “takes 50% as much time” (so .69Np faster, or 69cNp, or 69Np%). That’s an argument for using nepers as a standard base outside communications of probability.
    
    (trivia: Nepers and radians are each other turned sideways, being respectively the real and imaginary parts of eigenvalues of linear differential equation systems.)
  - wedrifid Nov 28, 2011, 12:36 PM
    8 points
    Parent
    
    base 2 - advantages, we can talk about N bytes’ worth of evidences.
    
    Wouldn’t it be easier to talk about N bytes worth of evidence in base 256? Bits of evidence seems the more useful metric!
  - brilee Nov 30, 2011, 4:12 PM
    0 points
    Parent
    Article is rewritten in base 10, and I rewrote some of the explanation for Bayesian updates. Enjoy!
  - shokwave Nov 28, 2011, 11:36 AM
    0 points
    Parent
    I would like to see the article in base 10.