Log odds, measured in something like “bits of evidence” or “decibels of evidence”, is the natural thing to think of yourself as “counting”. A probability of 100% would be like having infinite positive evidence for a claim and a probability of 0% is like having infinite negative evidence for a claim. Arbital has some math and Eliezer has a good old essay on this.
A good general heuristic (or widely applicable hack) to “fix your numbers to even be valid numbers” when trying to get probabilities for things based on counts (like a fast and dirty spreadsheet analysis), and never having this spit out 0% or 100% due to naive division on small numbers (like seeing 3 out of 3 of something and claiming it means the probability of that thing is probably 100%), is to use “pseudo-counting” where every category that is analytically possible is treated as having been “observed once in our imaginations”. This way, if you can fail or succeed, and you’ve seen 3 of either, and seen nothing else, you can use pseudocounts to guesstimate that whatever happened every time so far is (3+1)/(3+2) == 80% likely in the future, and whatever you’ve never seen is (0+1)/(3+2) == 20% likely.
Log odds, measured in something like “bits of evidence” or “decibels of evidence”, is the natural thing to think of yourself as “counting”. A probability of 100% would be like having infinite positive evidence for a claim and a probability of 0% is like having infinite negative evidence for a claim. Arbital has some math and Eliezer has a good old essay on this.
Odds (and log odds) solve some problems but they unfortunately create others.
For addition and multiplication they at least seem to make things worse. We know that we can add probabilities if they are “mutually exclusive” to get the probability of their disjunction, and we know we can multiply them if they are “independent” to get the probability of their conjunction. But when can we add two odds, or multiply two odds? (Or log odds) And what would be the interpretation of the result?
On the other hand, unlike for probabilities, the multiplication with constants does indeed seem unproblematic for odds. (Or the addition of constants for logs.) E.g. “doubling” some odds makes always sense due to them being unbounded from above, while doubling probabilities is not always possible. And when it is, it is questionable whether it has any sensible interpretation.
But the way Arbital and Eliezer handle it doesn’t actually make use of this fact. They instead treat the likelihood ratio (or its logarithm) as evidence strength. But, as I said, the likelihood ratio is actually a ratio of probabilities, not of odds, in which case the interpretation as evidence strength is shaky. The likelihood ratio assumes that doubling a small probability of the evidence constitutes the same evidence strength as doubling a relatively large one, which seems not right.
As a formal example, assume the hypothesis H doubles the probability of evidence E1 compared to ¬H. That is, we have the likelihood ratio P(E1|H)P(E1|¬H)=2. Since log2(2)=1, E1 is interpreted to constitute 1 bit of evidence in favor of H.
Then assume we also have some evidence E2 that is doubled by H compared to ¬H. So E2 is interpreted to also be 1 bit of evidence in favor of H.
Does this mean both cases involve equal evidence strength? Arguably no. For example, the probability of E1 may be quite small while the probability of E2 may be quite large. This would mean H hardly decreases the probability of ¬E1 compared to ¬H, while H strongly decreases the probability of ¬E2 compared to ¬H. So P(¬E1|H)P(¬E1|¬H)≫P(¬E2|H)P(¬E2|¬H).
So according to the likelihood ratio theory, E1 would be moderate (1 bit) evidence for H, and E2 would be equally moderate evidence for H, but ¬E1 would be very weak evidence against H while ¬E2 would be very strong evidence against H.
That seems implausible. Arguably E2 is here much stronger evidence for H than E1.
Here is a more concrete example:
H = The patient actually suffers from Diseasitis.
E1 = The patient suffers from Diseasitis according to test 1.
E2 = The patient suffers from Diseasitis according to test 2.
P(E1|H)=0.02
P(E1|¬H)=0.01
P(E2|H)=0.98
P(E2|¬H)=0.49
Log likelihood ratio E1: log2(P(E1|H)P(E1|¬H))=1 bit
Log likelihood ratio E2: log2(P(E2|H)P(E2|¬H))=1 bit
So this says both tests represent equally strong evidence.
What if we instead take the ratio of conditional odds, instead of the ratio of conditional probabilities (as in the likelihood ratio)?
O=P1−P
Log odds ratio E1: log2(O(E1|H)O(E1|¬H))=1.0146 bits
Log odds ratio E2: log2(O(E2|H)O(E2|¬H))=5.6724 bits
So the odds ratios are actually pretty different. Unlike the likelihood ratio, the odds ratio agrees with my argument that E2 is significantly stronger evidence than E1.
Log odds, measured in something like “bits of evidence” or “decibels of evidence”, is the natural thing to think of yourself as “counting”. A probability of 100% would be like having infinite positive evidence for a claim and a probability of 0% is like having infinite negative evidence for a claim. Arbital has some math and Eliezer has a good old essay on this.
A good general heuristic (or widely applicable hack) to “fix your numbers to even be valid numbers” when trying to get probabilities for things based on counts (like a fast and dirty spreadsheet analysis), and never having this spit out 0% or 100% due to naive division on small numbers (like seeing 3 out of 3 of something and claiming it means the probability of that thing is probably 100%), is to use “pseudo-counting” where every category that is analytically possible is treated as having been “observed once in our imaginations”. This way, if you can fail or succeed, and you’ve seen 3 of either, and seen nothing else, you can use pseudocounts to guesstimate that whatever happened every time so far is (3+1)/(3+2) == 80% likely in the future, and whatever you’ve never seen is (0+1)/(3+2) == 20% likely.
Odds (and log odds) solve some problems but they unfortunately create others.
For addition and multiplication they at least seem to make things worse. We know that we can add probabilities if they are “mutually exclusive” to get the probability of their disjunction, and we know we can multiply them if they are “independent” to get the probability of their conjunction. But when can we add two odds, or multiply two odds? (Or log odds) And what would be the interpretation of the result?
On the other hand, unlike for probabilities, the multiplication with constants does indeed seem unproblematic for odds. (Or the addition of constants for logs.) E.g. “doubling” some odds makes always sense due to them being unbounded from above, while doubling probabilities is not always possible. And when it is, it is questionable whether it has any sensible interpretation.
But the way Arbital and Eliezer handle it doesn’t actually make use of this fact. They instead treat the likelihood ratio (or its logarithm) as evidence strength. But, as I said, the likelihood ratio is actually a ratio of probabilities, not of odds, in which case the interpretation as evidence strength is shaky. The likelihood ratio assumes that doubling a small probability of the evidence constitutes the same evidence strength as doubling a relatively large one, which seems not right.
As a formal example, assume the hypothesis H doubles the probability of evidence E1 compared to ¬H. That is, we have the likelihood ratio P(E1|H)P(E1|¬H)=2. Since log2(2)=1, E1 is interpreted to constitute 1 bit of evidence in favor of H.
Then assume we also have some evidence E2 that is doubled by H compared to ¬H. So E2 is interpreted to also be 1 bit of evidence in favor of H.
Does this mean both cases involve equal evidence strength? Arguably no. For example, the probability of E1 may be quite small while the probability of E2 may be quite large. This would mean H hardly decreases the probability of ¬E1 compared to ¬H, while H strongly decreases the probability of ¬E2 compared to ¬H. So P(¬E1|H)P(¬E1|¬H)≫P(¬E2|H)P(¬E2|¬H).
So according to the likelihood ratio theory, E1 would be moderate (1 bit) evidence for H, and E2 would be equally moderate evidence for H, but ¬E1 would be very weak evidence against H while ¬E2 would be very strong evidence against H.
That seems implausible. Arguably E2 is here much stronger evidence for H than E1.
Here is a more concrete example:
H = The patient actually suffers from Diseasitis.
E1 = The patient suffers from Diseasitis according to test 1.
E2 = The patient suffers from Diseasitis according to test 2.
P(E1|H)=0.02
P(E1|¬H)=0.01
P(E2|H)=0.98
P(E2|¬H)=0.49
Log likelihood ratio E1: log2(P(E1|H)P(E1|¬H))=1 bit
Log likelihood ratio E2: log2(P(E2|H)P(E2|¬H))=1 bit
So this says both tests represent equally strong evidence.
What if we instead take the ratio of conditional odds, instead of the ratio of conditional probabilities (as in the likelihood ratio)?
O=P1−P
Log odds ratio E1: log2(O(E1|H)O(E1|¬H))=1.0146 bits
Log odds ratio E2: log2(O(E2|H)O(E2|¬H))=5.6724 bits
So the odds ratios are actually pretty different. Unlike the likelihood ratio, the odds ratio agrees with my argument that E2 is significantly stronger evidence than E1.