So a while ago, I was thinking about whether there was any way to combine probability distributions P and Q by “intersecting” them; I wanted a distribution R which had high probability only if both of P and Q had high probability. One idea I came up with was argminRKL(R||P)+KL(R||Q). I didn’t prove anything about it, but looking at some test cases it looked reasonable.
I was just about to suggest changing your belief aggregation method should be closer to this, but then I realized that it was already close in the relevant sense. Gonna be exciting to see what you might have to say about it.
The aggregation method you suggest is called logarithmic pooling. Another way to phrase it is: take the geometric mean of the odds given by the probability distribution (or the arithmetic mean of the log-odds). There’s a natural way to associate every proper scoring rule (for eliciting probability distributions) with an aggregation method, and logarithmic pooling is the aggregation method that gets associated with the log scoring rule (which Scott wrote about in an earlier post). (Here’s a paper I wrote about this connection: https://arxiv.org/pdf/2102.07081.pdf)
Wait no, I’m stupid. What you do corresponds to argminRKL(P||R)+KL(Q||R), which is more like the union of hypotheses. You’d have to do something like argmaxQGh∼P[Gw∼Q[h(w)]] to get the intersection, I think. I should probably think through the math again more carefully when I have more time.
Note that this is just the arithmetic mean of the probability distributions. Which is indeed what you want if you believe that P is right with probability 50% and Q is right with probability 50%, and I agree that this is what Scott does.
At the same time, I wonder—is there some sort of frame on the problem that makes logarithmic pooling sensible? Perhaps (inspired by the earlier post on Nash bargaining) something like a “bargain” between the two hypotheses, where a hypothesis’ “utility” for an outcome is the probability that the hypothesis assigns to it.
The place where I came up with it was in thinking about models that focus on independent dynamics and might even have different ontologies. For instance, maybe to set environmental policy, you want to combine climate models with economics models. The intersection expression seemed like a plausible method for that. Though I didn’t look into it in detail.
So a while ago, I was thinking about whether there was any way to combine probability distributions P and Q by “intersecting” them; I wanted a distribution R which had high probability only if both of P and Q had high probability. One idea I came up with was argminRKL(R||P)+KL(R||Q). I didn’t prove anything about it, but looking at some test cases it looked reasonable.
I was just about to suggest changing your belief aggregation method should be closer to this, but then I realized that it was already close in the relevant sense. Gonna be exciting to see what you might have to say about it.
The aggregation method you suggest is called logarithmic pooling. Another way to phrase it is: take the geometric mean of the odds given by the probability distribution (or the arithmetic mean of the log-odds). There’s a natural way to associate every proper scoring rule (for eliciting probability distributions) with an aggregation method, and logarithmic pooling is the aggregation method that gets associated with the log scoring rule (which Scott wrote about in an earlier post). (Here’s a paper I wrote about this connection: https://arxiv.org/pdf/2102.07081.pdf)
I’m also exited to see where this sequence goes!
Nice, thank you.
Wait no, I’m stupid. What you do corresponds to argminRKL(P||R)+KL(Q||R), which is more like the union of hypotheses. You’d have to do something like argmaxQGh∼P[Gw∼Q[h(w)]] to get the intersection, I think. I should probably think through the math again more carefully when I have more time.
Note that this is just the arithmetic mean of the probability distributions. Which is indeed what you want if you believe that P is right with probability 50% and Q is right with probability 50%, and I agree that this is what Scott does.
At the same time, I wonder—is there some sort of frame on the problem that makes logarithmic pooling sensible? Perhaps (inspired by the earlier post on Nash bargaining) something like a “bargain” between the two hypotheses, where a hypothesis’ “utility” for an outcome is the probability that the hypothesis assigns to it.
The place where I came up with it was in thinking about models that focus on independent dynamics and might even have different ontologies. For instance, maybe to set environmental policy, you want to combine climate models with economics models. The intersection expression seemed like a plausible method for that. Though I didn’t look into it in detail.