I’d force log odds, as they are the more natural representation and much less susceptible to irrational certainty and nonsense answers.
Someone has to actually try and comprehend what they are doing to troll logits; -INF seems a lot more out to lunch than p = 0.
I’d also like someone to go thru the math to figure out how to correctly take the mean of probability estimates. I see no obvious reason why you can simply average [0, 1] probability. The correct method would probably involve cooking up a hypothetical bayesian judge that takes everyones estimates as evidence.
Edit: since logits can be a bit unintuitive, I’d give a few calibration examples like odds of rolling a 6 on a die, odds of winning some lottery, fair odds, odds of surviving a car crash, etc.
I’d force log odds, as they are the more natural representation and much less susceptible to irrational certainty and nonsense answers.
Personally, for probabilities roughly between 20% and 80% I find probabilities (or non-log odds) easier than understand than log-odds.
Someone has to actually try and comprehend what they are doing to troll logits; -INF seems a lot more out to lunch than p = 0.
Yeah. One of the reason why I proposed this is the median answer of 0 in several probability questions. (I’d also require a rationale in order to enter probabilities more extreme than 1%/99%.)
I’d also like someone to go thru the math to figure out how to correctly take the mean of probability estimates. I see no obvious reason why you can simply average [0, 1] probability. The correct method would probably involve cooking up a hypothetical bayesian judge that takes everyones estimates as evidence.
I’d go with the average of log-odds, but this requires all of them to be finite...
yeah. that’s what I was getting at with the maxentropy judge.
On further thought, I really should look into figuring this out. Maybe I’ll do some work on it and post a discussion post. This could be a great group rationality tool.
I’d force log odds, as they are the more natural representation and much less susceptible to irrational certainty and nonsense answers.
Someone has to actually try and comprehend what they are doing to troll logits; -INF seems a lot more out to lunch than p = 0.
I’d also like someone to go thru the math to figure out how to correctly take the mean of probability estimates. I see no obvious reason why you can simply average [0, 1] probability. The correct method would probably involve cooking up a hypothetical bayesian judge that takes everyones estimates as evidence.
Edit: since logits can be a bit unintuitive, I’d give a few calibration examples like odds of rolling a 6 on a die, odds of winning some lottery, fair odds, odds of surviving a car crash, etc.
Personally, for probabilities roughly between 20% and 80% I find probabilities (or non-log odds) easier than understand than log-odds.
Yeah. One of the reason why I proposed this is the median answer of 0 in several probability questions. (I’d also require a rationale in order to enter probabilities more extreme than 1%/99%.)
I’d go with the average of log-odds, but this requires all of them to be finite...
Weighting, in part, by the calibration questions?
I dunno how you would weight it. I think you’d want to have a maxentropy ‘fair’ judge at least for comparison.
Calibration questions are probably the least controversial way of weighting. Compare to, say, trying to weight using karma.
This might be an interesting thing to develop. A voting system backed up by solid bayes-math could be useful for more than just LW surveys.
It might be interesting to see what results are produced by several weighting approaches.
yeah. that’s what I was getting at with the maxentropy judge.
On further thought, I really should look into figuring this out. Maybe I’ll do some work on it and post a discussion post. This could be a great group rationality tool.