More specifics of how I might implement a probability system like this:
Before you can assign probabilities, you need a “level-1 calibration” badge, achieved via hitting a certain calibration level in a LW/​CFAR game/​app. (You can then get level-2, level-3, etc. calibration badges for even better performance, maybe unlocking other site features like karma-betting markets.)
Everyone’s probability assignments (which are non-anonymous by default) can always be viewed by clicking or hovering on a “collapsed” probability (i.e., one that look like 𝗣 = foo instead of like an Arbital probability distribution image).
Collapsed probabilities will display as 𝗣 = ? until, e.g., some number of users with at least 2000 karma between them have assigned probabilities to that comment/​post. Some aggregated probability will then be displayed for “foo” in 𝗣 = foo, with better-calibrated users (according to badge count) receiving more weight in the aggregation.
The main purpose of hiding the probabilities behind 𝗣 = ? until enough high-karma users have weighed in is to discourage low-karma users from getting really excited and wasting time running around and assigning probabilities to every comment on the site. If someone feels like doing that, that’s totally fine — maybe they’ll learn something from the process — but those probabilities shouldn’t be prominently displayed, because a lot of comments on the site are things like “I agree!” or “Woah.” where it doesn’t really matter if someone decides to waste their time adding silly probability assignments, but it does start carrying a cost if this makes silly probability assignments distracting and visible to anyone visiting the page.
(Note that users’ karma totals do not affect the weight users’ assignments receive in the probability aggregation at all, even though it affects how prominently the probabilities are displayed. On the other hand, it might be fine to weight users more if they have badges for things other than calibration, e.g., a “general knowledge” badge reflecting that you’re unusually good at answering Jeopardy questions or what-have-you.)
More specifics of how I might implement a probability system like this:
Before you can assign probabilities, you need a “level-1 calibration” badge, achieved via hitting a certain calibration level in a LW/​CFAR game/​app. (You can then get level-2, level-3, etc. calibration badges for even better performance, maybe unlocking other site features like karma-betting markets.)
Everyone’s probability assignments (which are non-anonymous by default) can always be viewed by clicking or hovering on a “collapsed” probability (i.e., one that look like 𝗣 = foo instead of like an Arbital probability distribution image).
Collapsed probabilities will display as 𝗣 = ? until, e.g., some number of users with at least 2000 karma between them have assigned probabilities to that comment/​post. Some aggregated probability will then be displayed for “foo” in 𝗣 = foo, with better-calibrated users (according to badge count) receiving more weight in the aggregation.
The main purpose of hiding the probabilities behind 𝗣 = ? until enough high-karma users have weighed in is to discourage low-karma users from getting really excited and wasting time running around and assigning probabilities to every comment on the site. If someone feels like doing that, that’s totally fine — maybe they’ll learn something from the process — but those probabilities shouldn’t be prominently displayed, because a lot of comments on the site are things like “I agree!” or “Woah.” where it doesn’t really matter if someone decides to waste their time adding silly probability assignments, but it does start carrying a cost if this makes silly probability assignments distracting and visible to anyone visiting the page.
(Note that users’ karma totals do not affect the weight users’ assignments receive in the probability aggregation at all, even though it affects how prominently the probabilities are displayed. On the other hand, it might be fine to weight users more if they have badges for things other than calibration, e.g., a “general knowledge” badge reflecting that you’re unusually good at answering Jeopardy questions or what-have-you.)
Is there an existing CFAR/​LW calibration app that we consider good? (I know there have been attempts but haven’t actually used them myself)
New thing: https://​​www.openphilanthropy.org/​​blog/​​new-web-app-calibration-training
This one broke for me a few times :/​
I actually also think the UI design and feedback mechanisms are a lot worse, so I would recommend that people still use the old one.