The problem with this view is that there does not seem to be any way to calibrate the scale. What should be the karma of a good post? A bad post? A mediocre one? What does 20 mean? What does 5 mean? Don’t the answers to these questions depend on how many users are voting on the post, and what their voting behavior is? Suppose you and I both hold the view you describe, but I think a good post should have 100 karma and you think a good post should have 300 karma—how should our voting behavior be interpreted? What does it mean, when a post ends up with, say, 75 karma? Do people think it’s good? Bad? Do we know?
This gets very complicated. It seems like the signal is degraded, not improved, by this.
i.e. voting independently results in posts just above the threshold for “good enough to strong-upvote” for a lot of users but to get the same karma as a post that is in the top 5 of all-time favorite posts for everyone who upvoted it
It seems to me like your perspective results in an improved signal only if everyone who votes has the same opinions on everything.
If people do not have the same opinions, then there will be a distribution across people’s “good enough to strong-upvote” thresholds; a post’s karma will then reflect its position along that distribution. A “top 5 all-time favorite for many people” will be “good enough to strong-upvote” for most people, and will have a high score. A “just good enough to upvote” post for many people, will cross that threshold for fewer, i.e. will be lower along that distribution, and will end up with a lower score. (In other words, you’re getting strong upvote × probability of strong upvote, summed across all voters.)
If everyone has the same opinion, then this will simply result in either everyone strong-upvoting it or no one strong-upvoting it—and in that case, my earlier concern about differently calibrated scales also does not apply.
So, your interpretation seems optimal if adopted by a user population with extremely homogeneous opinions. It is strongly sub-optimal, however, if adopted by a user population with a diverse range of opinions; in that scenario, the “votes independently indicate one’s own evaluation” interpretation is optimal.
The problem with this view is that there does not seem to be any way to calibrate the scale. What should be the karma of a good post? A bad post? A mediocre one? What does 20 mean? What does 5 mean? Don’t the answers to these questions depend on how many users are voting on the post, and what their voting behavior is? Suppose you and I both hold the view you describe, but I think a good post should have 100 karma and you think a good post should have 300 karma—how should our voting behavior be interpreted? What does it mean, when a post ends up with, say, 75 karma? Do people think it’s good? Bad? Do we know?
This gets very complicated. It seems like the signal is degraded, not improved, by this.
It seems to me like your perspective results in an improved signal only if everyone who votes has the same opinions on everything.
If people do not have the same opinions, then there will be a distribution across people’s “good enough to strong-upvote” thresholds; a post’s karma will then reflect its position along that distribution. A “top 5 all-time favorite for many people” will be “good enough to strong-upvote” for most people, and will have a high score. A “just good enough to upvote” post for many people, will cross that threshold for fewer, i.e. will be lower along that distribution, and will end up with a lower score. (In other words, you’re getting strong upvote × probability of strong upvote, summed across all voters.)
If everyone has the same opinion, then this will simply result in either everyone strong-upvoting it or no one strong-upvoting it—and in that case, my earlier concern about differently calibrated scales also does not apply.
So, your interpretation seems optimal if adopted by a user population with extremely homogeneous opinions. It is strongly sub-optimal, however, if adopted by a user population with a diverse range of opinions; in that scenario, the “votes independently indicate one’s own evaluation” interpretation is optimal.