The Karma system is better than nothing, and also better than even simpler systems as Facebook’s like system, but the main problem is that it is too simple.
Presumably the Karma system is supposed to at least do two things:
1) Influence posters’ behaviour (e.g. if you get downvoted when writing in a certain way you’re likely to change)
2) Inform readers which posts and comments to read
However, it does not perform these tasks very efficiently, the reason being that it is so very unclear what we are voting on. People apply wildly different criteria. For instance, I would guess that some have a much lower threshold for throwing a downvote than others. Also, some primarily reward people who write posts containing objective information (as pointed out above), whereas others also reward other sorts of posts.
As someone pointed out somewere, there is also a bandwagon effect when it comes to voting, so that posts/comments with upvotes/downvotes are more likely to continue to be upvoted/downvoted. This means that a certain post which a lot of people would actually find interesting can get downvoted because of bad luck: the first voter uses non-standard criteria and his vote then influences subsequent voters.
All this means that both posters and readers can’t know exactly why it is that a certain post has got a certain amount of Karma. As a result, the present Karma system does not fulfil either task 1) or task 2) adequately. If you don’t know why a certain post got a certain amount of Karma, how can you know how to change your writing, and how can you decide whether to read it or not?
Of course, the comments give both readers and posters a better picture of what people think of the post, but saying this is a bit beside the point. If it doesn’t matter that the Karma system is less than satisfactory because you can read the comments, then why have the Karma system after all?
The main advantage of the present Karma system is its simplicity. It could be argued that more complex system would be too complicated for people to comprehend, etc. That is perhaps an argument that would be viable at Reddit and similar sites, but surely a site claiming to be “rationalist” should be able to assume that it’s members can handle more complex systems.
Exactly how such a system is to be devised is an important question which should be discussed (suggestions are welcome) but I’ll stop here for now.
If everyone had identical criteria for voting, we would see all postings having either large positive karma, karma near zero, or large negative karma. The more alike people are in their judgements, the less information the total score provides. It is because people vary in what they find voteworthy that the whole spectrum of scores is meaningful.
As someone pointed out somewere, there is also a bandwagon effect when it comes to voting, so that posts/comments with upvotes/downvotes are more likely to continue to be upvoted/downvoted.
If many people with different criteria all like a post, chances are that the next person to read it will like it also. I don’t see a problem.
This means that a certain post which a lot of people would actually find interesting can get downvoted because of bad luck: the first voter uses non-standard criteria and his vote then influences subsequent voters.
I have often noticed the direction of karma on a post reversing after the first few votes. Sometimes I have voted on a post that I would not otherwise have done, just to oppose the trend of its karma when I thought it unmerited.
The main advantage of the present Karma system is its simplicity.
Yes! One click! A more complicated system would not be too complicated to use, but too complicated to be worth using. On Ebay, I’m happy to give feedback as positive/neutral/negative plus a few words of boilerplate, but I never use their 5-star scales for quality of packaging, promptness of delivery, etc. How do I rate a cardboard box out of 5?
In short, I think the karma system is excellent and sets a high bar for being improved on.
If everyone had identical criteria for voting, we would see all postings having either large positive karma, karma near zero, or large negative karma. The more alike people are in their judgements, the less information the total score provides.
If you only can give 1 plus vote, 1 negative vote, or no vote at all, that seems to follow. If you rather could give, say 1-5 positive or negative Karma, we would see a greater variety of scores.
Also, note that many posts and especially comments have very few votes. This means that the votes actually cast will often not be typical of the whole population of possible voters in a system where people’s votes vary considerably. In a system where people’s votes are more alike, this obviously happens less frequently.
Yes! One click! A more complicated system would not be too complicated to use, but too complicated to be worth using. On Ebay, I’m happy to give feedback as positive/neutral/negative plus a few words of boilerplate, but I never use their 5-star scales for quality of packaging, promptness of delivery, etc. How do I rate a cardboard box out of 5?
I agree that one shouldn’t have to rate, e.g. comments on say five different criteria. The system could be be somewhat more complex to comprehend, but you’re right that it shouldn’t be significantly more complex to use.
I think one obvious improvement is, though, to separate the posts into different categories which are to be assessed on different criteria. You could have one “objective information/literature review” section, one “opinion piece/discussion” section, one “meetup” section, and possibly a few more. In each section, you’d be rated on different criteria. That way, original pieces wouldn’t be downvoted because they’re not literature reviews, which seems to be Gunnar’s (justifiable) complaint.
This system would be superior to the present, and no more complicated. I think further improvements are also possible, but those should be separately discussed.
It is a property. It means some aggregation. But that is inevitable given a single bit.
In short, I think the karma system is excellent and sets a high bar for being improved on.
It is excellent compared to no rating or only single-direction voting. But is quite inferior to e.g. the slashdot system. Even a single-click system that provides different buttons for different types of posts would be better.
Thanks, that’s very interesting. I was especially interested in this:
We can gauge each Superuser’s voting accuracy based on their performance on honeypots (proposed updates with known answers which are deliberately inserted into the updates queue). Measuring performance and using these probabilities correctly is the key to how we assign points to a Superuser’s vote.
So they measure voting accuracy based on some questions on which they know the true answer.
There is a difference between their votes and the kind of votes cast here, though; namely that on Less Wrong there is not in a strict sense a “true answer” to how good a post or comment is. So that tactics cannot be used.
On questions on which there is a true answer it is easier to track people’s reliability and provide them with incentives to answer reliably. On questions which are more an issue of preference (“e.g. how good is this post?”) that is harder.
In some comment some time earlier I proposed a voting/rating system (which I now can’t find because “vote” occurs in every hit) which was intendend to be intuitive and provide the necessary information. The basic idea is to asynchroneously transport human emotion. Translating the emotion to/from a few well known words is trivial and if the set of words is sufficiently rich and the aggregation of these ratings (for sorting/filtering) follows some sensible rules then I think this system should be near optimum.
I’d add independent votes for the dichotomies love/hate, happy/sad, awed/pity, surprised/bored, funny/sick, (for comparison you can have a look at the Lojban attitudinals). Using such a system a great insightful post might get voted love+awe. And a rant hate and/or sick. Some unhelpful commonplace get ‘bored’.
Adding a satisfied/dissatisfied attitudinal is problematic because it is prone to depend on the relationship to the poster.
One could add an agreement/disagreement vote which votes the relation between both members and which isn’t taken into account when ranking globally but in a personal view.
In a way the usual ‘like’ is an abstracted sum of the positive emotions. Whereas karma here is a sum of all emotions (because it allows downvotes).
Slashdot tries a different approach that tries to use some objective categories which I can’t translate to simple emotions (‘informative’=curiosity? ‘insightful’=surprise+awe?, ‘funny’=surprise+happyness?). But I do get little out of these tags and they are more difficult to translate.
Since you mention Slashdot, here’s a little side effect of one of their moderation systems. At one point, they decided that “funny” shouldn’t give posters karma. However, given the per-post karma cap of 5, this can prevent karma-giving moderation while encouraging karma-deleting moderation by people who think the comment overrated, potentially costing the poster tons of karma. As such, moderators unwilling to penalize posters for making jokes largely abandoned the “funny” tag in favor of alternatives.
I suspect that if an agree/disagree moderation option were added, it would likely suffer from a similar problem. Eg if we treated that tag reasonably and used it to try to separate karma gains/losses from personal agreement/disagreement, people would be tempted to rate a post they especially like as disagree/love/awe.
A more interesting idea, I think, would be to run correlations between your votes and various other bits, such as keywords, author, and other voters, to increase the visibility of posts you like and decrease the visibility of posts you don’t like. This would encourage honest and frequent voting, and diversity. Conversely, it would cause people to overestimate the community’s agreement with them (more than they would by default).
Interesting, and an interesting Slashdot link. I especially like the idea of “moderating the moderators”. You do need to check whether people vote seriously in some way, it seems to me.
The only problem I see is Richard’s concern below that multi-criterial systems, where you actually vote on all criteria, may turn out to be too cumbersome to use.
The Karma system is better than nothing, and also better than even simpler systems as Facebook’s like system, but the main problem is that it is too simple.
Presumably the Karma system is supposed to at least do two things:
1) Influence posters’ behaviour (e.g. if you get downvoted when writing in a certain way you’re likely to change)
2) Inform readers which posts and comments to read
However, it does not perform these tasks very efficiently, the reason being that it is so very unclear what we are voting on. People apply wildly different criteria. For instance, I would guess that some have a much lower threshold for throwing a downvote than others. Also, some primarily reward people who write posts containing objective information (as pointed out above), whereas others also reward other sorts of posts.
As someone pointed out somewere, there is also a bandwagon effect when it comes to voting, so that posts/comments with upvotes/downvotes are more likely to continue to be upvoted/downvoted. This means that a certain post which a lot of people would actually find interesting can get downvoted because of bad luck: the first voter uses non-standard criteria and his vote then influences subsequent voters.
All this means that both posters and readers can’t know exactly why it is that a certain post has got a certain amount of Karma. As a result, the present Karma system does not fulfil either task 1) or task 2) adequately. If you don’t know why a certain post got a certain amount of Karma, how can you know how to change your writing, and how can you decide whether to read it or not?
Of course, the comments give both readers and posters a better picture of what people think of the post, but saying this is a bit beside the point. If it doesn’t matter that the Karma system is less than satisfactory because you can read the comments, then why have the Karma system after all?
The main advantage of the present Karma system is its simplicity. It could be argued that more complex system would be too complicated for people to comprehend, etc. That is perhaps an argument that would be viable at Reddit and similar sites, but surely a site claiming to be “rationalist” should be able to assume that it’s members can handle more complex systems.
Exactly how such a system is to be devised is an important question which should be discussed (suggestions are welcome) but I’ll stop here for now.
This is a feature.
If everyone had identical criteria for voting, we would see all postings having either large positive karma, karma near zero, or large negative karma. The more alike people are in their judgements, the less information the total score provides. It is because people vary in what they find voteworthy that the whole spectrum of scores is meaningful.
If many people with different criteria all like a post, chances are that the next person to read it will like it also. I don’t see a problem.
I have often noticed the direction of karma on a post reversing after the first few votes. Sometimes I have voted on a post that I would not otherwise have done, just to oppose the trend of its karma when I thought it unmerited.
Yes! One click! A more complicated system would not be too complicated to use, but too complicated to be worth using. On Ebay, I’m happy to give feedback as positive/neutral/negative plus a few words of boilerplate, but I never use their 5-star scales for quality of packaging, promptness of delivery, etc. How do I rate a cardboard box out of 5?
In short, I think the karma system is excellent and sets a high bar for being improved on.
If you only can give 1 plus vote, 1 negative vote, or no vote at all, that seems to follow. If you rather could give, say 1-5 positive or negative Karma, we would see a greater variety of scores.
Also, note that many posts and especially comments have very few votes. This means that the votes actually cast will often not be typical of the whole population of possible voters in a system where people’s votes vary considerably. In a system where people’s votes are more alike, this obviously happens less frequently.
I agree that one shouldn’t have to rate, e.g. comments on say five different criteria. The system could be be somewhat more complex to comprehend, but you’re right that it shouldn’t be significantly more complex to use.
I think one obvious improvement is, though, to separate the posts into different categories which are to be assessed on different criteria. You could have one “objective information/literature review” section, one “opinion piece/discussion” section, one “meetup” section, and possibly a few more. In each section, you’d be rated on different criteria. That way, original pieces wouldn’t be downvoted because they’re not literature reviews, which seems to be Gunnar’s (justifiable) complaint.
This system would be superior to the present, and no more complicated. I think further improvements are also possible, but those should be separately discussed.
It is a property. It means some aggregation. But that is inevitable given a single bit.
It is excellent compared to no rating or only single-direction voting. But is quite inferior to e.g. the slashdot system. Even a single-click system that provides different buttons for different types of posts would be better.
See also The Mathematics of Gamification—Application of Bayes Rule to Voting.
Thanks, that’s very interesting. I was especially interested in this:
So they measure voting accuracy based on some questions on which they know the true answer.
There is a difference between their votes and the kind of votes cast here, though; namely that on Less Wrong there is not in a strict sense a “true answer” to how good a post or comment is. So that tactics cannot be used.
On questions on which there is a true answer it is easier to track people’s reliability and provide them with incentives to answer reliably. On questions which are more an issue of preference (“e.g. how good is this post?”) that is harder.
In some comment some time earlier I proposed a voting/rating system (which I now can’t find because “vote” occurs in every hit) which was intendend to be intuitive and provide the necessary information. The basic idea is to asynchroneously transport human emotion. Translating the emotion to/from a few well known words is trivial and if the set of words is sufficiently rich and the aggregation of these ratings (for sorting/filtering) follows some sensible rules then I think this system should be near optimum.
I’d add independent votes for the dichotomies love/hate, happy/sad, awed/pity, surprised/bored, funny/sick, (for comparison you can have a look at the Lojban attitudinals). Using such a system a great insightful post might get voted love+awe. And a rant hate and/or sick. Some unhelpful commonplace get ‘bored’.
Adding a satisfied/dissatisfied attitudinal is problematic because it is prone to depend on the relationship to the poster. One could add an agreement/disagreement vote which votes the relation between both members and which isn’t taken into account when ranking globally but in a personal view.
In a way the usual ‘like’ is an abstracted sum of the positive emotions. Whereas karma here is a sum of all emotions (because it allows downvotes).
Slashdot tries a different approach that tries to use some objective categories which I can’t translate to simple emotions (‘informative’=curiosity? ‘insightful’=surprise+awe?, ‘funny’=surprise+happyness?). But I do get little out of these tags and they are more difficult to translate.
ADDED: See Measuring Emotions
Since you mention Slashdot, here’s a little side effect of one of their moderation systems. At one point, they decided that “funny” shouldn’t give posters karma. However, given the per-post karma cap of 5, this can prevent karma-giving moderation while encouraging karma-deleting moderation by people who think the comment overrated, potentially costing the poster tons of karma. As such, moderators unwilling to penalize posters for making jokes largely abandoned the “funny” tag in favor of alternatives.
I suspect that if an agree/disagree moderation option were added, it would likely suffer from a similar problem. Eg if we treated that tag reasonably and used it to try to separate karma gains/losses from personal agreement/disagreement, people would be tempted to rate a post they especially like as disagree/love/awe.
A more interesting idea, I think, would be to run correlations between your votes and various other bits, such as keywords, author, and other voters, to increase the visibility of posts you like and decrease the visibility of posts you don’t like. This would encourage honest and frequent voting, and diversity. Conversely, it would cause people to overestimate the community’s agreement with them (more than they would by default).
Interesting, and an interesting Slashdot link. I especially like the idea of “moderating the moderators”. You do need to check whether people vote seriously in some way, it seems to me.
The only problem I see is Richard’s concern below that multi-criterial systems, where you actually vote on all criteria, may turn out to be too cumbersome to use.
Depends. It’s too cumbersomeif it is as elaborate as this one: Measuring Emotions
I didn’t propose to force a vote on all. Only the stronges emotional responses. Maybe none.