Why shouldn’t it be highly voted? When you’re talking to a random outsider, and want to demonstrate the usefulness of bayesian techniques, using the example of clippy is a funny, and interesting, way to make your point.
As such, this is a valuable contribution for anyone who might, at some point, want to convert someone to bayesian techniques.
Given that it takes very little time to read, this means that it’s value:time ratio is very good. As it is a discussion post, rather than a main post, this is sufficient justification to upvote it.*
*(with a main post I’d also expect a significant amount of content)
Personally, I think it’s a little disturbing that the post’s karma fell from ~35 to ~25 since you posted this. I would have thought LWers generally put more thought into their votes than that.
Thank you, I appreciate that. Although it’s not like I didn’t anticipate this exact response, I just thought it was funny enough to go through with anyway.
(Honestly I half-expected ArisKatsaris to show up, all don’t tell me how to vote! etc wanky etc.)
I withdrew my vote after jsalvatier made his comment. It made me think more about it, and the fact that I have a general problem with voting too often for things that are funny instead of things that genuinely help the signal to noise ratio. I also saw the extremely high total vote as worrisome. If the vote had at the time been +10 or +15 or so I might not have felt as much of a need to withdraw my vote.
I suspect that similar thought processes occurred with other people.
Yes, this is what I assumed had happened and was commenting on. Maybe I just pay too much attention to karma because I’m green as grass, but I don’t think I’ve ever cast a vote without thinking through what I find valuable about the post and how that compares to its current total. The fact that apparently a lot of people cast what I would call impulse votes is making me reevaluate exactly what it is that ‘karma’ is measuring.
Edit: Oh, I just realized—the anti-kibitzer hides karma scores as well as usernames. Probably there’s a large subset of voters who don’t and can’t take relative totals into account until someone comments on it.
Of course, if someone considers that a good reason to reverse their vote I don’t know why they would be using the anti-kibitzer in the first place.
I still don’t think anyone here should feel good about paying attention to current total while deciding whether to upvote or downvote. Share evidence, not conclusions. The net karma a comment ends up at should be the result of aggregating our valuations, not a result of, say, whether those who thought it should be at +100 voted before or after those who thought it should be at +2.
Edit: it’s clear to me now that I don’t have a good solution to my perceived problem.
It seems to me that your suggested policy would result in comment-placement effects being even stronger than they are now. What score should a comment end up with if 50 people consider voting on it and they all think it should have a score of +2?
I communicated poorly. I don’t think “should have a score of +2” should enter into the decision to upvote, downvote, or not vote. Instead, I’d rather voting algorithms which, when implemented individually, have results which can be meaningfully summed. For example, suppose everyone upvotes exactly when they think a comment is in the top 5% of comments in “everyone should read this” ordering and downvotes for the bottom 5%. Then the sum reflects the number of people who read the comment x (the average percentage of people who thought it was in the top 5% - bottom 5%). That’s something I can understand.
If I think a comment should end up with a score of +2, too bad, I have no direct way of controlling that. The resulting score is a reflection of the community’s votes, not something I try to game by altering my voting decision based on whether the score gets closer to +2.
I mean, do people downvote comments that they would have otherwise not voted on if they think the comment has too many upvotes? If not, why do they decline to upvote when they otherwise would have upvoted? The two look the same from everyone else’s perspective, right?
I’m not saying that your proposed algorithm is wrong—not exactly, anyway. I am pointing out something that I think is a flaw.
Putting the same point a different way:
Consider two comments. One is posted early, and is seen by 50 people. It’s slightly good—good enough that each of those people would, by your algorithm, upvote it, but no better than that. The other is posted late, and is only seen by 10 people, but it’s very, very good. According to your algorithm, the first one would get a score of +50 and the second one would get a score of +10. By the methods currently in use, the first one will get a low score—probably +1 or +2 - and the second one will still get +10.
The first comment got many more points than the second, by your algorithm, because its author was able to quickly put together something good enough to be upvoteable, and because they were at the right place at the right time to post it early in the conversation, which implies either luck or lots of time spent lurking on LW. I don’t think these are things we want to incentivise—at least not more than we want to incentivise putting time into crafting well-thought-out comments.
Also:
… do people downvote comments that they would have otherwise not voted on if they think the comment has too many upvotes?
You’re right. Reviewing my feelings on this I discovered that my main “ugh, that’s terrible” feeling comes from the observation that a correlated set of people form a control system that wipes out the contributions of others not in a similar or larger implicit alliance. That doesn’t imply the solution is to vote independently of the total, though, as there are negative side effects like the one you describe.
I mean, do people downvote comments that they would have otherwise not voted on if they think the comment has too many upvotes? If not, why do they decline to upvote when they otherwise would have upvoted?
I often (although) not always will upvote a comment simply if it deserves it. I only very rarely downvote or don’t vote a comment if I think it is too high but should be positive. Declining to upvote a too high comment is something I do much more frequently than downvoting a too high comment. This is a passive rather than active decision. In general declining to upvote creates less negative emotional feelings in me than actively downvoting something which is too high.
I do sometimes upvote comments that have been downvoted if I think they’ve simply been downvoted way too much. That seems for me at least to be the most common form of corrective voting.
I have no idea how representative my behavior is of the general LWian.
If I think a comment should end up with a score of +2, too bad, I have no direct way of controlling that. The resulting score is a reflection of the community’s votes, not something I try to game by altering my voting decision based on whether the score gets closer to +2.
Ok, but that’s your self handicapping and I want no part of it myself.
My decision to vote shall be determined by whatever vote I predict has the best consequences.
I don’t think “should have a score of +2” should enter into the decision to upvote, downvote, or not vote.
Why not? No, really: what’s wrong with that?
Instead, I’d rather voting algorithms which, when implemented individually, have results which can be meaningfully summed.
The current voting algorithms can be meaningfully summed, they’re just complicated, opaque and nonstandardized. I don’t understand why you think “everyone should use my voting algorithm” is a useful thing to say.
If I think a comment should end up with a score of +2, too bad, I have no direct way of controlling that.
In what situation would you not, given that it is possible to alter your voting decision based on whether the score gets closer to +2? Do you intend to prevent that somehow?
do people downvote comments that they would have otherwise not voted on if they think the comment has too many upvotes?
At least two people do. Why do you ask? (Seriously, I can’t figure out why this is phrased as a rhetorical question.)
Edit: Okay, here’s the thing: I think it would be more useful if karma was the average of our valuations; i.e. if you could, say, input ‘+10’ or ‘-3’ as shorthand for ‘upvote if below this number, downvote if above’ rather than simply ‘upvote’ and ‘downvote’. What do you imagine the problem with this system would be?
Edit: Okay, here’s the thing: I think it would be more useful if karma was the average of our valuations; i.e. if you could, say, input ‘+10’ or ‘-3’ as shorthand for ‘upvote if below this number, downvote if above’ rather than simply ‘upvote’ and ‘downvote’. What do you imagine the problem with this system would be?
Not exactly a problem but a lotof my votes would either be +1000 or −1000.
I think that karma is a useful feedback but only at a very approximate level. If a post is heavily upvoted or heavily downvoted it is likely to be higher quality. But this is extremely approximate. The posts I’ve had most upvoted are rarely what I would consider my highest quality remarks. For example, this comment was relevant but I don’t see any reason why it is at +24 other than some sort of bandwagon effect.
Pff, that’s nothing. Two of my highest-karma comments (try not to laugh at the totals; I’m green as grass, remember) are utterly derivative, by virtue of being simple restatements of another person’s point in a slightly funnier way. Namely this and this.
Ok. But the real thing is the discrepancy between them. While that comment I made is at +24, this comment is at +2 where it uses a nearly identical level of sources and analysis about a somewhat similar set of demographic issues.
It isn’t just that some funny comments get voted up a lot. It is that there’s very little general pattern to how far one comment gets up compared to another even when they are very similar comments.
Comments get more upvotes, independent of quality, if they:
Are in a high-traffic thread
Are made while the thread is still new
Get an early complimentary reply
Make a point many people agree with and care about (especially if the first to make that point)
Become the highest-karma comment early on (bandwagon + people may only read/vote on the first few comments, so being the top comment is valuable)
Are closer to top-level (people don’t read deep into threads unless particularly interested)
I think these effects, in aggregate, are probably much stronger determinants of comment karma than actual quality. Top-level posts, to main or discussion, suffer from fewer of these effects, so their karma is a little more reliable. But I hope no one is taking their comment karma too much to heart.
I don’t think the attribution is right. I am always surprised by what does and doesn’t get upvoted, which means I’m poorly calibrated. Something useful I post to discussion after spending 20 hours on it gets 5 upvotes, and then something useless like this discussion post that took me 60 seconds to post gets 25+ upvotes. :)
My first assumption is that almost everything you post is seen as (at least somewhat) valuable (for almost every post #upvotes > #downvotes), so the net karma you get is mostly based on throughput. More readers, more votes. More votes, more karma.
Second, useful posts do not only take time to write, they take time to read as well. And my guess is that most of us don’t like to vote on thoughtful articles before we have read them. So for funny posts we can quickly make the judgement on how to vote, but for longer posts it takes time.
Decision fatigue may also play a role (after studying something complex the extra decision of whether to vote on it feels like work so we skip it). People may also print more valuable texts, or save them for later, making it easy to forget to vote.
The effect is much more evident on other karma based sites. Snarky one-liners and obvious puns are karma magnets. LessWrong uses the same system and is visited by the same species and therefore suffers from the same problems, just to a lesser extent.
Decision fatigue may also play a role (after studying something complex the extra decision of whether to vote on it feels like work so we skip it).
This. Also after reading a more complex thing, it seems common that I’ll forget to think about voting at all, since I’m distracted by thinking about the implications or who I might want to share it with or what other people have to say about it. Sometimes I remember to go back and vote, but I think most of the time I just don’t, whereas with funny things the impulse to focus on the author and give them a reward in response seems to be automatic.
Also, sometimes an apparently well-researched article turns out to be based on only a superficial understanding of the topic (e.g. only having skimmed the abstracts) and mis-represents the cited material, and this is sometimes revealed on “cross-examination” in the comments.
On the other hand I suspect you are well calibrated with what gives you respect and reputation. You could say that your poor calibration with respect to karma is karma’s problem! :)
Something useful I post to discussion after spending 20 hours on it
20 hours on a discussion post? That would be a mistake right there!
Amusing, but I am embarrassed that this so highly voted (which I am attributing to this being written by luke).
Why shouldn’t it be highly voted? When you’re talking to a random outsider, and want to demonstrate the usefulness of bayesian techniques, using the example of clippy is a funny, and interesting, way to make your point.
As such, this is a valuable contribution for anyone who might, at some point, want to convert someone to bayesian techniques.
Given that it takes very little time to read, this means that it’s value:time ratio is very good. As it is a discussion post, rather than a main post, this is sufficient justification to upvote it.*
*(with a main post I’d also expect a significant amount of content)
Personally, I think it’s a little disturbing that the post’s karma fell from ~35 to ~25 since you posted this. I would have thought LWers generally put more thought into their votes than that.
(A consistent effect. Comments about karma are powerful. People seem rather malleable.)
This is a really good point, so I upvoted it!
This is a really bad point, so I downvoted it!
(We shall see whose karma-fu is stronger, Molybdenumblue! Bwahaha!)
Edit: Welp, guess that answers that.
Aw, hugs.
Thank you, I appreciate that. Although it’s not like I didn’t anticipate this exact response, I just thought it was funny enough to go through with anyway.
(Honestly I half-expected ArisKatsaris to show up, all don’t tell me how to vote! etc wanky etc.)
I withdrew my vote after jsalvatier made his comment. It made me think more about it, and the fact that I have a general problem with voting too often for things that are funny instead of things that genuinely help the signal to noise ratio. I also saw the extremely high total vote as worrisome. If the vote had at the time been +10 or +15 or so I might not have felt as much of a need to withdraw my vote.
I suspect that similar thought processes occurred with other people.
Yes, this is what I assumed had happened and was commenting on. Maybe I just pay too much attention to karma because I’m green as grass, but I don’t think I’ve ever cast a vote without thinking through what I find valuable about the post and how that compares to its current total. The fact that apparently a lot of people cast what I would call impulse votes is making me reevaluate exactly what it is that ‘karma’ is measuring.
Edit: Oh, I just realized—the anti-kibitzer hides karma scores as well as usernames. Probably there’s a large subset of voters who don’t and can’t take relative totals into account until someone comments on it.
Of course, if someone considers that a good reason to reverse their vote I don’t know why they would be using the anti-kibitzer in the first place.
I still don’t think anyone here should feel good about paying attention to current total while deciding whether to upvote or downvote. Share evidence, not conclusions. The net karma a comment ends up at should be the result of aggregating our valuations, not a result of, say, whether those who thought it should be at +100 voted before or after those who thought it should be at +2.
Edit: it’s clear to me now that I don’t have a good solution to my perceived problem.
It seems to me that your suggested policy would result in comment-placement effects being even stronger than they are now. What score should a comment end up with if 50 people consider voting on it and they all think it should have a score of +2?
I communicated poorly. I don’t think “should have a score of +2” should enter into the decision to upvote, downvote, or not vote. Instead, I’d rather voting algorithms which, when implemented individually, have results which can be meaningfully summed. For example, suppose everyone upvotes exactly when they think a comment is in the top 5% of comments in “everyone should read this” ordering and downvotes for the bottom 5%. Then the sum reflects the number of people who read the comment x (the average percentage of people who thought it was in the top 5% - bottom 5%). That’s something I can understand.
If I think a comment should end up with a score of +2, too bad, I have no direct way of controlling that. The resulting score is a reflection of the community’s votes, not something I try to game by altering my voting decision based on whether the score gets closer to +2.
I mean, do people downvote comments that they would have otherwise not voted on if they think the comment has too many upvotes? If not, why do they decline to upvote when they otherwise would have upvoted? The two look the same from everyone else’s perspective, right?
I’m not saying that your proposed algorithm is wrong—not exactly, anyway. I am pointing out something that I think is a flaw.
Putting the same point a different way:
Consider two comments. One is posted early, and is seen by 50 people. It’s slightly good—good enough that each of those people would, by your algorithm, upvote it, but no better than that. The other is posted late, and is only seen by 10 people, but it’s very, very good. According to your algorithm, the first one would get a score of +50 and the second one would get a score of +10. By the methods currently in use, the first one will get a low score—probably +1 or +2 - and the second one will still get +10.
The first comment got many more points than the second, by your algorithm, because its author was able to quickly put together something good enough to be upvoteable, and because they were at the right place at the right time to post it early in the conversation, which implies either luck or lots of time spent lurking on LW. I don’t think these are things we want to incentivise—at least not more than we want to incentivise putting time into crafting well-thought-out comments.
Also:
I do this. Not very often, but it happens.
You’re right. Reviewing my feelings on this I discovered that my main “ugh, that’s terrible” feeling comes from the observation that a correlated set of people form a control system that wipes out the contributions of others not in a similar or larger implicit alliance. That doesn’t imply the solution is to vote independently of the total, though, as there are negative side effects like the one you describe.
I often (although) not always will upvote a comment simply if it deserves it. I only very rarely downvote or don’t vote a comment if I think it is too high but should be positive. Declining to upvote a too high comment is something I do much more frequently than downvoting a too high comment. This is a passive rather than active decision. In general declining to upvote creates less negative emotional feelings in me than actively downvoting something which is too high.
I do sometimes upvote comments that have been downvoted if I think they’ve simply been downvoted way too much. That seems for me at least to be the most common form of corrective voting.
I have no idea how representative my behavior is of the general LWian.
Ok, but that’s your self handicapping and I want no part of it myself.
My decision to vote shall be determined by whatever vote I predict has the best consequences.
Surely by whatever vote is recommended by the decision procedure you predict has the best consequences. ;)
No, I meant what I said.
Why not? No, really: what’s wrong with that?
The current voting algorithms can be meaningfully summed, they’re just complicated, opaque and nonstandardized. I don’t understand why you think “everyone should use my voting algorithm” is a useful thing to say.
In what situation would you not, given that it is possible to alter your voting decision based on whether the score gets closer to +2? Do you intend to prevent that somehow?
At least two people do. Why do you ask? (Seriously, I can’t figure out why this is phrased as a rhetorical question.)
Edit: Okay, here’s the thing: I think it would be more useful if karma was the average of our valuations; i.e. if you could, say, input ‘+10’ or ‘-3’ as shorthand for ‘upvote if below this number, downvote if above’ rather than simply ‘upvote’ and ‘downvote’. What do you imagine the problem with this system would be?
Not exactly a problem but a lotof my votes would either be +1000 or −1000.
I think that karma is a useful feedback but only at a very approximate level. If a post is heavily upvoted or heavily downvoted it is likely to be higher quality. But this is extremely approximate. The posts I’ve had most upvoted are rarely what I would consider my highest quality remarks. For example, this comment was relevant but I don’t see any reason why it is at +24 other than some sort of bandwagon effect.
Pff, that’s nothing. Two of my highest-karma comments (try not to laugh at the totals; I’m green as grass, remember) are utterly derivative, by virtue of being simple restatements of another person’s point in a slightly funnier way. Namely this and this.
It’s embarrassing, frankly.
Ok. But the real thing is the discrepancy between them. While that comment I made is at +24, this comment is at +2 where it uses a nearly identical level of sources and analysis about a somewhat similar set of demographic issues.
It isn’t just that some funny comments get voted up a lot. It is that there’s very little general pattern to how far one comment gets up compared to another even when they are very similar comments.
Comments get more upvotes, independent of quality, if they:
Are in a high-traffic thread
Are made while the thread is still new
Get an early complimentary reply
Make a point many people agree with and care about (especially if the first to make that point)
Become the highest-karma comment early on (bandwagon + people may only read/vote on the first few comments, so being the top comment is valuable)
Are closer to top-level (people don’t read deep into threads unless particularly interested)
I think these effects, in aggregate, are probably much stronger determinants of comment karma than actual quality. Top-level posts, to main or discussion, suffer from fewer of these effects, so their karma is a little more reliable. But I hope no one is taking their comment karma too much to heart.
If that’s true, then… what’s the point of karma scores?
How about this: keep track of total votes behind the scenes, but only report whether the karma is [- -] for k<-5, [-] for −4+10.
I don’t think the attribution is right. I am always surprised by what does and doesn’t get upvoted, which means I’m poorly calibrated. Something useful I post to discussion after spending 20 hours on it gets 5 upvotes, and then something useless like this discussion post that took me 60 seconds to post gets 25+ upvotes. :)
My first assumption is that almost everything you post is seen as (at least somewhat) valuable (for almost every post #upvotes > #downvotes), so the net karma you get is mostly based on throughput. More readers, more votes. More votes, more karma.
Second, useful posts do not only take time to write, they take time to read as well. And my guess is that most of us don’t like to vote on thoughtful articles before we have read them. So for funny posts we can quickly make the judgement on how to vote, but for longer posts it takes time.
Decision fatigue may also play a role (after studying something complex the extra decision of whether to vote on it feels like work so we skip it). People may also print more valuable texts, or save them for later, making it easy to forget to vote.
The effect is much more evident on other karma based sites. Snarky one-liners and obvious puns are karma magnets. LessWrong uses the same system and is visited by the same species and therefore suffers from the same problems, just to a lesser extent.
This. Also after reading a more complex thing, it seems common that I’ll forget to think about voting at all, since I’m distracted by thinking about the implications or who I might want to share it with or what other people have to say about it. Sometimes I remember to go back and vote, but I think most of the time I just don’t, whereas with funny things the impulse to focus on the author and give them a reward in response seems to be automatic.
Also, sometimes an apparently well-researched article turns out to be based on only a superficial understanding of the topic (e.g. only having skimmed the abstracts) and mis-represents the cited material, and this is sometimes revealed on “cross-examination” in the comments.
I guess that’s a little better. (also that sounds like poor accuracy rather than poor calibration, but that’s probably just semantics).
On the other hand I suspect you are well calibrated with what gives you respect and reputation. You could say that your poor calibration with respect to karma is karma’s problem! :)
20 hours on a discussion post? That would be a mistake right there!
The 20 hours isn’t for LW karma, obviously. It’s stuff like announcing IntelligenceExplosion.com.