Curious as to why you think that LW2.0 will have a problem with gaming karma when LW1.0 hasn’t had such a problem (unless you count Eugine, and even if you do, we’ve been promised the tools for dealing with Eugines now).
I think this roughly summarizes my perspective on this. Karma seems to work well for a very large range of online forums and applications. We didn’t really have any problems with collusion on LW outside of Eugine, and that was a result of a lack of moderator tools, not a problem with the karma system itself.
I agree that you should never fully delegate your decision making process to a simple algorithm, that’s what the value-loading problem is all about, but that’s what we have moderators and admins for. If we see suspicious behavior in the voting patterns we investigate and if we find someone is gaming the system we punish them. This is how practically all social rules and systems get enforced.
LW1.0′s problem with karma is that karma isn’t measuring anything useful (certainly not quality). How can a distributed voting system decide on quality? Quality is not decided by majority vote.
The biggest problem with karma systems is in people’s heads—people think karma does something other than what it does in reality.
Higher-voted comments are consistently more insightful and interesting than low-voted ones.
This was also my experience (on LW) several years ago, but not recently. On Reddit, I don’t see much difference between highly- and moderately-upvoted comments, only poorly-upvoted comments (in a popular thread) are consistently bad.
aggregating lots of individual estimates of quality sure can help discover the quality.
I guess we fundamentally disagree. Lots of people with no clue about something aren’t going to magically transform into a method for discerning clue regardless of aggregation method—garbage in garbage out. For example: aggregating learners in machine learning can work, but requires strong conditions.
Do you disagree with Kaj that higher-voted comments are consistently more insightful and interesting than low-voted ones?
It sounds like you are making a different point: that no voting system is a substitute for having a smart, well-informed userbase. While that is true, that is also not really the problem that a voting system is trying to solve.
Sure do. On stuff I know a little about, what gets upvoted is “LW folk wisdom” or perhaps “EY’s weird opinions” rather than anything particularly good. That isn’t surprising. Karma, being a numerical aggregate of the crowd, is just spitting back a view of the crowd on a topic. That is what karma does—nothing to do with quality.
Every crowd thinks they are such a place where it’s actually true.
Some of the extreme sceptics do not believe they are much closer to the truth than anyone else.
Outside view: they are wrong.
There does not exist a group such that consensus of the group is highly correlated with truth? That’s quite an extraordinary claim you’re making; do you have the appropriate evidence?
I think Ilya is not claiming that no such group exists but that it is well nigh impossible to know that your group is one such. At least where the claim is being made very broadly, as it seems to be upthread. I don’t think it’s unreasonable for experimental physicists to think that their consensus on questions of experimental physics is strongly correlated with truth, for instance, and I bet Ilya doesn’t either.
More specifically, I think the following claim is quite plausible: When a group of people coalesces around some set of controversial ideas (be they political, religious, technological, or whatever), the correlation between group consensus and truth in the area of those controversial ideas may be positive or negative or zero, and members of the group are typically ill-equipped to tell which of these cases they’re in.
LW has the best epistemic hygiene of all the communities I’ve encountered and/or participated in.
In so far as epistemic hygiene is positively correlated with truth, I expect LW consensus to be more positively correlated with truth than most (not all) other internet communities.
Talking about LW, specifically. Presumably, groups exist that truth-track, for example experts on their area of expertise. LW isn’t an expert group.
The prior on LW is the same as on any other place on the internet, it’s just a place for folks to gab. If LW were extraordinary, truth-wise, they would be sitting on an enormous pile of utility.
The prior on LW is the same as on any other place on the internet.
I disagree. Epistemic hygiene is genuinely better on LW, and insofar as Epistemic hygiene is positively correlated with truth, I expect LW consensus to be more positively correlated with truth than most (not all) other internet communities.
Presumably, groups exist that truth-track, for example experts on their area of expertise.
A group of experts will not necessarily truth-track—there are a lot of counterexamples from gender studies to nutrition.
I would probably say that a group which implements its ideas in practice and is exposed to the consequences is likely to truth-track. That’s not LW, but that’s not most of the academia either.
Lots of people with no clue about something aren’t going to magically transform into a method for discerning clue regardless of aggregation method—garbage in garbage out.
I think that’s the core of the disagreement: I assume that if the forum is worth reading in the first place, then the average forum user’s opinion of a comment’s quality tends to correlate with my own. In which case something have lots of upvotes is evidence in favor of me also thinking that it will be a good comment.
This assumption does break down if you assume that the other people have “no clue”, but if that’s your opinion of a forum’s users, then why are you reading that forum in the first place?
“Clue” is not a total ordering of people from best to worst, it varies from topic to topic.
The other issue to consider is what you view the purpose of a forum is.
Consider a subreddit like TheDonald. Presumably they may use karma to get consensus on what a good comment is, also. But TheDonald is an echo chamber. If your opinions are very correlated with opinions of others in a forum, then naturally you get a number that tells you what everyone agrees is good.
That can be useful, sometimes. But this isn’t quality, it’s just community consensus, and that can be arbitrarily far off. “Less wrong,” as-is-written-on-the-tin is supposedly about something more objective than just coming to a community consensus. You need true signal for that, and karma, being a mirror a community holds to itself, cannot give it to you.
edit: the form of your question is: “if you don’t like TheDonald, why are you reading TheDonald?” Is that what you want to be saying?
Hopefully this question is not too much of a digression—but has anyone considered using something like Arxiv-Sanity but instead of for papers it could include content (blog posts, articles, etc.) produced by the wider rationality community? Because at least with that you are measuring similarity to things you have already read and liked, things other people have read and liked, or things people are linking to and commenting on, and you can search things pretty well based on content and authorship. Ranking things by (what people have stored in their library and are planning on taking time to study) might contain more information than karma.
Karma serves as an indicator of the reception that certain content got. High karma means several people liked it. Negative karma means it was very disliked, etc.
Curious as to why you think that LW2.0 will have a problem with gaming karma when LW1.0 hasn’t had such a problem (unless you count Eugine, and even if you do, we’ve been promised the tools for dealing with Eugines now).
I think this roughly summarizes my perspective on this. Karma seems to work well for a very large range of online forums and applications. We didn’t really have any problems with collusion on LW outside of Eugine, and that was a result of a lack of moderator tools, not a problem with the karma system itself.
I agree that you should never fully delegate your decision making process to a simple algorithm, that’s what the value-loading problem is all about, but that’s what we have moderators and admins for. If we see suspicious behavior in the voting patterns we investigate and if we find someone is gaming the system we punish them. This is how practically all social rules and systems get enforced.
LW1.0′s problem with karma is that karma isn’t measuring anything useful (certainly not quality). How can a distributed voting system decide on quality? Quality is not decided by majority vote.
The biggest problem with karma systems is in people’s heads—people think karma does something other than what it does in reality.
That’s the exact opposite of my experience. Higher-voted comments are consistently more insightful and interesting than low-voted ones.
Obviously not decided by it, but aggregating lots of individual estimates of quality sure can help discover the quality.
This was also my experience (on LW) several years ago, but not recently. On Reddit, I don’t see much difference between highly- and moderately-upvoted comments, only poorly-upvoted comments (in a popular thread) are consistently bad.
I guess we fundamentally disagree. Lots of people with no clue about something aren’t going to magically transform into a method for discerning clue regardless of aggregation method—garbage in garbage out. For example: aggregating learners in machine learning can work, but requires strong conditions.
Do you disagree with Kaj that higher-voted comments are consistently more insightful and interesting than low-voted ones?
It sounds like you are making a different point: that no voting system is a substitute for having a smart, well-informed userbase. While that is true, that is also not really the problem that a voting system is trying to solve.
Sure do. On stuff I know a little about, what gets upvoted is “LW folk wisdom” or perhaps “EY’s weird opinions” rather than anything particularly good. That isn’t surprising. Karma, being a numerical aggregate of the crowd, is just spitting back a view of the crowd on a topic. That is what karma does—nothing to do with quality.
What if the view of the crowd is correlated with quality.
Every crowd thinks so.
I think Lesswrong might be (or at the very least was once) such a place where this is actually true.
Every crowd thinks they are such a place where it’s actually true. Outside view: they are wrong.
Some of the extreme sceptics do not believe they are much closer to the truth than anyone else.
There does not exist a group such that consensus of the group is highly correlated with truth? That’s quite an extraordinary claim you’re making; do you have the appropriate evidence?
I think Ilya is not claiming that no such group exists but that it is well nigh impossible to know that your group is one such. At least where the claim is being made very broadly, as it seems to be upthread. I don’t think it’s unreasonable for experimental physicists to think that their consensus on questions of experimental physics is strongly correlated with truth, for instance, and I bet Ilya doesn’t either.
More specifically, I think the following claim is quite plausible: When a group of people coalesces around some set of controversial ideas (be they political, religious, technological, or whatever), the correlation between group consensus and truth in the area of those controversial ideas may be positive or negative or zero, and members of the group are typically ill-equipped to tell which of these cases they’re in.
LW has the best epistemic hygiene of all the communities I’ve encountered and/or participated in.
In so far as epistemic hygiene is positively correlated with truth, I expect LW consensus to be more positively correlated with truth than most (not all) other internet communities.
Doesn’t LW loudly claim to be special in this respect?
And if it actually is not, doesn’t this represent a massive failure of the entire project?
Talking about LW, specifically. Presumably, groups exist that truth-track, for example experts on their area of expertise. LW isn’t an expert group.
The prior on LW is the same as on any other place on the internet, it’s just a place for folks to gab. If LW were extraordinary, truth-wise, they would be sitting on an enormous pile of utility.
I disagree. Epistemic hygiene is genuinely better on LW, and insofar as Epistemic hygiene is positively correlated with truth, I expect LW consensus to be more positively correlated with truth than most (not all) other internet communities.
A group of experts will not necessarily truth-track—there are a lot of counterexamples from gender studies to nutrition.
I would probably say that a group which implements its ideas in practice and is exposed to the consequences is likely to truth-track. That’s not LW, but that’s not most of the academia either.
I don’t think LW is perfect; I think LW has the best epistemic hygiene of all communities I’ve encountered and/or participated in.
I think epistemic hygiene is positively correlated with truth.
I think that’s the core of the disagreement: I assume that if the forum is worth reading in the first place, then the average forum user’s opinion of a comment’s quality tends to correlate with my own. In which case something have lots of upvotes is evidence in favor of me also thinking that it will be a good comment.
This assumption does break down if you assume that the other people have “no clue”, but if that’s your opinion of a forum’s users, then why are you reading that forum in the first place?
“Clue” is not a total ordering of people from best to worst, it varies from topic to topic.
The other issue to consider is what you view the purpose of a forum is.
Consider a subreddit like TheDonald. Presumably they may use karma to get consensus on what a good comment is, also. But TheDonald is an echo chamber. If your opinions are very correlated with opinions of others in a forum, then naturally you get a number that tells you what everyone agrees is good.
That can be useful, sometimes. But this isn’t quality, it’s just community consensus, and that can be arbitrarily far off. “Less wrong,” as-is-written-on-the-tin is supposedly about something more objective than just coming to a community consensus. You need true signal for that, and karma, being a mirror a community holds to itself, cannot give it to you.
edit: the form of your question is: “if you don’t like TheDonald, why are you reading TheDonald?” Is that what you want to be saying?
Hopefully this question is not too much of a digression—but has anyone considered using something like Arxiv-Sanity but instead of for papers it could include content (blog posts, articles, etc.) produced by the wider rationality community? Because at least with that you are measuring similarity to things you have already read and liked, things other people have read and liked, or things people are linking to and commenting on, and you can search things pretty well based on content and authorship. Ranking things by (what people have stored in their library and are planning on taking time to study) might contain more information than karma.
Karma serves as an indicator of the reception that certain content got. High karma means several people liked it. Negative karma means it was very disliked, etc.