If someone’s voting history has 900 downvotes and 100 upvotes...
The important thing would be who received those 900 downvotes. I am not sure about the exact formula, but the first approximation is whether the set of 900 comments downvoted by user X would correlate more with “what other people downvoted” or with “who wrote those comments”. That is, how much the user has high standards vs how much is a personal grudge.
To some degree “what other people downvoted” and “who wrote those comments” correlate with each other, because some people are more likely to write good comments, and some people are more likely to write bad comments. The question would be whether the downvoting patterns of user X correlate with “who wrote that” significantly more strongly that the downvoting patterns of an average user.
(Of course, any algorithm, when made public, can be gamed. For example, detection by the algorithm as described above could be avoided by a bot who would (a) upvote every comment that already has karma 3 or more, unless the comment author is in the “target” list; (b) downvote every comment that already has karma −3 or less, and (c) downvote every comment whose author is in the “target” list. The first two parts would make the bot profile seem similar to the average user, if the detection algorithm ignores the order of votes for each comment.)
the first approximation is whether the set of 900 comments downvoted by user X would correlate more with “what other people downvoted” or with “who wrote those comments”. That is, how much the user has high standards vs how much is a personal grudge.
That doesn’t look like a good approach to me. Correlating with “what other people downvoted” doesn’t mean “high standards” to me, it means “follows the hivemind”.
Imagine a forum which is populated by representatives of two tribes, Blue and Green, and moreover 90% of the forum participants are Green and only 10% are Blue. Let’s take Alice who’s Blue—her votes will not be positively correlated with other people’s votes for obvious reasons. You’re thinking about a normative situation where people should vote based on ill-defined “quality” of the post, but from a descriptive point of view people vote affectively, even on LW.
I think what you want is fairly easy to define without correlations. You are looking for a voting pattern that:
Stems from a single account (or a small number of them)
Is targeted at a single account (or a small number of them)
Has a large number of negative votes in a short period of time
Targets old posts, often in a particular sequence that matches the way software displays comments
The important thing would be who received those 900 downvotes. I am not sure about the exact formula, but the first approximation is whether the set of 900 comments downvoted by user X would correlate more with “what other people downvoted” or with “who wrote those comments”. That is, how much the user has high standards vs how much is a personal grudge.
To some degree “what other people downvoted” and “who wrote those comments” correlate with each other, because some people are more likely to write good comments, and some people are more likely to write bad comments. The question would be whether the downvoting patterns of user X correlate with “who wrote that” significantly more strongly that the downvoting patterns of an average user.
(Of course, any algorithm, when made public, can be gamed. For example, detection by the algorithm as described above could be avoided by a bot who would (a) upvote every comment that already has karma 3 or more, unless the comment author is in the “target” list; (b) downvote every comment that already has karma −3 or less, and (c) downvote every comment whose author is in the “target” list. The first two parts would make the bot profile seem similar to the average user, if the detection algorithm ignores the order of votes for each comment.)
That doesn’t look like a good approach to me. Correlating with “what other people downvoted” doesn’t mean “high standards” to me, it means “follows the hivemind”.
Imagine a forum which is populated by representatives of two tribes, Blue and Green, and moreover 90% of the forum participants are Green and only 10% are Blue. Let’s take Alice who’s Blue—her votes will not be positively correlated with other people’s votes for obvious reasons. You’re thinking about a normative situation where people should vote based on ill-defined “quality” of the post, but from a descriptive point of view people vote affectively, even on LW.
I think what you want is fairly easy to define without correlations. You are looking for a voting pattern that:
Stems from a single account (or a small number of them)
Is targeted at a single account (or a small number of them)
Has a large number of negative votes in a short period of time
Targets old posts, often in a particular sequence that matches the way software displays comments