Without looking at the data, I couldn’t say with certainty what the dominant cause is, but I can reasonably confidently say that your clustering algorithm, with its built-in assumption of a roughly even divide on both sides of its vectors, is responsible for at least part of it.
The prime issue is that you are algorithmically creating the data—the clusters—you’re drawing inferences on. Your algorithm should be your most likely candidate for -any- anomalies. You definitely shouldn’t get attached to any conclusions, especially if they’re favorable to the group of people you more closely identify with. (It’s my impression that the “open-mindedness” conclusion -is- favorable to the people you identify with, given that you give it higher elevation than the possibility that the opposing side is producing better arguments.)
Suppose people are divided by some arbitrary criteria (e.g., blondes vs. brunettes) and then it turns out that blondes upvote brunettes much more often than vice versa. You could still ask the same question.
Regarding elevation, I simply wanted a short and easy to understand title and it did not occur to me that it would be perceived as prejudicial.
Except in this case you’re grouping on the same behavior you’re measuring—given that you’re doing statistical analysis on what is essentially traffic-analysis grouped data, I can’t think of a trivial example to compare to. That’s bound to lead to some variable dependency issues.
And I think you did realize that, given your care in not naming names or sides, but I’m not attacking you, I’m suggesting you should be cautious in taking conclusions. You want to measure—so you’re not taking it as a given, which is good skepticism—but you skipped skepticism of your techniques.
Suppose, for the sake of the argument, that my own data is totally wrong and consider the same question for a purely hypothetical case:
Group A upvotes only its own comments. Group B upvotes preferentially its own comments. Is there a way to tell whether the difference lies in the comment quality or the characters of the group members?
I’d say your hypothetical case is undecidable on multiple levels, starting with how to determine comment quality in the first place, the very definition of which may vary between Group A and Group B.
I’m sure there will be some correlations but I would not know what to do with them. Traits like conscientiousness have no obvious connection to my question. Openness to new experiences is sometimes used as a proxy for open-mindedness, but to me this seems a little farfetched. Is there a strong reason to believe that an adventurous eater will be more open-minded on political questions?
Without looking at the data, I couldn’t say with certainty what the dominant cause is, but I can reasonably confidently say that your clustering algorithm, with its built-in assumption of a roughly even divide on both sides of its vectors, is responsible for at least part of it.
The prime issue is that you are algorithmically creating the data—the clusters—you’re drawing inferences on. Your algorithm should be your most likely candidate for -any- anomalies. You definitely shouldn’t get attached to any conclusions, especially if they’re favorable to the group of people you more closely identify with. (It’s my impression that the “open-mindedness” conclusion -is- favorable to the people you identify with, given that you give it higher elevation than the possibility that the opposing side is producing better arguments.)
Suppose people are divided by some arbitrary criteria (e.g., blondes vs. brunettes) and then it turns out that blondes upvote brunettes much more often than vice versa. You could still ask the same question.
Regarding elevation, I simply wanted a short and easy to understand title and it did not occur to me that it would be perceived as prejudicial.
Except in this case you’re grouping on the same behavior you’re measuring—given that you’re doing statistical analysis on what is essentially traffic-analysis grouped data, I can’t think of a trivial example to compare to. That’s bound to lead to some variable dependency issues.
And I think you did realize that, given your care in not naming names or sides, but I’m not attacking you, I’m suggesting you should be cautious in taking conclusions. You want to measure—so you’re not taking it as a given, which is good skepticism—but you skipped skepticism of your techniques.
Suppose, for the sake of the argument, that my own data is totally wrong and consider the same question for a purely hypothetical case:
Group A upvotes only its own comments. Group B upvotes preferentially its own comments. Is there a way to tell whether the difference lies in the comment quality or the characters of the group members?
I’d say your hypothetical case is undecidable on multiple levels, starting with how to determine comment quality in the first place, the very definition of which may vary between Group A and Group B.
If you measure the personality via a big 5 personality test you can see whether the ratings correlate.
I’m sure there will be some correlations but I would not know what to do with them. Traits like conscientiousness have no obvious connection to my question. Openness to new experiences is sometimes used as a proxy for open-mindedness, but to me this seems a little farfetched. Is there a strong reason to believe that an adventurous eater will be more open-minded on political questions?