I would be interested to see the results of some Clustering Algorithm on the comment data. It may be, that long comments can be classified into high karma and low karma and we can then analyze what the differences between them are. If it is possible to extract features of high-quality posts, then those features can be the goal, instead of just the length.
I also think it’s dangerous to focus too strongly on karma, because karma score is only a rough approximation of actual quality. For example, I believe many short comments, that only ask for some clarification are generally more important than is reflected by their karma.
I would be interested to see the results of some Clustering Algorithm on the comment data. It may be, that long comments can be classified into high karma and low karma and we can then analyze what the differences between them are. If it is possible to extract features of high-quality posts, then those features can be the goal, instead of just the length.
I also think it’s dangerous to focus too strongly on karma, because karma score is only a rough approximation of actual quality. For example, I believe many short comments, that only ask for some clarification are generally more important than is reflected by their karma.