Regarding the ratios of comment types have you compared that at all to subthreads about other topics, possibly less controversial ones? Without some idea of the usual level for an equivalent LW conversation about a less controversial topic, it is very hard to evaluate this data.
I’m not sure incidentally that I agree with your breakdown of comments. For example, you include the comment that started off the conversation as in none of the categories. Even just asking a worthwhile question should be worth something. And since this comment was at +17, even just by removing it we already substantially alter the average score of the 50 nones. The score goes from 2.7 to 2.4. This also illustrates another issue which is that if even a single comment can cause that sort of change then it doesn’t seem like this sort of data is statistically significant. Frankly, after realizing that, I’m not that inclined to check the rest of your data since that already puts the two at both 2.4 on average.
The fact that it seems like this comment itself would be put into the none category when I’ve made criticisms of the interpretation of evidence suggests that your break down isn’t great. (Please forgive the mild amount of self-reference.)
Regarding the ratios of comment types have you compared that at all to subthreads about other topics, possibly less controversial ones? Without some idea of the usual level for an equivalent LW conversation about a less controversial topic, it is very hard to evaluate this data.
It would be interesting to see what the patterns would be like in other subthreads. I sampled only the one subthread because I was curious about variation among comments within the single subthread and not variation between subthreads, so I figured one subthread would be enough.
I’m not sure incidentally that I agree with your breakdown of comments.
It’s certainly not perfect! I would have liked to have used a finer and more sensitive breakdown, but it would have become difficult to apply. I tried to invent the simplest breakdown I could think of that wouldn’t need much subjective judgment, and could approximate the types of discussion HughRistik had in mind.
For example, you include the comment that started off the conversation as in none of the categories. Even just asking a worthwhile question should be worth something.
That’s true—my list of categories is conservative, so some well-regarded comments that didn’t discuss data, predictions, or heuristics nonetheless didn’t end up in a category. That said, although my category list wasn’t exhaustive, I did still expect about as many comments to fit a category as there were comments that fitted none—I was genuinely surprised to get a 2⁄3 to 1⁄3 split.
This also illustrates another issue which is that if even a single comment can cause that sort of change then it doesn’t seem like this sort of data is statistically significant.
Fair point. The distribution of comment scores in that subthread is very skewed with a few outliers:
If I drop the four high scorers on the far tail I can recalculate the averages for the ‘nones’ versus the non-‘none’ comments without the influence of those outliers. The 47 remaining nones’ scores have mean 2.0 and the 23 remaining non-nones have a mean score of 1.8; the gap shrinks, but it’s still there.
If I did a statistical test of the difference, it likely would be statistically insignificant (and it’d likely have been insignificant even before dropping the outliers) - but that’s OK, because I don’t mean to generalize from that one subthread’s comments to the population of all comments.
The fact that it seems like this comment itself would be put into the none category when I’ve made criticisms of the interpretation of evidence suggests that your break down isn’t great.
Yes—if I planned to apply the breakdown to other subthreads, I’d add a category for comments that criticize or discuss evidence mentioned by someone else. Fortunately, it shouldn’t make much difference for the particular subthread I picked—I don’t remember any of the comments making detailed criticisms of other people’s evidence.
Regarding the ratios of comment types have you compared that at all to subthreads about other topics, possibly less controversial ones? Without some idea of the usual level for an equivalent LW conversation about a less controversial topic, it is very hard to evaluate this data.
I’m not sure incidentally that I agree with your breakdown of comments. For example, you include the comment that started off the conversation as in none of the categories. Even just asking a worthwhile question should be worth something. And since this comment was at +17, even just by removing it we already substantially alter the average score of the 50 nones. The score goes from 2.7 to 2.4. This also illustrates another issue which is that if even a single comment can cause that sort of change then it doesn’t seem like this sort of data is statistically significant. Frankly, after realizing that, I’m not that inclined to check the rest of your data since that already puts the two at both 2.4 on average.
The fact that it seems like this comment itself would be put into the none category when I’ve made criticisms of the interpretation of evidence suggests that your break down isn’t great. (Please forgive the mild amount of self-reference.)
It would be interesting to see what the patterns would be like in other subthreads. I sampled only the one subthread because I was curious about variation among comments within the single subthread and not variation between subthreads, so I figured one subthread would be enough.
It’s certainly not perfect! I would have liked to have used a finer and more sensitive breakdown, but it would have become difficult to apply. I tried to invent the simplest breakdown I could think of that wouldn’t need much subjective judgment, and could approximate the types of discussion HughRistik had in mind.
That’s true—my list of categories is conservative, so some well-regarded comments that didn’t discuss data, predictions, or heuristics nonetheless didn’t end up in a category. That said, although my category list wasn’t exhaustive, I did still expect about as many comments to fit a category as there were comments that fitted none—I was genuinely surprised to get a 2⁄3 to 1⁄3 split.
Fair point. The distribution of comment scores in that subthread is very skewed with a few outliers:
If I drop the four high scorers on the far tail I can recalculate the averages for the ‘nones’ versus the non-‘none’ comments without the influence of those outliers. The 47 remaining nones’ scores have mean 2.0 and the 23 remaining non-nones have a mean score of 1.8; the gap shrinks, but it’s still there.
If I did a statistical test of the difference, it likely would be statistically insignificant (and it’d likely have been insignificant even before dropping the outliers) - but that’s OK, because I don’t mean to generalize from that one subthread’s comments to the population of all comments.
Yes—if I planned to apply the breakdown to other subthreads, I’d add a category for comments that criticize or discuss evidence mentioned by someone else. Fortunately, it shouldn’t make much difference for the particular subthread I picked—I don’t remember any of the comments making detailed criticisms of other people’s evidence.