All discussion post titles, points, and dates as an excel sheet
You can find it here.
Earlier today I wanted to quantify whether lesswrong has stopped being a well kept garden. So I wrote a scraper to produce the above dataset, so that anyone that wants to do the analysis, can.
All data is as of a few minutes ago.
For programmers: You can see the source here, it’s made to run on scraperwiki, but it will time out after about 3000 articles. At that point you need to adjust the initial value of the uri variable to be the last uri printed. Repeating this process once more will allow you to reach the end. Have fun.
Then you probably should start by quantifying what does “being a well kept garden” mean.
True. I guess I was being a bit cheeky. LW is no longer being kept at all AFAICT (or just on maintenance), just wanted to see if it’s on an upward or downward trajectory. I obviously think there is a problem, and I have a solution to suggest, but I wanted to double check my intuition with the numbers.
Authors might be an interesting field to add; one of the more plausible measures mentioned in the other thread was a drop in posts from specific prolific authors.
post updated with code, go crazy! number of comments is another one I’d add if I ran it again.
Before you look at the numbers, what metrics are you going to use to quantify this?
posts per month, upvotes per month. (i understand score is positive minus negative, but it cancels out). potentially comments per month too, but I didn’t fetch that data. substitute month for your preferred granularity of course.
for +10 points, post the scraper. (but put a throttle in by default)
done
I’m very curious about you results.
Well, it’s not being ‘kept’ anymore for one, but I didn’t need analysis for that. I guess the question is if it is flourishing or dying out.