Writing on LessWrong, I find myself missing the feature for a couple reasons:
While the Good Heart Project continues, clearly the number of posts being published is higher than average. But are there also a higher than average number of readers? Knowing if I’m getting more or fewer readers than average during Good Heart Project would definitely influence my behavior (probably moreso than the money—the post I wrote in a flurry yesterday was mostly inspired by the fact that I’d secretly wanted to talk about valuing karma for a while, but it felt too taboo/off-topic for an ordinary EA Forum post, and the Good Heart Project created the perfect opportunity). Seeing post analytics might help me assess whether there are more or fewer readers per post than usual.
The feature has helped improve my writing—seeing how many people open a page but only stay for a short while, I was encouraged to write more concise posts and always put summaries at the top, to be more useful for readers.
It’s also been interesting to see which of my posts have “legs” and keep getting revisited later on—this post about EA charity gift cards versus this one analyzing a Peter Thiel essay both got around the same number of upvotes, but the Thiel post still draws a couple of hits every day while attention to the gift-card essay dropped off sharply when it left the front page. I feel like being able to see this helps steer me towards topics that might have more long
Oddly, even though I read LessWrong as often as I read the EA Forum, I have a poorer sense for whether an idea of mine “belongs” on LessWrong—What’s too political for the front page? Is my short story ‘The Toba Supervolcanic Eruption’ too EA to be worth cross-posting here? What topics are too casual (rambling off some thoughts about evolution and psychology) vs too technical (talking about some aerospace engineering stuff)? Seeing post analytics in addition to upvotes might help me get a better sense of this.
I think this is a cool feature and it would be nice if LessWrong also adopted it.
As a counterpoint, knowing that the EA forums expose this significantly disincentivizes me, at the very least, from ever looking at or recommending the EA forums.
There is no way to track these statistics in a way that isn’t either inaccurate in adversarial scenarios or leaks far too much user information, or both. And there tends to be a certain cat-and-mouse game:
Initially there’s something absolutely basic like a hit counter.
Someone writes a script that hammers a page from a single IP, to boost the seeming engagement.
A set cardinality estimator is added to e.g. filter by only a single hit per IP.
Someone writes a script that hammers a page from many IPs, to boost the seeming engagement.
The hit counter is modified to e.g. only work if Javascript is enabled.
The script is ported to use a JS interpreter, or to directly poke the backend.
The hit counter is modified to e.g. also fingerprint what browser is being used.
The script is ported to use headless Chrome or somesuch.
The hit counter is modified to e.g. only capture views from logged-in visitors.
The script is modified to automatically create accounts and use them.
Account creation is modified to include a CAPTCHA or similar.
The script is modified to include a tool to bypass CAPTCHAs[1]
etc.
Note that every one of these back-and-forths a) also drop or distort data, or otherwise make life harder, for legitimate users, and b) leak more and more information about visitors.
I would not have too much of a problem with readership statistics if the resulting entropy was explicitly calculated, and if the forum precommitted to not in future making changes that continued the ratchet; without these I have serious concerns.
2. Someone writes a script that hammers a page from a single IP, to boost the seeming engagement.
In the case of EA forum, the readership statistics have no consequences whatsoever. They’re not even publicly viewable. Why would anyone try to artificially inflate them?
Hmm, well, I guess we could imagine a scenario where someone works at an EA nonprofit, and wants to impress their boss, so they write a blog post, and artificially inflate the readership statistics, and then show their boss a printout of how many people have read their blog post. And then the boss goes to EA Forum mods and says “I want these statistics to be harder to fake”. But then I imagine the EA Forum mods would respond “Why should we spend our time doing that? This is your problem, not ours. You should come up with a less stupid way to judge your underlings.”
On the subject of “maybe we should tolerate a little bit of Goodharting in the name of encouraging people to post”, the EA Forum allows authors to view readership statistics for their posts. I think this is a cool feature and it would be nice if LessWrong also adopted it.
Writing on LessWrong, I find myself missing the feature for a couple reasons:
While the Good Heart Project continues, clearly the number of posts being published is higher than average. But are there also a higher than average number of readers? Knowing if I’m getting more or fewer readers than average during Good Heart Project would definitely influence my behavior (probably moreso than the money—the post I wrote in a flurry yesterday was mostly inspired by the fact that I’d secretly wanted to talk about valuing karma for a while, but it felt too taboo/off-topic for an ordinary EA Forum post, and the Good Heart Project created the perfect opportunity). Seeing post analytics might help me assess whether there are more or fewer readers per post than usual.
The feature has helped improve my writing—seeing how many people open a page but only stay for a short while, I was encouraged to write more concise posts and always put summaries at the top, to be more useful for readers.
It’s also been interesting to see which of my posts have “legs” and keep getting revisited later on—this post about EA charity gift cards versus this one analyzing a Peter Thiel essay both got around the same number of upvotes, but the Thiel post still draws a couple of hits every day while attention to the gift-card essay dropped off sharply when it left the front page. I feel like being able to see this helps steer me towards topics that might have more long
Oddly, even though I read LessWrong as often as I read the EA Forum, I have a poorer sense for whether an idea of mine “belongs” on LessWrong—What’s too political for the front page? Is my short story ‘The Toba Supervolcanic Eruption’ too EA to be worth cross-posting here? What topics are too casual (rambling off some thoughts about evolution and psychology) vs too technical (talking about some aerospace engineering stuff)? Seeing post analytics in addition to upvotes might help me get a better sense of this.
As a counterpoint, knowing that the EA forums expose this significantly disincentivizes me, at the very least, from ever looking at or recommending the EA forums.
There is no way to track these statistics in a way that isn’t either inaccurate in adversarial scenarios or leaks far too much user information, or both. And there tends to be a certain cat-and-mouse game:
Initially there’s something absolutely basic like a hit counter.
Someone writes a script that hammers a page from a single IP, to boost the seeming engagement.
A set cardinality estimator is added to e.g. filter by only a single hit per IP.
Someone writes a script that hammers a page from many IPs, to boost the seeming engagement.
The hit counter is modified to e.g. only work if Javascript is enabled.
The script is ported to use a JS interpreter, or to directly poke the backend.
The hit counter is modified to e.g. also fingerprint what browser is being used.
The script is ported to use headless Chrome or somesuch.
The hit counter is modified to e.g. only capture views from logged-in visitors.
The script is modified to automatically create accounts and use them.
Account creation is modified to include a CAPTCHA or similar.
The script is modified to include a tool to bypass CAPTCHAs[1]
etc.
Note that every one of these back-and-forths a) also drop or distort data, or otherwise make life harder, for legitimate users, and b) leak more and more information about visitors.
I would not have too much of a problem with readership statistics if the resulting entropy was explicitly calculated, and if the forum precommitted to not in future making changes that continued the ratchet; without these I have serious concerns.
Be it ‘feeding audio captchas to a speech-to-text program’, or ‘just use Mechanical Turk’.
In the case of EA forum, the readership statistics have no consequences whatsoever. They’re not even publicly viewable. Why would anyone try to artificially inflate them?
Hmm, well, I guess we could imagine a scenario where someone works at an EA nonprofit, and wants to impress their boss, so they write a blog post, and artificially inflate the readership statistics, and then show their boss a printout of how many people have read their blog post. And then the boss goes to EA Forum mods and says “I want these statistics to be harder to fake”. But then I imagine the EA Forum mods would respond “Why should we spend our time doing that? This is your problem, not ours. You should come up with a less stupid way to judge your underlings.”