At any given time, is there anything especially wrong about using citation count (weighted by the weightings of other paper’s citation count) as a rough proxy for “what are the most important papers, and/or best authors, weighted?”
My sense is the thing that’s bad about this is that it creates an easy goodhart metric. I can imagine worlds where it’s already so thoroughly goodharted that it doesn’t signal anything anymore. If that’s the case, can you get around that by grounding it out in some number of trusted authors, and purging obviously fraudulent authors from the system?
I’m asking from the lens of “I’d like to have some kind barometer for which scientific papers (or, also, LW posts) are the best. And this just… actually seems pretty good, at least if you were only using it as a one-time-check.”
It depends what you mean by “rough proxy”, and whether you’re applying it to scientific papers (where Goodhart has been out in force for decades, so a one-time check is off the table) or to LessWrong posts (where citation-count has never been something people cared about). Most things have zero citations, and this is indeed a negative quality signal. But after you get to stuff that’s cited at all, citation count is mainly determined by the type and SEO of a paper, rather than its quality. Eg this paper. Citations also don’t distinguish building upon something from criticizing it. That’s much worse in the Goodhart arena than the one-time arena, but still pretty bad in the one-shot case.
In a given (sub)field, the highest-cited papers tend to be those which introduced or substantially improved on a key idea/result/concept; so they’re important in that sense. If you’re looking for the best introduction though that will often be a textbook, and there might be important caveats or limitations in a later and less-cited paper.
I’ve also had a problem where a few highly cited papers propose $approach, many papers apply or puport to extend it, and then eventually someone does a well-powered study checking whether $approach actually works. Either way that’s an important paper, but they tend to be under-cited either because either the results are “obvious” (and usually a small effect) or the field of $approach studies shrinks considerably.
It’s an extremely goodhartable metric but perhaps the best we have for papers; for authors I tend to ask “does this person have good taste in problems (important+tractable), and are their methods appropriate to the task?”.
At any given time, is there anything especially wrong about using citation count (weighted by the weightings of other paper’s citation count) as a rough proxy for “what are the most important papers, and/or best authors, weighted?”
My sense is the thing that’s bad about this is that it creates an easy goodhart metric. I can imagine worlds where it’s already so thoroughly goodharted that it doesn’t signal anything anymore. If that’s the case, can you get around that by grounding it out in some number of trusted authors, and purging obviously fraudulent authors from the system?
I’m asking from the lens of “I’d like to have some kind barometer for which scientific papers (or, also, LW posts) are the best. And this just… actually seems pretty good, at least if you were only using it as a one-time-check.”
It depends what you mean by “rough proxy”, and whether you’re applying it to scientific papers (where Goodhart has been out in force for decades, so a one-time check is off the table) or to LessWrong posts (where citation-count has never been something people cared about). Most things have zero citations, and this is indeed a negative quality signal. But after you get to stuff that’s cited at all, citation count is mainly determined by the type and SEO of a paper, rather than its quality. Eg this paper. Citations also don’t distinguish building upon something from criticizing it. That’s much worse in the Goodhart arena than the one-time arena, but still pretty bad in the one-shot case.
Nod. “positive vs disagreement citation” is an important angle I wasn’t thinking about.
Important for what? Best for what?
In a given (sub)field, the highest-cited papers tend to be those which introduced or substantially improved on a key idea/result/concept; so they’re important in that sense. If you’re looking for the best introduction though that will often be a textbook, and there might be important caveats or limitations in a later and less-cited paper.
I’ve also had a problem where a few highly cited papers propose $approach, many papers apply or puport to extend it, and then eventually someone does a well-powered study checking whether $approach actually works. Either way that’s an important paper, but they tend to be under-cited either because either the results are “obvious” (and usually a small effect) or the field of $approach studies shrinks considerably.
It’s an extremely goodhartable metric but perhaps the best we have for papers; for authors I tend to ask “does this person have good taste in problems (important+tractable), and are their methods appropriate to the task?”.