Raemon comments on Raemon’s Shortform

Raemon 14 Nov 2019 4:52 UTC
6 points
The 2018 Long Review (Notes and Current Plans)
I’ve spent much of the past couple years pushing features that help with the early stages of the intellectual-pipeline – things like shortform, and giving authors moderation tools that let them have the sort of conversation they want (which often is higher-context, and assuming a particular paradigm that the author is operating in)
Early stage ideas benefit from a brainstorming, playful, low-filter environment. I think an appropriate metaphor for those parts of LessWrong are “a couple people in a research department chatting about their ideas.”
But longterm incentives and filters matter a lot as well. I’ve focused on the early stages because that’s where the bottleneck seemed to be, but LessWrong is now at a place where I think we should start prioritizing the later stages of the pipeline – something more analogous to publishing papers, and eventually distilling them into textbooks.
So, here’s the current draft of a plan that I’ve been discussing with other LW Team members:
— The Long Review Format —
Many LessWrong posts are more conceptual than empirical, and it’s hard to tell immediately how useful they are. I think they benefit a lot from hindsight. So, once each year, we could reflect as a group about the best posts of the previous year*, and which them seem to have withstood the tests of time as something useful, true, and (possibly), something that should enter in the LessWrong longterm canon that people are expected to be familiar with.
Here’s my current best guess for the format:
[note: I currently expect the entire process to be fully public, because it’s not really possible for it to be completely private, and “half public” seems like the worst situation to me]
- (1 week) Nomination
  - Users with 1000+ karma can nominate posts from 2018-or-earlier, describing how they found the post useful over the longterm.
- (4 weeks) Review Phase
  - Authors of nominated posts can opt-out of the rest of the review process if they want.
    Posts with 3* nominations are announced as contenders. For a month, people are encouraged to look at them thoughtfully, writing comments (or posts) that discuss:
    How has this post been useful?
    How does it connect to the broader intellectual landscape.
    Is this post epistemically sound?
    How could it be improved?
    What further work would you like to see people do with the content of this post?
    Authors are encouraged to engage with critique. Ideally, updating the post in response to feedback, and/or discussing what sort of further work they’d be interesting seeing by others.
- (1 Week) Voting
  - Users with 1000+ karma rank each post on...
    1-10 scale for “how important is the content”
    1-10 scale for “how epistemically virtuous is this post”
    Yes/No/Veto on “should this post be added to LessWrong canon?”
  - (In the 1-10 scale, 6+ means “I’d be happy to see this included in the ‘best of 2018’” roundup, and 10 means ‘this is the best I can imagine’”)
  - “Yes, add this to canon” means that it hits some minimum threshold of epistemic virtue, as well as “this is something I think all LW readers should be at least passingly familiar with, or if they’re not, the burden is on them to read up on it if it comes up in conversation.”
- Rewards
  - The votes will all be publicly available. A few different aggregate statistics will be available, including the raw average, and probably some attempt at a “karma-weighted average.”
  - The LW moderation team will put together a physical book, and online sequence, of the best posts, as well as the most valuable reviews of each post.
  - The LW team awards up to* $1000 in prizes to the best reviewers, and $3000 in prizes to the top post authors.
    * this depends on whether we get reviews that seem to genuinely improve the epistemic landscape. Prizes for reviewers will be mostly moderator discretion (plus some inputs like “how much karma and engagement the review got”)
And importantly:
Next Year
Even if we stuck to the above plan, I’d see it as more of an experiment than the definitive, longterm review mechanism. I expect we’d iterate a lot the following year.
But one thing I’m particularly interested in is how this builds over the longterm: next year (November 2020), while people would mostly be nominating posts from 2019, there should also be a process for submitting posts for “re-review”, if there’s been something like a replication crisis, or if a research direction that seemed promising now seems less-so, that’s something we can revisit.
- Raemon 14 Nov 2019 5:12 UTC
  2 points
  Parent
  Some major uncertainties
  1. How much work will the community be motivated to do here?
  The best version of this involves quite a bit of effort from top authors and commenters, who are often busy. I think it gracefully scales down if no one has time for anything other than quick nominations or voting.
  ...
  2. What actually are good standards for LessWrong?
  A lot of topics LessWrong focuses on are sort of pre-paradigmatic. Many posts suggest empirical experiments you might run (and I’m hoping for reviews that explore that question), but in many cases it’s unclear what those experiments would even be, let alone the expense of running them.
  Many posts are about how to carve up reality, and how to think. How do you judge how well you carve up reality or think? Well, ideally by seeing whether thinking that way turns out to be useful over the longterm. But, that’s a very messy, confounded process that’s hard to get good data on.
  I think this will become more clear over longer timescales. One thing I hope to come out of this project is a bunch of people putting serious thought into the question, and hopefully getting a bit more consensus on it than we currently have.
  I’m kind of interested in an outcome here where there’s a bar you
  ...
  3. How to actually decide what goes in the book
  I have a lot of uncertainty about how many nominations, reviews and votes we’d get.
  I also have a lot of uncertainty about how much disagreement there’ll be about which posts.
  So, I’m pretty hesitant about committing in advance to a particular method of aggregation, or how many vetoes are necessary to prevent a post from making it into the book. I’d currently lean towards “the whole thing just involves a lot of moderation discretion, but the information is all public and if there’s a disconnect between “the people’s choice awards” and the “moderators choice awards”, we can have a conversation about that.
  - Matt Goldenberg 15 Nov 2019 18:43 UTC
    10 points
    Parent
    I feel a lot of unease about the sort of binary “Is this good enough to be included in canon” measure.
    I have an intuition that making a binary cut off point tied to prestige leads to one of to equilibria:
    1. You choose a very objective metric (P<.05) and then you end up with goodhearting.
    2. You choose a much more subjective process, and this leads to either the measure being more about prestige than actual goodness, making the process highly political, as much about who and who isn’t being honored as about the actual thing its’ trying to measure(Oscars, Nobel Prizes), or to gradual lowering of standards as edge cases keep lowering the bar imperceptibly over time (Grade inflation, 5 star rating systems).
    Furthermore, I think a binary system is quite antithetical to how intellectual progress and innovation actually happen, which are much more about a gradual lowering of uncertainty and raising of usefulness, than a binary realization after a year that this thing is useful.
    - Raemon 17 Nov 2019 0:54 UTC
      2 points
      Parent
      Fair concerns. A few more thoughts:
      First, small/simple update: I think the actual period of time for “canonization” to be on the table should be more like 5 years.
      My intent was for canonization to be pretty rare, and in fact is mostly there to sort of set a new, higher standard that everyone can aspire to, which most LW posts don’t currently meet. (You could make this part of a different process than a yearly review, but I think it’s fairly costly to get everyone’s attention at once for a project like this, and it makes more sense to have each yearly review include both “what were the best things from the previous year” as well as even longer term considerations)
      Why have Canonization?
      Furthermore, I think a binary system is quite antithetical to how intellectual progress and innovation actually happen, which are much more about a gradual lowering of uncertainty and raising of usefulness, than a binary realization after a year that this thing is useful.
      I do think this how a lot of progress works. But it’s important that sooner or later, you have to update your textbooks that you generally expect students to read.
      I think the standards for the core LW Library probably aren’t quite at the level of standards for textbooks (among other things, because most posts currently aren’t written with exercises in mind, and otherwise not quite optimized as a comprehensive pedagogical experience)
      Journal before Canon?
      Originally, I included the possibility of “canonization” in this year’s review round because longterm, I’d expect it to make most sense for the review to include both, and the aforementioned “I wanted part of the point here to highlight a standard that we mostly haven’t reached yet.”
      But two things occur to me as I write this out:
      1. This particular year, most of the value is in experimentation. This whole process will be pretty new, and I’m not sure it’ll work that well. That makes it perhaps not a good time to try out including the potential for “updating the textbooks” to be part of it.
      2. It might be good to require two years to for a post to have a shot at getting added to the top shelf in the LW Library, and for posts to first need to have previously been included
      2. You choose a much more subjective process, and this leads to either the measure being more about prestige than actual goodness, making the process highly political, as much about who and who isn’t being honored as about the actual thing its’ trying to measure(Oscars, Nobel Prizes), or to gradual lowering of standards as edge cases keep lowering the bar imperceptibly over time (Grade inflation, 5 star rating systems).
      I agree that these are both problems, and quite hard. My current sense is that it’s still on net better to have a system like this than not. But I’ll try to spend some time thinking about this more concretely.