1. How much work will the community be motivated to do here?
The best version of this involves quite a bit of effort from top authors and commenters, who are often busy. I think it gracefully scales down if no one has time for anything other than quick nominations or voting.
...
2. What actually are good standards for LessWrong?
A lot of topics LessWrong focuses on are sort of pre-paradigmatic. Many posts suggest empirical experiments you might run (and I’m hoping for reviews that explore that question), but in many cases it’s unclear what those experiments would even be, let alone the expense of running them.
Many posts are about how to carve up reality, and how to think. How do you judge how well you carve up reality or think? Well, ideally by seeing whether thinking that way turns out to be useful over the longterm. But, that’s a very messy, confounded process that’s hard to get good data on.
I think this will become more clear over longer timescales. One thing I hope to come out of this project is a bunch of people putting serious thought into the question, and hopefully getting a bit more consensus on it than we currently have.
I’m kind of interested in an outcome here where there’s a bar you
...
3. How to actually decide what goes in the book
I have a lot of uncertainty about how many nominations, reviews and votes we’d get.
I also have a lot of uncertainty about how much disagreement there’ll be about which posts.
So, I’m pretty hesitant about committing in advance to a particular method of aggregation, or how many vetoes are necessary to prevent a post from making it into the book. I’d currently lean towards “the whole thing just involves a lot of moderation discretion, but the information is all public and if there’s a disconnect between “the people’s choice awards” and the “moderators choice awards”, we can have a conversation about that.
I feel a lot of unease about the sort of binary “Is this good enough to be included in canon” measure.
I have an intuition that making a binary cut off point tied to prestige leads to one of to equilibria:
1. You choose a very objective metric (P<.05) and then you end up with goodhearting.
2. You choose a much more subjective process, and this leads to either the measure being more about prestige than actual goodness, making the process highly political, as much about who and who isn’t being honored as about the actual thing its’ trying to measure(Oscars, Nobel Prizes), or to gradual lowering of standards as edge cases keep lowering the bar imperceptibly over time (Grade inflation, 5 star rating systems).
Furthermore, I think a binary system is quite antithetical to how intellectual progress and innovation actually happen, which are much more about a gradual lowering of uncertainty and raising of usefulness, than a binary realization after a year that this thing is useful.
First, small/simple update: I think the actual period of time for “canonization” to be on the table should be more like 5 years.
My intent was for canonization to be pretty rare, and in fact is mostly there to sort of set a new, higher standard that everyone can aspire to, which most LW posts don’t currently meet. (You could make this part of a different process than a yearly review, but I think it’s fairly costly to get everyone’s attention at once for a project like this, and it makes more sense to have each yearly review include both “what were the best things from the previous year” as well as even longer term considerations)
Why have Canonization?
Furthermore, I think a binary system is quite antithetical to how intellectual progress and innovation actually happen, which are much more about a gradual lowering of uncertainty and raising of usefulness, than a binary realization after a year that this thing is useful.
I do think this how a lot of progress works. But it’s important that sooner or later, you have to update your textbooks that you generally expect students to read.
I think the standards for the core LW Library probably aren’t quite at the level of standards for textbooks (among other things, because most posts currently aren’t written with exercises in mind, and otherwise not quite optimized as a comprehensive pedagogical experience)
Journal before Canon?
Originally, I included the possibility of “canonization” in this year’s review round because longterm, I’d expect it to make most sense for the review to include both, and the aforementioned “I wanted part of the point here to highlight a standard that we mostly haven’t reached yet.”
But two things occur to me as I write this out:
1. This particular year, most of the value is in experimentation. This whole process will be pretty new, and I’m not sure it’ll work that well. That makes it perhaps not a good time to try out including the potential for “updating the textbooks” to be part of it.
2. It might be good to require two years to for a post to have a shot at getting added to the top shelf in the LW Library, and for posts to first need to have previously been included
2. You choose a much more subjective process, and this leads to either the measure being more about prestige than actual goodness, making the process highly political, as much about who and who isn’t being honored as about the actual thing its’ trying to measure(Oscars, Nobel Prizes), or to gradual lowering of standards as edge cases keep lowering the bar imperceptibly over time (Grade inflation, 5 star rating systems).
I agree that these are both problems, and quite hard. My current sense is that it’s still on net better to have a system like this than not. But I’ll try to spend some time thinking about this more concretely.
Some major uncertainties
1. How much work will the community be motivated to do here?
The best version of this involves quite a bit of effort from top authors and commenters, who are often busy. I think it gracefully scales down if no one has time for anything other than quick nominations or voting.
...
2. What actually are good standards for LessWrong?
A lot of topics LessWrong focuses on are sort of pre-paradigmatic. Many posts suggest empirical experiments you might run (and I’m hoping for reviews that explore that question), but in many cases it’s unclear what those experiments would even be, let alone the expense of running them.
Many posts are about how to carve up reality, and how to think. How do you judge how well you carve up reality or think? Well, ideally by seeing whether thinking that way turns out to be useful over the longterm. But, that’s a very messy, confounded process that’s hard to get good data on.
I think this will become more clear over longer timescales. One thing I hope to come out of this project is a bunch of people putting serious thought into the question, and hopefully getting a bit more consensus on it than we currently have.
I’m kind of interested in an outcome here where there’s a bar you
...
3. How to actually decide what goes in the book
I have a lot of uncertainty about how many nominations, reviews and votes we’d get.
I also have a lot of uncertainty about how much disagreement there’ll be about which posts.
So, I’m pretty hesitant about committing in advance to a particular method of aggregation, or how many vetoes are necessary to prevent a post from making it into the book. I’d currently lean towards “the whole thing just involves a lot of moderation discretion, but the information is all public and if there’s a disconnect between “the people’s choice awards” and the “moderators choice awards”, we can have a conversation about that.
I feel a lot of unease about the sort of binary “Is this good enough to be included in canon” measure.
I have an intuition that making a binary cut off point tied to prestige leads to one of to equilibria:
1. You choose a very objective metric (P<.05) and then you end up with goodhearting.
2. You choose a much more subjective process, and this leads to either the measure being more about prestige than actual goodness, making the process highly political, as much about who and who isn’t being honored as about the actual thing its’ trying to measure(Oscars, Nobel Prizes), or to gradual lowering of standards as edge cases keep lowering the bar imperceptibly over time (Grade inflation, 5 star rating systems).
Furthermore, I think a binary system is quite antithetical to how intellectual progress and innovation actually happen, which are much more about a gradual lowering of uncertainty and raising of usefulness, than a binary realization after a year that this thing is useful.
Fair concerns. A few more thoughts:
First, small/simple update: I think the actual period of time for “canonization” to be on the table should be more like 5 years.
My intent was for canonization to be pretty rare, and in fact is mostly there to sort of set a new, higher standard that everyone can aspire to, which most LW posts don’t currently meet. (You could make this part of a different process than a yearly review, but I think it’s fairly costly to get everyone’s attention at once for a project like this, and it makes more sense to have each yearly review include both “what were the best things from the previous year” as well as even longer term considerations)
Why have Canonization?
I do think this how a lot of progress works. But it’s important that sooner or later, you have to update your textbooks that you generally expect students to read.
I think the standards for the core LW Library probably aren’t quite at the level of standards for textbooks (among other things, because most posts currently aren’t written with exercises in mind, and otherwise not quite optimized as a comprehensive pedagogical experience)
Journal before Canon?
Originally, I included the possibility of “canonization” in this year’s review round because longterm, I’d expect it to make most sense for the review to include both, and the aforementioned “I wanted part of the point here to highlight a standard that we mostly haven’t reached yet.”
But two things occur to me as I write this out:
1. This particular year, most of the value is in experimentation. This whole process will be pretty new, and I’m not sure it’ll work that well. That makes it perhaps not a good time to try out including the potential for “updating the textbooks” to be part of it.
2. It might be good to require two years to for a post to have a shot at getting added to the top shelf in the LW Library, and for posts to first need to have previously been included
I agree that these are both problems, and quite hard. My current sense is that it’s still on net better to have a system like this than not. But I’ll try to spend some time thinking about this more concretely.