While I like the older recommendations, they highlight the problems with the discussion threads on the older (17yo) content. Often we have learned things in the last 17 years and comments that provide relevant updates would ideally be upvoted and be the first thing I read after the article. But the comments are sorted oldest-first by default. I rarely find the oldest comments the most useful. I can change the sort but I think the default sort discourages comments of the type I wish to read.
The reason why they are oldest-sorted is because posts back then didn’t have threading. So many back-and-forth conversations would be made completely intelligible by removing the sort order.
I think maybe the right solution here is to have a separate comment section for old comments and for new comments, but it’s kind of ugly and complicated.
That’s a very good point. The idea of changing the sort order for old posts has come up before, I’m glad to be reminded of it and I think we’ll try that.
You don’t want to reverse-sort, though. While that would prioritize the ‘update’ comments like “this failed to replicate” or “this was the foundation of a whole new ML research area”, it would make a hash of the older comments, which will be nonsensical as you are reading the reply to a reply to a reply … to a reply to the post. (Even if the old posts had tree comments, rather than forcibly being linearized for lack of threading at that time, reverse-sorting the top-level comments would be weird and confusing.)
You could try to use a LLM to classify comments (or post + comment) by ‘updateness’, and simply put ‘update’ comments in a separate block of newest-first comments at the start (while sorting the rest in the usual oldest-first way), and that might work. (The separator could be an explicit section header/bold label, or just a horizontal ruler with the difference left implicit.) Include a few dozen examples to few-shot it by looking at new comments on old posts—I bet a nice cheap LLM like GPT-4o or Claude-3.5-sonnet can handle it without a problem. Standard ‘I know it when I see it’ semantic property that few-shot LLMs work well for.
A simpler thing might be to say that what Randall identifies is a natural kind: there are ‘old’ comments and there are ‘new’ comments, and they are fundamentally different. A heuristic might be: Comments on a post within a year of posting are ‘old’, and comments after that are ‘new’; as before, ‘new’ comments get put in a separate section sort-by-newest at the beginning, then ‘old’ comments get sort-by-oldest. If that heuristic doesn’t look good, you could look for a cutpoint: the largest temporal gap between 2 successive top-level comments. Then everything after that is ‘new’ vs ‘old’. (Because usually, if there is a burst of comments on posting, and then someone revisits it long afterwards to update it, the largest gap will be somewhere in the ‘new’ subsequence, so this will be conservative in creating out-of-order comments and show only the newest.)
I definitely don’t want newest first. The magic (new and upvoted) sort seems to work well. The default (top scoring) sort too. A concrete example is this post on “an especially elegant evpsych experiment”. It’s not the newest, but it is top-scoring.
I think an AI could reasonably convert ancient discussions to light use of threads to preserve most of the conversation flow, where it has value.
While I like the older recommendations, they highlight the problems with the discussion threads on the older (17yo) content. Often we have learned things in the last 17 years and comments that provide relevant updates would ideally be upvoted and be the first thing I read after the article. But the comments are sorted oldest-first by default. I rarely find the oldest comments the most useful. I can change the sort but I think the default sort discourages comments of the type I wish to read.
The reason why they are oldest-sorted is because posts back then didn’t have threading. So many back-and-forth conversations would be made completely intelligible by removing the sort order.
I think maybe the right solution here is to have a separate comment section for old comments and for new comments, but it’s kind of ugly and complicated.
That’s a very good point. The idea of changing the sort order for old posts has come up before, I’m glad to be reminded of it and I think we’ll try that.
You don’t want to reverse-sort, though. While that would prioritize the ‘update’ comments like “this failed to replicate” or “this was the foundation of a whole new ML research area”, it would make a hash of the older comments, which will be nonsensical as you are reading the reply to a reply to a reply … to a reply to the post. (Even if the old posts had tree comments, rather than forcibly being linearized for lack of threading at that time, reverse-sorting the top-level comments would be weird and confusing.)
You could try to use a LLM to classify comments (or post + comment) by ‘updateness’, and simply put ‘update’ comments in a separate block of newest-first comments at the start (while sorting the rest in the usual oldest-first way), and that might work. (The separator could be an explicit section header/bold label, or just a horizontal ruler with the difference left implicit.) Include a few dozen examples to few-shot it by looking at new comments on old posts—I bet a nice cheap LLM like GPT-4o or Claude-3.5-sonnet can handle it without a problem. Standard ‘I know it when I see it’ semantic property that few-shot LLMs work well for.
A simpler thing might be to say that what Randall identifies is a natural kind: there are ‘old’ comments and there are ‘new’ comments, and they are fundamentally different. A heuristic might be: Comments on a post within a year of posting are ‘old’, and comments after that are ‘new’; as before, ‘new’ comments get put in a separate section sort-by-newest at the beginning, then ‘old’ comments get sort-by-oldest. If that heuristic doesn’t look good, you could look for a cutpoint: the largest temporal gap between 2 successive top-level comments. Then everything after that is ‘new’ vs ‘old’. (Because usually, if there is a burst of comments on posting, and then someone revisits it long afterwards to update it, the largest gap will be somewhere in the ‘new’ subsequence, so this will be conservative in creating out-of-order comments and show only the newest.)
Cheers, yeah, something like this might be the way to go.
I definitely don’t want newest first. The magic (new and upvoted) sort seems to work well. The default (top scoring) sort too. A concrete example is this post on “an especially elegant evpsych experiment”. It’s not the newest, but it is top-scoring.
I think an AI could reasonably convert ancient discussions to light use of threads to preserve most of the conversation flow, where it has value.