Not Rohin (who might disagree with me on what constitutes a “good” case) but I’ve also tried to do a similar experiment.
Besides the “why does RLHF not work” question, which is pretty tricky, another classic theme is people misciting the ML literature, or confidently citing papers that are outliers in the literature as if they were settled science. If you’re going to back up your claims with citations, it’s very important to get them right!
Not Rohin (who might disagree with me on what constitutes a “good” case) but I’ve also tried to do a similar experiment.
Besides the “why does RLHF not work” question, which is pretty tricky, another classic theme is people misciting the ML literature, or confidently citing papers that are outliers in the literature as if they were settled science. If you’re going to back up your claims with citations, it’s very important to get them right!
I’d encourage you to write up a blog post on common mistakes if you can find the time.