Rohin Shah comments on Matt Botvinick on the spontaneous emergence of learning algorithms

Rohin Shah 22 Aug 2020 17:56 UTC
4 points
If part of the failure is that the post is well-received, why wouldn’t you want people to downvote it now that you pointed it out?
It feels like downvotes-as-I-see-them-in-practice are some combination of “you should feel bad about having written this” and “make worse content less visible”, and I didn’t want the first effect. Idk if that’s the right call though, and idk if that’s how others (especially the author) interpret it.
I also neglected that people can just remove their upvotes without downvoting, which feels less bad (though from the author’s perspective it’s the same, so I think I’m just being inconsistent here).
I also think the average LW user shouldn’t be expected to understand enough RL to see this
Agreed, which is why I focused on the AF karma rather than the LW karma. (I agree with the rest of that paragraph.)
Separately, I think you can explain part of the failure by laziness rather than a lack of understanding of RL. You could read/skim this post and not quite understand what the setting actually is (even though it’s mentioned at the end of the second chapter). Just like I don’t think the average LW user should be expected to understand enough ML to realize that the main point is misleading, I also don’t they they should be expected to read the post carefully enough before upvoting it, especially not if it’s curated or high karma (because that should be a quality assurance, and at that point it seems fine to upvote purely to signal-boost the point).
Agreed this is likely but still seems pretty bad—this isn’t the first time people would have updated incorrectly had I not made a correction, though this is the most upvoted case. (I perhaps find it more annoying than it really “should” be because of how much shit LW gives academia and peer review.)
I realize your critique was of the AF, not of LW, so I’m not sure how much I’m really disagreeing with you here.
Yeah, I think this would still be a critique of LW, but much less strongly.
But since Evan Hubinger understood the point and upvoted the post anyway, it’s unclear how much you can conclude.
I give it 98% chance that the majority of people who upvoted did not understand the point.
- Pongo 22 Aug 2020 18:06 UTC
  2 points
  Parent
  Agreed, which is why I focused on the AF karma rather than the LW karma
  I think it’s worth pointing out that I originally saw this just posted to LW, and must have been manually promoted to AF by a mod. Partly want to point it out because possibly one of the main errors is people updating too much on promotion as a signal of quality
  - Rafael Harth 22 Aug 2020 18:20 UTC
    5 points
    Parent
    It’s trivially correct to update downward on the de-facto importance of promotion (by however much), but this seems like a bad thing.
    Naively, I would like people to make sure they understand the point at
    the curation step
    the promotion-to-AF step
    maybe at the upvote step if you’re a professional AI safety researcher
    And if the conclusion is that the post is meaningful despite possibly being misinterpreted, I would naively want the person in charge to PM the author and ask to put in a clarification before the post is curated/promoted.
    I say ‘naively’ because I don’t know anything about how hard it would be to achieve this and I could be genuinely wrong about this being a reasonable thing to want.