Nathan Helm-Burger answers How do you feel about LessWrong these days? [Open feedback thread]

Nathan Helm-Burger 6 Dec 2023 5:22 UTC
21 points
13
I feel a mix of pleased and frustrated. The main draw for me is AI safety discussion. I dislike the feeling of group-think around stuff, and I value the people who speak up against the group-think with contrary views (e.g. TurnTrout), who post high quality technical content, or well-researched and thought-out posts (e.g. Steven Byrnes).
I feel frustrated at things like feeling that people don’t always do a good job of voting comments up based on how valuable/coherent/high-effort the information content is, and then separately voting agree/disagree. I really like this feature, and I wish people gave it more respect. I am pleased that it does as well as it does though.
I like the new emojis and the new dialogues. I’m excited for the site designers to keep trying new (optional) stuff.
The things I’d like more from the site would be if it could split into two: one which was even more in the direction of technical discussion of AI safety, and the other for rationality and philosophy stuff. And then I’d like the technical side to have features like jupyter notebook-based posts for dynamic code demonstrations. And people presenting recent important papers not their own (e.g. from arxiv), for the sake of highlighting/summarizing/sparking-discussion. The weakness of the technical discussion here is, in my opinion, related to the lack of engagement with the wider academic community and empirical evidence.
Ultimately, I don’t think it matters much what we do with the site in the longer term because I think things are about to go hockey stick singularity crazy. That’s the bet I’m making anyway.
- the gears to ascension 6 Dec 2023 5:25 UTC
  18 points
  12
  Parent
  Yeah. The threshold for “okay, you can submit to alignmentforum” is way, way, way too high, and as a result, lesswrong.com is the actual alignmentforum. Attempts to insist otherwise without appropriately intense structural change will be met with lesswrong.com going right on being the alignmentforum.
  - Nathan Helm-Burger 6 Dec 2023 5:44 UTC
    27 points
    10
    Parent
    Ok, slightly off topic, but I just had a wacky notion for how to break-up groupthink as a social phenomenon. You know the cool thing from Audrey Tang’s ideas, Polis? What if we did that, but we found ‘thought groups’ of LessWrong users based on the agreement voting. And then posts/comments which were popular across thought-groups instead of just intensely within a thought group got more weight?
    Niclas Kupper tried a LessWrong Polis to gather our opinions a while back. https://www.lesswrong.com/posts/fXxa35TgNpqruikwg/lesswrong-poll-on-agi
    - Yoav Ravid 6 Dec 2023 6:25 UTC
      12 points
      0
      Parent
      So, something like the community notes algorithm?
      - Nathan Helm-Burger 6 Dec 2023 6:35 UTC
        5 points
        0
        Parent
        https://vitalik.eth.limo/general/2023/08/16/communitynotes.html
        Ah, as a non-Twitter user I hadn’t known about this. Neat.
        Quote
        For any given note, most users have not rated that note, so most entries in the matrix will be zero, but that’s fine. The goal of the algorithm is to create a four-column model of users and notes, assigning each user two stats that we can call “friendliness” and “polarity”, and each note two stats that we can call “helpfulness” and “polarity”. The model is trying to predict the matrix as a function of these values, using the following formula:
        Note that here I am introducing both the terminology used in the Birdwatch paper, and my own terms to provide a less mathematical intuition for what the variables mean
        μ is a “general public mood” parameter that accounts for how high the ratings are that users give in general
        is a user’s “friendliness”: how likely that particular user is to give high ratings
        is a note’s “helpfulness”: how likely that particular note is to get rated highly. Ultimately, this is the variable we care about.
        or is user or note’s “polarity”: its position among the dominant axis of political polarization. In practice, negative polarity roughly means “left-leaning” and positive polarity means “right-leaning”, but note that the axis of polarization is discovered emergently from analyzing users and notes; the concepts of leftism and rightism are in no way hard-coded.
        The algorithm uses a pretty basic machine learning model (standard gradient descent) to find values for these variables that do the best possible job of predicting the matrix values. The helpfulness that a particular note is assigned is the note’s final score. If a note’s helpfulness is at least +0.4, the note gets shown.
        The core clever idea here is that the “polarity” terms absorb the properties of a note that cause it to be liked by some users and not others, and the “helpfulness” term only measures the properties that a note has that caused it to be liked by all. Thus, selecting for helpfulness identifies notes that get cross-tribal approval, and selects against notes that get cheering from one tribe at the expense of disgust from the other tribe.
        Seth Herd 7 Dec 2023 19:38 UTC
        4 points
        0
        Parent
        This is the formalization of the concept “left hand whuffy” from Charlie Stross’s “down and out in the magic kingdom”, 2003. When people who usually disagree with people like you actually agree with you or like what you’ve said, that’s special and deserves attention. I’ve always wanted to see it implemented. I don’t usually tweet but I’ll have to look at this.
        Said Achmiz 7 Dec 2023 23:58 UTC
        3 points
        0
        Parent
        Down and Out in the Magic Kingdom was by Cory Doctorow, not Stross.
        Seth Herd 8 Dec 2023 0:31 UTC
        2 points
        0
        Parent
        Good catch. I’d genuinely misremembered. I lump the two together, but generally far prefer Stross as a storyteller, even though Doctorow’s futurism is also first-rate, in a different dimension. I found the story in Down and Out to be Stross-quality.
        
        That sort of good idea for a social network improvement is definitely signature Doctorow, though.
    - Ebenezer Dukakis 9 Dec 2023 6:58 UTC
      7 points
      2
      Parent
      Another idea is to upweight posts if they’re made by a person in thought group A, but upvoted by people in thought group B.
    - jacobjacob 22 Dec 2023 5:34 UTC
      4 points
      2
      Parent
      Yeah, I’m interested in features in this space!
      Another idea is to implement a similar algorithm to Twitter’s community votes: identify comments that have gotten upvotes by people who usually disagree with each other, and highlight those.
    - Roman Leventov 6 Dec 2023 20:10 UTC
      4 points
      0
      Parent
      This idea is definitely simmering in many people’s heads at the moment :)
    - Nathan Young 20 Dec 2023 18:51 UTC
      3 points
      0
      Parent
      How private are the LessWrong votes?
      
      Would you want to do it overall or blog by blog. Seems pretty doable.
      - Nathan Helm-Burger 22 Dec 2023 2:02 UTC
        3 points
        0
        Parent
        Currently, the information about who voted which way on what things is private to the individual who made the vote in question and the LW admins.
        So if doing this on LW votes, it’d need to be done in cooperation with the LW team.
    - Nathan Helm-Burger 22 Dec 2023 2:08 UTC
      2 points
      0
      Parent
      I’m pasting this here because it’s the sort of thing I’d like to see. I’d like to see where I fall in it, and at least the anonymized position of others. Also, it’d be cool to track how I move over time. Movement over time should be expected unless we fall into the ‘wrong sort of updateless decision theory’ as jokingly described by TurnTrout (and term coined by Wei Dai). https://www.lesswrong.com/posts/j2W3zs7KTZXt2Wzah/how-do-you-feel-about-lesswrong-these-days-open-feedback?commentId=X7iBYqQzvEgsppcTb
      Richard’ comment about thought groups:
      Richard Ngo 4mo15
      Five clusters of alignment researchers
      Very broadly speaking, alignment researchers seem to fall into five different clusters when it comes to thinking about AI risk:
      MIRI cluster. Think that P(doom) is very high, based on intuitions about instrumental convergence, deceptive alignment, etc. Does work that’s very different from mainstream ML. Central members: Eliezer Yudkowsky, Nate Soares.
      Structural risk cluster. Think that doom is more likely than not, but not for the same reasons as the MIRI cluster. Instead, this cluster focuses on systemic risks, multi-agent alignment, selective forces outside gradient descent, etc. Often work that’s fairly continuous with mainstream ML, but willing to be unusually speculative by the standards of the field. Central members: Dan Hendrycks, David Krueger, Andrew Critch.
      Constellation cluster. More optimistic than either of the previous two clusters. Focuses more on risk from power-seeking AI than the structural risk cluster, but does work that is more speculative or conceptually-oriented than mainstream ML. Central members: Paul Christiano, Buck Shlegeris, Holden Karnofsky. (Named after Constellation coworking space.)
      Prosaic cluster. Focuses on empirical ML work and the scaling hypothesis, is typically skeptical of theoretical or conceptual arguments. Short timelines in general. Central members: Dario Amodei, Jan Leike, Ilya Sutskever.
      Mainstream cluster. Alignment researchers who are closest to mainstream ML. Focuses much less on backchaining from specific threat models and more on promoting robustly valuable research. Typically more concerned about misuse than misalignment, although worried about both. Central members: Scott Aaronson, David Bau.
      Remember that any such division will be inherently very lossy, and please try not to overemphasize the differences between the groups, compared with the many things they agree on.
      Depending on how you count alignment researchers, the relative size of each of these clusters might fluctuate, but on a gut level I think I treat all of them as roughly the same size.