Ruby comments on Speaking of Stag Hunts

Ruby 6 Nov 2021 18:35 UTC
42 points
Strong upvote. Thank you for writing this, it articulates the problems better than I had them in my head and enhances my focus. This deserves a longer reply, but I’m not sure if I’ll get to write it today, so I’ll respond with my initial thoughts.
What I really want from LessWrong is to make my own thinking better, moment to moment. To be embedded in a context that evokes clearer thinking, the way being in a library evokes whispers. To be embedded in a context that anti-evokes all those things my brain keeps trying to do, the way being in a church anti-evokes coarse language.
I want this too.
In the big, important conversations, the ones with big stakes, the ones where emotions run high—
I don’t think LessWrong, as a community, does very well in those conversations at all.
Regarding the three threads you list, I, others involved in managing of LessWrong, and leading community figures who’ve spoken to me are all dissatisfied with how those conversations went and believe it calls for changes in LessWrong.
Solutions I am planning or considering:
- Technological solutions (i.e. UI changes). Currently, I think it’s difficult to provide norm-enforcing feedback on comments (you are required to write another comment, which is actually quite costly). One is also torn between signalling agreement/disagreement with statement and approval/disapproval of the reasoning. These issues could be addressed by factoring karma into two axes (approve/disapprove, agree/disagree) and also possible something like “epistemic reacts” where you can easily tag a comment as exemplifying a virtue or vice. I think that would give standard-upholding users (including moderators) a tool to uphold the standards.
  - There’s a major challenge in all of this in that I see any norms you introduce as being additional tools that can be abused to win–just selectively call out your opponents for alleged violations to discredit them. This can maybe worked around, say giving react abilities only to trusted users or something, but it’s non-trivial.
  - Another thing is that new-users are currently on too-even footing with established users. You can make an account and your comments will look the same as a user who’s proven themselves. This could be addressed by marking new users as such (Hacker News does this) or we can create spaces where new users cannot easily participate (more on this in a moment).
  - Not a solution, but a problem to be solved: when it comes to users, high karma can in part indicate highly valuable contributions, but also is just a measure of engagement. Someone with hundreds of low scoring comments can have a much higher score than someone of higher standards with only a few standout posts. This means that karma alone is inadequate to segment out better users from lower standard and quality adhering ones.
- I am interested in creating “Gardens within the Garden”. As you say, counting up, LessWrong does well compared to the GenPop but far from sufficient. I think it would be good to have a place where people can level-up, and a further higher-quality space towards which people can strive to be admitted to. Admission could be granted by moderators (likely) or by passing an adequate test (if we are able to create such a one), I imagine you (Duncan) would actually be quite helpful in designing it.
- I think our new user system is woefully inadequate. The system needs to change so that admission to LessWrong as a commenter and poster is not something that is taken for granted, and that new users are made aware that many (most?) new users will be turned away (once their initial contributions seem low enough quality).
  - Standards need to be made clear to new users, and for that matter, they need to be clarified to everyone. This is hard because to to me at least, picking the right standards is not easy. Picking the wrong standards to enforce could kill LessWrong (which I think would be worse than living with the current students).
  - I think that starting by getting the standards for new users clear (“stopping the bleeding”) we can then begin to extend that onto the existing user base. As a general approach we (the moderators) have a much higher bar for banning long-term users over new users [1].
This is just the cached list that I’m able to retrieve on the spot. There are surely more good things that I’m forgetting or haven’t thought of.
I think it isn’t. I think that a certain kind of person is becoming less prevalent on LessWrong, and a certain other kind of person is becoming more prevalent, and while I have nothing against the other kind, I really thought LessWrong was for the first group.
It is the definitely the case that people who I want on LessWrong are not there because the discussion doesn’t meet their standards. They have told me. I want to address this, although it’s somewhat hard because the people I want tend to be opinionated about standards in ways that conflict, or at least whose intersection would be a norm-enforcement burden that neither moderators or users could tolerate. That said, I think there are improvements in quality that would be universally regarded as good and would shift the culture and userbase in good directions.
In no small part, the duty of the moderation team is to ensure that no LessWronger who’s trying to adhere to the site’s principles is ever alone, when standing their ground against another user (or a mob of users) who isn’t
I would really like this to be true.
Hire a team of well-paid moderators for a three-month high-effort experiment of responding to every bad comment with a fixed version of what a good comment making the same point would have looked like. Flood the site with training data.
If you can find me people capable of being these moderators, I will hire them. I think the number of people who have mastered the standards you propose and are also available is...smaller than I have been able to locate so far.
Timelines for things happening from LW team
Progress is a little slow at the moment. Since the restructuring into Lightcone Infrastructure, I’m the only full-time member of the LessWrong team. I still get help with various tasks from other Lightcone members, and jimrandomh independently does dev work as an open source contributor; however, I’m the only one able to drive large initiatives (like rescuing the site’s norms) forward. Right now the bulk of my focus on hiring [2]. Additionally, I’ve begun doing some work on the new user process, and I hope to begin are the experiments with karma factorization. Those are smaller steps than what’s required, unfortunately.

If you or someone you know is a highly capable software engineer with Rationalist virtue, please contact me. While the community does have many software developers, the number who are skilled enough and willing to live in Berkeley and work on LessWrong is not so high that it’s trivial to hire.

--
[1] In the terminology of Raemon, I believe we have some Integrity Debt in disclosing how many new users we ban (and their content that we remove).
[2] It’s plausible I should drop hiring and just focused on everything in the OP/I mention above, but I consider LessWrong “exposed” right now since I’m neither technically strong enough or productive enough to maintain the site alone, which makes me reliant people outside the team, which is a kind of brittle way for things to be.
What links here?
- Speaking of Stag Hunts by Duncan Sabien (Deactivated) (6 Nov 2021 8:20 UTC; 191 points)
- supposedlyfun 7 Nov 2021 22:28 UTC
  20 points
  Parent
  Regarding the three threads you list, I, others involved in managing of LessWrong, and leading community figures who’ve spoken to me are all dissatisfied with how those conversations went and believe it calls for changes in LessWrong.
  I’m deeply surprised by this. If there is a consensus among the LW managers and community figures, could one of them write a post about it laying out what was dissatisfactory and what changes they feel need to be made, or at least the result they want from the changes? I know you’re a highly conscientious person with too much on zir hands already, so please don’t take this upon yourself.
  - habryka 8 Nov 2021 1:20 UTC
    21 points
    Parent
    I am also surprised by this! I think this sentence is kind of true, and am dissatisfied with the threads, but I don’t feel like my take is particularly well-summarized with the above language, at least in the context of this post (like, I feel like this sentence implies a particular type of agreement with the OP that I don’t think summarizes my current position very well, though I am also not totally confident I disagree with the OP).
    I am in favor of experimenting more with some karma stuff, and have been encouraging people to work on that within the Lightcone team. I think there is lots of stuff we could do better, and definitely comparing us to some ideal that I have in my head, I think things definitely aren’t going remotely as well as I would like them to, but I do feel like the word “dissatisfied” seems kind of wrong. I think there are individual comments that seem bad, but overall I think the conversations have been quite good, and I am mildly positively surprised by how well they have been going.
    - Duncan Sabien (Deactivated) 8 Nov 2021 1:46 UTC
      5 points
      Parent
      (As the author of the OP, I think my position is also consistent with “quite good, and mildly positively surprised.” I think the difference is counting up vs. counting down? I’m curious whether you think quite good when counting down from your personal vision of the ideal LessWrong.)
      - habryka 8 Nov 2021 20:38 UTC
        9 points
        Parent
        When counting down we are all savages dancing to the sun gods in a feeble attempt to change the course of history.
        More seriously though, yeah, definitely when I count down, I see a ton of stuff that could be a lot better. A lot of important comments missing, not enough courage, not enough honesty, not enough vulnerability, not enough taking responsibility for the big picture.
        Ruby 8 Nov 2021 23:52 UTC
        3 points
        Parent
        I did indeed mean “dissatisfied” in a “counting down” sense.
    - Vladimir_Nesov 8 Nov 2021 5:09 UTC
      3 points
      Parent
      The most obvious/annoying issue with karma is false disagreement zero equilibrium controversy tug of war that can’t currently be split into more specific senses of voting to reveal that actually there is a consensus.
      
      This can’t be solved by pre-splitting, it has to act as needed, maybe co-opting the tagging system, with the default tag being “Boostworthy” (but not “Relevant” or anything specific like that), ability to see the tags if you click something, and ability to tag your vote with anything (one tag per voter, so to give a specific tag you have to untag “Boostworthy”, and all tags sum up into the usual karma score that is the only thing that shows by default until you click something). This has to be sufficiently inconvenient to only get used when necessary, but then somehow become convenient enough for everyone to use (for that specific comment).
      
      On the other hand there is Steam that only has approve/disapprove votes and gives vastly more useful quality ratings than most rating aggregators that are even a little bit more nuanced. So any good idea is likely to make things worse. (Though Steam doesn’t have a zero equilibrium problem because the rating is the percentage of approve votes.)
      - Viliam 8 Nov 2021 12:08 UTC
        7 points
        Parent
        Is it more important to see absolute or relative numbers of votes? To me it seems that if there are many votes, the relative numbers are more important: a comment with 45 upvotes and 55 downvotes is not too different from a comment with 55 upvotes and 45 downvotes; but one of them would be displayed as “-10 karma” and the other as “+10 karma” which seems different a lot.
        On the other hand, with few votes, I would prefer to see “+1 karma” rather than “100% consensus” if in fact only 1 person has voted. It would be misleading to make a comment with 1 upvote and 0 downvotes seem more representative of the community consensus than a comment with 99 upvotes and 1 downvote.
        How I perceive the current voting system, is that comments are somewhere on the “good—bad” scale, and the total karma is a result of “how many people think this is good vs bad” multiplied by “how many people saw this comment and bothered to vote”. So, “+50 karma” is not necessarily better than “+10 karma”, maybe just more visible; like a top-level comment made immediately after writing the article, versus an insightful comment made three days later as a reply to a reply to a reply to something.
        But some people seem to have a strong opinion about the magnitude of the result, like “this comment is good, but not +20 good, only +5 good” or “this comment is stupid and deserves to have negative karma, but −15 is too low so I am going to upvote it to balance all those upvotes”—which drives me crazy, because it means that some people’s votes depend on whether they were among the early or late voters (the early voters expressing their honest opinion, the late voters mostly voting the opposite of their honest opinion just because they decided that too much agreement is a bad thing).
        Here is my idea of a very simple visual representation that would reflect both the absolute and relative votes. Calculate three numbers: positive (the number of upvotes), neutral (the magical constant 7), and negative (the number of downvotes), then display a rectangle of fixed width and length, divided proportionally into green (positive), gray (neutral) and red (negative) parts.
        So a comment with 1 upvote would have a mostly gray line with some green on the left, the comment with 2 upvotes would have almost 2× as much green… but the comments with 10 upvotes and 12 upvotes would seem quite similar to each other. A comment with 45 upvotes and 55 downvotes, or vice versa, would have a mostly half-green half-red line, so obviously controversial.
        Yoav Ravid 8 Nov 2021 12:39 UTC
        8 points
        Parent
        But some people seem to have a strong opinion about the magnitude of the result, like “this comment is good, but not +20 good, only +5 good” or “this comment is stupid and deserves to have negative karma, but −15 is too low so I am going to upvote it to balance all those upvotes”—which drives me crazy, because it means that some people’s votes depend on whether they were among the early or late voters (the early voters expressing their honest opinion, the late voters mostly voting the opposite of their honest opinion just because they decided that too much agreement is a bad thing).
        I think this comes from a place of also seeing karma as reward/punishment, and thinking the reward/punishment is enough/too high, or from a place of seeing the score as representing where it should be relative to other comments, or just from trying to correct for underratedness/overratedness.
        I sometimes do this, and think it’s alright with the current voting system, but I think it’s a flaw of the voting system that it creates this dynamic.
        Duncan Sabien (Deactivated) 8 Nov 2021 17:55 UTC
        7 points
        Parent
        the early voters expressing their honest opinion, the late voters mostly voting the opposite of their honest opinion just because they decided that too much agreement is a bad thing
        This is a misconstrual. The late voters are also expressing their honest opinion, it’s just that their honest opinion lies on a policy level rather than a raw stimulus-response level.
        It’s at least as valid (and, I suspect, somewhat more valid) to have preferences of the form “this should be seen as somewhat better than that” than to have preferences of the form “I like this and dislike that.”
        Yoav Ravid 8 Nov 2021 12:44 UTC
        6 points
        Parent
        Here is my idea of a very simple visual representation that would reflect both the absolute and relative votes. Calculate three numbers: positive (the number of upvotes), neutral (the magical constant 7), and negative (the number of downvotes), then display a rectangle of fixed width and length, divided proportionally into green (positive), gray (neutral) and red (negative) parts.
        So a comment with 1 upvote would have a mostly gray line with some green on the left, the comment with 2 upvotes would have almost 2× as much green… but the comments with 10 upvotes and 12 upvotes would seem quite similar to each other. A comment with 45 upvotes and 55 downvotes, or vice versa, would have a mostly half-green half-red line, so obviously controversial.
        This is interesting and I would like to see a demo of it. The upside to this suggestion is that since it’s only a visual change and doesn’t actually change the way karma and voting works, it could get tested and reverted very easily, just needs to be built.
        Viliam 8 Nov 2021 16:17 UTC
        5 points
        Parent
        It could even be displayed to the right from the current karma display, so we could temporarily have both, like this:
        Aspiring Rationalist 5h < 10 > ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░▓
      - Yoav Ravid 8 Nov 2021 5:32 UTC
        5 points
        Parent
        Interesting, this gave me an idea for something a bit different.
        We’ll have a list of good attributes a comment can have (Rigor, Effort, Correctness/Accuracy/Precision, Funny, etc.). By default you would have one attribute (perhaps ‘Relevant’), and users will be able to add whichever attributes they want (perhaps even custom ones). These attributes will be voteable by users (no limit on how many you can vote on), and will show at the top of the comment together with their score (sorted by absolute value). I’m not sure how it would be used to sort comments or give points to users, though.
  - Duncan Sabien (Deactivated) 8 Nov 2021 1:01 UTC
    3 points
    Parent
    (I expect that having written this post + being friendly with much of the team will result in me being a part of some conversations on this in the near future; if there are summaries I can share here and otherwise it would be a long time before word got out, I’ll try to do so.)
- Shamash 7 Nov 2021 18:18 UTC
  13 points
  Parent
  While I am not technically a “New User” in the context of the age of my account, I comment very infrequently, and I’ve never made a forum-level post.
  I would rate my own rationality skills and knowledge at slightly above the average person but below the average active LessWrong member. While I am aware that I possess many habits and biases that reduce the quality of my written content, I have the sincere goal of becoming a better rationalist.
  There are times when I am unsure whether an argument or claim that seems incorrect is flawed or if it is my reasoning that is flawed. In such cases, it seems intuitive to write a critical comment which explicitly states what I perceive to be faulty about that claim or argument and what thought processes have led to this perception. In the case that these criticisms are valid, then the discussion of the subject is improved and those who read the comment will benefit. If the criticisms are not valid, then I may be corrected by a response that points out where my reasoning went wrong, helping me avoid making such errors in the future.
  Amateur rationalists like myself are probably going to make mistakes when it comes to criticism of other people’s written content, even when we strive to follow community guidelines. My concern with your suggestions is that these changes may discourage users like me from creating flawed posts and comments that help us grow as rationalists.
  - Duncan Sabien (Deactivated) 7 Nov 2021 20:11 UTC
    11 points
    Parent
    I think there’s a real danger of that, in practice.
    But I’ve had lots of experience with “my style of moderation/my standards” being actively good for people taking their first steps toward this brand of rationalism; lots of people have explicitly reached out to me to say that e.g. my FB wall allowed them to do just those sorts of first, flawed steps.
    A big part of this is “if the standards are more generally held, then there’s more room for each individual bend-of-the-rules.” I personally can spend more spoons responding positively and cooperatively to [a well-intentioned newcomer who’s still figuring out the norms of the garden] if I’m not also feeling like it’s pretty important for me to go put out fires elsewhere.
    - Duncan Sabien (Deactivated) 7 Nov 2021 20:12 UTC
      5 points
      Parent
      Or in other words, that’s part of what I was clumsily gesturing at with “Cooperate past the first ‘defect’ from your interlocutor.” I should’ve written “first apparent defect.”
- Chris_Leong 7 Nov 2021 6:50 UTC
  6 points
  Parent
  “If you can find me people capable of being these moderators, I will hire them. I think the number of people who have mastered the standards you propose and are also available is...smaller than I have been able to locate so far.”
  
  I think the best way to do this would be to ask people to identify a few such comments and how they would have rewritten the comment.
- MondSemmel 6 Nov 2021 22:39 UTC
  6 points
  Parent
  If I may add something, I wish users occasionally had to explain or defend their karma votes a bit. To give one example that really confuses me, currently the top three comments on this thread are:
  1. a clarification by OP (Duncan) - makes sense
  2. a critical comment which was edited after I criticized it; now my criticism is at ~0 karma, without any comments indicating why. This would all be fine, except the comment generated no other responses, so now I don’t even understand why I was the only one who found the original objectionable, or why others didn’t like my response to it; and I don’t remotely understand the combination of <highly upvoted OP> and <highly upvoted criticism which generates no follow-up discussion>. (Also, after a comment is edited, is there even a way to see the original? Or was my response just doomed to stop making sense once the original was edited?)
  3. another critical comment, which did generate the follow-up discussion I expected
  (EDIT: Have fixed broken links.)
  - supposedlyfun 7 Nov 2021 22:33 UTC
    9 points
    Parent
    (You’ve done good work in this post’s comment section, IMO.)
    I wish users occasionally had to explain or defend their karma votes a bit
    Maybe if a comment were required in order to strongly upvote or strongly downvote? As someone who does those things fairly often, I wouldn’t hate this change. Sitting here imagining a comment I initially wanted to strongly upvote but didn’t because of such a rule, I feel okay about the fact that I was deterred, given this site’s standards.
    Or maybe a 1 in 3 chance that a strong upvote will require a comment.
- hg00 9 Nov 2021 2:59 UTC
  4 points
  Parent
  
  There’s a major challenge in all of this in that I see any norms you introduce as being additional tools that can be abused to win–just selectively call out your opponents for alleged violations to discredit them.
  
  I think this is usually done subconsciously—people are more motivated to find issues with arguments they disagree with.
- Yoav Ravid 6 Nov 2021 20:09 UTC
  3 points
  Parent
  Hire a team of well-paid moderators for a three-month high-effort experiment of responding to every bad comment with a fixed version of what a good comment making the same point would have looked like. Flood the site with training data.
  If you can find me people capable of being these moderators, I will hire them. I think the number of people who have mastered the standards you propose and are also available is...smaller than I have been able to locate so far.
  
  How you would test if someone fits the criteria? Can those people be trained?
  - Duncan Sabien (Deactivated) 6 Nov 2021 20:13 UTC
    4 points
    Parent
    I do in fact claim that I could do some combination of identify and train such people, and I claim that I am justified and credible in believing this about myself based on e.g. my experience helping CFAR train mentors and workshop instructors. I mention this somewhat-arrogant belief because Ruby highlighted me in his comment as someone who might be able to help on related matters.
    - Yoav Ravid 6 Nov 2021 20:23 UTC
      5 points
      Parent
      Sorry, I think my comment came across as casting doubt on that, but my intention was to ask out of genuine curiosity and interest. I wonder how the process/test might look like.
      - Duncan Sabien (Deactivated) 7 Nov 2021 0:45 UTC
        4 points
        Parent
        Double sorry! I didn’t read you as casting doubt, and didn’t have defensive feelings while writing the above.
        I do believe people can be trained, though I have a hard time boiling down either “how” or why I believe it.
        As for how to test, I don’t have a ready answer, but if I develop one I’ll come back and note it here. (Especially if e.g. Ruby asks me for help and I actually deliver.)
        Yoav Ravid 7 Nov 2021 5:13 UTC
        2 points
        Parent
        Alright, thanks :)
        (If anyone else has Ideas I’d be glad to hear them)