Ruby comments on Ruby’s Quick Takes

Ruby Jul 7, 2023, 6:06 PM
9 points
The LessWrong admins are often evaluating whether users (particularly new users) are going to be productive members of the site vs are just really bad and need strong action taken.

A question we’re currently disagreeing on is which pieces of evidence it’s okay to look at in forming judgments. Obviously anything posted publicly. But what about:

- Drafts (admins often have good reason to look at drafts, so they’re there)
- Content the user deleted
- The referring site that sent someone to LessWrong

I’m curious how people feel about moderators looking at those.

Alternatively, we’re not in complete agreement about:
- Should deleted stuff even be that private? It was already public, could already have been copied, archived, etc. So there isn’t that much expectation of privacy so admins should look at it.
- Is it the case that we basically shouldn’t extend the same rights, e.g. privacy, to new users because they haven’t earned them as much, and we need to look at more activity/behavior to assess the new user?
  - There’s some quantitative here where we might sometimes doing this depending on our degree of suspicion. Generally respecting privacy but looking at more things, e.g. drafts, that if we’re on the edge about banning someone.
- We are generally very hesitant to look at votes, but start to do this if we suspect bad voting behavior (e.g. someone possibly indiscriminately downvoting another person). Rate limiting being tied to downvotes perhaps makes this more likely and more of an issue. Just how ready to investigate (including deanonymization) should be if we suspect abuse?
- Raemon Jul 7, 2023, 7:50 PM
  10 points
  Parent
  I want to clarify the draft thing:
  In general LW admins do not look at drafts, except when a user has specifically asked for help debugging something. I indeed care a lot about people feeling like they can write drafts without an admin sneaking a peak.
  The exceptions under discussion are things like “a new user’s first post or comment looks very confused/crackpot-ish, to the point where we might consider banning the user from the site. The user has some other drafts. (I think a central case here is a new user shows up with a crackpot-y looking Theory of Everything. The first post that they’ve posted publicly looks sort of borderline crackpot-y and we’re not sure what call to make. A thing we’ve done sometimes is do a quick skim of their other drafts to see if they’re going in a direction that looks more reassuring or “yeah this person is kinda crazy and we don’t want them around.”)
  I think the new auto-rate-limits somewhat relax the need for this (I feel a bit more confident that crackpots will get downvoted, and then automatically rate limited, instead of something the admins have to monitor and manage). I think I’d have defended the need to have this tool in the past, but it might be sufficiently unnecessary now that we should remove it from our common mod toolset.
  ...
  I also want to emphasize since @Dagon brought it up: we never look at DMs. We do have a flag for “a new user has sent a lot of DMs without posting any content”, but the thing we do there is send the user a message saying approximately “hey, we have observed this metadata, we haven’t read your DMs, but just want to encourage you to be careful about spamming people in DMs”. In cases where we suspect someone is doing flagrant DM spam we might disable their ability to send future DMs until they’ve persuaded us they’re a reasonable real person, but still not actually read the DM.
  - Dagon Jul 8, 2023, 3:58 AM
    2 points
    Parent
    I apologize if I implied that the mods were routinely looking at private data without reason—I do, in fact, trust your intentions very deeply, and I’m sad when my skepticism about the ability to predict future value bleeds over into making your jobs harder.
    I wonder if the missing feature might be a status for “post approval required”—if someone triggers your “probably a crackpot” intuition, rather than the only options being “ban” or “normal access” have a “watchlist” option, where posts and comments have a 60-minute delay before becoming visible (in addition to rate limiting). The only trustworthy evidence about future posts is the posts themselves—drafts or deleted things only show that they have NOT decided to post that.
    Note that I don’t know how big a problem this is. I think that’s a great credit to the mods—you’re removing the truly bad before I notice it, and leaving some not-great-but-not-crackpot, which I think is about right. This makes it very hard for me to be confident in any opinions about whether you’re putting too much work into prior-censorship or not.
- Rafael Harth Jul 7, 2023, 7:45 PM
  7 points
  Parent
  I’m emotionally very opposed to looking at drafts of anyone, though this is not a rationally thought out position. I don’t have the same reaction toward votes because I don’t feel like you have an expectation of privacy there. There are forums where upvotes are just non-anonymous by default.
- Nathan Young Jul 8, 2023, 11:58 AM
  3 points
  Parent
  Ruby, why doesn’t your shortform have agree/disagreevote?
  - Raemon Jul 8, 2023, 8:48 PM
    2 points
    Parent
    It was made in the past and we hadn’t gotten around to updating all shortforms to use the new voting system.
- Max H Jul 8, 2023, 9:28 PM
  2 points
  Parent
  Personal opinion: it’s fine and good for the mods to look at all available evidence when making these calls, including votes and vote patterns. If someone is borderline, I’d rather they be judged based on all available info about them, and I think the more data the mods look at more closely, the more accurate and precise their judgments will be.
  I’m not particularly worried about a moderator being incorrectly “biased” from observing a low-quality draft or a suspect referral; I trust the mods to be capable of making roughly accurate Bayesian updates based on those observations.
  I also don’t think there’s a particularly strong expectation or implicit promise about privacy (w.r.t mods; of course I don’t expect anyone’s votes or drafts to be leaked to the public...) especially for new / borderline users.
  Separately, I feel like the precise policies and issues here are not worth sweating too much, for the mods / LW team. I think y’all are doing a great job overall, and it’s OK if the moderation policy towards new users is a bit adhoc / case-by-case. In particular, I don’t expect anything in the neighborhood of current moderation policies / rate-limiting / privacy violations currently implemented or being discussed to have any noticeable negative effects, on me personally or on most users. (In particular, I disagree pretty strongly with the hypothesis in e.g. this comment; I don’t expect rate limits or any other moderation rules / actions to have any impact whatsoever on my own posting / commenting behavior, and I don’t give them any thought when posting or commenting myself. I suspect the same is true for most other users, who are either unaware of them or don’t care / don’t notice.)
- Dagon Jul 8, 2023, 7:14 PM
  2 points
  Parent
  How frequent are moderation actions? Is this discussion about saving moderator effort (by banning someone before you have to remove the rate-limited quantity of their bad posts), or something else? I really worry about “quality improvement by prior restraint”—both because low-value posts aren’t that harmful, they get downvoted and ignored pretty easily, and because it can take YEARS of trial-and-error for someone to become a good participant in LW-style discussions, and I don’t want to make it impossible for the true newbies (young people discovering this style for the first time) to try, fail, learn, try, fail, get frustrated, go away, come back, and be slightly-above-neutral for a bit before really hitting their stride.
  Relatedly: I’m struck that it seems like half or more of posts get promoted to frontpage (if the /allPosts list is categorizing correctly, at least). I can’t see how many posts are deleted, of course, but I wonder if, rather than moderation, a bit more option in promotion/depromotion would help. If we had another category (frontpage, personal, and random), and mods moved things both up and down pretty easily, it would make for lower-stakes decisionmaking, and you wouldn’t have to ban anyone unless they’re making lots of work for mods even after being warned (or are just pure spam, which doesn’t seem to be the question triggering this discussion).
  - DragonGod Jul 8, 2023, 8:39 PM
    2 points
    Parent
    
    How frequent are moderation actions? Is this discussion about saving moderator effort (by banning someone before you have to remove the rate-limited quantity of their bad posts), or something else? I really worry about “quality improvement by prior restraint”—both because low-value posts aren’t that harmful, they get downvoted and ignored pretty easily, and because it can take YEARS of trial-and-error for someone to become a good participant in LW-style discussions, and I don’t want to make it impossible for the true newbies (young people discovering this style for the first time) to try, fail, learn, try, fail, get frustrated, go away, come back, and be slightly-above-neutral for a bit before really hitting their stride.
    
    I agree with Dagon here.
    
    Six years ago after discovering HPMOR and reading part (most?) of the Sequences, I was a bad participant in old LW and rationalist subreddits.
    
    I would probably have been quickly banned on current LW.
    
    It really just takes a while for people new to LW like norms to adjust.
- Dagon Jul 7, 2023, 7:36 PM
  2 points
  Parent
  Can you formalize the threat model a bit more? What is the harm you’re trying to prevent with this predictive model of whether a user (new or not) will be “productive” or “really bad”? I’m mostly interested in your cost estimates for false positive/negative and your error bars for the information you have available. Also, how big is the gap between “productive” and “really bad”. MOST users are neither—they’re mildly good to mildly bad, with more noise than signal to figure out the sign.
  The bayesean in me says “use all data you have”, but the libertarian side says “only use data that the target would expect to be used”, and even more “I don’t believe you’ll USE the less-direct data to reach correct conclusions”. For example, is it evidence of responsibility that someone deleted a bad comment, or evidence of risk that they wrote it in the first place?
  I DO strongly object to differential treatment of new users. Long-term users have more history to judge them on, but aren’t inherently different, and certainly shouldn’t have more expectation of privacy. I do NOT strongly object to a clear warning that drafts, deleted comments, and DMs are not actually private, and will often be looked at by site admins. I DO object to looking at them without the clear notice that LW is different than a naive expectation in this regard.
  I should say explicitly: I have VERY different intuitions of what’s OK to look at routinely for new users (or old) in a wide-net or general policy vs what’s OK to look at if you have some reason (a complaint or public indication of suspicious behavior) to investigate an individual. I’d be very conservative on the former, and pretty darn detailed on the latter.
  I think you’re fully insane (or more formally, have an incoherent privacy, threat, and prediction model) if you look at deleted/private/draft messages, and ignore voting patterns.