Wei Dai

Karma: 40,699

I think I need more practice talking with people in real time (about intellectual topics). (I’ve gotten much more used to text chat/comments, which I like because it puts less time pressure on me to think and respond quickly, but I feel like I now incur a large cost due to excessively shying away from talking to people, hence the desire for practice.) If anyone wants to have a voice chat with me about a topic that I’m interested in (see my recent post/comment history to get a sense), please contact me via PM.

www.weidai.com

Wei Dai Jul 22, 2025, 5:59 AM
4 points
3
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines

I think I have elaborated non-trivially on my reasons in this thread, so I don’t really think it’s an issue of people not finding it.

It’s largely an issue of lack of organization and conciseness (50k+ words is a minus, not a plus in my view), but also clearly an issue of “not finding it”, given that you couldn’t find an important comment of your own, one that (judging from your description of it) contains a core argument needed to understand your current insistence on authors-as-unilateral-moderators.

Wei Dai Jul 22, 2025, 2:15 AM
5 points
2
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
I meant pivot in the sense of “this doesn’t seem to be working well, we should seriously consider other possibilities” not “we’re definitely switching to a new moderation model”, but I now get that you disagree with Ray even about this.

Your comment under Ray’s post wrote:

We did end up implementing the AI Alignment Forum, which I do actually think is working pretty well and is a pretty good example of how I imagine Archipelago-like stuff to play out. We now also have both the EA Forum and LessWrong creating some more archipelago-like diversity in the online-forum space.

This made me think you were also no longer very focused on the authors-as-unilateral-moderators model and was thinking more about subreddit-like models that Ray mentioned in his post.

BTW I’ve been thinking for a while that LW needs a better search, as I’ve also often been in the position being unable to find some comment I’ve written in the past.

Instead of one-on-one chats (or in addition to them), I think you should collect/organize your thoughts in a post or sequence, for a number of reasons including that you seem visibly frustrated that after having written 50k+ words on the topic, people like me still don’t know your reasons for preferring your solution.

Wei Dai Jul 22, 2025, 12:58 AM
5 points
3
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
It seems to me there are plenty of options aside from centralized control and giving authors unilateral powers, and last I remember (i.e., at the end of this post) the mod team seems to be pivoting to other possibilities, some of which I would find much more reasonable/acceptable. I’m confused why you’re now so focused again on the model of authors-as-unilateral-moderators. Where have you explained this?

Wei Dai Jul 21, 2025, 7:26 PM
7 points
1
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines

I mean, mostly we’ve decided to give the people who complain about moderation a shot

What do you mean by this? Until I read this sentence, I saw you as giving the people who demand unilateral moderation powers a shot, and denying the requests of people like me to reduce such powers.

My not very confident guess at this point is that if it weren’t for people like me, you would have pushed harder for people to moderate their own spaces more, perhaps by trying to publicly encourage this? And why did you decide to go against your own judgment on it, given that “people who complain about moderation” have no particular powers, except the power of persuasion (we’re not even threatening to leave the site!), and it seems like you were never persuaded?

My guess is LW would be a lot better if more people felt comfortable moderating things, and in the present world, there are a lot of costs born by the site admins that wouldn’t be necessary otherwise.

This seems implausible to me given my understanding of human nature (most people really hate to see/hear criticism) and history (few people can resist the temptation to shut down their critics when given the power and social license or cover to do so). If you want a taste of this, try asking DeepSeek some questions about the CCP.

But presumably you also know this (at least abstractly, but perhaps not as viscerally as I do, coming from a Chinese background, where even before the CCP, criticism in many situations was culturally/socially impossible), so I’m confused and curious why you believe what you do.

My guess is that you see a constant stream of bad comments, and wish you could outsource the burden of filtering them to post authors (or combine efforts to do more filtering). But as an occasional post author, my experience is that I’m not a reliable judge of what counts as a “bad comment”, e.g., I’m liable to view a critique as a low quality comment, only to change my mind later after seeing it upvoted and trying harder to understand/appreciate its point. Given this, I’m much more inclined to leave the moderation to the karma system, which seems to work well enough in leaving bad comments at low karma/visibility by not upvoting them, and even when it’s occasionally wrong, still provides a useful signal to me that many people share the same misunderstanding and it’s worth my time to try to correct (or maybe by engaging with it I find out that I still misjudged it).

But if you don’t think it works well enough… hmm I recall writing a post about moderation tech proposals in 2016 and maybe there has been newer ideas since then?

Wei Dai Jul 21, 2025, 6:27 AM
6 points
3
in reply to: Ben Pace’s comment on: [Meta] New moderation tools and moderation guidelines

If someone finds interacting with you very unpleasant and you don’t understand quite why, it’s often bad form to loudly complain about it every time they don’t want to interact with you any more, even if you have an uncharitable hypothesis as to why.

If I was in this circumstance, I would be pretty worried about my own biases, and ask neutral or potentially less biased parties whether there might be more charitable and reasonable hypotheses why that person doesn’t want to interact with me. If there isn’t though, why shouldn’t I complain and e.g. make it common knowledge that my valuable criticism is being suppressed? (Obviously I would also take into consideration social/political realities, not make enemies I can’t afford to make, etc.)

I’ve seen many spaces degrade due to unwillingness to moderate

But most people aren’t using this feature, so to the extent that LW hasn’t degraded (and that’s due to moderation), isn’t it mainly because of the site moderators and karma voters? The benefits of having a few people occasionally moderate their own spaces hardly seems worth the cost (to potential critics and people like me who really value criticism) of not knowing when their critiques might be unilaterally deleted or banned by post authors. I mean aside from the “benefit” of attracting/retaining the authors who demand such unilateral powers.

And, man, this is a lot of moderation discussion.

Aside from the above “benefit”, It seems like you’re currently getting the worst of both worlds: lack of significant usage and therefore potential positive effects, and lots of controversy when it is occasionally used. If you really thought this was an important feature for the long term health of the community, wouldn’t you do something to make it more popular? (Or have done it in the past 7 years since the feature came out?) But instead you (the mod team) seem content that few people use it, only coming out to defend the feature when people explicitly object to it. This only seems to make sense if the main motivation is again to attract/retain certain authors.

I am somewhat wary you will keep asking me a lot of short questions that, due to your inexperience moderating spaces, you will assume have simple answers, and I will have to do lots of work generating all the contexts to show how things play out

It seems like if you actually wanted or expected many people to use this feature, you would have written some guidelines on what people can and can’t do, or under what circumstances their moderation actions might be reversed by the site moderators. I don’t think I was expecting the answers to my questions to necessarily be simple, but rather that the answers already exist somewhere, at least in the form of general guidelines that might need to be interpreted to answer my specific questions.

Wei Dai Jul 20, 2025, 2:44 PM
9 points
2
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines

the problem is that many bad comments try to make some things low status that I am trying to cultivate on the site

What are these things? Do you have a post about them?

Wei Dai Jul 20, 2025, 2:34 PM
7 points
5
in reply to: Ben Pace’s comment on: [Meta] New moderation tools and moderation guidelines

We’re arguing that authors on LessWrong should be able to moderate their posts with different norms/standards from one another, and that there should not reliably be retribution or counter-punishment by other commenters for them moderating in that way.

What is currently the acceptable range of moderation norms/standards (according to the LW mod team)? For example if someone blatantly deletes/bans their most effective critics, is that acceptable? What if they instead subtly discourage critics (while being overtly neutral/welcoming) by selectively enforcing rules more stringently against their critics? What if they simply ban all “offensive” content, which as a side effect discourages critics (since as I mentioned earlier, criticism almostly inescapably implies offense)?

And what does “retribution or counter-punishment” mean? If I see an author doing one of the above, and question or criticize that in the comments or elsewhere, is that considered “retribution or counter-punishment” given that my comment/post is also inescapably offensive (status-lowering) toward the author?

Wei Dai Jul 18, 2025, 3:55 PM
7 points
0
on: Comment on “Four Layers of Intellectual Conversation”
As a note on intellectual history, I think I was influenced by “Four Layers of Intellectual Conversation” (Yudkowsky 2016) and “AI Safety via Debate” (Irving, Christiano, Amodei 2018) when I wrote “Some Thoughts on Metaphilosophy” (2019), and I wonder if “AI Safety via Debate” itself was influenced by Eliezer’s post, even though there’s no direct citation, since both emphasize the problem-solving power of large/unlimited number of layers/rounds of debate.

Wei Dai Jul 5, 2025, 5:09 PM
15 points
12
in reply to: Ben Pace’s comment on: [Meta] New moderation tools and moderation guidelines
Does everyone here remember and/or agree with my point in The Nature of Offense, that offense is about status, which in the current context implies that it’s essentially impossible to avoid giving offense while delivering strong criticism (as it almost necessarily implies that the target of criticism deserves lower status for writing something seriously flawed, having false/harmful beliefs, etc.)? @habryka @Zack_M_Davis @Said Achmiz
This discussion has become very long and I’ve been travelling so I may have missed something, but has anyone managed to write a version of Said’s comment that delivers the same strength of criticism while avoiding offending its target? (Given the above, I think this would be impossible.)
What links here?
- “Some Basic Level of Mutual Respect About Whether Other People Deserve to Live”?! by Zack_M_Davis (Jul 18, 2025, 6:41 AM; 16 points)

Wei Dai Jun 18, 2025, 12:07 AM
6 points
2
in reply to: Ben Pace’s comment on: [Meta] New moderation tools and moderation guidelines

I think incompatibilities often drive people away (e.g. at LessOnline I have let ppl know they can ask certain ppl not to come to their sessions, as it would make them not want to run the sessions, and this is definitely not due to criticism but to conflict between the two people). That’s one reason why I think this should be available.

This is something I currently want to accommodate but not encourage people to use moderation tools for, but maybe I’m wrong. How can I get a better sense of what’s going on with this kind of incompatibility? Why do you think “definitely not due to criticism but to conflict”?

I think bad commenters also drive people away. There are bad commenters who seem fine when inspecting any single comment but when inspecting longer threads and longer patterns they’re draining energy and provide no good ideas or arguments. Always low quality criticisms, stated maximally aggressively, not actually good at communication/learning. I can think of many examples.

It seems like this requires a very different kind of solution than either local bans or mutes, which most people don’t or probably won’t use, so can’t help in most places. Like maybe allow people to vote on commenters instead of just comments, and then their comments get a default karma based on their commenter karma (or rather the direct commenter-level karma would contribute to the default karma, in addition to their total karma which currently determines the default karma).

Most substantial criticisms on this site have come in post and quick takes form, such as Wentworth’s critiques of other alignment strategies, or the sharp left turn discourse, or Natalia’s critiques of Guzey’s sleep hypotheses / SMTM’s lithium hypothesis, or Eliezer’s critique of the bioanchors report.

I’m worried about less “substantial” criticisms that are unlikely to get their own posts, like just pointing out a relatively obvious mistake in the OP, or lack of clarity, or failure to address some important counterargument.

Wei Dai Jun 17, 2025, 11:09 PM
3 points
3
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
It would be easy to give authors a button to let them look at comments that they’ve muted. (This seems so obvious that I didn’t think to mention it, and I’m confused by your inference that authors would have no ability to look at the muted comments at all. At the very least they can simply log out.)

Wei Dai Jun 17, 2025, 10:54 PM
4 points
4
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
In the discussion under the original post, some people will have read the reply post, and some won’t (perhaps including the original post’s author, if they banned the commenter in part to avoid having to look at their content), so I have to model this.

Sure, let’s give people moderation tools, but why trust authors with unilateral powers that can’t be overriden by the community, such as banning and moving comments/commenters to a much less visible section?

Wei Dai Jun 17, 2025, 10:19 PM
8 points
2
in reply to: Ben Pace’s comment on: [Meta] New moderation tools and moderation guidelines
My proposal was meant to address the requirement that some authors apparently have to avoid interacting with certain commenters. All proposals dealing with this imply multiple conversations and people having to model different states of knowledge in others, unless those commenters are just silenced altogether, so I’m confused why it’s more confusing to have multiple conversations happening in the same place when those conversations are marked clearly.

It seems to me like the main difference is that Habryka just trusts authors to “garden their spaces” more than I do, and wants to actively encourage this, whereas I’m reluctantly trying to accommodate such authors. I’m not sure what’s driving this difference though. People on Habryka’s side (so far only he has spoken up, but there’s clearly more given voting patterns) seem very reluctant to directly address the concern that people like me have that even great authors are human and likely biased quite strongly when it comes to evaluating strong criticism, unless they’ve done so somewhere I haven’t seen.

Maybe it just comes down to differing intuitions and there’s not much to say? There’s some evidence available though, like Said’s highly upvoted comment nevertheless triggering a desire to ban Said. Has Habryka seen more positive evidence that I haven’t?

Wei Dai Jun 17, 2025, 11:31 AM
LW: 2 AF: 2
0
AF
in reply to: Geoffrey Irving’s comment on: An alignment safety case sketch based on debate

I think we’re in a similar place with the philosophical worries: we have both a bunch of specific games that fail with older theories, and a bunch of proposals (say, variants of FDT) without a clear winner.

I think the situation in decision theory is way more confusing than this. See https://www.lesswrong.com/posts/wXbSAKu2AcohaK2Gt/udt-shows-that-decision-theory-is-more-puzzling-than-ever and I would be happy to have a chat about this if that would help convey my view of the current situation.

Wei Dai Jun 17, 2025, 11:21 AM
1 point
−2
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
To reduce clutter you can reuse the green color bars that currently indicate new comments, and make it red for muted comments.

Authors might rarely ban commenters because the threat of banning drives them away already. And if the bans are rare then what’s the big deal with requiring moderator approval first?

giving the author social legitimacy to control their own space, combined with checks and balance

I would support letting authors control their space via the mute and flag proposal, adding my weight to its social legitimacy, and I’m guessing others who currently are very much against the ban system (thus helping to deprive it of social legitimacy) would also support or at least not attack it much in the future. I and I think others would be against any system that lets authors unilaterally exert very strong control of visibility of comments such as by moving them to a bottom section.

But I guess you’re actually talking about something else, like how comfortable does the UX make the author, thus encouraging them to use it more. It seems like you’re saying you don’t want to make the muting to be too in your face, because that makes authors uncomfortable and reluctant to use it? Or you simultaneously want authors to have a lot of control over comment visibility, but don’t want that fact to be easily visible (and the current ban system accomplishes this)? I don’t know, this just seems very wrong to me, like you want authors to feel social legitimacy that doesn’t actually exist, ie if most people support giving authors more control then why would it be necessary to hide it.

Wei Dai Jun 16, 2025, 8:15 PM
4 points
0
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
Yeah I think it would help me understand your general perspective better if you were to explain more why you don’t like my proposal. What about just writing out the top 3 reasons for now, if you don’t want to risk investing a lot of time on something that might not turn out to be productive?

Wei Dai Jun 16, 2025, 7:31 PM
2 points
0
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines

Comments almost never get downvoted.

Assuming your comment was serious (which on reflection I think it probably was), what about a modification to my proposed scheme, that any muted commenter gets an automatic downvote from the author when they comment? Then it would stay at the bottom unless enough people actively upvoted it? (I personally don’t think this is necessary because low quality comments would stay near the bottom even without downvotes just from lack of upvotes, but I want to address this if it’s a real blocker for moving away from the ban system.)

Wei Dai Jun 16, 2025, 7:03 PM
2 points
0
in reply to: Wei Dai’s comment on: [Meta] New moderation tools and moderation guidelines
BTW my old, now defunct user script LW Power Reader had a feature to adjust the font size of comments based on their karma, so that karma could literally affect visibility despite “the thread structure making strict karma sorting impossible”. So you could implement that if you want, but it’s not really relevant to the current debate since karma obviously affects visibility virtually even without sorting, in the sense that people can read the number and decide to skip the comment or not.

Wei Dai Jun 16, 2025, 6:45 PM
13 points
−2
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
Said’s comment that triggered this debate is ³⁹⁄₃₄, at the top of the comments section of the post and #6 in Popular Comments for the whole site, but you want to allow the author to ban Said from future commenting, with the rationale “you should model karma as currently approximately irrelevant for managing visibility of comments”. I think this is also wrong generally as I’ve often found karma to be very helpful in exposing high quality comments to me, and keeping lower quality comments less visible toward the bottom, or allowing me to skip them if they occur in the middle of threads.

I almost think the nonsensical nature of this justification is deliberate, but I’m not quite sure. In any case, sigh...

Wei Dai Jun 16, 2025, 5:59 AM
4 points
2
in reply to: habryka’s comment on: [Meta] New moderation tools and moderation guidelines
The point of my proposal is to give authors an out if there are some commenters who they just can’t stand to interact with. This is a claimed reason for demanding a unilateral ban, at least for some.

If the author doesn’t trust the community to vote bad takes down into less visibility, when they have no direct COI, why should I trust the author to do it unilaterally, when they do? Writing great content doesn’t equate to rationality when it comes to handling criticism.

LW has leverage in the form of its audience, which most blogs can’t match, but obviously that’s not sufficient leverage for some, so I’m willing to accept the status quo, but that doesn’t mean I’m going to be happy about it.