Thoughts on LessWrong norms, the Art of Discourse, and moderator mandate
A couple of weeks ago I asked Should LW have an official list of norms? and I appreciate the responses there. Here I want to say what I’m currently thinking following that post, and continue having a public conversation about it.
I think saying more on this topic actually gets into a bunch of interesting questions around LessWrong’s purpose, userbase, de facto norms and culture, moderation mandate, etc. Without locking in things as “Officially How It Is Forever”, I’ll opine on my current thinking on this topics and how I relate to them in practice. It’s possible that further public discussion will shift some things here, and after more back-and-forth, it’d make sense to “ratify” some of it more.
With all that said...
LessWrong and The Art of Discourse
LessWrong was founded to be a place for perfecting the Art of Human Rationality, i.e., generally thinking in ways which more reliably result in true beliefs, etc. Similarly, I think there’s a closely related “Art of Discourse”: communicating in ways that more reliably result in those conversing (and reading along) having more true beliefs. Perhaps it’s a sub-art of the Art of Human Rationality.
The real rules of which communication most efficiently gets you towards truth lives in reality. You can choose your norms, but whether those norms are conducive to truth isn’t up to you.
The LessWrong community, over its 10-15 year existence, has assembled a number of beliefs about the Art of Discourse. Things like communicating degrees of belief quantitatively, preference for asymmetric weapons, an interest in local validity, etc. We of course don’t have the complete art and may be mistaken about pieces of it, but we feel strongly about some of the pieces we believe we possess of this art.
Different people in our community have somewhat different senses of the Art of Discourse, and these even form clusters. But there’s a pretty solid common core set of norms on the site, such that if someone is not conforming to them, most people would want them to change their behavior or go elsewhere.
The core point I want to make here is: The Art of [Truth-seeking] Discourse lives in the territory, and we community members attempt to discover it and practice it.
Moderators moderate according to their own understanding of The Art
A thing you could imagine doing is the community comes together, writes down its sense of how you ought to behave, and enshrines that as The Law. The moderators (judges/police) then interpret and enforce the law. I think this sometimes gets called “Rule of Law”.
I think that gets you some advantages, but requires infrastructure and investment LessWrong can’t realistically have, both for enshrining the initial law and then updating it over time in cases of incompleteness and ambiguity.
(edit: “Rule of Man” as an existing phrase means something crucially different from what I wanted to described. See my comment here for clarification.
Instead LessWrong operates by a “Rule of Man”[1] ~~”hybrid Rule of Law/Man” system where the moderators apply our own understanding of the Art of Discourse to making moderation decisions about which behaviors are okay or not, and what to do with users who behave badly according to us. This has quite a few benefits: it allows us to be flexible and adaptable to new cases, it means we ask a direct question of “does this seem good or not?” rather than “did it violate the enshrined law?”, and it allows us to smoothly improve the enforced policy as our understanding of the Art of Discourse improves over time.
This approach does run the risk that moderators have bad calls (or could be corrupt or biased), which is why I favor moderation being transparent where doing so isn’t too costly , so people can call out things they think are mistakes.
Components of Decision-Making: Inside-View/Outside-View/Stakeholder-Game-Theory
It’d actually be imprecise to say that moderators just moderate according to our inside views of the Art of Discourse. I could possibly carve it in a few ways, but here’s one attempted breakdown at how we’ll make our site decisions:
We make moderation decisions based on our inside view beliefs about what would be good for the Discourse and LessWrong’s goal. We do so both via indirect (via principles we’ve settled on) and direct consequentialist reasoning[2].
We might sometimes weight the views of people we think are wrong, but generally respect their thinking. This is something like applying our “outside view” to situations.
One form of this is trying to hold ourselves accountable to the people we think we should hold ourselves accountable to. I can’t currently provide you a list or clear criteria, but it’s like “these people seem to really capture the spirit of my values, by doing well in their lights, I will do well according to my own values and judgment”. Another framing might be “this includes the people that if they thought we were fucking up or were unhappy with us, I’d really really care. Eliezer and Scott would be on that list, for example.
We do some “game theory” to figure out how to account for the views and preferences of people that we feel have meaningful “stake” in LessWrong. For example, if it was the case that a number of very core contributors differed from the LessWrong mod team in their beliefs about the Art of Discourse (could be something like different beliefs about what politeness-norms or psychologizing-norms are good), we would likely weigh those beliefs in the actually policy we upheld.
Legibilizing one’s understanding of the Art
There’s a bunch of encoded functions and algorithms in my brain which, for any given post or comment on LessWrong, will provide an evaluation of it. This illegible function is what is what I actually use to make moderation calls, and it would be very difficult, or really impossible, for me to make it fully legible (even to myself). The other members of the LessWrong team have their own functions, and for that matter, so does every user on LessWrong.
However, I can attempt to capture aspects of my encoded function into something explicit. Lists of principles that, while not the actual thing, point you in the right direction. Or lists of principles that I can invoke to help explain my reasoning in various cases. These legible list of principle or rules aren’t the law in the sense in which the US Constitution is law, but they’ll provide a better sense of the real rules than if you didn’t have them.
You end up with a fair bit of indirection:
written discussion principles attempt to capture LW team’s understanding of the Art of Discourse attempts to capture The Actual Art of Discourse.
The written principles/norm are then hopefully useful by:
Being a useful start to learning the actual Art of Discourse for new users
Helping new users understand which behaviors get upvoted/downvoted, approved/rejected, moderated/not-moderated
Helping moderators and other users explain their reactions
Focusing community discussions around okay/not-okay behavior
At the same time, the written principles are not the end-all be-all. A moderator might say “while none of our existing written things capture what you’re doing, we’re pretty sure it’s bad and we’re taking moderator action to prevent more of this”.
A list of norms for LessWrong which is of the shape “here’s our understanding of the Art of Discourse (work in progress)” seems like it could be pretty good.
Towards a settled picture
I think the above picture is pretty good and it’s approx the models/philosophy behind current moderation. But seems good to write up it up and discuss in advance of us taking bolder actions on the basis of it (e.g. writing a list of site norms). Very interested in feedback here that could result in amending the picture.
More than something framed as “site norms”, I like the idea of writing up “here’s our understanding of the Art of Discourse so far” that can be shared with new users and cited in moderation decisions would be pretty good. Also ideally it gets updated over time as we figure out more and more Art of the Discourse, and make LW more successful at its missions.
- ^
This term gets used elsewhere in not quite the sense I mean it. Elsewhere, Rule of Man means something like the laws from the man, whereas I actually mean something like “the laws live in the territory but are interpreted and applied by man”. Perhaps could have a better term for it.
- ^
I’ve long been a fan of R. M. Hare’s two-level utilitarianism, and think it in fact matches how we moderate – attempting to figure out general principles, but figuring out those principles and applying via more direct consequentialist reasoning.
- 13 May 2023 5:59 UTC; 0 points) 's comment on Moderation notes re: recent Said/Duncan threads by (
I appreciate you being clear that this is trying to be Rule of Man. But:
I think this misses a big part of what’s important about Rule of Law. A feature of law is that it’s still very flawed, human, vague, incomplete, requiring interpretation, etc.; and yet, it “pretends” to be eternal, symmetric, universal, logical, convergent. More exactly: rule of law is, in the first place, putting the ultimate judge as something other than man, and in the second place putting the ultimate judge as something that is normative, i.e. follows from logic, and in the third place as something that negotiates positive-sum coordination between values.
Slightly more concretely, “rule of man” abdicates responsibility to be open to error correction, because there’s no recourse to a criterion for recognizing error; it’s just rule of man, so ultimately it’s about man’s judgement. It abdicates the attempt to steer towards some abstract ideal.
This can cash out in bad, corrupt, or biased calls. It can also cash out as other people not being able to verify the process; an abstract ideal implies universally recognizable criteria and logical implications which people can independently check, but rule of man doesn’t. Likewise, the mods themselves don’t have the abstract ideal. Since people can’t see an ideal being upheld, they can’t necessarily behave so as to integrate into the process in a way that serves the common good. They can’t put their weight down on the system to go towards truth, openness, etc.; and they can’t rely on having recourse to correcting errors by making error-producing dynamics explicit.
I’m realizing that describing my philosophy for LessWrong as “Rule of Man” was possibly a mistake, since I think what I’m describing is not Rule of Man as used elsewhere. My model is a hybrid system where there’s an external “standard”, i.e. the platonic ideal Art of Discourse , and then the Man (or people) who make judgments about Art of Discourse. This is importantly different from a pure Rule of Man in that can go to the rulers and say “hey, you’re mistaken about Art of Discourse” because there’s an external standard to be appealed to. I think this gets you the properties that you’re saying would be good [about Rule of Law] and what is not true of regular Rule of Man.
If you’re down to take the time, I’m curious to know, if you state any comments here in terms of the system specifically described for LW rather than the abstract kinds of governance (and in doing so, taboo “Rule of Man/Law”.
This alleged “hybrid system” doesn’t get you the benefits of rule of law, because the distinguishing feature of the rule of law is that the law is not an optimizer. As Yudkowsky explains in “Free to Optimize”, the function of the legal system is “to provide a predictable environment in which people can optimize their own futures.” In a free country (as contrasted to an authoritarian dictatorship), a good citizen is someone who pays their taxes and doesn’t commit crimes. That way, citizens who have different ideas about what the good life looks like can get along with each other. Sometimes people might make bad decisions, but it turns out that it’s actually more fun to live in a country where people have the right to make their own bad decisions (and suffer the natural consequences, like losing money or getting downvoted or failing to persuade people), than a country where a central authority tries to micromanage everyone’s decisions.
A country where judges try to get citizens to “actively (credibly) agree to stop optimizing in a fairly deep way”, and impose ad hoc punishments to prevent those who don’t agree from doing things that “feel damaging” to the judge, is not reaping the benefits of the rule of law, because the benefits of the rule of law flow precisely from the fact that there are rules, and that citizens are free to optimize in a deep way as long as they obey the rules—that the centralized authority isn’t trying to grab all the optimization power in the system for itself.
Crucially, assurances that the power structure is trying optimize for something good, are not rules. I’m sure the judges of the Inquisition or Soviet show trials would have told you that they weren’t exercising power arbitrarily, because they were making judgements about an external standard—the platonic ideal of God’s will, or the common good. I’m sure they were being perfectly sincere about that. The rule of law is about imposing more stringent limits on the use of power than an authority figure’s subjectively sincere allegiance to an abstract ideal that isn’t written down.
I wrote a post explaining why there’s very obviously not going to be any such thing. Probability theory isn’t going to tell you how polite to be. It just isn’t. Why would it? How could it?
What’s your counterargument? If you don’t have a counterargument, then how can you possibly claim with a straight face that “go[ing] to the rulers and say[ing] ‘hey, you’re mistaken about Art of Discourse’” is a redress mechanism?
I feel like I must be missing something? The post seems to be about something else, and I don’t really know how it relates to this. Or maybe you are misunderstanding the metaphor here?
To quote you directly from your post:
This feels to me like it’s actually making a very similar point.
The OP isn’t arguing that there should be a single “rationalist discourse”. It’s pretty explicitly saying that “different discourse algorithms (the collective analogue of ‘cognitive algorithm’) leverage the laws of rationality to convert information into optimization in somewhat different ways, depending on the application and the population of interlocutors at hand”. The Art of Discourse should indeed be general and we should be careful to distinguish between what is the locally correct application of those rules (i.e. what we do around here given our local constraints) and what the general rules are and how they would apply to different environments. This is what I understood the part about “the moderators try to learn the art of discourse, and then separately the moderators will set rules and guidelines and write explanations based on their best understanding of the art, and how it applies to this specific forum” to be about.
This question “optimization” is pretty interesting. I’m not sure I consider “not an optimizer to be the distinguishing feature, but nonetheless. I agree that in this sense, the LW “law/moderation” is an optimizer and will interfere with optimization it disagrees with much more than the law. It might be a matter of degree though.
I like this comment and have more to say, but I spilled soy sauce on my keyboard recently and my spacebar is sticky, making it hard to type. I’ll say a little more tomorrow when I have an external keyboard again.
I agree with this particular statement, but there are two nearby statements that also seem true and important:
Probability theory absolutely informs what sorts of communication styles are going to identity useful truths most efficiently. For example, you should be more likely to make utterances like “this updates my probability that X will happen” (rather than “X will happen” or “X will not happen” in a more boolean true/false paradigm, for example)
Human psychology and cognitive science (as well as the general study of minds-in-general) absolutely inform the specific question of “what sort of politeness norms are useful for conversations optimized for truth-tracking”. There might be multiple types of conversations that optimize for different truth-tracking strategies. Debate vs collaborative brainstorming vs doublecrux might accomplish slightly different things and benefit from different norms. Crockers rules might create locally more truth-tracking in some situations but also make an environment less likely to include people subconsciously maneuvering such that they won’t have deal with painful stimuli. There is some fact-of-the-matter about what sort of human cultures find out the most interesting and important things most quickly.
I argued a bunch with your post at the time and generally don’t think it was engaging with the question I’m considering here. I complained at the time about you substituting a word definition without acknowledging it, which I think you’re doing again here.
Bloom specifically used the phrase “Platonic ideal Art of Discourse”! When someone talks about Platonic ideals of discourse, I think it’s a pretty reasonable reading on my part to infer that they’re talking about simple principles of ideal reasoning with wide interpersonal appeal, like the laws of probability theory, or “Clarifying questions aren’t attacks”, or “A debate in which one side gets unlimited time, but the other side is only allowed to speak three times, isn’t fair”, or “When you identify a concern as your ‘actual crux’, and someone very clearly addresses it, you should either change your mind or admit that it wasn’t a crux”—not there merely being a fact of the matter as to what option is best with respect to some in-principle-specifiable utility function when faced with a complicated, messy empirical policy trade-off (as when deciding when and how to use multiple types of conversations that optimize for different truth-tracking strategies), which is trivial.
If Bloom meant something else by “Platonic ideal Art of Discourse”, he’s welcome to clarify. I charitably assumed the intended meaning was something more substantive than, “The mods are trying to do what we personally think is good, and whatever we personally think is good is a Platonic ideal”, which is vacuous.
This is germane because when I look at recent moderator actions, the claim that the mod team is trying to be accountable to simple principles of ideal reasoning with wide interpersonal appeal is farcical. You specifically listed limiting Said Achmiz’s speech as a prerequisite “next step” for courting potential users to the site. When asked whether the handful of user complaints against Achmiz were valid, you replied that you had “a different ontology here”.
That is, from your own statements, it sure looks like your rationale for restricting Achmiz’s speech is not about him violating any principles of ideal discourse that you can clearly describe (and could therefore make neutrally-enforced rules about, and have an ontology such that some complaints are invalid) but rather that some people happen to dislike Achmiz’s writing style, and you’re worried about those people not using your website. (I’m not confident you’ll agree with that characterization, but it seems accurate to me; if you think I’m misreading the situation, you’re welcome to explain why.)
As amusing as it would be to see you try, I should hope you’re not going to seriously defend “But then fewer people would use our website” as a Platonic ideal of good discourse?!
(I would have hoped that I wouldn’t need to explain this, but to be clear, the problem with “But then fewer people would use our website” as moderation policy is that it systematically sides with popularity over correctness—deciding arguments based on the relative social power of their proponents and detractors, rather than the intellectual merits.)
I’ve heard that some people on this website don’t like Holocaust allusions, but frankly, you’re acting like the property owner of a gated community trying to court wealthy but anti-Semetic potential tenants by imposing restrictions on existing Jewish tenants. You’re sensitive to the fact that this plan has costs, and you’re willing to consider mitigating those costs by probably building something that lets people Opt Into More Jews, but you’re not willing to consider that the complaints of the rich people you’re trying to attract are invalid on account of the Jewish tenants not doing anything legibly bad (that you could make a neutrally-enforced rule against), because you have a different ontology here.
If you object to this analogy, I think you should be able explain what, specifically, you think the relevant differences are between people who don’t want to share a gated compound with Jews (despite the fact that they’re free to not invite Jews to dinner parties at their own condos), and people who don’t want share a website with Said Achmiz (despite the fact that they’re free to ban Achmiz from commenting on their own posts). I think it’s a great analogy—right down to the detail of Jews being famous for asking annoying questions.
You mean all that stuff that famously fails to replicate on a regular basis and huge swaths of which have turned out to be basically nonsense…?
I don’t think I know what this is. Are you talking about animal psychology, or formal logic (and similarly mathematical fields like probability theory), or what…?
No doubt there is, but I would like to see something more than just a casual assumption that we have any useful amount of “scientific” or otherwise rigorous knowledge (as opposed to, e.g., “narrative” knowledge, or knowledge that consists of heuristics derived from experience) about this.
Some examples I have in mind here are game theory, information theory, and algorithm design. I think the thing on my mind when I wrote the sentence was How An Algorithm Feels From Inside, which touches on different ways you might structure a network that would have different implications on the algorithm’s efficiency and what errors it might make as a side effect.
To be clear, I don’t currently think I have beliefs about moderation that are strongly downstream of those fields. It’s more that I think it’s useful, on a forum that is in-large-part about the intersection of these (and similar) fields, it’s nice to step between my practical best guesses of how what practical tools to apply, and what underlying laws might govern things, even if the laws I know of don’t directly apply to the situation.
Game theory is the bit that I feel like I’ve looked into the most myself and grokked, with Most Prisoner’s Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems being an example I found particularly crisp and illuminating, and Elinor Ostrom’s Governance of the Commons being useful for digging into the details of messy human examples, and giving me a sense of what it’d mean to actually translate them into a formalization.
FYI I include a lot of Zack’s philosophy-of-language-translated-into-abstracted-python useful here, and I’d also include your Selective, Corrective, Structural: Three Ways of Making Social Systems Work an example of something I’d expect to still hold up in some alien civilizations.
FWIW, my own take here is indeed that we should try to get some of the benefits of the rule of law (and indeed that one of the central components here is putting limits on the power of the moderators), but that an online forum should aspire to a much lower standard of justice than a country, and should be closer to the standard that we hold companies to (where at least in the U.S. you have things like at-will employment and a broad understanding that it’s often the right choice to fire someone even if they didn’t do anything legibly bad). I don’t feel super confident on this though.
Company/corporate cultures are hardly a good model to emulate if we want to optimize for truth-seeking, as such cultures famously select for distortions of truth, often lack any incentives for truth-seeking and truth-telling, and generally reward sociopathy (instrumentally) and bullshit (epistemically) to an appalling degree. Is that really the standard to aim for, here?
I mean, really depends on which company. The variance between different companies here is huge. My current model is that both the world’s best teams and cultures are located in for-profit companies, as well as the world’s worst epistemic environments. So I find it very hard to speak in generalities here (and think it’s somewhat obviously wrong to claim that for-profit companies in-general select for distortions of truth).
When you say that “world’s best teams and cultures are located in for-profit companies”, what companies do you have in mind? SpaceX? Google? Jane Street…?
Bell Labs is the classical example here, as clearly one of history’s most intellectually generative places.
It also appears that sadly, at least for the purpose of people really understanding engineering and computer science, Deepmind appears to be a quite good place for thinking, as have some other parts of Google been.
Bridgewater also seems quite good as far as I can tell. At least when I’ve talked to people who worked there, and when I read Ray Dalio’s work, I get a pretty good impression, though it’s of course hard to tell from the outside (I remember you shared an article like 4 years ago with some concerns, though I don’t currently buy the contents of that).
(Note: this comment delayed by rate limit. Next comment on this topic, if any, won’t be for a week, for the same reason.)
Very ironic! I had all three of those in mind as counterexamples to your claim. (Well, not Deepmind specifically, but Google in general; but the other two for sure.)
Bell Labs was indeed “one of history’s most intellectually generative places”. But the striking thing about Bell Labs (and similarly Xerox PARC, and IBM Research) is the extent to which the people working there were isolated from ordinary corporate politics, corporate pressures, and day-to-day business concerns. In other words, these corporate research labs are notable precisely for being enclaves within which corporate/company culture essentially does not operate.
As far as Google and/or Deepmind goes, well… I don’t know enough about Deepmind in particular to comment on it. But Google, in general, is famous for being a place where fixing/improving things is low-prestige, and the way to get ahead is to be seen as developing shiny new features/products/etc. This has predictable consequences for, e.g., usability (Google’s products are infamous for having absolutely horrific interaction and UX design—Google Plus being one egregious example). Everything I’ve heard about Google indicates that the stereotypical “moral maze” dynamics of corporate culture are in full swing there.
Re: Bridgewater, you remember correctly, although “some concerns” is rather an understatement; it’s more like “the place is a real-life Orwellian panopticon, with all the crushing stress and social/psychological dysfunction that implies”. Even more damning is that they never even bother to verify that all of this helps their investing performance in any way. This seems to me to be very obviously the opposite of a healthy epistemic environment—something to avoid as assiduously as we possibly can.
Xerox PARC also had some impressive achievements.
I think my comments still apply to LW as you described? I guess you could cash out what I’m saying as: You should say what you mean by Art of Discourse (or whatever are most of the core things you’re trying to make space for in LW), such that you’re willing to also then say: this is the thing we’re trying to do here, this is the ideal which users can
validate our mod actions against,
have recourse to object to mod or user actions, or call mod principles into question,
rely on to be the long-term trend of the community.
(And then say that.) Of course the thing will be in many ways vague and unspecified, and require a lot of interpretation. But in the absence of actual law, which would be better but would be costly as you say, it seems to me much better than nothing. You gestured at that in your post: “communicating in ways that more reliably result in those conversing (and reading along) having more true beliefs”. But I think more has to be said.
Yes, “Rule of Man” is a nearly maximally bad choice of words.
This one feels wrong. Discourse has the map-nature.
First, I will admit the triviality that maps are also things that live in territories. Brains run on physics; software runs on hardware. The Venn diagram is {things in territories {things in maps}}. But though we use the same word, the meme of the mythical unicorn in our books and art and brains is distinct from one actually made directly from atoms, though books and brains are made of atoms too.
(Truth-seeking) Discourse is about improving maps by using other maps. If we were using the territory to improve the maps, we might call that an “experiment”, or suchlike.
Innate and universal human psychology might have the territory-nature, because we can improve our understanding of them via experiment, but culture lives in the maps, and unlike innate human nature, is quite mutable. Norms are also mutable. Social rules and laws are mutable. Should we be mapping these, or engineering them instead?
Principles and abstractions and theories have the map-nature. They’re lossy compression models (i.e., maps) that throw away the irrelevant details. This includes socially relevant mathematical models. E.g., Game Theory is made out of models, i.e., maps. Mathematics may be “discovered”, but theorems are towers of meta-maps. The results depend very much upon the axioms.
I think the point being made in the post is that there’s a ground-truth-of-the-matter as to what comprises Art-Following Discourse.
To move into a different frame which I feel may capture the distinction more clearly, the True Laws of Discourse are not socially constructed, but our norms (though they attempt to approximate the True Laws) are definitely socially constructed.
Normally posts like this are “Personal blog”, but since it discusses core parts of how LessWrong gets moderated, etc., seems appropriate to give it extra visibility.