He is using this comment to show the ‘epistemic concerns’ side specifically, and claiming the personal stuff were separate .
This is the specific claim.
We think that ialdabaoth poses a substantial risk to our epistemic environment due to manipulative epistemic tactics, based on our knowledge and experience of him. This is sufficient reason for the ban, and holds without investigating or making any sort of ruling on other allegations.
Maybe I’m confused about what you mean by “the personal stuff”. My impression is that what I would consider “the personal stuff” is central to why ialdabaoth is considered to pose an epistemic threat: he has (allegedly) a history of manipulation which makes it more likely that any given thing he writes is intended to deceive or manipulate. Which is why jimrandomh said:
The problem is, I think this post may contain a subtle trap, and that understanding its author, and what he was trying to do with this post, might actually be key to understanding what the trap is.
and, by way of explanation of why “this post may contain a subtle trap”, a paragraph including this:
So he created narratives to explain why those conversations were so confusing, why he wouldn’t follow the advice, and why the people trying to help him were actually wronging him, and therefore indebted. This post is one such narrative.
Unless I’m confused, (1) this is not “a somewhat standard LW critique of a LW post” because most such critiques don’t allege that the thing critiqued is likely to contain subtle malignly-motivated traps, and (2) the reason for taking it seriously is “the personal stuff”.
Who’s saying, in what sense, that “the personal stuff were separate”?
Vaniver is saying that the personal stuff didn’t come into account when banning him and that epistemic concerns were enough. From OP:
We think that ialdabaoth poses a substantial risk to our epistemic environment due to manipulative epistemic tactics, based on our knowledge and experience of him. This is sufficient reason for the ban, and holds without investigating or making any sort of ruling on other allegations.
but then the epistemic concerns seem to be purely based on stuff from the “other allegations” part.
And honestly, the quality of that post is (subjectively) higher than the quality of > 99% of current LW posts, yet the claim is that content is what he is banned for, which is a bit ridiculous. What I am asking is, why pretend it is the content and that the “other allegations” have no part?
What I am asking is, why pretend it is the content and that the “other allegations” have no part?
As mentioned in a sibling comment, I am trying to establish the principle that ‘promoting reasoning styles in a way we think is predatory’ can be a bannable offense, independent of whether or not predation has obviously happened, in part because I think that’s part of having a well-kept garden and in part so that the next person in ialdabaoth’s reference class can be prevented from doing significant harm. Simply waiting until someone has been exiled doesn’t do that.
I don’t think I have the slightest idea what this means, even having read the OP and everything else about ialdabaoth’s actions. That’s a problem.
I am not worried about you in this regard; if anyone is interested in whether or not they should be worried for themselves, please reach out to me.
Many sorts of misbehavior can be adequately counteracted by clear rules; a speed limit is nicely quantitative and clearly communicable. Misbehavior on a higher level of abstraction like “following the letter of the law but not the spirit” cannot be adequately counteracted in the same sort of way. If one could letter out the spirit of the law, they would have done that the first time. Similarly, if I publish my sense of “this is how I detect adversarial reasoners,” then an adversarial reasoner has an easier time getting past my defenses.
I will put some thought into whether I can come up with a good discussion of what I mean by ‘predatory’, assuming that’s where the confusion is; if instead it’s in something like “promoting reasoning styles” I’d be happy to attempt to elaborate that.
If one could letter out the spirit of the law, they would have done that the first time.
Some sort of typo / word substitution here, I think? Well, I think I get the sense of what you meant here from context. Anyhow…
Similarly, if I publish my sense of “this is how I detect adversarial reasoners,” then an adversarial reasoner has an easier time getting past my defenses.
I appreciate that, but this was not the point of my comment. Rather, I was saying that it’s entirely unclear to me what you are even trying to detect. (“Adversarial reasoners”, yes, but that is obviously far too broad a category… isn’t it? Maybe I don’t know what you mean by that, either…)
I will put some thought into whether I can come up with a good discussion of what I mean by ‘predatory’, assuming that’s where the confusion is; if instead it’s in something like “promoting reasoning styles” I’d be happy to attempt to elaborate that.
Both, in fact. I would appreciate some elaboration on both counts!
Some sort of typo / word substitution here, I think? Well, I think I get the sense of what you meant here from context. Anyhow…
Yeah, I tried to shorten “If one could capture the spirit of the law in letters, they would have done that the first time.” and I don’t think it worked very well.
Both, in fact. I would appreciate some elaboration on both counts!
So thinking about this response to Zack_M_Davis made me realize that there’s a bit of my model that I might be able to explain easily.
I often think of things like LessWrong as having a ‘gravity well’, where people come close and then get drawn in by some features that they want more of. Gravity is obviously disanalogous in several ways, as it ignores the role played by preferences (on seeing something, some people become more interested whereas others become more bored), but I think that doesn’t affect the point too much; the main features that I want are something like “people near the center feel a stronger pull to stay in than people further away, and the natural dynamics tend to pull people closer over time.” I think often a person is in many such wells at once, and there’s some dynamic equilibrium between how all their different interests want to pull them.
Sometimes, people want to have a gravity well around themselves to pull other people closer to them. This class contains lots of ordinary things; someone starting a company and looking to build a team does this, someone trying to find apprentices or otherwise mentor people does this, someone attempting to find a romantic partner does a narrower version of this.
Whether this is ‘predatory’ seems to be a spectrum that for me mostly depends on a few factors, primarily how relative benefit is being prioritized and how information asymmetries are being handled. A clearly non-predatory case is Gallant attempting to start a company where it will succeed and all founders / early employees / investors will share in the profits and whose customers will be satisfied, and a clearly predatory case is Goofus attempting to start a company where the main hope is to exit before things collapse, and plans to pay very little by promising lots of equity while maintaining sufficient control that all of the early employees can be diluted out of profits when the time comes.
One way to have a gravity well around you is to create something new and excellent that people flock to. Another way to have a gravity well pulling other people towards you is moving towards the center of a pre-existing well. Imagine someone getting good at a martial art so that they can teach that martial art to others and feel knowledgeable and helpful.
A third option is becoming a ‘fringe master’, where you create something new at the boundary of something pre-existing. You don’t have to pay the costs of being part of the ‘normal system’ and dealing with its oversight, but do gain many of the benefits of its advertising / the draw of the excellence at the center of the pre-existing thing. This is basically the category that I am most worried about / want to be able to act against, where someone will take advantage of new people drawn in by LessWrong / the rationality community who don’t know about the missing stairs to watch out for or who is held in low regard. A general feature of this case is that the most unsavory bits will happen in private, or only be known through rumor, or assessments of technical skill or correctness that seem obvious to high-skill individuals will not be widely held.
Put another way, it seems to me like the rationality community is trying to draw people in, so that they can get the benefits of rationality / they can improve rationality / they can contribute to shared projects, and it would be good to devote attention to the edges of the community and making sure that we’re not losing marginal people we attract to traps set up on the edge of the community.
Furthermore, given the nature of our community, I expect those traps to look a lot like reasoning styles or models that contain traps that the people absorbing the model are unable to see. One of the main things that rationalists do is propose ways of thinking things through that could lead to better truth-tracking or superior performance in some way.
The primary thing I’m worried about there is when the superior performance is on a metric like “loyalty to the cause” or “not causing problems for ialdabaoth.” On a top-level comment, Isnasene writes:
Using Affordance Widths to manipulate people into doing things for you is basically a fancy pseudo-rationalist way of manipulating people into doing things for you by making them feel guilty and responsible.
Which is basically my sense; the general pattern was that ialdabaoth would insist on his frame / interpretation that he was being oppressed by society / the other people in the conversation, and would use a style of negotiation that put the pressure on the other person to figure out how to accommodate him / not be guilty of oppressing him instead of on him to not violate their boundaries. There’s also a general sense that he was aware the of the risk to him for open communication or clear thinking about about him, and thus would work against both when possible.
becoming a ‘fringe master’, where you create something new at the boundary of something pre-existing. You don’t have to pay the costs of being part of the ‘normal system’ and dealing with its oversight, but do gain many of the benefits of its advertising / the draw of the excellence at the center of the pre-existing thing. This is basically the category that I am most worried about / want to be able to act against, where someone will take advantage of new people drawn in by LessWrong / the rationality community who don’t know about the missing stairs to watch out for or who is held in low regard.
Thanks for explaining this part; this is really helpful. This model seems to assume that the “oversight” of the “normal system” at the center of the gravity well is trustworthy. I’m currently most worried[1] about the scenario where the “normal system” is corrupt: that new people are getting drawn in by the awesomeness of the Sequences and Harry Potter and the Methods,[2] only to get socialized into a community whose leadership and dominant social trend pays lip service to “rationality”, but is not actually interested in using reasoning (at least, not in public) when using reasoning would be socially inconvenient (whether due to the local area’s political environment, the asymmetric incentives faced by all sufficiently large organizations, the temptation to wirehead on our own “rationality” and “effectiveness” marketing promises, or many other possible reasons) and therefore require a small amount of bravery.
The worst thing for an epistemic standard is not the person who ignores or denies it, but the person who tries to mostly follow it when doing so feels right or is convenient while not acknowledging that they aren’t following it when it feels weird or inconvenient, as that leads to a community of people with such standards engaging in double-think WRT whether their standards call for weird or inconvenient behavior.
Have you thought at all about how to prevent the center of the gravity well from becoming predatory? Obviously, I’m all in favor of having systems to catch missing-stair rapists. But if you’re going to build an “immune system” to delegitimize anyone “held in low regard” without having to do the work of engaging with their arguments—without explaining why an amorphous mob that holds something in “low regard” can be trusted to reach that judgement for reliably good reasons—then you’re just running a cult. And if enough people who remember the spirit of the Sequences notice their beloved rationality community getting transformed into a cult, then you might have a rationalist civil war on your hands.
(Um, sorry if that’s too ominous or threatening of a phrasing. I think we mostly want the same thing, but have been following different strategies and exposed to different information, and I notice myself facing an incentive to turn up the rhetoric and point menacingly at my BATNA in case that helps with actually being listened to, because recent experiences have trained my brain to anticipate that even high-ranking “rationalists” are more interested in avoiding social threat than listening to arguments. As I’m sure you can also see, this is already a very bad sign of the mess we’re in.)
“Worried” is an understatement. It’s more like panicking continuously all year with many hours of lost sleep, crying fits, pacing aimlessly instead of doing my dayjob, and eventually doing enough trauma processing to finish writing my forthcoming 20,000-word memoir explaining in detail (as gently and objectively as possible while still telling the truth about my own sentiments and the world I see) why you motherfuckers are being incredibly intellectually dishonest (with respect to a sense of “intellectual dishonesty” that’s about behavior relative to knowledge, not conscious verbal “intent”).
Notably, written at a time Yudkowsky and “the community” had a lower public profile and therefore faced less external social pressure. This is not a coincidence because nothing is ever a coincidence.
This model seems to assume that the “oversight” of the “normal system” at the center of the gravity well is trustworthy.
On the core point, I think you improve / fix problems with the normal system in the boring, hard ways, and do deeply appreciate you championing particular virtues even when I disagree on where the balance of virtues lies.
I find something offputting here about the word “trustworthy,” because I feel like it’s a 2-place word; I think of oversight as something like “good enough to achieve standard X”, whereas “trustworthy” alone seems to imply there’s a binary standard that is met or not (and has been met). It seems like we could easily have very different standards for trustworthiness that cause us to not disagree on the facts while disagreeing on the implications.
(Somehow, it reminds me of this post and Caledonian’s reaction to it.)
Have you thought at all about how to prevent the center of the gravity well from becoming predatory?
Yes. Mostly this has focused on recruitment work for MIRI, where we really don’t want to guilt people into working on x-risk reduction (as not only is it predatory, it also is a recipe for them burning out instead of being productive, and so morality and efficiency obviously align), and yet most of the naive ways to ask people to consider working on x-risk reduction risk guilting them, and you need a more sophisticated way to remove that failure mode than just saying “please don’t interpret this as me guilting you into it!”. This is a thing that I’ve already written that’s parts of my longer thoughts here.
And, obviously, when I think about moderating LessWrong, I think about how to not become corrupt myself, and what sorts of habits and systems lower the chances of that, or make it more obvious if it does happen.
it’s a 2-place word [...] It seems like we could easily have very different standards for trustworthiness that cause us to not disagree on the facts while disagreeing on the implications.
Right, I agree that we don’t want to get into a pointless pseudo-argument where everyone agrees that x = 60, and yet we have a huge shouting match over whether this should be described using the English word “large” or “small.”
Maybe a question that would lead to a more meaningful disagreement would be, “Should our culture become more or less centralized?”—where centralized is the word I’m choosing to refer to a concept I’m going to try to describe extensionally/ostensively in the following two paragraphs.[1]
In a high-centralization culture, there’s a stronger presumption that our leaders in the town center come closer to knowing everything already, and that the reasoning styles or models being hawked by fringe masters are likely to “contain traps that the people absorbing the model are unable to see”: that is, thinking for yourself doesn’t work. As a result, our leaders might talk up “the value of having a community-wide immune system” so that they can “act against people who are highly manipulative and deceitful before they have clear victims.” If a particular fringe master starts becoming popular, our leaders might want to announce that they are “actively hostile to [the fringe master], and make it clear that [we] do not welcome support from those quarters.”
You seem to be arguing that we should become more centralized. I think that would be moving our culture in the absolute wrong direction. As long as we’re talking about patterns of adversarial optimization, I have to say that, to me, this kind of move looks optimized for “making it easier to ostracize and silence people who could cause trouble for MIRI and CfAR (e.g., Vassar or Ziz), either by being persistent critics or by embarrassing us in front of powerful third parties who are using guilt-by-association heuristics”, rather than improving our collective epistemics.
This seems like a substantial disagreement, rather than a trivial Sorites problem about how to use the word “trustworthy”.
do deeply appreciate you championing particular virtues even when I disagree on where the balance of virtues lies
not because most fringe masters are particularly good (they aren’t), but because thinking for yourself actually works and it’s not like our leaders in the town center know everything already.
I think the leaders in the town center do not know everything already. I think different areas have different risks when it comes to “thinking for yourself.” It’s one thing to think you can fly and jump off a roof yourself, and another thing to think it’s fine to cook for people when you’re Typhoid Mary, and I worry that you aren’t drawing a distinction here between those cases.
I have thought about this a fair amount, but am not sure I’ve discovered right conceptual lines here, and would be interested in how you would distinguish between the two cases, or if you think they are fundamentally equivalent, or that one of them isn’t real.
You seem to be arguing that we should become more centralized. I think that would be moving our culture in the absolute wrong direction.
In short, I think there are some centralizing moves that are worth it, and others that aren’t, and that we can choose policies individually instead of just throwing the lever on “centralization: Y/N”. Well-Kept Gardens Die by Pacifism is ever relevant; here, the thing that seems relevant to me is that there are some basic functions that need to happen (like, say, the removal of spam), and fulfilling those functions requires tools that could also be used for nefarious functions (as we could just mark criticisms of MIRI as ‘spam’ and they would vanish). But the conceptual categories that people normally have for this are predicated on the interesting cases; sure, both Nazi Germany and WWII America imprisoned rapists, but the interesting imprisonments are of political dissidents, and we might prefer WWII America because it had many fewer such political prisoners, and further prefer a hypothetical America that had no political prisoners. But this spills over into the question of whether we should have prisons or justice systems at all, and I think people’s intuitions on political dissidents are not very useful for what should happen with the more common sort of criminal.
Like, it feels almost silly to have to say this, but I like it when people put forth public positions that are critical of an idea I favor, because then we can argue about it and it’s an opportunity for me to learn something, and I generally expect the audience to be able to follow it and get things right. Like, I disagreed pretty vociferously with The AI Timelines Scam, and yet I thought the discussion it prompted was basically good. It did not ping my Out To Get You sensors in the way that ialdabaoth does. To me, this feels like a central example of the sort of thing you see in a less centralized culture where people are trying to think things through for themselves and end up with different answers, and is not at risk from this sort of moderation.
“Worried” is an understatement. It’s more like panicking continuously all year with many hours of lost sleep, crying fits, pacing aimlessly instead of doing my dayjob, and eventually doing enough trauma processing to finish writing my forthcoming 20,000-word memoir explaining in detail (as gently and objectively as possible while still telling the truth about my own sentiments and the world I see) why you motherfuckers are being incredibly intellectually dishonest (with respect to a sense of “intellectual dishonesty” that’s about behavior relative to knowledge, not conscious verbal “intent”).
I think I’m someone who might be sympathetic to your case, but just don’t understand what it is, so I’m really curious about this “memoir”. (Let me know if you want me to read your current draft.) Is this guess, i.e., you’re worried about the community falling prey to runaway virtue signaling, remotely close?
The separation I’m hoping to make is between banning him because “we know he committed sex crimes” and banning him because “he’s promoting reasoning styles in a way we think is predatory.” We do not know the first to the standard of legal evidence; ialdabaoth has not been convicted in a court of law, and while I think the community investigations were adequate for the questions of whether or not he should be allowed at particular events or clubs, my sense is that his exile was decided by amorphous community-wide processing in a way that I’m reluctant to extend further.
I’m making the additional claim that “The Bay Area community has a rough consensus to kick this guy out” by itself does not meet my bar for banning someone from LessWrong, given the different dynamics of in-person and online interactions. (As a trivial example, suppose someone’s body has a persistent and horrible smell; it could easily be the utilitarian move to not allow them at any physical meetups while giving them full freedom to participate online.) I think this is the bit Tenoke is finding hardest to swallow; it’s one thing to say “yep, this guy is exiled and we’re following the herd” and another thing to say “we’ve exercised independent judgment here, despite obvious pressures to conform.” The latter is a more surprising claim, and correspondingly would require more evidence.
and (2) the reason for taking it seriously is “the personal stuff”.
I think this is indirectly true. That is, there’s a separation between expected harm and actual harm, and I’m trying to implement procedures that reduce expected harm. Consider the difference between punishing people for driving drunk and just punishing people for crashing. It’s one thing to just wait until someone accumulates a cloud of ‘unfortunate events’ around them that leads to them finally losing their last defenders, and another to take active steps to assess risks and reduce them. Note that this requires a good model of how ‘drunkenness’ leads to ‘crashes’, and I do not see us as having presented a convincing model of that in this case.
Of course, this post isn’t an example of that; as mentioned, this post is years late, and the real test of whether we can do the equivalent of punishing people for driving drunk is whether we can do anything about people currently causing problems [in expectation]. But my hope is that this community slowly moves from a world where ‘concerns about X’ are published years after they’ve become mutual knowledge among people in the know to one where corrosive forces are actively cleaned up before they make things substantially worse.
Why would it make sense to “exclude the personal stuff”? Isn’t the personal stuff the point here?
He is using this comment to show the ‘epistemic concerns’ side specifically, and claiming the personal stuff were separate .
This is the specific claim.
Maybe I’m confused about what you mean by “the personal stuff”. My impression is that what I would consider “the personal stuff” is central to why ialdabaoth is considered to pose an epistemic threat: he has (allegedly) a history of manipulation which makes it more likely that any given thing he writes is intended to deceive or manipulate. Which is why jimrandomh said:
and, by way of explanation of why “this post may contain a subtle trap”, a paragraph including this:
Unless I’m confused, (1) this is not “a somewhat standard LW critique of a LW post” because most such critiques don’t allege that the thing critiqued is likely to contain subtle malignly-motivated traps, and (2) the reason for taking it seriously is “the personal stuff”.
Who’s saying, in what sense, that “the personal stuff were separate”?
Vaniver is saying that the personal stuff didn’t come into account when banning him and that epistemic concerns were enough. From OP:
but then the epistemic concerns seem to be purely based on stuff from the “other allegations” part.
And honestly, the quality of that post is (subjectively) higher than the quality of > 99% of current LW posts, yet the claim is that content is what he is banned for, which is a bit ridiculous. What I am asking is, why pretend it is the content and that the “other allegations” have no part?
As mentioned in a sibling comment, I am trying to establish the principle that ‘promoting reasoning styles in a way we think is predatory’ can be a bannable offense, independent of whether or not predation has obviously happened, in part because I think that’s part of having a well-kept garden and in part so that the next person in ialdabaoth’s reference class can be prevented from doing significant harm. Simply waiting until someone has been exiled doesn’t do that.
I don’t think I have the slightest idea what this means, even having read the OP and everything else about ialdabaoth’s actions. That’s a problem.
I am not worried about you in this regard; if anyone is interested in whether or not they should be worried for themselves, please reach out to me.
Many sorts of misbehavior can be adequately counteracted by clear rules; a speed limit is nicely quantitative and clearly communicable. Misbehavior on a higher level of abstraction like “following the letter of the law but not the spirit” cannot be adequately counteracted in the same sort of way. If one could letter out the spirit of the law, they would have done that the first time. Similarly, if I publish my sense of “this is how I detect adversarial reasoners,” then an adversarial reasoner has an easier time getting past my defenses.
I will put some thought into whether I can come up with a good discussion of what I mean by ‘predatory’, assuming that’s where the confusion is; if instead it’s in something like “promoting reasoning styles” I’d be happy to attempt to elaborate that.
Some sort of typo / word substitution here, I think? Well, I think I get the sense of what you meant here from context. Anyhow…
I appreciate that, but this was not the point of my comment. Rather, I was saying that it’s entirely unclear to me what you are even trying to detect. (“Adversarial reasoners”, yes, but that is obviously far too broad a category… isn’t it? Maybe I don’t know what you mean by that, either…)
Both, in fact. I would appreciate some elaboration on both counts!
Yeah, I tried to shorten “If one could capture the spirit of the law in letters, they would have done that the first time.” and I don’t think it worked very well.
So thinking about this response to Zack_M_Davis made me realize that there’s a bit of my model that I might be able to explain easily.
I often think of things like LessWrong as having a ‘gravity well’, where people come close and then get drawn in by some features that they want more of. Gravity is obviously disanalogous in several ways, as it ignores the role played by preferences (on seeing something, some people become more interested whereas others become more bored), but I think that doesn’t affect the point too much; the main features that I want are something like “people near the center feel a stronger pull to stay in than people further away, and the natural dynamics tend to pull people closer over time.” I think often a person is in many such wells at once, and there’s some dynamic equilibrium between how all their different interests want to pull them.
Sometimes, people want to have a gravity well around themselves to pull other people closer to them. This class contains lots of ordinary things; someone starting a company and looking to build a team does this, someone trying to find apprentices or otherwise mentor people does this, someone attempting to find a romantic partner does a narrower version of this.
Whether this is ‘predatory’ seems to be a spectrum that for me mostly depends on a few factors, primarily how relative benefit is being prioritized and how information asymmetries are being handled. A clearly non-predatory case is Gallant attempting to start a company where it will succeed and all founders / early employees / investors will share in the profits and whose customers will be satisfied, and a clearly predatory case is Goofus attempting to start a company where the main hope is to exit before things collapse, and plans to pay very little by promising lots of equity while maintaining sufficient control that all of the early employees can be diluted out of profits when the time comes.
One way to have a gravity well around you is to create something new and excellent that people flock to. Another way to have a gravity well pulling other people towards you is moving towards the center of a pre-existing well. Imagine someone getting good at a martial art so that they can teach that martial art to others and feel knowledgeable and helpful.
A third option is becoming a ‘fringe master’, where you create something new at the boundary of something pre-existing. You don’t have to pay the costs of being part of the ‘normal system’ and dealing with its oversight, but do gain many of the benefits of its advertising / the draw of the excellence at the center of the pre-existing thing. This is basically the category that I am most worried about / want to be able to act against, where someone will take advantage of new people drawn in by LessWrong / the rationality community who don’t know about the missing stairs to watch out for or who is held in low regard. A general feature of this case is that the most unsavory bits will happen in private, or only be known through rumor, or assessments of technical skill or correctness that seem obvious to high-skill individuals will not be widely held.
Put another way, it seems to me like the rationality community is trying to draw people in, so that they can get the benefits of rationality / they can improve rationality / they can contribute to shared projects, and it would be good to devote attention to the edges of the community and making sure that we’re not losing marginal people we attract to traps set up on the edge of the community.
Furthermore, given the nature of our community, I expect those traps to look a lot like reasoning styles or models that contain traps that the people absorbing the model are unable to see. One of the main things that rationalists do is propose ways of thinking things through that could lead to better truth-tracking or superior performance in some way.
The primary thing I’m worried about there is when the superior performance is on a metric like “loyalty to the cause” or “not causing problems for ialdabaoth.” On a top-level comment, Isnasene writes:
Which is basically my sense; the general pattern was that ialdabaoth would insist on his frame / interpretation that he was being oppressed by society / the other people in the conversation, and would use a style of negotiation that put the pressure on the other person to figure out how to accommodate him / not be guilty of oppressing him instead of on him to not violate their boundaries. There’s also a general sense that he was aware the of the risk to him for open communication or clear thinking about about him, and thus would work against both when possible.
Thanks for explaining this part; this is really helpful. This model seems to assume that the “oversight” of the “normal system” at the center of the gravity well is trustworthy. I’m currently most worried[1] about the scenario where the “normal system” is corrupt: that new people are getting drawn in by the awesomeness of the Sequences and Harry Potter and the Methods,[2] only to get socialized into a community whose leadership and dominant social trend pays lip service to “rationality”, but is not actually interested in using reasoning (at least, not in public) when using reasoning would be socially inconvenient (whether due to the local area’s political environment, the asymmetric incentives faced by all sufficiently large organizations, the temptation to wirehead on our own “rationality” and “effectiveness” marketing promises, or many other possible reasons) and therefore require a small amount of bravery.
As Michael Vassar put it in 2013:
Have you thought at all about how to prevent the center of the gravity well from becoming predatory? Obviously, I’m all in favor of having systems to catch missing-stair rapists. But if you’re going to build an “immune system” to delegitimize anyone “held in low regard” without having to do the work of engaging with their arguments—without explaining why an amorphous mob that holds something in “low regard” can be trusted to reach that judgement for reliably good reasons—then you’re just running a cult. And if enough people who remember the spirit of the Sequences notice their beloved rationality community getting transformed into a cult, then you might have a rationalist civil war on your hands.
(Um, sorry if that’s too ominous or threatening of a phrasing. I think we mostly want the same thing, but have been following different strategies and exposed to different information, and I notice myself facing an incentive to turn up the rhetoric and point menacingly at my BATNA in case that helps with actually being listened to, because recent experiences have trained my brain to anticipate that even high-ranking “rationalists” are more interested in avoiding social threat than listening to arguments. As I’m sure you can also see, this is already a very bad sign of the mess we’re in.)
“Worried” is an understatement. It’s more like panicking continuously all year with many hours of lost sleep, crying fits, pacing aimlessly instead of doing my dayjob, and eventually doing enough trauma processing to finish writing my forthcoming 20,000-word memoir explaining in detail (as gently and objectively as possible while still telling the truth about my own sentiments and the world I see) why you motherfuckers are being incredibly intellectually dishonest (with respect to a sense of “intellectual dishonesty” that’s about behavior relative to knowledge, not conscious verbal “intent”).
Notably, written at a time Yudkowsky and “the community” had a lower public profile and therefore faced less external social pressure. This is not a coincidence because nothing is ever a coincidence.
On the core point, I think you improve / fix problems with the normal system in the boring, hard ways, and do deeply appreciate you championing particular virtues even when I disagree on where the balance of virtues lies.
I find something offputting here about the word “trustworthy,” because I feel like it’s a 2-place word; I think of oversight as something like “good enough to achieve standard X”, whereas “trustworthy” alone seems to imply there’s a binary standard that is met or not (and has been met). It seems like we could easily have very different standards for trustworthiness that cause us to not disagree on the facts while disagreeing on the implications.
(Somehow, it reminds me of this post and Caledonian’s reaction to it.)
Yes. Mostly this has focused on recruitment work for MIRI, where we really don’t want to guilt people into working on x-risk reduction (as not only is it predatory, it also is a recipe for them burning out instead of being productive, and so morality and efficiency obviously align), and yet most of the naive ways to ask people to consider working on x-risk reduction risk guilting them, and you need a more sophisticated way to remove that failure mode than just saying “please don’t interpret this as me guilting you into it!”. This is a thing that I’ve already written that’s parts of my longer thoughts here.
And, obviously, when I think about moderating LessWrong, I think about how to not become corrupt myself, and what sorts of habits and systems lower the chances of that, or make it more obvious if it does happen.
Right, I agree that we don’t want to get into a pointless pseudo-argument where everyone agrees that x = 60, and yet we have a huge shouting match over whether this should be described using the English word “large” or “small.”
Maybe a question that would lead to a more meaningful disagreement would be, “Should our culture become more or less centralized?”—where centralized is the word I’m choosing to refer to a concept I’m going to try to describe extensionally/ostensively in the following two paragraphs.[1]
A low-centralization culture has slogans like, “Nullis in verba” or “Constant vigilance!”. If a fringe master sets up shop on the outskirts of town, the default presumption is that (time permitting) you should “consider it open-mindedly and then steal only the good parts [...] [as] an obvious guideline for how to do generic optimization”, not because most fringe masters are particularly good (they aren’t), but because thinking for yourself actually works and it’s not like our leaders in the town center know everything already.
In a high-centralization culture, there’s a stronger presumption that our leaders in the town center come closer to knowing everything already, and that the reasoning styles or models being hawked by fringe masters are likely to “contain traps that the people absorbing the model are unable to see”: that is, thinking for yourself doesn’t work. As a result, our leaders might talk up “the value of having a community-wide immune system” so that they can “act against people who are highly manipulative and deceitful before they have clear victims.” If a particular fringe master starts becoming popular, our leaders might want to announce that they are “actively hostile to [the fringe master], and make it clear that [we] do not welcome support from those quarters.”
You seem to be arguing that we should become more centralized. I think that would be moving our culture in the absolute wrong direction. As long as we’re talking about patterns of adversarial optimization, I have to say that, to me, this kind of move looks optimized for “making it easier to ostracize and silence people who could cause trouble for MIRI and CfAR (e.g., Vassar or Ziz), either by being persistent critics or by embarrassing us in front of powerful third parties who are using guilt-by-association heuristics”, rather than improving our collective epistemics.
This seems like a substantial disagreement, rather than a trivial Sorites problem about how to use the word “trustworthy”.
Thanks. I like you, too.
I just made this up, so I’m not at all confident this is the right concept, much like how I didn’t think contextualing-vs.-decoupling was the right concept.
I think the leaders in the town center do not know everything already. I think different areas have different risks when it comes to “thinking for yourself.” It’s one thing to think you can fly and jump off a roof yourself, and another thing to think it’s fine to cook for people when you’re Typhoid Mary, and I worry that you aren’t drawing a distinction here between those cases.
I have thought about this a fair amount, but am not sure I’ve discovered right conceptual lines here, and would be interested in how you would distinguish between the two cases, or if you think they are fundamentally equivalent, or that one of them isn’t real.
In short, I think there are some centralizing moves that are worth it, and others that aren’t, and that we can choose policies individually instead of just throwing the lever on “centralization: Y/N”. Well-Kept Gardens Die by Pacifism is ever relevant; here, the thing that seems relevant to me is that there are some basic functions that need to happen (like, say, the removal of spam), and fulfilling those functions requires tools that could also be used for nefarious functions (as we could just mark criticisms of MIRI as ‘spam’ and they would vanish). But the conceptual categories that people normally have for this are predicated on the interesting cases; sure, both Nazi Germany and WWII America imprisoned rapists, but the interesting imprisonments are of political dissidents, and we might prefer WWII America because it had many fewer such political prisoners, and further prefer a hypothetical America that had no political prisoners. But this spills over into the question of whether we should have prisons or justice systems at all, and I think people’s intuitions on political dissidents are not very useful for what should happen with the more common sort of criminal.
Like, it feels almost silly to have to say this, but I like it when people put forth public positions that are critical of an idea I favor, because then we can argue about it and it’s an opportunity for me to learn something, and I generally expect the audience to be able to follow it and get things right. Like, I disagreed pretty vociferously with The AI Timelines Scam, and yet I thought the discussion it prompted was basically good. It did not ping my Out To Get You sensors in the way that ialdabaoth does. To me, this feels like a central example of the sort of thing you see in a less centralized culture where people are trying to think things through for themselves and end up with different answers, and is not at risk from this sort of moderation.
I don’t think this conversation is going to make any progress at this level of abstraction and in public. I might send you an email.
I look forward to receiving it.
I think I’m someone who might be sympathetic to your case, but just don’t understand what it is, so I’m really curious about this “memoir”. (Let me know if you want me to read your current draft.) Is this guess, i.e., you’re worried about the community falling prey to runaway virtue signaling, remotely close?
Close! I’ll PM you.
This comment of yours does a better job of explaining that than the post.
The separation I’m hoping to make is between banning him because “we know he committed sex crimes” and banning him because “he’s promoting reasoning styles in a way we think is predatory.” We do not know the first to the standard of legal evidence; ialdabaoth has not been convicted in a court of law, and while I think the community investigations were adequate for the questions of whether or not he should be allowed at particular events or clubs, my sense is that his exile was decided by amorphous community-wide processing in a way that I’m reluctant to extend further.
I’m making the additional claim that “The Bay Area community has a rough consensus to kick this guy out” by itself does not meet my bar for banning someone from LessWrong, given the different dynamics of in-person and online interactions. (As a trivial example, suppose someone’s body has a persistent and horrible smell; it could easily be the utilitarian move to not allow them at any physical meetups while giving them full freedom to participate online.) I think this is the bit Tenoke is finding hardest to swallow; it’s one thing to say “yep, this guy is exiled and we’re following the herd” and another thing to say “we’ve exercised independent judgment here, despite obvious pressures to conform.” The latter is a more surprising claim, and correspondingly would require more evidence.
I think this is indirectly true. That is, there’s a separation between expected harm and actual harm, and I’m trying to implement procedures that reduce expected harm. Consider the difference between punishing people for driving drunk and just punishing people for crashing. It’s one thing to just wait until someone accumulates a cloud of ‘unfortunate events’ around them that leads to them finally losing their last defenders, and another to take active steps to assess risks and reduce them. Note that this requires a good model of how ‘drunkenness’ leads to ‘crashes’, and I do not see us as having presented a convincing model of that in this case.
Of course, this post isn’t an example of that; as mentioned, this post is years late, and the real test of whether we can do the equivalent of punishing people for driving drunk is whether we can do anything about people currently causing problems [in expectation]. But my hope is that this community slowly moves from a world where ‘concerns about X’ are published years after they’ve become mutual knowledge among people in the know to one where corrosive forces are actively cleaned up before they make things substantially worse.