becoming a ‘fringe master’, where you create something new at the boundary of something pre-existing. You don’t have to pay the costs of being part of the ‘normal system’ and dealing with its oversight, but do gain many of the benefits of its advertising / the draw of the excellence at the center of the pre-existing thing. This is basically the category that I am most worried about / want to be able to act against, where someone will take advantage of new people drawn in by LessWrong / the rationality community who don’t know about the missing stairs to watch out for or who is held in low regard.
Thanks for explaining this part; this is really helpful. This model seems to assume that the “oversight” of the “normal system” at the center of the gravity well is trustworthy. I’m currently most worried[1] about the scenario where the “normal system” is corrupt: that new people are getting drawn in by the awesomeness of the Sequences and Harry Potter and the Methods,[2] only to get socialized into a community whose leadership and dominant social trend pays lip service to “rationality”, but is not actually interested in using reasoning (at least, not in public) when using reasoning would be socially inconvenient (whether due to the local area’s political environment, the asymmetric incentives faced by all sufficiently large organizations, the temptation to wirehead on our own “rationality” and “effectiveness” marketing promises, or many other possible reasons) and therefore require a small amount of bravery.
The worst thing for an epistemic standard is not the person who ignores or denies it, but the person who tries to mostly follow it when doing so feels right or is convenient while not acknowledging that they aren’t following it when it feels weird or inconvenient, as that leads to a community of people with such standards engaging in double-think WRT whether their standards call for weird or inconvenient behavior.
Have you thought at all about how to prevent the center of the gravity well from becoming predatory? Obviously, I’m all in favor of having systems to catch missing-stair rapists. But if you’re going to build an “immune system” to delegitimize anyone “held in low regard” without having to do the work of engaging with their arguments—without explaining why an amorphous mob that holds something in “low regard” can be trusted to reach that judgement for reliably good reasons—then you’re just running a cult. And if enough people who remember the spirit of the Sequences notice their beloved rationality community getting transformed into a cult, then you might have a rationalist civil war on your hands.
(Um, sorry if that’s too ominous or threatening of a phrasing. I think we mostly want the same thing, but have been following different strategies and exposed to different information, and I notice myself facing an incentive to turn up the rhetoric and point menacingly at my BATNA in case that helps with actually being listened to, because recent experiences have trained my brain to anticipate that even high-ranking “rationalists” are more interested in avoiding social threat than listening to arguments. As I’m sure you can also see, this is already a very bad sign of the mess we’re in.)
“Worried” is an understatement. It’s more like panicking continuously all year with many hours of lost sleep, crying fits, pacing aimlessly instead of doing my dayjob, and eventually doing enough trauma processing to finish writing my forthcoming 20,000-word memoir explaining in detail (as gently and objectively as possible while still telling the truth about my own sentiments and the world I see) why you motherfuckers are being incredibly intellectually dishonest (with respect to a sense of “intellectual dishonesty” that’s about behavior relative to knowledge, not conscious verbal “intent”).
Notably, written at a time Yudkowsky and “the community” had a lower public profile and therefore faced less external social pressure. This is not a coincidence because nothing is ever a coincidence.
This model seems to assume that the “oversight” of the “normal system” at the center of the gravity well is trustworthy.
On the core point, I think you improve / fix problems with the normal system in the boring, hard ways, and do deeply appreciate you championing particular virtues even when I disagree on where the balance of virtues lies.
I find something offputting here about the word “trustworthy,” because I feel like it’s a 2-place word; I think of oversight as something like “good enough to achieve standard X”, whereas “trustworthy” alone seems to imply there’s a binary standard that is met or not (and has been met). It seems like we could easily have very different standards for trustworthiness that cause us to not disagree on the facts while disagreeing on the implications.
(Somehow, it reminds me of this post and Caledonian’s reaction to it.)
Have you thought at all about how to prevent the center of the gravity well from becoming predatory?
Yes. Mostly this has focused on recruitment work for MIRI, where we really don’t want to guilt people into working on x-risk reduction (as not only is it predatory, it also is a recipe for them burning out instead of being productive, and so morality and efficiency obviously align), and yet most of the naive ways to ask people to consider working on x-risk reduction risk guilting them, and you need a more sophisticated way to remove that failure mode than just saying “please don’t interpret this as me guilting you into it!”. This is a thing that I’ve already written that’s parts of my longer thoughts here.
And, obviously, when I think about moderating LessWrong, I think about how to not become corrupt myself, and what sorts of habits and systems lower the chances of that, or make it more obvious if it does happen.
it’s a 2-place word [...] It seems like we could easily have very different standards for trustworthiness that cause us to not disagree on the facts while disagreeing on the implications.
Right, I agree that we don’t want to get into a pointless pseudo-argument where everyone agrees that x = 60, and yet we have a huge shouting match over whether this should be described using the English word “large” or “small.”
Maybe a question that would lead to a more meaningful disagreement would be, “Should our culture become more or less centralized?”—where centralized is the word I’m choosing to refer to a concept I’m going to try to describe extensionally/ostensively in the following two paragraphs.[1]
In a high-centralization culture, there’s a stronger presumption that our leaders in the town center come closer to knowing everything already, and that the reasoning styles or models being hawked by fringe masters are likely to “contain traps that the people absorbing the model are unable to see”: that is, thinking for yourself doesn’t work. As a result, our leaders might talk up “the value of having a community-wide immune system” so that they can “act against people who are highly manipulative and deceitful before they have clear victims.” If a particular fringe master starts becoming popular, our leaders might want to announce that they are “actively hostile to [the fringe master], and make it clear that [we] do not welcome support from those quarters.”
You seem to be arguing that we should become more centralized. I think that would be moving our culture in the absolute wrong direction. As long as we’re talking about patterns of adversarial optimization, I have to say that, to me, this kind of move looks optimized for “making it easier to ostracize and silence people who could cause trouble for MIRI and CfAR (e.g., Vassar or Ziz), either by being persistent critics or by embarrassing us in front of powerful third parties who are using guilt-by-association heuristics”, rather than improving our collective epistemics.
This seems like a substantial disagreement, rather than a trivial Sorites problem about how to use the word “trustworthy”.
do deeply appreciate you championing particular virtues even when I disagree on where the balance of virtues lies
not because most fringe masters are particularly good (they aren’t), but because thinking for yourself actually works and it’s not like our leaders in the town center know everything already.
I think the leaders in the town center do not know everything already. I think different areas have different risks when it comes to “thinking for yourself.” It’s one thing to think you can fly and jump off a roof yourself, and another thing to think it’s fine to cook for people when you’re Typhoid Mary, and I worry that you aren’t drawing a distinction here between those cases.
I have thought about this a fair amount, but am not sure I’ve discovered right conceptual lines here, and would be interested in how you would distinguish between the two cases, or if you think they are fundamentally equivalent, or that one of them isn’t real.
You seem to be arguing that we should become more centralized. I think that would be moving our culture in the absolute wrong direction.
In short, I think there are some centralizing moves that are worth it, and others that aren’t, and that we can choose policies individually instead of just throwing the lever on “centralization: Y/N”. Well-Kept Gardens Die by Pacifism is ever relevant; here, the thing that seems relevant to me is that there are some basic functions that need to happen (like, say, the removal of spam), and fulfilling those functions requires tools that could also be used for nefarious functions (as we could just mark criticisms of MIRI as ‘spam’ and they would vanish). But the conceptual categories that people normally have for this are predicated on the interesting cases; sure, both Nazi Germany and WWII America imprisoned rapists, but the interesting imprisonments are of political dissidents, and we might prefer WWII America because it had many fewer such political prisoners, and further prefer a hypothetical America that had no political prisoners. But this spills over into the question of whether we should have prisons or justice systems at all, and I think people’s intuitions on political dissidents are not very useful for what should happen with the more common sort of criminal.
Like, it feels almost silly to have to say this, but I like it when people put forth public positions that are critical of an idea I favor, because then we can argue about it and it’s an opportunity for me to learn something, and I generally expect the audience to be able to follow it and get things right. Like, I disagreed pretty vociferously with The AI Timelines Scam, and yet I thought the discussion it prompted was basically good. It did not ping my Out To Get You sensors in the way that ialdabaoth does. To me, this feels like a central example of the sort of thing you see in a less centralized culture where people are trying to think things through for themselves and end up with different answers, and is not at risk from this sort of moderation.
“Worried” is an understatement. It’s more like panicking continuously all year with many hours of lost sleep, crying fits, pacing aimlessly instead of doing my dayjob, and eventually doing enough trauma processing to finish writing my forthcoming 20,000-word memoir explaining in detail (as gently and objectively as possible while still telling the truth about my own sentiments and the world I see) why you motherfuckers are being incredibly intellectually dishonest (with respect to a sense of “intellectual dishonesty” that’s about behavior relative to knowledge, not conscious verbal “intent”).
I think I’m someone who might be sympathetic to your case, but just don’t understand what it is, so I’m really curious about this “memoir”. (Let me know if you want me to read your current draft.) Is this guess, i.e., you’re worried about the community falling prey to runaway virtue signaling, remotely close?
Thanks for explaining this part; this is really helpful. This model seems to assume that the “oversight” of the “normal system” at the center of the gravity well is trustworthy. I’m currently most worried[1] about the scenario where the “normal system” is corrupt: that new people are getting drawn in by the awesomeness of the Sequences and Harry Potter and the Methods,[2] only to get socialized into a community whose leadership and dominant social trend pays lip service to “rationality”, but is not actually interested in using reasoning (at least, not in public) when using reasoning would be socially inconvenient (whether due to the local area’s political environment, the asymmetric incentives faced by all sufficiently large organizations, the temptation to wirehead on our own “rationality” and “effectiveness” marketing promises, or many other possible reasons) and therefore require a small amount of bravery.
As Michael Vassar put it in 2013:
Have you thought at all about how to prevent the center of the gravity well from becoming predatory? Obviously, I’m all in favor of having systems to catch missing-stair rapists. But if you’re going to build an “immune system” to delegitimize anyone “held in low regard” without having to do the work of engaging with their arguments—without explaining why an amorphous mob that holds something in “low regard” can be trusted to reach that judgement for reliably good reasons—then you’re just running a cult. And if enough people who remember the spirit of the Sequences notice their beloved rationality community getting transformed into a cult, then you might have a rationalist civil war on your hands.
(Um, sorry if that’s too ominous or threatening of a phrasing. I think we mostly want the same thing, but have been following different strategies and exposed to different information, and I notice myself facing an incentive to turn up the rhetoric and point menacingly at my BATNA in case that helps with actually being listened to, because recent experiences have trained my brain to anticipate that even high-ranking “rationalists” are more interested in avoiding social threat than listening to arguments. As I’m sure you can also see, this is already a very bad sign of the mess we’re in.)
“Worried” is an understatement. It’s more like panicking continuously all year with many hours of lost sleep, crying fits, pacing aimlessly instead of doing my dayjob, and eventually doing enough trauma processing to finish writing my forthcoming 20,000-word memoir explaining in detail (as gently and objectively as possible while still telling the truth about my own sentiments and the world I see) why you motherfuckers are being incredibly intellectually dishonest (with respect to a sense of “intellectual dishonesty” that’s about behavior relative to knowledge, not conscious verbal “intent”).
Notably, written at a time Yudkowsky and “the community” had a lower public profile and therefore faced less external social pressure. This is not a coincidence because nothing is ever a coincidence.
On the core point, I think you improve / fix problems with the normal system in the boring, hard ways, and do deeply appreciate you championing particular virtues even when I disagree on where the balance of virtues lies.
I find something offputting here about the word “trustworthy,” because I feel like it’s a 2-place word; I think of oversight as something like “good enough to achieve standard X”, whereas “trustworthy” alone seems to imply there’s a binary standard that is met or not (and has been met). It seems like we could easily have very different standards for trustworthiness that cause us to not disagree on the facts while disagreeing on the implications.
(Somehow, it reminds me of this post and Caledonian’s reaction to it.)
Yes. Mostly this has focused on recruitment work for MIRI, where we really don’t want to guilt people into working on x-risk reduction (as not only is it predatory, it also is a recipe for them burning out instead of being productive, and so morality and efficiency obviously align), and yet most of the naive ways to ask people to consider working on x-risk reduction risk guilting them, and you need a more sophisticated way to remove that failure mode than just saying “please don’t interpret this as me guilting you into it!”. This is a thing that I’ve already written that’s parts of my longer thoughts here.
And, obviously, when I think about moderating LessWrong, I think about how to not become corrupt myself, and what sorts of habits and systems lower the chances of that, or make it more obvious if it does happen.
Right, I agree that we don’t want to get into a pointless pseudo-argument where everyone agrees that x = 60, and yet we have a huge shouting match over whether this should be described using the English word “large” or “small.”
Maybe a question that would lead to a more meaningful disagreement would be, “Should our culture become more or less centralized?”—where centralized is the word I’m choosing to refer to a concept I’m going to try to describe extensionally/ostensively in the following two paragraphs.[1]
A low-centralization culture has slogans like, “Nullis in verba” or “Constant vigilance!”. If a fringe master sets up shop on the outskirts of town, the default presumption is that (time permitting) you should “consider it open-mindedly and then steal only the good parts [...] [as] an obvious guideline for how to do generic optimization”, not because most fringe masters are particularly good (they aren’t), but because thinking for yourself actually works and it’s not like our leaders in the town center know everything already.
In a high-centralization culture, there’s a stronger presumption that our leaders in the town center come closer to knowing everything already, and that the reasoning styles or models being hawked by fringe masters are likely to “contain traps that the people absorbing the model are unable to see”: that is, thinking for yourself doesn’t work. As a result, our leaders might talk up “the value of having a community-wide immune system” so that they can “act against people who are highly manipulative and deceitful before they have clear victims.” If a particular fringe master starts becoming popular, our leaders might want to announce that they are “actively hostile to [the fringe master], and make it clear that [we] do not welcome support from those quarters.”
You seem to be arguing that we should become more centralized. I think that would be moving our culture in the absolute wrong direction. As long as we’re talking about patterns of adversarial optimization, I have to say that, to me, this kind of move looks optimized for “making it easier to ostracize and silence people who could cause trouble for MIRI and CfAR (e.g., Vassar or Ziz), either by being persistent critics or by embarrassing us in front of powerful third parties who are using guilt-by-association heuristics”, rather than improving our collective epistemics.
This seems like a substantial disagreement, rather than a trivial Sorites problem about how to use the word “trustworthy”.
Thanks. I like you, too.
I just made this up, so I’m not at all confident this is the right concept, much like how I didn’t think contextualing-vs.-decoupling was the right concept.
I think the leaders in the town center do not know everything already. I think different areas have different risks when it comes to “thinking for yourself.” It’s one thing to think you can fly and jump off a roof yourself, and another thing to think it’s fine to cook for people when you’re Typhoid Mary, and I worry that you aren’t drawing a distinction here between those cases.
I have thought about this a fair amount, but am not sure I’ve discovered right conceptual lines here, and would be interested in how you would distinguish between the two cases, or if you think they are fundamentally equivalent, or that one of them isn’t real.
In short, I think there are some centralizing moves that are worth it, and others that aren’t, and that we can choose policies individually instead of just throwing the lever on “centralization: Y/N”. Well-Kept Gardens Die by Pacifism is ever relevant; here, the thing that seems relevant to me is that there are some basic functions that need to happen (like, say, the removal of spam), and fulfilling those functions requires tools that could also be used for nefarious functions (as we could just mark criticisms of MIRI as ‘spam’ and they would vanish). But the conceptual categories that people normally have for this are predicated on the interesting cases; sure, both Nazi Germany and WWII America imprisoned rapists, but the interesting imprisonments are of political dissidents, and we might prefer WWII America because it had many fewer such political prisoners, and further prefer a hypothetical America that had no political prisoners. But this spills over into the question of whether we should have prisons or justice systems at all, and I think people’s intuitions on political dissidents are not very useful for what should happen with the more common sort of criminal.
Like, it feels almost silly to have to say this, but I like it when people put forth public positions that are critical of an idea I favor, because then we can argue about it and it’s an opportunity for me to learn something, and I generally expect the audience to be able to follow it and get things right. Like, I disagreed pretty vociferously with The AI Timelines Scam, and yet I thought the discussion it prompted was basically good. It did not ping my Out To Get You sensors in the way that ialdabaoth does. To me, this feels like a central example of the sort of thing you see in a less centralized culture where people are trying to think things through for themselves and end up with different answers, and is not at risk from this sort of moderation.
I don’t think this conversation is going to make any progress at this level of abstraction and in public. I might send you an email.
I look forward to receiving it.
I think I’m someone who might be sympathetic to your case, but just don’t understand what it is, so I’m really curious about this “memoir”. (Let me know if you want me to read your current draft.) Is this guess, i.e., you’re worried about the community falling prey to runaway virtue signaling, remotely close?
Close! I’ll PM you.