Another common werewolf move is to take advantage of strong norms like epistemic honesty, and use them to drive wedges in a community or push their agenda, while knowing they can’t be called out because doing so would be akin to attacking the community’s norms.
I’ve seen the meme elsewhere in the rationality community that strong and rigid epistemic norms are a good sociopath repellent, and it’s ALMOST right. The truth is that competent sociopaths (in the Venkat Rao sense) are actually great at using rigid norms for their own ends, and are great at using the truth for their own ends as well. The reason it might work well in the rationality community (besides the obvious fact that sociopaths are even better at using lies to their own ends than the truth) is that strong epistemics are very close to what we’re actually fighting for—and remembering and always orienting towards the mission is ACTUALLY an effective first line defense against sociopaths (necessary but not sufficient IMO).
99 times out of a 100, the correct way to remember what we’re fighting for is to push for stronger epistemics above other considerations. I knew that when I made the original post, and I made it knowing I would get pushback for attacking a core value of the community.
However, 1 time out of 100 the correct way to remember what you’re fighting for is to realize that you have to sacrifice a sacred value for the greater good. And when you see someone explicitly pushing the gray area by trying to get you to accept harmful situations by appealing to that sacred value, it’s important to make clear (mostly to other people in the community) that sacrificing that value is an option.
What specifically do you mean by “werewolf” here & how do you think it relates to the way Jessica was using it? I’m worried that we’re getting close to just redefining it as a generic term for “enemies of the community.”
By werewolf I meant something like “someone who is pretending be working for the community as a member, but is actually working for their own selfish ends”. I thought Jessica was using it in the same way.
That’s not what I meant. I meant specifically someone who is trying to prevent common knowledge from being created (and more generally, to gum up the works of “social decisionmaking based on correct information”), as in the Werewolf party game.
Worth noting: “werewolf” as a jargon term strikes me as something that is inevitably going to get collapsed into “generic bad actor” over time, if it gets used a lot. I’m assuming that you’re thinking of it sort of as in the “preformal” stage, where it doesn’t make sense to over-optimize the terminology. But if you’re going to keep using it I think it’d make sense to come up with a term that’s somewhat more robust against getting interpreted that way.
(random default suggestion: “obfuscator”. Other options I came up with required multiple words to get the point across and ended up too convoluted. There might be a fun shorthand for a type of animal or mythological figure that is a) a predator or parasite, b) relies on making things cloudy. So far I could just come up with “squid” due to ink jets, but it didn’t really have the right connotations)
That is a bit more specific than what I meant. In this case though, the second more broad meaning of “someone who’s trying to gum up the works of social decisionmaking” still works in the context of the comment.
And when you see someone explicitly pushing the gray area by trying to get you to accept harmful situations by appealing to that sacred value
Um, in context, this sounds to me like you’re arguing that by writing “Where to Draw the Boundaries?” and my secret (“secret”) blog, I’m trying to get people to accept harmful situations? Am I interpreting you correctly? If so, can you explain in detail what specific harm you think is being done?
Sorry, I was trying to be really careful as I was writing of not accusing you specifically of bad intentions, but obviously it’s hard in a conversation like this where you’re jumping between the meta and the object-level.
It’s important to distinguish a couple things.
1. Jessica and I were talking about people with negative intentions in the last two posts. I’m not claiming that you’re one of those people that is deliberately using this type of argument to cause harm.
2. I’m not claiming that it was the writing of those two posts that were harmful in the way we were talking about. I was claiming that the long post you wrote at the top of the thread where you made several analogies about your response, were exactly the sort of gray area situations where, depending on context, the community might decide to sacrifice it’s sacred value. At the same time, you were banking on the fact that it was a sacred value to say “even in this case, we would uphold the sacred value.” This has the same structure as the werewolf move mentioned above, and it was important for me to speak up, even if you’re not a werewolf.
people with negative intentions [...] deliberately
So, it’s actually not clear to me that deliberate negative intentions are particularly important, here or elsewhere? Almost no one thinks of themselves as deliberately causing avoidable harm, and yet avoidable harm gets done, probably by people following incentive gradients that predictably lead towards harm, against truth, &c. all while maintaining a perfectly sincere subjective conscious narrative about how they’re doing God’s work, on the right side of history, toiling for the greater good, doing what needs to be done, maximizing global utility, acting in accordance with the moral law, practicing a virtue which is nameless, &c.
it was important for me to speak up, even if you’re not a werewolf.
Agreed. If I’m causing harm, and you acquire evidence that I’m causing harm, then you should present that evidence in an appropriate venue in order to either persuade me to stop causing harm, or persuade other people to coördinate to stop me from causing harm.
I was claiming that the long post you wrote at the top of the thread where you made several analogies about your response, were exactly the sort of gray area situations where, depending on context, the community might decide to sacrifice it’s sacred value.
So, my current guess (which is only a guess and which I would have strongly disagreed with ten years ago) is that this is a suicidally terrible idea that will literally destroy the world. Sound like an unreflective appeal to sacred values? Well, maybe!—you shouldn’t take my word for this (or anything else) except to the exact extent that you think my word is Bayesian evidence. Unfortunately I’m going to need to defer supporting argumentation to future Less Wrong posts, because mental and financial health requirements force me to focus on my dayjob for at least the next few weeks. (Oh, and group theory.)
So, it’s actually not clear to me that deliberate negative intentions are particularly important, here or elsewhere?
(responding, and don’t expect another response back because you’re busy).
I used to think this, but I’ve since realized that intentions STRONGLY matter. It seems like a system is fractal, the goals of the subparts/subagents get reflected in the goal of the broader system. People with aligned intentions will tend to shift the incentive gradients, as well people with unaligned intentions (of course, this isn’t a one way relationship, the incentive gradients will also shift the intentions).
I deny that your approach ever has an advantage over recognizing that definitions are tools which have no truth values, and then digging into goals or desires.
Another common werewolf move is to take advantage of strong norms like epistemic honesty, and use them to drive wedges in a community or push their agenda, while knowing they can’t be called out because doing so would be akin to attacking the community’s norms.
I’ve seen the meme elsewhere in the rationality community that strong and rigid epistemic norms are a good sociopath repellent, and it’s ALMOST right. The truth is that competent sociopaths (in the Venkat Rao sense) are actually great at using rigid norms for their own ends, and are great at using the truth for their own ends as well. The reason it might work well in the rationality community (besides the obvious fact that sociopaths are even better at using lies to their own ends than the truth) is that strong epistemics are very close to what we’re actually fighting for—and remembering and always orienting towards the mission is ACTUALLY an effective first line defense against sociopaths (necessary but not sufficient IMO).
99 times out of a 100, the correct way to remember what we’re fighting for is to push for stronger epistemics above other considerations. I knew that when I made the original post, and I made it knowing I would get pushback for attacking a core value of the community.
However, 1 time out of 100 the correct way to remember what you’re fighting for is to realize that you have to sacrifice a sacred value for the greater good. And when you see someone explicitly pushing the gray area by trying to get you to accept harmful situations by appealing to that sacred value, it’s important to make clear (mostly to other people in the community) that sacrificing that value is an option.
What specifically do you mean by “werewolf” here & how do you think it relates to the way Jessica was using it? I’m worried that we’re getting close to just redefining it as a generic term for “enemies of the community.”
By werewolf I meant something like “someone who is pretending be working for the community as a member, but is actually working for their own selfish ends”. I thought Jessica was using it in the same way.
That’s not what I meant. I meant specifically someone who is trying to prevent common knowledge from being created (and more generally, to gum up the works of “social decisionmaking based on correct information”), as in the Werewolf party game.
Worth noting: “werewolf” as a jargon term strikes me as something that is inevitably going to get collapsed into “generic bad actor” over time, if it gets used a lot. I’m assuming that you’re thinking of it sort of as in the “preformal” stage, where it doesn’t make sense to over-optimize the terminology. But if you’re going to keep using it I think it’d make sense to come up with a term that’s somewhat more robust against getting interpreted that way.
(random default suggestion: “obfuscator”. Other options I came up with required multiple words to get the point across and ended up too convoluted. There might be a fun shorthand for a type of animal or mythological figure that is a) a predator or parasite, b) relies on making things cloudy. So far I could just come up with “squid” due to ink jets, but it didn’t really have the right connotations)
That is a bit more specific than what I meant. In this case though, the second more broad meaning of “someone who’s trying to gum up the works of social decisionmaking” still works in the context of the comment.
Um, in context, this sounds to me like you’re arguing that by writing “Where to Draw the Boundaries?” and my secret (“secret”) blog, I’m trying to get people to accept harmful situations? Am I interpreting you correctly? If so, can you explain in detail what specific harm you think is being done?
Sorry, I was trying to be really careful as I was writing of not accusing you specifically of bad intentions, but obviously it’s hard in a conversation like this where you’re jumping between the meta and the object-level.
It’s important to distinguish a couple things.
1. Jessica and I were talking about people with negative intentions in the last two posts. I’m not claiming that you’re one of those people that is deliberately using this type of argument to cause harm.
2. I’m not claiming that it was the writing of those two posts that were harmful in the way we were talking about. I was claiming that the long post you wrote at the top of the thread where you made several analogies about your response, were exactly the sort of gray area situations where, depending on context, the community might decide to sacrifice it’s sacred value. At the same time, you were banking on the fact that it was a sacred value to say “even in this case, we would uphold the sacred value.” This has the same structure as the werewolf move mentioned above, and it was important for me to speak up, even if you’re not a werewolf.
Thanks for clarifying!
So, it’s actually not clear to me that deliberate negative intentions are particularly important, here or elsewhere? Almost no one thinks of themselves as deliberately causing avoidable harm, and yet avoidable harm gets done, probably by people following incentive gradients that predictably lead towards harm, against truth, &c. all while maintaining a perfectly sincere subjective conscious narrative about how they’re doing God’s work, on the right side of history, toiling for the greater good, doing what needs to be done, maximizing global utility, acting in accordance with the moral law, practicing a virtue which is nameless, &c.
Agreed. If I’m causing harm, and you acquire evidence that I’m causing harm, then you should present that evidence in an appropriate venue in order to either persuade me to stop causing harm, or persuade other people to coördinate to stop me from causing harm.
So, my current guess (which is only a guess and which I would have strongly disagreed with ten years ago) is that this is a suicidally terrible idea that will literally destroy the world. Sound like an unreflective appeal to sacred values? Well, maybe!—you shouldn’t take my word for this (or anything else) except to the exact extent that you think my word is Bayesian evidence. Unfortunately I’m going to need to defer supporting argumentation to future Less Wrong posts, because mental and financial health requirements force me to focus on my dayjob for at least the next few weeks. (Oh, and group theory.)
(End of thread for me.)
(responding, and don’t expect another response back because you’re busy).
I used to think this, but I’ve since realized that intentions STRONGLY matter. It seems like a system is fractal, the goals of the subparts/subagents get reflected in the goal of the broader system. People with aligned intentions will tend to shift the incentive gradients, as well people with unaligned intentions (of course, this isn’t a one way relationship, the incentive gradients will also shift the intentions).
I deny that your approach ever has an advantage over recognizing that definitions are tools which have no truth values, and then digging into goals or desires.