It’s important to distinguish the question of whether, in your own personal decisionmaking, you should ever do things that aren’t maximally epistemically good (obviously, yes); from the question of whether the discourse norms of this website should tolerate appeals to consequences (obviously, no).
I agree it’s important to realize that these things are fundamentally different.
It might be morally right, in some circumstances, to pass off a false mathematical proof as a true one (e.g. in a situation where it is useful to obscure some mathematical facts related to engineering weapons of mass destruction). It’s still a violation of the norms of mathematics, with good reason. And it would be very wrong to argue that the norms of mathematics should change to accommodate people making this (by assumption, morally right) choice.
A better norm of mathematics might be to NOT publish proofs that have obvious negative consequences like enabling weapons of mass destruction, and have a norm that actively disincentivizes people who publish that sort of research.
In other words, a norm might be to basically be epistemically pure, UNLESS the local instrumental considerations outweigh the cost to epistemic climate. This can be rounded down to “have norms about epistemics and break them sometimes,” but only if when someone points at edge cases where the norms are actively harmful, they’re challenged that sometimes the breaking of those norms is perfectly OK.
IE, if someone is using the norms of the community as a weapon, it’s important to point at that the norms are a means to an end, and that the community won’t blindly allow itself to be taken advantage of.
I think my actual concern with this line of argumentation is: if you have a norm of “If ‘X’ and ‘X implies Y’ then ‘Y’, EXCEPT when it’s net bad to have concluded ‘Y’”, then the werewolves win.
The question of whether it’s net bad to have concluded ‘Y’, is much, much more complicated than the question of whether, logically, ‘Y’ is true under these assumptions (of course, it is). There are many, many more opportunities for werewolves to gum up the works of this process, making the calculation come out wrong.
If we’re having a discussion about X and Y, someone moves to propose ‘Y’ (because, as it has already been agreed, ‘X’ and ‘X implies Y’), and then someone else says “no, we can’t do that, that has negative consequences!”, that second person is probably playing a werewolf strategy, gumming up the works of the epistemic substrate.
If we are going to have the exception to the norm at all, then there has to be a pretty high standard of evidence to prove that adding ‘Y’ to the discourse, in fact, has bad consequences. And, to get the right answer, that discussion itself is going to have to be up to high epistemic standards. To be trustworthy, it’s going to have to make logical inferences much more complex than “if ‘X’ and ‘X implies Y’, then ‘Y’”. What if someone objects to those logical inference steps, on the basis that they would have negative consequences? Where does that discussion happen?
In practice, these questions aren’t actually answered. In practice, what happens is that social epistemology doesn’t happen, and instead everything becomes about coalitional politics. Saying ‘Y’ doesn’t mean ‘Y is literally true’, it means you’re part of the coalition of people who wants consequences related to (but not even necessarily directly implied by!) the statement ‘Y’ to be put into effect, and that makes you blameworthy if those consequences hurt someone sympathetic, or that coalition is bad. Under such conditions, it is a major challenge to re-establish epistemic discourse, because everything is about violence, including attempts to talk about the “we don’t have epistemology and everything is about violence” problem.
We have something approaching epistemic discourse here on LessWrong, but we have to defend it, or it, too, becomes all about coalitional politics.
If we are going to have the exception to the norm at all, then there has to be a pretty high standard of evidence to prove that adding ‘Y’ to the discourse, in fact, has bad consequences.
I want to note that LW definitely has exceptions to this norm, if only because of the boring, normal exceptions. (If we would get in trouble with law enforcement for hosting something you might put on LW, don’t put it on LW.) We’ve had in the works (for quite some time) a post explaining our position on less boring cases more clearly, but it runs into difficulty with the sort of issues that you discuss here; generally these questions are answered in private in a way that connects to the judgment calls being made and the particulars of the case, as opposed to through transparent principles that can be clearly understood and predicted in advance (in part because, to extend the analogy, this empowers the werewolves as well).
Another common werewolf move is to take advantage of strong norms like epistemic honesty, and use them to drive wedges in a community or push their agenda, while knowing they can’t be called out because doing so would be akin to attacking the community’s norms.
I’ve seen the meme elsewhere in the rationality community that strong and rigid epistemic norms are a good sociopath repellent, and it’s ALMOST right. The truth is that competent sociopaths (in the Venkat Rao sense) are actually great at using rigid norms for their own ends, and are great at using the truth for their own ends as well. The reason it might work well in the rationality community (besides the obvious fact that sociopaths are even better at using lies to their own ends than the truth) is that strong epistemics are very close to what we’re actually fighting for—and remembering and always orienting towards the mission is ACTUALLY an effective first line defense against sociopaths (necessary but not sufficient IMO).
99 times out of a 100, the correct way to remember what we’re fighting for is to push for stronger epistemics above other considerations. I knew that when I made the original post, and I made it knowing I would get pushback for attacking a core value of the community.
However, 1 time out of 100 the correct way to remember what you’re fighting for is to realize that you have to sacrifice a sacred value for the greater good. And when you see someone explicitly pushing the gray area by trying to get you to accept harmful situations by appealing to that sacred value, it’s important to make clear (mostly to other people in the community) that sacrificing that value is an option.
What specifically do you mean by “werewolf” here & how do you think it relates to the way Jessica was using it? I’m worried that we’re getting close to just redefining it as a generic term for “enemies of the community.”
By werewolf I meant something like “someone who is pretending be working for the community as a member, but is actually working for their own selfish ends”. I thought Jessica was using it in the same way.
That’s not what I meant. I meant specifically someone who is trying to prevent common knowledge from being created (and more generally, to gum up the works of “social decisionmaking based on correct information”), as in the Werewolf party game.
Worth noting: “werewolf” as a jargon term strikes me as something that is inevitably going to get collapsed into “generic bad actor” over time, if it gets used a lot. I’m assuming that you’re thinking of it sort of as in the “preformal” stage, where it doesn’t make sense to over-optimize the terminology. But if you’re going to keep using it I think it’d make sense to come up with a term that’s somewhat more robust against getting interpreted that way.
(random default suggestion: “obfuscator”. Other options I came up with required multiple words to get the point across and ended up too convoluted. There might be a fun shorthand for a type of animal or mythological figure that is a) a predator or parasite, b) relies on making things cloudy. So far I could just come up with “squid” due to ink jets, but it didn’t really have the right connotations)
That is a bit more specific than what I meant. In this case though, the second more broad meaning of “someone who’s trying to gum up the works of social decisionmaking” still works in the context of the comment.
And when you see someone explicitly pushing the gray area by trying to get you to accept harmful situations by appealing to that sacred value
Um, in context, this sounds to me like you’re arguing that by writing “Where to Draw the Boundaries?” and my secret (“secret”) blog, I’m trying to get people to accept harmful situations? Am I interpreting you correctly? If so, can you explain in detail what specific harm you think is being done?
Sorry, I was trying to be really careful as I was writing of not accusing you specifically of bad intentions, but obviously it’s hard in a conversation like this where you’re jumping between the meta and the object-level.
It’s important to distinguish a couple things.
1. Jessica and I were talking about people with negative intentions in the last two posts. I’m not claiming that you’re one of those people that is deliberately using this type of argument to cause harm.
2. I’m not claiming that it was the writing of those two posts that were harmful in the way we were talking about. I was claiming that the long post you wrote at the top of the thread where you made several analogies about your response, were exactly the sort of gray area situations where, depending on context, the community might decide to sacrifice it’s sacred value. At the same time, you were banking on the fact that it was a sacred value to say “even in this case, we would uphold the sacred value.” This has the same structure as the werewolf move mentioned above, and it was important for me to speak up, even if you’re not a werewolf.
people with negative intentions [...] deliberately
So, it’s actually not clear to me that deliberate negative intentions are particularly important, here or elsewhere? Almost no one thinks of themselves as deliberately causing avoidable harm, and yet avoidable harm gets done, probably by people following incentive gradients that predictably lead towards harm, against truth, &c. all while maintaining a perfectly sincere subjective conscious narrative about how they’re doing God’s work, on the right side of history, toiling for the greater good, doing what needs to be done, maximizing global utility, acting in accordance with the moral law, practicing a virtue which is nameless, &c.
it was important for me to speak up, even if you’re not a werewolf.
Agreed. If I’m causing harm, and you acquire evidence that I’m causing harm, then you should present that evidence in an appropriate venue in order to either persuade me to stop causing harm, or persuade other people to coördinate to stop me from causing harm.
I was claiming that the long post you wrote at the top of the thread where you made several analogies about your response, were exactly the sort of gray area situations where, depending on context, the community might decide to sacrifice it’s sacred value.
So, my current guess (which is only a guess and which I would have strongly disagreed with ten years ago) is that this is a suicidally terrible idea that will literally destroy the world. Sound like an unreflective appeal to sacred values? Well, maybe!—you shouldn’t take my word for this (or anything else) except to the exact extent that you think my word is Bayesian evidence. Unfortunately I’m going to need to defer supporting argumentation to future Less Wrong posts, because mental and financial health requirements force me to focus on my dayjob for at least the next few weeks. (Oh, and group theory.)
So, it’s actually not clear to me that deliberate negative intentions are particularly important, here or elsewhere?
(responding, and don’t expect another response back because you’re busy).
I used to think this, but I’ve since realized that intentions STRONGLY matter. It seems like a system is fractal, the goals of the subparts/subagents get reflected in the goal of the broader system. People with aligned intentions will tend to shift the incentive gradients, as well people with unaligned intentions (of course, this isn’t a one way relationship, the incentive gradients will also shift the intentions).
I deny that your approach ever has an advantage over recognizing that definitions are tools which have no truth values, and then digging into goals or desires.
I agree it’s important to realize that these things are fundamentally different.
A better norm of mathematics might be to NOT publish proofs that have obvious negative consequences like enabling weapons of mass destruction, and have a norm that actively disincentivizes people who publish that sort of research.
In other words, a norm might be to basically be epistemically pure, UNLESS the local instrumental considerations outweigh the cost to epistemic climate. This can be rounded down to “have norms about epistemics and break them sometimes,” but only if when someone points at edge cases where the norms are actively harmful, they’re challenged that sometimes the breaking of those norms is perfectly OK.
IE, if someone is using the norms of the community as a weapon, it’s important to point at that the norms are a means to an end, and that the community won’t blindly allow itself to be taken advantage of.
I think my actual concern with this line of argumentation is: if you have a norm of “If ‘X’ and ‘X implies Y’ then ‘Y’, EXCEPT when it’s net bad to have concluded ‘Y’”, then the werewolves win.
The question of whether it’s net bad to have concluded ‘Y’, is much, much more complicated than the question of whether, logically, ‘Y’ is true under these assumptions (of course, it is). There are many, many more opportunities for werewolves to gum up the works of this process, making the calculation come out wrong.
If we’re having a discussion about X and Y, someone moves to propose ‘Y’ (because, as it has already been agreed, ‘X’ and ‘X implies Y’), and then someone else says “no, we can’t do that, that has negative consequences!”, that second person is probably playing a werewolf strategy, gumming up the works of the epistemic substrate.
If we are going to have the exception to the norm at all, then there has to be a pretty high standard of evidence to prove that adding ‘Y’ to the discourse, in fact, has bad consequences. And, to get the right answer, that discussion itself is going to have to be up to high epistemic standards. To be trustworthy, it’s going to have to make logical inferences much more complex than “if ‘X’ and ‘X implies Y’, then ‘Y’”. What if someone objects to those logical inference steps, on the basis that they would have negative consequences? Where does that discussion happen?
In practice, these questions aren’t actually answered. In practice, what happens is that social epistemology doesn’t happen, and instead everything becomes about coalitional politics. Saying ‘Y’ doesn’t mean ‘Y is literally true’, it means you’re part of the coalition of people who wants consequences related to (but not even necessarily directly implied by!) the statement ‘Y’ to be put into effect, and that makes you blameworthy if those consequences hurt someone sympathetic, or that coalition is bad. Under such conditions, it is a major challenge to re-establish epistemic discourse, because everything is about violence, including attempts to talk about the “we don’t have epistemology and everything is about violence” problem.
We have something approaching epistemic discourse here on LessWrong, but we have to defend it, or it, too, becomes all about coalitional politics.
I want to note that LW definitely has exceptions to this norm, if only because of the boring, normal exceptions. (If we would get in trouble with law enforcement for hosting something you might put on LW, don’t put it on LW.) We’ve had in the works (for quite some time) a post explaining our position on less boring cases more clearly, but it runs into difficulty with the sort of issues that you discuss here; generally these questions are answered in private in a way that connects to the judgment calls being made and the particulars of the case, as opposed to through transparent principles that can be clearly understood and predicted in advance (in part because, to extend the analogy, this empowers the werewolves as well).
Another common werewolf move is to take advantage of strong norms like epistemic honesty, and use them to drive wedges in a community or push their agenda, while knowing they can’t be called out because doing so would be akin to attacking the community’s norms.
I’ve seen the meme elsewhere in the rationality community that strong and rigid epistemic norms are a good sociopath repellent, and it’s ALMOST right. The truth is that competent sociopaths (in the Venkat Rao sense) are actually great at using rigid norms for their own ends, and are great at using the truth for their own ends as well. The reason it might work well in the rationality community (besides the obvious fact that sociopaths are even better at using lies to their own ends than the truth) is that strong epistemics are very close to what we’re actually fighting for—and remembering and always orienting towards the mission is ACTUALLY an effective first line defense against sociopaths (necessary but not sufficient IMO).
99 times out of a 100, the correct way to remember what we’re fighting for is to push for stronger epistemics above other considerations. I knew that when I made the original post, and I made it knowing I would get pushback for attacking a core value of the community.
However, 1 time out of 100 the correct way to remember what you’re fighting for is to realize that you have to sacrifice a sacred value for the greater good. And when you see someone explicitly pushing the gray area by trying to get you to accept harmful situations by appealing to that sacred value, it’s important to make clear (mostly to other people in the community) that sacrificing that value is an option.
What specifically do you mean by “werewolf” here & how do you think it relates to the way Jessica was using it? I’m worried that we’re getting close to just redefining it as a generic term for “enemies of the community.”
By werewolf I meant something like “someone who is pretending be working for the community as a member, but is actually working for their own selfish ends”. I thought Jessica was using it in the same way.
That’s not what I meant. I meant specifically someone who is trying to prevent common knowledge from being created (and more generally, to gum up the works of “social decisionmaking based on correct information”), as in the Werewolf party game.
Worth noting: “werewolf” as a jargon term strikes me as something that is inevitably going to get collapsed into “generic bad actor” over time, if it gets used a lot. I’m assuming that you’re thinking of it sort of as in the “preformal” stage, where it doesn’t make sense to over-optimize the terminology. But if you’re going to keep using it I think it’d make sense to come up with a term that’s somewhat more robust against getting interpreted that way.
(random default suggestion: “obfuscator”. Other options I came up with required multiple words to get the point across and ended up too convoluted. There might be a fun shorthand for a type of animal or mythological figure that is a) a predator or parasite, b) relies on making things cloudy. So far I could just come up with “squid” due to ink jets, but it didn’t really have the right connotations)
That is a bit more specific than what I meant. In this case though, the second more broad meaning of “someone who’s trying to gum up the works of social decisionmaking” still works in the context of the comment.
Um, in context, this sounds to me like you’re arguing that by writing “Where to Draw the Boundaries?” and my secret (“secret”) blog, I’m trying to get people to accept harmful situations? Am I interpreting you correctly? If so, can you explain in detail what specific harm you think is being done?
Sorry, I was trying to be really careful as I was writing of not accusing you specifically of bad intentions, but obviously it’s hard in a conversation like this where you’re jumping between the meta and the object-level.
It’s important to distinguish a couple things.
1. Jessica and I were talking about people with negative intentions in the last two posts. I’m not claiming that you’re one of those people that is deliberately using this type of argument to cause harm.
2. I’m not claiming that it was the writing of those two posts that were harmful in the way we were talking about. I was claiming that the long post you wrote at the top of the thread where you made several analogies about your response, were exactly the sort of gray area situations where, depending on context, the community might decide to sacrifice it’s sacred value. At the same time, you were banking on the fact that it was a sacred value to say “even in this case, we would uphold the sacred value.” This has the same structure as the werewolf move mentioned above, and it was important for me to speak up, even if you’re not a werewolf.
Thanks for clarifying!
So, it’s actually not clear to me that deliberate negative intentions are particularly important, here or elsewhere? Almost no one thinks of themselves as deliberately causing avoidable harm, and yet avoidable harm gets done, probably by people following incentive gradients that predictably lead towards harm, against truth, &c. all while maintaining a perfectly sincere subjective conscious narrative about how they’re doing God’s work, on the right side of history, toiling for the greater good, doing what needs to be done, maximizing global utility, acting in accordance with the moral law, practicing a virtue which is nameless, &c.
Agreed. If I’m causing harm, and you acquire evidence that I’m causing harm, then you should present that evidence in an appropriate venue in order to either persuade me to stop causing harm, or persuade other people to coördinate to stop me from causing harm.
So, my current guess (which is only a guess and which I would have strongly disagreed with ten years ago) is that this is a suicidally terrible idea that will literally destroy the world. Sound like an unreflective appeal to sacred values? Well, maybe!—you shouldn’t take my word for this (or anything else) except to the exact extent that you think my word is Bayesian evidence. Unfortunately I’m going to need to defer supporting argumentation to future Less Wrong posts, because mental and financial health requirements force me to focus on my dayjob for at least the next few weeks. (Oh, and group theory.)
(End of thread for me.)
(responding, and don’t expect another response back because you’re busy).
I used to think this, but I’ve since realized that intentions STRONGLY matter. It seems like a system is fractal, the goals of the subparts/subagents get reflected in the goal of the broader system. People with aligned intentions will tend to shift the incentive gradients, as well people with unaligned intentions (of course, this isn’t a one way relationship, the incentive gradients will also shift the intentions).
I deny that your approach ever has an advantage over recognizing that definitions are tools which have no truth values, and then digging into goals or desires.