Bad intent is a disposition, not a feeling
It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:
I believe that the ability to expect that conversation partners are well-intentioned by default is a public good. An extremely valuable public good. When criticism turns to attacking the intentions of others, I perceive that to be burning the commons. Communities often have to deal with actors that in fact have ill intentions, and in that case it’s often worth the damage to prevent an even greater exploitation by malicious actors. But damage is damage in either case, and I suspect that young communities are prone to destroying this particular commons based on false premises.
To be clear, I am not claiming that well-intentioned actions tend to have good consequences. The road to hell is paved with good intentions. Whether or not someone’s actions have good consequences is an entirely separate issue. I am only claiming that, in the particular case of small high-trust communities, I believe almost everyone is almost always attempting to do good by their own lights. I believe that propagating doubt about that fact is nearly always a bad idea.
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this?
There are plenty of reasons why we might make systematic errors on things that are too infrequent or too inconsequential to yield a lot of relevant-feeling training data or matter much for reproductive fitness, but social intuitions are a central case of the sort of things I would expect humans to get right by default. I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.
Nate continues:
My models of human psychology allow for people to possess good intentions while executing adaptations that increase their status, influence, or popularity. My models also don’t deem people poor allies merely on account of their having instinctual motivations to achieve status, power, or prestige, any more than I deem people poor allies if they care about things like money, art, or good food. […]
One more clarification: some of my friends have insinuated (but not said outright as far as I know) that the execution of actions with bad consequences is just as bad as having ill intentions, and we should treat the two similarly. I think this is very wrong: eroding trust in the judgement or discernment of an individual is very different from eroding trust in whether or not they are pursuing the common good.
Nate’s argument is almost entirely about mens rea—about subjective intent to make something bad happen. But mens rea is not really a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.
More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone’s behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It’s clear to me that we should expect this to sometimes be the case—sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.The default hypothesis should be that any given constraint has not been applied to someone’s behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It’s clear to me that we should expect this to sometimes be the case—sometimes people have a convergent incentive to inform one another, rather than a divergent incentive to grab control.
I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I’ve personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people!”
Aside from the obvious problems with assuring someone that you’re telling the truth, this is generally something of a nonsequitur. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.
Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. And besides, it wouldn’t be so common if it required an exceptionally bad character. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse—and to point this out publicly so that we can learn to do better, together.
(Cross-posted at my personal blog.)
[EDITED 1 May 2017 - changed wording of title from “behavior” to “disposition”]
I agree that most relevant bad behavior isn’t going to feel from the inside like an attempt to mislead, and I think that rationalists sometimes either ignore this or else have an unfounded optimism about nominal alignment.
In the evolutionary context, our utterances and conscious beliefs are optimized for their effects on others, and not merely for accuracy. Believing and claiming bad things about competitors is a typical strategy. Prima facie, accusations of bad faith are particularly attractive since they can be levied on sparse evidence yet are rationally compelling. Empirically, accusations of bad faith are particularly common.
This makes an interesting contrast with the content of the post. The feeling that some people are bad is a strong and central social intuition. Do you think you’ve risen to the standard of evidence you are requesting here? It seems to me that you are largely playing the same game people normally play, and then trying to avoid norms that regulate the game by disclaiming “I’m not playing the game.”
For the most part these procedural issues seem secondary to disputes about facts on the ground. But all else equal they’re a reason to prefer object-level questions to questions about intent, logical argument and empirical data to intuition, and private discussion to public discussion.
Nope! Good point.
Here’s a specific outcome I would like to avoid: ganging up on the individuals saying misleading things, replacing them with new individuals who have better track records, but doing nothing to alter the underlying incentives. That would be really bad. I think we actually have exceptionally high-integrity individuals in critical leadership positions now, in ways that make the problem of perceived incentive to distort much easier to solve than it might otherwise be.
I don’t actually know how not to play the same old game yet, but I am trying to construct a way.
I see you aiming to construct a way and making credible progress, but I worry that you’re trying to do to many things at once and are going to cause lasting damage by the time you figure it out.
Specifically, the “confidence game” framing of the previous post moved it from “making an earnest good faith effort to talk about things” to “the majority of the post’s content is making a status move”[1] (in particular in the context of your other recent posts, and is exacerbated by this one), and if I were using the framing of this current post I’d say both the previous post and this one have bad intent.
I don’t think that’s a good framing—I think it’s important that you (and folk at OpenPhil and at CEA) do not just have an internally positive narrative but are actually trying to do things that actually cache out to “help each other” (in a broad sense of “help each other”). But I’m worried that this will not remain the case much longer if you continue on your current trajectory.
A year ago, I was extremely impressed with the work you were doing and points you were making, and frustrated that those points were not having much impact.
My perception was “EA Has A Lying Problem” was an inflection point where a) yeah, people started actually paying attention to the class of criticism you’re doing, but the mechanism by which people started paying attention was by critics invoking rhetoric and courting controversy, which was approximately as bad as the problem it was trying to solve. (or at least, within an order of magnitude as bad)
[1] I realize there was a whole lot of other content of the Confidence Game post that was quite good. But, like, the confidence game part is the part I remember easily. Which is the problem.
Could you say more about which things you think I should be doing separately instead of together, and why?
Things I notice you doing:
Meta discussion of how to have conversations / high quality discourse / why this is important
Evaluating OpenPhil and CEA as institutions, in a manner that’s aiming to be evenhanded and fair
Making claims and discussing OpenPhil and CEA that seem pretty indistinguishable from “punishing them and building public animosity towards them.”
Because of #3, I think it’s a lot harder to have credibility when doing #1 or #2. I think there is now enough history with #3 (perceived or actual, whatever your intent), that if you want to be able to do #1 or #2 you need to signal pretty hard that you’re not doing #3 anymore, and specifically take actions aiming to rebuild trust. (And if you were doing #3 by accident, this includes figuring out why your process was outputting something that looked like 3)
I have thoughts about “what to do to cause OpenPhil and CEA to change their behavior” which’ll be a response to tristanm’s comment.
Um, this notion that publicly criticizing organizations such as OpenPhil and CEA amounts to unhelpfully “punishing them and building public animosity towards them”, and thus is per se something to be avoided, is exactly one of the glaring issues “EA has a Lying Problem” (specifically, the subsection Criticizing EA orgs is harmful to the movement) was about. Have we learned nothing since then?
I think I’m mostly going to have to retreat to “this is a very important conversation that I would very much like to have over skype but I think online text is not a good medium for it.”
But we’ve had this conversation online, when EA Has A Lying Problem was first posted. Some worthwhile points were raised that are quite close to your position here, such as the point that unrealistic standards of idealism/virtue, honesty and prompt response to criticism (that is, unrealistic for broadly any real-world institution) could undermine the very real progress that EA orgs are hopefully making, compared to most charitable organizations. This is very much true, but the supposed implication that any and all internal critiques are per se harmful simply doesn’t follow!
I wasn’t saying any and all critiques are harmful—the specific thing I was saying was “these are three things I see you doing right now, and I don’t think you can do all of those within a short timespan.”
Independently, I also think some-but-not-all of the specific critiques you are making are harmful, but that wasn’t the point I was making at the time.
The reason I’d much prefer to have the conversation in person is because by now the entire conversation is emotionally charged (at least for me, and it looks like for you), in a way that is counterproductive. Speaking only for myself, I know that in an in person conversation where I can read facial expressions, I can a) more easily maintain empathy throughout the process, b) as soon as I hit a point where either we disagree, or where the conversation is getting heated, it’s a lot easier to see that, step back and say “okay let’s stop drop and doublecrux.” (And, hopefully, often realize that something was a simple misunderstanding rather than a disagreement)
Online, there are two options at any interval: write out a short point, or write out a long point. If I write out a short point, it won’t actually address all the things I’m trying to point at. If I write a long point, at least one thing will probably be disagreed with or misunderstood, which will derail the whole post.
A) I think this is probably a good thing to do when an online conversation is accumulating drama and controversy.
B) Even if it’s not, I very much want to test it out and find out if it works.
Sometimes a just and accurate evaluation shows that someone’s not as great as they said they were. I’m not trying to be evenhanded in the sense of never drawing any conclusions, and I don’t see the value in that.
Overall, a lot of this feels to me like asking me to do more work, with no compensation, and no offers of concrete independent help, and putting the burden of making the interaction go well on the critic.
It would have been very, very helpful at that time to have public evidence that anyone at all agreed or at least thought that particular points I was making needed an answer. I’m getting that now, I wasn’t getting that then, so I find it hard to see the appeal in going back to a style that wasn’t working.
That was a blog post by Sarah Constantin. I am not Sarah Constantin. I wrote my own post in response and about the same things, which no one is bringing up here because no one remembers it. It got a bit of engagement at the time, but I think most of that was spillover from Sarah’s post.
If you want higher-quality discourse, you can engage more publicly with what you see as the higher-quality discourse. My older posts are still available to engage with on the public internet, and were written to raise points that would still be relevant in the future.
I agree that the “confidence game” framing, and particularly the comparison to a ponzi scheme seemed to me like surprisingly charged language, and not the kind of thing you would do if you wanted a productive dialogue with someone.
I’m not sure whether Benquo means for it to come across that way or not. (Pro: maybe he has in fact given up on direct communication with OpenPhil, and thinks his only method of influence is riling up their base. Con: maybe he just thought it was an apt metaphor and didn’t model it as a slap-in-the-face, like I did. Or maybe something else I’m missing.)
Just to add another datapoint, I read it as strongly hostile, more like aiming at delegitimizing the target in the eyes of others than at starting a constructive discussion with them.
If his goal is to actually convince EA organizations to change their behavior, then it could be argued that his rhetorical tactics are in fact likely to be the most effective way of actually achieving that. We should not underestimate the effectiveness of strategies that work by negative PR or by using rhetorical as opposed to strictly argumentative language. I would argue they actually have a pretty good track record of getting organizations to change, without completely destroying the organization (or an associated movement). Uber and United have probably just gone through some of the worst negative coverage it is possible to undergo, and yet the probability that either of them will be completely destroyed by that is almost negligible. On the other hand, the probability that they will improve due to the negative controversy is quite high by my estimation.
Noting the history of organizations that have been completely wiped out by scandal or controversy, it is usually the case that they failed to accomplish their primary goal (such as maximizing shareholder value), and typically in a catastrophic or permanent way that indicated almost beyond doubt that they would never be able to accomplish that goal. It is generally not enough that their leaders acted immorally or unethically (since they can usually be replaced), or that they fail at a subgoal (because subgoals tend to be easier to modify). And since EA is not a single organization, but is better understood as a movement, it is unlikely that the entire movement will be crippled by even a major controversy in one of its organizations. It’s really hard to destroy philosophies.
OpenPhil leadership stated that responding to criticisms and being transparent about their decision-making is a highly costly action to take. And I think it has been well-argued at this point (and not in a purely rhetorical way) that EA organizations are so strongly motivated against taking these actions (as judged by observation of their actions), that they may even occasionally act in the opposite direction. Therefore, if there exist convincing arguments that they are engaging in undesirable behavior, and given that we fairly well know that they are acting on strong incentives, then it follows that in order to change their behavior, they need to be strongly motivated in the other direction. It is not, in general, possible to modify an agent’s utility function by reasoning alone. All rational agents are instrumentally motivated to preserve their preferences and resist attempts at modification.
My argument is not that we need to resort to sensationalist tactics, but only that purely argumentative strategies that offer no negative cost to the organization in question are unlikely to be effective either. And additionally that actions that add this cost are unlikely to be so costly that they result in permanent or unrecoverable damage.
I agree that this is a big and complicated deal and “never resort to sensationalist tactics” isn’t a sufficient answer for reasons close to what you describe. I’m not sure what the answer is, but I’ve been thinking about ideas.
Basically, I think were automatically fail if we have no way to punish defectors, and we also automatically fail controversy/sensationalism-as-normally-practiced is our main tool of doing so.
I think the threat of sensationalist tactics needs to be real. But it needs to be more like Nuclear Deterrence than it is like tit-for-tat warfare.
We’ve seen where sensationalism/controversy leads—American journalism. It is a terrible race to the bottom of inducing as much outrage as you can. It is anti-epistemic, anti-instrumental, anti-everything. Once you start down the dark path, forever will it dominate your destiny.
I am very sympathetic to the fact that Ben tried NOT doing that, and it didn’t work.
Comments like this make me want to actually go nuclear, if I’m already not getting credit for avoiding doing so.
I haven’t really called anyone in the community names. I’ve worked hard to avoid singling people out, and instead tried to make the discussion about norms and actions, not persons. I haven’t tried to organize any material opposition to the interests of the organizations I’m criticizing. I haven’t talked to journalists about this. I haven’t made any efforts to widely publicize my criticisms outside of the community. I’ve been careful to bring up the good points as well as the bad of the people and institutions I’ve been criticizing.
I’d really, really like it if there were a way to get sincere constructive engagement with the tactics I’ve been using. They’re a much better fit for my personality than the other stuff. I’d like to save our community, not blow it up. But we are on a path towards enforcing norms to suppress information rather than disclose it, and if that keeps going, it’s simply going to destroy the relevant value.
(On a related note, I’m aware of exactly one individual who’s been accused of arguing in bad faith in the discourse around Nate’s post, and that individual is me.)
I’m not certain that there is, in fact, a nuclear option. Besides that, I believe there is still room for more of what you have been doing. In particular, I think there are a couple of important topics that have yet to be touched upon in depth, by anyone really.
The first is that the rationalist community has yet to be fully engaged with the conversation. I’ve been observing the level of activity here and on other blogs and see that the majority of conversation is conducted by a small number of people. In other locations, such as SSC, there is a high level of activity but there is also substantial overlap with groups not directly a part of the rationality community, and the conversations there aren’t typically on the topic of EA. Some of the more prominent people have not entered the conversation at all besides Nate. It would be nice if someone like Eliezer gave his two cents.
The second is that it has yet to be discussed in depth the fact that the rationality community and the EA community are, in fact, separate communities with differing goals and values that are more accurately said to have formed an alliance rather than actually merged into one group. They have different origin stories, different primary motivations, and have been focused on a different set of problems throughout their history. The re-focusing of EA towards AI safety occurred rather recently and I think that as their attention turned there, it become more obvious to the rationality community that there were significant differences in thought that were capable of causing conflict.
What I see as one of the main differences between the two cultures is that the rationality community is mostly concerned with accuracy of belief and methods of finding truth whereas the EA community is mostly concerned with being a real force for good in the world, achieved through utilitarian means. I think there is in fact a case to be made that we either need to find a way to reconcile these differences, or go our separate ways, but we certainly can’t pretend these differences don’t exist. One of my main problems with Nate’s post is that he appears to imply that there aren’t any genuine conflicts between the two communities, which I think is simply not the case. And I think this has caused some disappointing choices for MIRI in responding to criticisms. For example, it’s disappointing that MIRI has yet to publish a formal response to the critiques made by Open Phil. I think it’s basic PR 101 that if you’re going to link to or reference criticisms to your organization, you should be fully prepared to engage with each and every point made.
I think my overall point is that there is still room for you, and anyone else who wants to enter this conversation, to continue with the strategy you are currently using, because it does not seem to have fully permeated the rationality community. Some sort of critical mass of support has to be reached before progress can be made, I think.
I don’t have better description than “more like nuclear deterrence” for now, mulling it over.
I think Nate is absolutely correct to note that if we just retreat to the object level and give up on implied trust, we lose something very valuable. We can’t each evaluate everything from scratch. If we’re going to make intellectual progress together, we need to be able to justifiably trust that people aren’t just trying to get us to do things that make sense to them, aren’t even just telling us things that happen to be literally true, but are making an honest good-faith attempt to give us the most decision-relevant information.
Discussions about what’s in good faith also seem hard to avoid when discussing things like standards of evidence, and how to evaluate summaries from outsiders.
I agree with the first two items, but consider that the content of these private discussions, and whatever the conclusions that are being drawn from them, are probably only visible to the wider community in the form of the decisions being made at the highest levels. Therefore, how do you ensure that when these decisions are made, and the wider community is expected to support them, that there will not be disagreement or confusion? Especially since the reasoning behind them is probably highly complex.
Then this begs the question, what is the distribution of private / public discussion that is the most preferred? Certainly if all discussion was kept private, then the wider community (especially the EA community) would have no choice but to support decisions on faith alone. And at the other extreme, there is the high cost of writing and publishing documentation of reasoning, the risk of wide misunderstanding and confusion, and the difficulty associated with trying to retract or adjust statements that are no longer supported.
And if “private” discussion wasn’t constrained to just a small circle, but rather simply meant that you would have to communicate to each individual inquiry separately, than that may come at an even greater cost than that of simply publishing your thoughts openly, because it would require you to devote your attention and effort into multiple, possibly numerous individual discussions, that require modeling each person’s level of knowledge and understanding.
I essentially don’t think the answer is going to be as simple as “private” vs “public”, but I tend to err on the side of transparency, though this may reflect more of a value than a belief based on strong empirical data.
Why would these procedural issues be a reason to prefer private discourse?
a. Private discussion is nearly as efficient as public discussion for information-transmission, but has way fewer political consequences. On top of that, the political context is more collaborative between participants and so it is less epistemically destructive.
b. I really don’t want to try to use collectively-enforced norms to guide epistemology and don’t think there are many examples of this working out well (whereas there seem to be examples of avoiding epistemically destructive norms by moving into private).
Can you define more precisely what you mean by “private discussion?” If by that you mean that all discourse is constrained to one-on-one conversations where the contents are not available to anyone else, I don’t intuitively see how this would be less destructive and more collaborative. It seems to require that a lot of interactions must occur before every person is up to date on the collective group knowledge, and also that for each conversation there is a lossy compression going on—it’s difficult for each conversation to carry the contents of each person’s history of previous conversations.
On the other hand, if you’re advocating for information to be filtered when transmitted beyond the trusted group, but flows freely within the trusted group, I believe that is less complicated and more efficient and I would have fewer objections to that.
By “private discussion” I mean discussions amongst small groups, in contrast with discussions amongst large groups. Both of them occur constantly. I’ve claimed that in general political considerations cut in favor of having private discussions more often than you otherwise would, I didn’t mean to be making a bold claim.
I recently wanted to raise an issue with some possible controversy/politics in the main EA facebook group. Instead of approving the post I was told, “this post isn’t a good fit for the group, how about posting it instead in this secret facebook group.
That secret facebook group isn’t for one-on-one conversations but it’s still more private.
If this is a categorical claim, then what are academic journals for? Should we ban the printing press?
If your claim is just that some public forums are too corrupted to be worth fixing, not a categorical claim, then the obvious thing to do is to figure out what went wrong, coordinate to move to an uncorrupted forum, and add the new thing to the set of things we filter out of our new walled garden.
I don’t believe that academic journals are an efficient form of information transmission. Academics support academic journals (when they support academic journals) because journals serve other useful purposes.
Often non-epistemic consequences of words are useful, and often they aren’t a big deal. I wouldn’t use the word “corrupted” to describe “having political consequences,” it’s the default state of human discussions.
Public discussion is sometimes much more efficient than private discussion. A central example is when the writer’s time is much more valuable than the reader’s time, or when it would be high-friction for the reader to buy off the writer’s time. (Though in this case, what’s occurring isn’t really discourse.) There are of course other examples.
Doing things like “writing down your thoughts carefully, and then reusing what you’ve written down” is important whether discussion occurs in public or private.
My intuition around whether some people are intrinsically bad (as opposed to bad at some things), is that it’s an artifact of systems of dominance like schools designed to create insecure attachment, and not a thing nonabused humans will think of on their own.
I think this is very unlikely.
I think this would be valuable to work out eventually, but this probably isn’t the right time and place, and in the meantime I recognize that my position isn’t obviously true.
As far I remember, among the people (prison guards, psychiatrists) who work with… problematic humans the general consensus is that some small percentage of those they see (around 5% IIRC) are best described as irredeemably evil. Nothing works on them, they don’t become better with time or therapy or anything. There is no obvious cause either.
This is from memory, sorry, no links.
This seems completely false. Most people think that Hitler and Stalin were intrinsically bad, and they would be likely to think this with or without systems of dominance.
Kant and Thomas Aquinas explain it quite well: we call someone a “bad person” when we think they have bad will. And what does bad will mean? It means being willing to do bad things to bring about good things, rather than wanting to do good things period.
Do you think Nate’s claim was that we oughtn’t so often jump to the conclusion that people are willing to do bad things in order to bring about good things? That this is the accusation that’s burning the commons? I’m pretty sure many utilitarians would say that this is a fair description of their attitude at least in principle.
I would be a bit surprised if that was explicitly what Nate meant, but it is what we should be concerned about, in terms of being concerned about whether someone is a bad person.
To make my general claim clearer: “doing evil to bring about good, is still doing evil,” is necessarily true, for exactly the same reason that “blue objects touching white objects, are still blue objects,” is true.
I agree that many utilitarians understand their moral philosophy to recommend doing evil for the sake of good. To the extent that it does, their moral philosophy is mistaken. That does not necessarily mean that utilitarians are bad people, because you can be mistaken without being bad. But this is precisely the reason that when you present scenarios where you say, “would you be willing to do such and such a bad thing for the sake of good,” many utilitarians will reply, “No! That’s not the utilitarian thing to do!” And maybe it is the utilitarian thing, and maybe it isn’t. But the real reason they feel the impulse to say no, is that they are not bad people, and therefore they do not want to do bad things, even for the sake of good.
This also implies, however, that if someone understands utilitarianism in this way and takes it too seriously, they will indeed start down the road towards becoming a bad person. And that happened even in the context of the present discussion (understood more broadly to include its antecedents) when certain people insisted, saying in effect, “What’s so bad about lying and other deceitful tactics, as long as they advance my goals?”
I agree that this exists, and claim that it ought to be legitimate discourse to claim that someone else is doing it.
Human punishment of free riders helps ensure there are few free-riders. Our fear and surprise responses are ridiculously over sensitive, because of the consequences of type 1 vs type 2 errors. Etc...
Evolution, too, is into massive A/B testing with no optimisation target that includes truth.
That seems plausible, and suggests that the low rate of free-riders is causally related to our readiness to call out suspected ones.
This suggests that the right thing to do is to try to reduce the cost, rather than the rate, of false-positives. And surely not to demolish this Chesterton’s Fence without a good replacement fix for the underlying problem.
This suggests it’s more useful to compare human groups and see how they manage the problem, rather than trying to parse the ins and outs of evolutionary psychology.
Agreed.
It goes up at least one important meta level: fraction of the community willing to take on the (potentially high in ambiguous cases) cost of punishing free riders has threshold effects IIRC that determine which attractor you sort in to. Part of my S1 sense that EA will not be able to accomplish much good on an absolute scale (even if much good is done at the margin) is that it does not cross this threshold.
Note also that most groups treat their intuitions about whether or not someone is acting in bad faith as evidence worth taking seriously, and that we’re remarkable in how rarely we tend to allow our bad-faith-detecting intuitions to lead us to reach the positive conclusion that someone is acting in bad faith. Note also that we have a serious problem with not being able to effectively deal with Gleb-like people, sexual predators, etc, and that these sorts of people reliably provoke person-acting-in-bad-faith-intuitions in people with (both) strong and accurate bad-faith-sensing intuitions. (Note that having strong bad-faith-detecting intuitions correlates somewhat with having accurate ones, since having strong intuitions here makes it easier to pay attention to your training data, and thus build better intuitions with time). Anyways, as a community, taking intuitions about when someone’s acting in bad faith more seriously on the margin could help with this.
Now, one problem with this strategy is that many of us are out of practice at using these intuitions! It also doesn’t help that people without accurate bad-faith-detecting intuitions often typical-mind fallacy their way into believing that there aren’t people who have exceptionally accurate bad-faith-detecting intuitions. Sometimes this gets baked into social norms, such that criticism becomes more heavily taxed, partly because people with weak bad-faith-detecting intuitions don’t trust others to direct their criticism at people who are actually acting in bad faith.
Of course, we currently don’t accept person-acting-in-bad-faith-intuitions as useful evidence in the EA/LW community, so people who provoke more of these intuitions are relatively more welcome here than in other groups. Also, for people with both strong and accurate bad-faith-detecting intuitions, being around people who set off their bad-faith-sensing intuitions isn’t fun, so such people feel less welcome here, especially since a form of evidence they’re good at acquiring isn’t socially acknowledged or rewarded, while it is acknowledged and rewarded elsewhere. And when you look around, you see that we in fact don’t have many people with strong and accurate bad-faith-detecting intuitions; having more of these people around would have been a good way to detect Gleb-like folks much earlier than we tend to.
How acceptable bad-faith-detecting intuitions are in decision-making is also highly relevant to the gender balance of our community, but that’s a topic for another post. The tl;dr of it is that, when bad-faith-detecting intuitions are viewed as providing valid evidence, it’s easier to make people who are acting creepy change how they’re acting or leave, since “creepiness” is a non-objective thing that nevertheless has a real, strong impact on who shows up at your events.
Anyhow, I’m incredibly self-interested in pointing all of this out, because I have very strong (and, as of course I will claim, very accurate) bad-faith-detecting intuitions. If people with stronger bad-faith-detecting intuitions are undervalued because our skill at detecting bad actors isn’t recognized, then, well, this implies people should listen to us more. :P
Here on LW Gleb got laughed at almost immediately as he started posting. Did he actually manage to make any inroads into EA/Bay Area communities? I know EA ended up writing a basically “You are not one of us, please go away” post/letter, but it took a while.
Good observation.
Amusingly, one possible explanation is that the people who gave Gleb pushback on here were operating on bad-faith-detecting intuitions—this is supported by the quick reaction time. I’d say that those intuitions were good ones, if they lead to those folks giving Gleb pushback on a quick timescale, and I’d also say that those intuitions shaped healthy norms to the extent that they nudged us towards establishing a quick reality-grounded social feedback loop.
But the people who did give Gleb pushback more frequently framed things in terms other than them having bad-faith-detecting intuitions than you’d have guessed, if they were actually concluding that giving Gleb pushback was worth their time based on their intuitions—they pointed to specific behaviors, and so on, when calling him out. But how many of these people actually decided to give Gleb feedback because they System-2-noticed that he was implementing a specific behavior, and how many of us decided to give Gleb feedback because our bad-faith-detecting intuitions noticed something was up, which led us to fish around for a specific bad behavior that Gleb was doing?
If more of us did the latter, this suggests that we have social incentives in place that reward fishing around and finding specific bad behaviors, but to me, fishing around for bad behaviors (i.e. fishing through data) like this doesn’t seem too much different from p-hacking, except that fishing around for social data is way harder to call people out on. And if our real reasons for reaching the correct conclusion that Gleb needed to get pushback were based in bad-faith-detecting intuitions, and not in System 2 noticing bad behaviors, then maybe providing social allowance for the mechanism that actually led some of us to detect Gleb a bit earlier to do its work on its own in the future, rather than requiring its use to be backed up by evidence of bad behaviors (junk data) that can be both p-hacked by those who want to criticize independently of what was true, or hidden by those with more skill than Gleb, would be a good idea.
At a minimum, being honest with ourselves about what our real reasons are ought to help us understand our minds a bit better.
I don’t know if you can separate it this cleanly. Sometimes you get a smells-funny feeling and then your System 2 goes to investigate. But sometimes—and I think this was the case with Gleb—both System 1 and System 2 look at each other and chorus “Really, dude?” :-)
nod. This does seem like it should be a continuous thing, rather than System 1 solely figuring things out in some cases and System 2 figuring it out alone in others.
I sent a few private notes to him early on about the way I reacted to his posts. This wasn’t a “bad faith” detector ( I don’t actually buy the premise—such a thing is VERY uncommon compared to honest incorrect values and beliefs), this was a pattern match to an overzealous overconfident newbie, possibly with under-developed social skills. You know, just like all of us a few years (or in my case decades) ago.
This all sounds right, but the reasoning behind using the wording of “bad faith” is explained in the second bullet point of this comment.
Tl;dr the module your brain has for detecting things that feel like “bad faith” is good at detecting when someone is acting in ways that cause bad consequences in expectation but don’t feel like “bad faith” to the other person on the inside. If people could learn to correct a subset of these actions by learning, say, common social skills, treating those actions like they’re taken in “bad faith” incentivizes them to learn those skills, which results in you having to live with negative consequences from dealing with that person less. I’d say that this is part of why our minds often read well-intentioned-but-harmful-in-expectation behaviors as “bad faith”; it’s a way of correcting them.
I’d guess the same fraction of people reacted disrespectfully to Gleb in each community (i.e. most but not all). The difference was more that in an EA context, people worried that he would shift money away from EA-aligned charities, but on LW he only wasted peoples’ time.
http://lesswrong.com/lw/ou5/against_responsibility/dqfj
Agree in theory, but, lacking an effective bad faith detector myself, how do I know whose intuitions to trust? :(
I’m very glad that you asked this! I think we can come up with some decent heuristics:
If you start out with some sort of inbuilt bad faith detector, try to see when, in retrospect, it’s given you accurate readings, false positives, and false negatives. I catch myself doing this without having planned to on a System 1 level from time to time. It may be possible, if harder, to do this sort of intuition reshaping in response to evidence with System 2. Note that it sometimes takes a long time, and that sometimes you never figure out, whether or not your bad-faith-detecting intuitions were correct.
There’s debate about whether a bad-faith-detecting intuition that fires when someone “has good intentions” but ends up predictably acting in ways that hurt you (especially to their own benefit) is “correct”. My view is that the intuition is correct; defining it as incorrect and then acting in social accordance with it being incorrect incentivizes others to manipulate you by being/becoming good at making themselves believe they have good intentions when they don’t, which is a way of destroying information in itself. Hence why allowing people to get away with too many plausibly deniable things destroys information: if plausible deniability is a socially acceptable defense when it’s obvious someone has hurt you in a way that benefits them, they’ll want to blind themselves to information about how their own brains work. (This is a reason to disagree with many suggestions made in Nate’s post. If treating people like they generally have positive intentions reduces your ability to do collaborative truth-seeking with others on how their minds can fail in ways that let you down—planning fallacy is one example—then maybe it would be helpful to socially disincentivize people from misleading themselves this way by giving them critical feedback, or at least not tearing people down for being ostracizers when they do the same).
Try to evaluate other’s bad faith detectors by the same mechanism as in the first point; if they give lots of correct readings and not many false ones (especially if they share their intuitions with you before it becomes obvious to you whether or not they’re correct), this is some sort of evidence that they have strong and accurate bad-faith-detecting intuitions.
The above requires that you know someone well enough for them to trust you with this data, so a quicker way to evaluate other’s bad-faith-detecting intuitions is to look at who they give feedback to, criticize, praise, etc. If they end up attacking or socially qualifying popular people who are later revealed to have been acting in bad faith, or if they end up praising or supporting ones who are socially suspected of being up to something who are later revealed to have been acting in good faith, these are strong signals of them having accurate bad-faith-detecting intuitions.
Done right, bad-faith-detecting intuitions should let you make testable predictions about who will impose costs or provide benefits to you and your friends/cause; these intuitions become more valuable as you become more accurate at evaluating them. Bad-faith-detecting intuitions might not “taste” like Officially Approved Scientific Evidence, and we might not respect them much around here, but they should tie back into reality, and be usable to help you make better decisions than you’d been able to make without using them.
Of course our intuitions of someone acting in bad faith are evidence that they are; that much is obvious. The relevant question is how strong the correlation is between the intuition vs. actual bad faith. Since you admit quite openly that people vary widely in how sensitive their intuitive ‘bad-faith detectors’ are (this, after all, it what it means to have a ‘strong sense’ of such!), shouldn’t this be of concern for those who would claim that this correlation is very high—quite high enough to be useful on its own?
It’s also important to realize that both Type I (false hit) and Type II (miss) errors are harmful here, hence, as usual in any binary detection setting, specificity is as relevant as sensitivity—and there’s no reason why additional evidence should be discounted; particularly if such evidence is of a factual sort—and as such is likely to be otherwise broadly independent from the output of our intuitive detectors!
The binary classification leads to problems. We distinguish cooperative intent, defective intent and hostile intent. The person who optimizes his marketing for conversation without regard for the truth is acting defective and neither cooperative nor hostile.
There’s such a thing as hostile intent. Some people are intent to cause harm for other people but those aren’t the people with whom we have problems in this community.
I found this helpful. Distinguishing between treating people as friendly agents, tools, and enemy agents seems quite a bit better than the binary good/bad faith distinction. I think a lot of bad faith accusations feel like people are saying “this person is treating me like an enemy agent,” but are properly evidence for “this person is treating me like a tool,” which is itself sufficient reason to distrust and build common knowledge about them.
In some ways, “enemy agent” and “friendly agent” are more similar attitudes than either is to “tool”.
I would probably define “in bad faith” as “trying to deliberately mislead” (which itself is basically lying, just widened a bit to include cases like “but technically speaking this is a true statement” and “but I didn’t say anything, just wiggled my eyebrows suggestively”). Do you think it’s more complicated than that?
I’m skeptical of the work “deliberately” is doing there. If the whole agent determining someone’s actions is following a decision procedure that tries to push my beliefs away from the truth when convenient, then there’s a sense in which the whole agent is acting in bad faith, even if they’ve never consciously deliberated on the matter. At least, it’s materially different from unmotivated error, in a way that makes it similar to consciously lying.
Harry Frankfurt’s “On Bullshit” introduced the distinction between lies and bullshit. The liar wants to deceive you about the world (to get you to believe false statements), whereas the bullshitter wants to deceive you about his intentions (to get you to take his statements as good-faith efforts, when they are merely meant to impress).
We may need to introduce a third member of this set. Along with lies told by liars, and bullshit spread by bullshitters, there is also spam emitted by spambots.
Like the bullshitter (but unlike the liar), the spambot doesn’t necessarily have any model of the truth of its sentences. However, unlike the bullshitter, the spambot doesn’t particularly care what (or whether) you think of it. But it optimizes its sentences to cause you to do a particular action.
To me it seems troll is also an important category. Most journalists don’t care whether you believe what they write but care that you engage with their writing. Whether you love it or hate it is secondary when you share the post on facebook and twitter.
This is a bit of a definitions dispute, but I want to distinguish between someone whose values/interests/goals do not coincide with yours but who’s quite open about it on the one hand, and someone who wants to manipulate you without you realizing what’s going on on the other hand. I wouldn’t apply the expression “in bad faith” to the former case (that could be a “hostile agent”, but that’s a different thing), but I would to the latter case.
So if someone thinks the truth is different from what you think it is and tries to “push your beliefs away”, is he acting in bad faith? Consider e.g. your standard sincere Christian missionaries.
If someone’s mistaken, and tries to push my beliefs towards what they mistakenly believe to be the truth by offering the evidence they believe to be the most material, that seems like a good-faith error. On the other hand, telling simplified Bible stories that elide the problem of evil, and only addressing it once people are attached to the idea of God, would not seem like it’s in good faith.
Let’s flip the political arrow.
How do you feel about sincere people telling simplified climate change stories that elide the uncertainties and only addressing them once people are attached to the idea of fighting global warming?
Pretty bad. My trust in the establishment’s ability or willingness to honestly try to inform me about global warming is fairly low as a result. I think that global warming is happening, caused in part by human action, and is going to be somewhat costly, so I’m mildly in favor of measures like a global carbon tax, but I wouldn’t be shocked if some important part of the official narrative turned out to be deeply wrong.
Right, but we are not talking about global warming, we’re are talking about bad faith. Would you say that the sincere we’re-all-gonna-die-unless… environmentalists (and there are a lot of them) are acting in bad faith?
I’d have to know more details about the case you’re thinking of. There’s lots of heterogeneity!
There is no particular need to dig into the details, the point is whether you think of sincere missionaries as different (in the sense of being more or less prone to acting in bad faith) from sincere environmentalists. There’s a lot of heterogeneity all around :-)
I think a lot of environmentalist advocacy is well-described as a bad-faith process executed by people trying to do good, much like a lot of religious education. I’m unsure what the relative extent of each is.
No one is evil in his own story.
Just loudly repeating what you said using my own words… when we talk about optimizing for truth (or any other X), there are essentially 3 options (and of course any mix of them)...
optimizing explicitly for X;
optimizing neither for X nor against X (but perhaps for something else, or nothing at all); or
optimizing explicitly against X.
And while it is a bad form to accuse someone of optimizing against truth, it makes sense to suspect that people are simply not optimizing for truth… which—especially when they optimize for something else—usually ends with some misleading, even it there was no conscious intention to mislead.
This said, how to communicate this conclusion of “you need to explicitly optimize for truth, otherwise you will probably end up misleading people even if your intentions are pure”?
Probably needs to be communicated differently among rationalists, and outside of our small community. Either way, it helps emphasising that we talk about “misleading unintentionally” or perhaps just “misunderstanding”, i.e. to put high priority on communicating that we are not accusing the other side of having bad intentions, merely that… what they said is not what the perfect version of them would say in a perfect world, and that we would like them to get closer to that.
You may not be wrong but I don’t think it would necessarily be surprising. We adapted under social conditions that are radically different than exist today. It may no longer be adaptive.
Hypothesis: In small tribes and family groups assumptions of bad faith may have served to help negotiate away from unreasonable positions while strong familial ties and respected third parties mostly mitigated the harms. Conflicts between tribes without familial connections may have tended to escalate however (although there are ways to mitigate against this too).
Hypothesis: Perhaps assumptions of good and bad faith were reasonably accurate in small tribal and familial groups but in intertribal disagreements there was a tendency to assume bad faith because the cost of assuming good faith and being wrong was so much higher than assuming bad faith and being wrong.
My guess is that our exposure to bad faith communication is more frequent than in the past, rather than less, because of mass media; many more messages we receive are from people who do not expect to have to get along with us in twenty years.
That may well be true, but I should clarify that neither of my hypotheticals require or suggest that bad faith communication was more common in the past. They do suggest that assumptions of bad faith may have been significantly more common than actual bad faith, and that this hypersensitivity may have been adaptive in the ancestral environment but be maladaptive now.
So, a couple of thoughts:
1) ascribing intent to behavior is one of the best ways to control someone’s behavior, and it’s deeply baked into our reactions. You are much more likely to get someone to conform to your desires if you say “you’re intentionally behaving badly, stop it” than if you say “I don’t like that outcome, but you didn’t mean it”. Your mind is biased toward seeing (and believing, so you can more forcefully make the accusation) much stronger intent than actually exists.
2) intent is a much better predictor of future behaviors than simple observation. It’s far easier to punish or cut off ties to someone with bad intents than with one who’s just a little incompetent but means well. Therefore, your mind is biased toward seeing intent so you can take more forceful actions to protect your interests.
3) Nate doesn’t make the point directly, but “good” and “bad” are massively oversimplified to the point of being misleading. There are many dimensions to evaluate about a person or organization’s likelihood of helping or harming your goals in the future, and in figuring out how best to influence them to be more aligned with your values and beliefs.
4) I’m torn about the object-level objection about statements of value that differ from what behaviors imply. Most humans are not beings of pure thought, and the fact that there are any actions that affect others which are not purely information-sharing doesn’t seem that surprising to me.
This struck me as being such an important oversight that it almost turned Nate’s whole post into an academic exercise.
Any given interpersonal disagreement that culminates in an argument is going to have some kind of difficult-to-reconcile opposition of values and/or mutual knowledge at its core. Both parties are generally going to try to use persuasion in some form to manipulate their opponent’s sense of the relevant values, or their perception of the details of the situation, or their knowledge and interpretation of the facts. From the other side, this will very often look like a bad-faith attempt to undermine your values and beliefs, and you can’t necessarily even say that it isn’t.
In the ideal case, a disagreement can be solved purely by sharing all of the relevant facts. This may be the only case where you can actually expect people to come to an agreement without any tinge of feeling that their opponent is acting in bad faith or being manipulative.
In the less ideal case, all the facts may be shared, but a difference in perspective or weighting of various details necessitates further argument to try to come to an agreement. Since you are trying to address your opponent’s thinking and perceptions, you are by definition attempting to manipulate their mind. This is true regardless of the “goodness” of your intentions.
In the something-like-worst-cast, fundamentally felt values are in opposition, and no amount of sharing of facts and interpretations is going to lead to agreement. At this point it is difficult to even say that you are acting in good faith even if you think that you are, because you’re (perhaps knowingly) trying to persuade someone of something that they believe is wrong and would still believe to be wrong upon indefinite reflection.
The endpoints of “pure good faith” and “pure bad faith” are probably very rare, but the middle ground of muddled manipulativeness and self-justification better describe most arguments.
For more explanation on how incentive gradients interact with and allow the creation of mental modules that can systematically mislead people without intent to mislead, see False Faces.