Bogus Pipeline, Bona Fide Pipeline
Related to: Never Leave Your Room
Perhaps you are a psychologist, and you wish to do a study on racism. Maybe you want to know whether racists drink more coffee than non-racists. Sounds easy. Find a group of people and ask them how racist they are, then ask them how much coffee they drink.
Problem: everyone in your study says they’re completely non-racist and some of their best friends are black and all races are equally part of this vast multicolored tapestry we call humanity. Maybe some of them are stretching the truth here a bit. Until you figure out which ones, you’re never going to find out anything interesting about coffee.
So you build a foreboding looking machine out of gleaming steel, covered with wires and blinking lights. You sit your subjects down in front of the machine, connect them to its electrodes, and say as convincingly as possible that it is a lie detector and they must speak the truth. Your subjects look doubtful. Didn’t they hear on TV that lie detectors don’t really work? They’ll stick to their vehement assertions of tolerance until you get a more impressive-looking machine, thank you.
You get smarter. Before your experiment, you make the subjects fill in a survey, which you secretly copy while they’re not looking. Then you bring them in front of the gleaming metal lie detector, and dare them to try to thwart it. Every time they give an answer different from the one on the survey, you frown and tell them that the machine has detected their fabrication. When the subject is suitably impressed, you start asking them about racism.
The subjects start grudgingly admitting they have some racist attitudes. You have invented the Bogus Pipeline.
The Bogus Pipeline is quite powerful. Since its invention in the 70s, several different studies demonstrate that its victims will give significantly less self-enhancing answers to a wide variety of questions than will subjects not connected to the machinery. In cases where facts can be checked, Pipeline subjects’ answers tend to be more factually correct than normal subjects’.
In one of the more interesting Bogus Pipeline experiments, Millham and Kellogg wanted to know how much of a person’s average self-enhancement is due to self-deception biases, and how much is due to simple lying. They asked people some questions about themselves under normal and Pipeline conditions, using the Marlowe-Crowne scale. This scale really deserves a post of its own, but the short version is that it asks you some loaded questions, and if you take them as an opportunity to say nice things about yourself, you get marked down as a self-enhancer. There was a correlation of .68 between Marlowe-Crowne scores in normal and Pipeline conditions. If we accept that no one deliberately lies under the Pipeline, that means we now know how much self-enhancement is, on average, self-deception rather than deliberate falsehood (tendency towards deliberate falsehoods correlated .37 with Marlowe-Crowne.1)
Interesting stuff. But you still don’t know whether racists drink more coffee! Your Bogus Pipeline only eliminates part of the self-enhancement in your subjects’ answers. If you want to solve the coffee question once and for all, you can’t count on a fake mind-reading device. You need a real mind-reading device. And in the mid 90s, psychology finally developed one.
The Bona Fide Pipeline is far less impressive-looking than the Bogus Pipeline. Though the Bogus Pipeline tries as hard as it can to scream “mind-reading device”, the Bona Fide Pipeline has a vested interest in preventing its victims from realizing their minds are being read. It is a simple computer terminal.
The Pipeline uses a complicated process to disguise itself as an ordinary study on distraction or face recognition or somesuch, but the active ingredient is this: the subjects play a game where they must hit one key (perhaps “A”) if the screen displays a good word (for example “wonderful”), and a different key (perhaps “L”) if the screen displays a bad word (for example “ugly”).
But before it gives you the word, it shows you a picture of a white person or a black person. Remember priming? That picture of a black person is going to prime your brain’s concept of “black person” and any concepts you associate with “black person”. If you have racist attitudes, “bad” is one concept you associate with “black person”. You’re going to have a very easy time recognizing “ugly” as a bad word, because your “bad” concept is already activated. But you’re going to have a harder time recognizing “wonderful” as a good concept, because your brain is already skewed in the opposite direction. It’s not impossible, it’s just going to take a few hundred more milliseconds. Each of which the Bona Fide Pipeline is recording and processing. At the end, it spits out a score telling you that you took an average of three hundred milliseconds longer to recognize good words when primed with black people’s pictures than white people’s pictures.
Does this actually work? The original study (Fazio et al, 1995) tested both whites and blacks, and found the whites were more likely to be prejudiced against blacks than the blacks were, which makes sense. In the same study, a black experimenter conversed with the subjects for a while, and rated the quality of the interaction by a typically rigorous rubric. This fuzzy unscientific measure of racist behavior correlated well with the Pipeline’s data for the individuals involved. A study by Jackson (1997) find that people who score high on prejudice by Pipeline measures on average give lower scores to an essay written by a student known to be black.
The Bona Fide Pipeline has lately been superseded by its younger, sexier, Harvard-educated cousin, the IAT. More on that, the associated controversy, and the relevance to rationality tomorrow.
Footnotes:
1: I doubt that deceptions can be separated cleanly into self-deception and deliberate falsehood like this. More likely there are many different shades of grey, and the Bogus Pipeline captures some but not all of them.
I’m very doubtful about the validity of the IAT test. Their procedure requires you to declare your ethnicity prior to the testing. Depending on the answer they change the test ordering. When I declared an incorrect ethnicity, the test results claimed that I hate my own ethnic group.
While this may not be enough to invalidate the whole idea, it makes me wonder why they did not design the test in a way that would be totally insensitive to the subjects race. Too often I see sociologists design the tests in a way that will ensure the final results.
You assume (or did you read it somewhere?) that the test ordering and test results were based on ethnicity, rather than other factors, like dice rolls and having done it before.
I have been found to be racist when I have taken the IAT. This makes me a little uncomfortable but it also makes me distrust the appeal of attempts to debunk the test.
I took a series of online tests based on that principle: it’s too noisy. Most of the variance is almost certainly random. (I remember getting different results from taking the same test twice, but I’m unsure, and it could be explained as an attempt to deny being racist.) Answers near the end of the test are likely to be faster and more accurate as subjects get used to it. The choice of inputs is loaded: I was unable to recognize some images (photographs of people whose race I couldn’t tell, and a terrible drawing of a menorah that took several iterations to identify); connotations vary across cultures (Blacks are mostly Christian in North America and Muslim in Europe) and people (the test on homosexuality included words like “natural”, which I think of as neutral rather than good, and “perverse”, which I think of as good rather than bad).
I took many tests like this via Project Implicit online. I’m not sure the results are very reliable for individuals, though that is not to say they are unreliable on average.
For instance, I took a racism test in demonstration mode. I was anxious about the exam as I took it, and in particular I was distracted because I had figured out how it worked and I was self-conscious about making “mistakes” (responding more quickly in some scenarios by fluke) or sort-of secretly biasing the result because I didn’t want myself to be racist. In the end it told me I had a moderate automatic preference for whites over blacks. Although I do not consider myself racist, I was willing to accept this because I grew up in the Midwest with many racist peers and over the course of my life have had very little exposure to black people.
But then I encountered the same test as part of the experimental section of the site. This time I was much more relaxed because I had come to terms with the previous results. Yet, after this test, it told me I had no automatic preference for whites or blacks.
I don’t know whether my performance during the first test was skewed because I was anxious and distracted, whether my second test was skewed because I had taken the test before, or whether there is some large uncertainty and my “true” score is simply between these. I also don’t know if my seeing how the experiment works—and then being distracted by it—is common or rare.
Mike: Studies find a correlation of .6 between the same individual taking the test more than once, so 40% of the test score is always going to be the vagaries of each trial. They also find that “practice” on the IAT explains 15% of variance or so. More on this in a moment.
It’s potentially misleading to talk about ‘implicit racism’. At least, we should take care to distinguish Implicit Bias vs. Implicit Malice.
This is basically what Lie Detectors already are. Their extremely limited usefulness at detecting lies doesn’t hold a candle to their incredible utility at intimidating ignorant people.
I can recall a particularly iconic case in which a suspect was successfully fooled with some available wiring and a colander.
If anyone wants to do some background reading on the test before commenting, this paper addresses some of the common criticisms:
http://www.psy.utexas.edu/psy/FACULTY/Markman/jpsp01.pdf
“How do indirect measures of evaluation work? Evaluating the inference of prejudice in the IAT”
Nice history. These are successful because they offer ways to test dependent variables of subject’s opinions rather than their actual opinions. There are other ways to do it, including having subjects do word unscrambling exercises while varying the type of word they have to unscramble and then measuring their actual behavior.
There are still questions about tests of this nature, such as the Implicit Association Test. It’s not clear that they measure what they purport to measure.
But I guess you will talk about that tomorrow.
These laboratory experiments are quite artificial. I’m not saying we learn nothing from them, but they are often misinterpreted.
Yes, it’s a great temptation to draw broader conclusions than the actual test results would warrant. This type of test only measures a subset of the factors that inform behavior.
Those who want to read ahead can look at rationality legend Philip Tetlock’s critique at the link (2004-current publications).
Carl, why...
So many excuses to appear to be politically correct. There are so many who are racist, but cannot admit, and when these type of tests find them out they cry poor research tools.