[Link] How to Dispel Your Illusions
The topic and the problems associated with it are probably familiar to many of you already. But I think some may find this review by Freeman Dyson of the book Thinking, Fast and Slow by Daniel Kahneman interesting.
In 1955, when Daniel Kahneman was twenty-one years old, he was a lieutenant in the Israeli Defense Forces. He was given the job of setting up a new interview system for the entire army. The purpose was to evaluate each freshly drafted recruit and put him or her into the appropriate slot in the war machine. The interviewers were supposed to predict who would do well in the infantry or the artillery or the tank corps or the various other branches of the army. The old interview system, before Kahneman arrived, was informal. The interviewers chatted with the recruit for fifteen minutes and then came to a decision based on the conversation. The system had failed miserably. When the actual performance of the recruit a few months later was compared with the performance predicted by the interviewers, the correlation between actual and predicted performance was zero.
Kahneman had a bachelor’s degree in psychology and had read a book, Clinical vs. Statistical Prediction: A Theoretical Analysis and a Review of the Evidence by Paul Meehl, published only a year earlier. Meehl was an American psychologist who studied the successes and failures of predictions in many different settings. He found overwhelming evidence for a disturbing conclusion. Predictions based on simple statistical scoring were generally more accurate than predictions based on expert judgment.
A famous example confirming Meehl’s conclusion is the “Apgar score,” invented by the anesthesiologist Virginia Apgar in 1953 to guide the treatment of newborn babies. The Apgar score is a simple formula based on five vital signs that can be measured quickly: heart rate, breathing, reflexes, muscle tone, and color. It does better than the average doctor in deciding whether the baby needs immediate help. It is now used everywhere and saves the lives of thousands of babies. Another famous example of statistical prediction is the Dawes formula for the durability of marriage. The formula is “frequency of love-making minus frequency of quarrels.” Robyn Dawes was a psychologist who worked with Kahneman later. His formula does better than the average marriage counselor in predicting whether a marriage will last.
Having read the Meehl book, Kahneman knew how to improve the Israeli army interviewing system. His new system did not allow the interviewers the luxury of free-ranging conversations with the recruits. Instead, they were required to ask a standard list of factual questions about the life and work of each recruit. The answers were then converted into numerical scores, and the scores were inserted into formulas measuring the aptitude of the recruit for the various army jobs. When the predictions of the new system were compared to performances several months later, the results showed the new system to be much better than the old. Statistics and simple arithmetic tell us more about ourselves than expert intuition.
Reflecting fifty years later on his experience in the Israeli army, Kahneman remarks in Thinking, Fast and Slow that it was not unusual in those days for young people to be given big responsibilities. The country itself was only seven years old. “All its institutions were under construction,” he says, “and someone had to build them.” He was lucky to be given this chance to share in the building of a country, and at the same time to achieve an intellectual insight into human nature. He understood that the failure of the old interview system was a special case of a general phenomenon that he called “the illusion of validity.” At this point, he says, “I had discovered my first cognitive illusion.”
Cognitive illusions are the main theme of his book. A cognitive illusion is a false belief that we intuitively accept as true. The illusion of validity is a false belief in the reliability of our own judgment. The interviewers sincerely believed that they could predict the performance of recruits after talking with them for fifteen minutes. Even after the interviewers had seen the statistical evidence that their belief was an illusion, they still could not help believing it. Kahneman confesses that he himself still experiences the illusion of validity, after fifty years of warning other people against it. He cannot escape the illusion that his own intuitive judgments are trustworthy.
An episode from my own past is curiously similar to Kahneman’s experience in the Israeli army. I was a statistician before I became a scientist. At the age of twenty I was doing statistical analysis of the operations of the British Bomber Command in World War II. The command was then seven years old, like the State of Israel in 1955. All its institutions were under construction. It consisted of six bomber groups that were evolving toward operational autonomy. Air Vice Marshal Sir Ralph Cochrane was the commander of 5 Group, the most independent and the most effective of the groups. Our bombers were then taking heavy losses, the main cause of loss being the German night fighters.
Cochrane said the bombers were too slow, and the reason they were too slow was that they carried heavy gun turrets that increased their aerodynamic drag and lowered their operational ceiling. Because the bombers flew at night, they were normally painted black. Being a flamboyant character, Cochrane announced that he would like to take a Lancaster bomber, rip out the gun turrets and all the associated dead weight, ground the two gunners, and paint the whole thing white. Then he would fly it over Germany, and fly so high and so fast that nobody could shoot him down. Our commander in chief did not approve of this suggestion, and the white Lancaster never flew.
The reason why our commander in chief was unwilling to rip out gun turrets, even on an experimental basis, was that he was blinded by the illusion of validity. This was ten years before Kahneman discovered it and gave it its name, but the illusion of validity was already doing its deadly work. All of us at Bomber Command shared the illusion. We saw every bomber crew as a tightly knit team of seven, with the gunners playing an essential role defending their comrades against fighter attack, while the pilot flew an irregular corkscrew to defend them against flak. An essential part of the illusion was the belief that the team learned by experience. As they became more skillful and more closely bonded, their chances of survival would improve.
When I was collecting the data in the spring of 1944, the chance of a crew reaching the end of a thirty-operation tour was about 25 percent. The illusion that experience would help them to survive was essential to their morale. After all, they could see in every squadron a few revered and experienced old-timer crews who had completed one tour and had volunteered to return for a second tour. It was obvious to everyone that the old-timers survived because they were more skillful. Nobody wanted to believe that the old-timers survived only because they were lucky.
At the time Cochrane made his suggestion of flying the white Lancaster, I had the job of examining the statistics of bomber losses. I did a careful analysis of the correlation between the experience of the crews and their loss rates, subdividing the data into many small packages so as to eliminate effects of weather and geography. My results were as conclusive as those of Kahneman. There was no effect of experience on loss rate. So far as I could tell, whether a crew lived or died was purely a matter of chance. Their belief in the life-saving effect of experience was an illusion.
The demonstration that experience had no effect on losses should have given powerful support to Cochrane’s idea of ripping out the gun turrets. But nothing of the kind happened. As Kahneman found out later, the illusion of validity does not disappear just because facts prove it to be false. Everyone at Bomber Command, from the commander in chief to the flying crews, continued to believe in the illusion. The crews continued to die, experienced and inexperienced alike, until Germany was overrun and the war finally ended.
Another theme of Kahneman’s book, proclaimed in the title, is the existence in our brains of two independent sytems for organizing knowledge. Kahneman calls them System One and System Two. System One is amazingly fast, allowing us to recognize faces and understand speech in a fraction of a second. It must have evolved from the ancient little brains that allowed our agile mammalian ancestors to survive in a world of big reptilian predators. Survival in the jungle requires a brain that makes quick decisions based on limited information. Intuition is the name we give to judgments based on the quick action of System One. It makes judgments and takes action without waiting for our conscious awareness to catch up with it. The most remarkable fact about System One is that it has immediate access to a vast store of memories that it uses as a basis for judgment. The memories that are most accessible are those associated with strong emotions, with fear and pain and hatred. The resulting judgments are often wrong, but in the world of the jungle it is safer to be wrong and quick than to be right and slow.
System Two is the slow process of forming judgments based on conscious thinking and critical examination of evidence. It appraises the actions of System One. It gives us a chance to correct mistakes and revise opinions. It probably evolved more recently than System One, after our primate ancestors became arboreal and had the leisure to think things over. An ape in a tree is not so much concerned with predators as with the acquisition and defense of territory. System Two enables a family group to make plans and coordinate activities. After we became human, System Two enabled us to create art and culture.
If you’ve made it this far read the rest of the review here. There is still some cool stuff after this.
The review was good, but some alarm bells went off when I read these paragraphs:
The idea that Freud can “penetrate deeper” (ha!) than Kahneman is (a) false and (b) epistemically harmful because it promotes the idea that intuition and flowery prose is in some way better than empiricism in discovering truths about our psychology. And of course, the idea that modern psychology is unable to study strong emotions or is worse at studying them than Freud is obviously false. Other than that, though, the review was a decent read.
My alarm bells went off much earlier:
Ok maybe I was wrong here:
Some cool stuff after this.
I don’t disagree, I rather think that you’re probably right here, but I would feel more comfortable and certain with a couple of solid examples behind this statement, and I can’t think of any. Can you?
Dan Ariely did some studies of decision making while aroused or not.
Modern clinical psychology has a lot to say about emotions, particularly strong negative ones like anxiety, grief, anger, addiction and obsession (example: this guy studies anger). People like Daniel Gilbert have lots to say about happiness
I haven’t read Ariely’s research articles themselves, but I’ve seen this research summarized in Ariely’s (or maybe Wiseman’s) recent book. How is this a study of strong emotions? There’s much more to the emotional states than shifting preferences. For all we know, there may be systematic changes in certain competencies, susceptibility to certain kinds if stimuli, even lasting personality changes.
Didn’t know about Gilbert and Denson, thanks for pointing out their work.
Yes that part made me cringe too.
Basically the review would have been just as good if those paragraphs wouldn’t have been there, it seemed almost a non-sequitor to my eyes. Was he trying to say something silly and clever to signal intelligence, reach a certain word count or was he just reading about Freud recently and had an overwhelming desire to talk about him?
If I was Straussian I would almost argue he was trying to trick us by exploiting our system one for that nonsense (since our intuitions say our intuitions are reliable), hoping the clever ones remember the lesson! Heh. :)
I’m guessing the reviewer just had a particular axe to grind—he likes Freud and wanted to argue that Freudian ideas shouldn’t be seen as low-status.
I don’t think the article is nearly clever enough to exploit the reader like that, although it would certainly explain the, er, Freudian slip in the second paragraph.
This is the meat of his conclusion. I think Tetronian is right that the author enjoys the literary thinkers more than the scientific ones, even though those literary thinkers are likely to be relying heavily on System One.
Before that, the superficial parallels he draws between James and Kahneman are vexing in their superficiality.
Hmmm, that gave me an idea.
Intuitions say intuitions are reliable, reason disagrees. Reason says that reason is often unreliable, intuition agrees. So we see generally intuitions and reason agree on what the best course of action is.
I’m doing Stanford’s online Machine Learning class right now, which makes me wonder if these kind of simple statistical methods could be made even more accurate using ML techniques. Of course then you trade off accuracy vs. ease of understanding and/or manual calculation.
Can anyone link to me a website that tests my confidence accuracy? I’m looking for the sort of test where you answer random questions giving your range of probability of being right, and preferably at the end of the test it will give you a graph comparing your estimates of confidence with actual prediction results, and if the lines perfectly match then you get are perfectly calibrated (neither over nor under confident in your skills).
Its not exactly as you requested, but the Test Your Calibration article might be helpful for quick feedback. Also, prediction book is a website that does something similar over time, with a calibration curve report built in, which is reported to be educational if used over time :-)
Thanks!
It seems like it would be enormously valuable to compile scoring rules like this in one place so that people who might be able to benefit from them could easily find them.
Hm. I imagine it would be relatively low-cost to assemble a list, but I don’t know how valuable it would be. You probably don’t need access to a rule that tells you when to let out a psychiatric patient, for example (one of the other examples of SPRs I’ve seen).
The useful ones would be things like Dawes rule, or another one (that I don’t remember the name of) which says you need 4-5 times as many positive interactions in a relationship to negative ones. I wonder how resistant those are to gaming, though: if you take incompatible people who choose to delay fights / artificially increase lovemaking or compliment-giving, they’re probably more likely to break up than a couple who naturally has that level of compliments, love-making, and fighting.
It doesn’t seem to me that most incompatible couples would be able to artificially delay fights (by willpower alone) over the long run; I won’t speculate on whether the other variable could be artificially manipulated.