Question 4: Imagine that you are a doctor, and one of your patients suffers from migraine headaches that last about 3 hours and involve intense pain, nausea, dizziness, and hyper-sensitivity to bright lights and loud noises. The patient usually needs to lie quietly in a dark room until the headache passes. This patient has a migraine headache about 100 times each year. You are considering three medications that you could prescribe for this patient. The medications have similar side effects, but differ in effectiveness and cost. The patient has a low income and must pay the cost because her insurance plan does not cover any of these medications. Which medication would you be most likely to recommend?
Drug A: reduces the number of headaches per year from 100 to 30. It costs $350 per year.
Drug B: reduces the number of headaches per year from 100 to 50. It costs $100 per year.
Drug C: reduces the number of headaches per year from 100 to 60. It costs $100 per year.
This question is based on research on the decoy effect (aka “asymmetric dominance” or the “attraction effect”). Drug C is obviously worse than Drug B (it is strictly dominated by it) but it is not obviously worse than Drug A, which tends to make B look more attractive by comparison. This is normally tested by comparing responses to the three-option question with a control group that gets a two-option question (removing option C), but I cut a corner and only included the three-option question. The assumption is that more-biased people would make similar choices to unbiased people in the two-option question, and would be more likely to choose Drug B on the three-option question. The model behind that assumption is that there are various reasons for choosing Drug A and Drug B; the three-option question gives biased people one more reason to choose Drug B but other than that the reasons are the same (on average) for more-biased people and unbiased people (and for the three-option question and the two-option question).
Based on the discussion on the original survey thread, this assumption might not be correct. Cost-benefit reasoning seems to favor Drug A (and those with more LW exposure or higher intelligence might be more likely to run the numbers). Part of the problem is that I didn’t update the costs for inflation—the original problem appears to be from 1995 which means that the real price difference was over 1.5 times as big then.
I don’t know the results from the original study; I found this particular example online (and edited it heavily for length) with a reference to Chapman & Malik (1995), but after looking for that paper I see that it’s listed on Chapman’s CV as only a “published abstract”.
49% of LWers chose Drug A (the one that is more likely for unbiased reasoners), vs. 50% for Drug B (which benefits from the decoy effect) and 1% for Drug C (the decoy). There was a strong effect of LW exposure: 57% of those in the top third chose Drug A vs. only 44% of those in the bottom third. Again, this gap remained nearly the same when controlling for Intelligence (shrinking from 14% to 13%), and differences in Intelligence were associated with a similarly sized effect: 59% for the top third vs. 44% for the bottom third.
original study: ?? weakly-tied LWers: 44% strongly-tied LWers: 57%
Question 5: Get a random three digit number (000-999) from http://goo.gl/x45un and enter the number here.
Treat the three digit number that you just wrote down as a length, in feet. Is the height of the tallest redwood tree in the world more or less than the number that you wrote down?
What is your best guess about the height of the tallest redwood tree in the world (in feet)?
This is an anchoring question; if there are anchoring effects then people’s responses will be positively correlated with the random number they were given (and a regression analysis can estimate the size of the effect to compare with published results, which used two groups instead of a random number).
Asking a question with the answer in feet was a mistake which generated a great deal of controversy and discussion. Dealing with unfamiliar units could interfere with answers in various ways so the safest approach is to look at only the US respondents; I’ll also see if there are interaction effects based on country.
The question is from a paper by Jacowitz & Kahneman (1995), who provided anchors of 180 ft. and 1200 ft. to two groups and found mean estimates of 282 ft. and 844 ft., respectively. One natural way of expressing the strength of an anchoring effect is as a slope (change in estimates divided by change in anchor values), which in this case is 562/1020 = 0.55. However, that study did not explicitly lead participants through the randomization process like the LW survey did. The classic Tversky & Kahneman (1974) anchoring question did use an explicit randomization procedure (spinning a wheel of fortune; though it was actually rigged to create two groups) and found a slope of 0.36. Similarly, several studies by Ariely & colleagues (2003) which used the participant’s Social Security number to explicitly randomize the anchor value found slopes averaging about 0.28.
There was a significant anchoring effect among US LWers (n=578), but it was much weaker, with a slope of only 0.14 (p=.0025). That means that getting a random number that is 100 higher led to estimates that were 14 ft. higher, on average. LW exposure did not moderate this effect (p=.88); looking at the pattern of results, if anything the anchoring effect was slightly higher among the top third (slope of 0.17) than among the bottom third (slope of 0.09). Intelligence did not moderate the results either (slope of 0.12 for both the top third and bottom third). It’s not relevant to this analysis, but in case you’re curious, the median estimate was 350 ft. and the actual answer is 379.3 ft. (115.6 meters).
Among non-US LWers (n=397), the anchoring effect was slightly smaller in magnitude compared with US LWers (slope of 0.08), and not significantly different from the US LWers or from zero.
original study: slope of 0.55 (0.36 and 0.28 in similar studies) weakly-tied LWers: slope of 0.09 strongly-tied LWers: slope of 0.17
If we break the LW exposure variable down into its 5 components, every one of the five is strongly predictive of lower susceptibility to bias. We can combine the first four CFAR questions into a composite measure of unbiasedness, by taking the percentage of questions on which a person gave the “correct” answer (the answer suggestive of lower bias). Each component of LW exposure is correlated with lower bias on that measure, with r ranging from 0.18 (meetup attendance) to 0.23 (LW use), all p < .0001 (time per day on LW is uncorrelated with unbiasedness, r=0.03, p=.39). For the composite LW exposure variable the correlation is 0.28; another way to express this relationship is that people one standard deviation above average on LW exposure 75% of CFAR questions “correct” while those one standard deviation below average got 61% “correct”. Alternatively, focusing on sequence-reading, the accuracy rates were:
75% Nearly all of the Sequences (n = 302) 70% About 75% of the Sequences (n = 186) 67% About 50% of the Sequences (n = 156) 64% About 25% of the Sequences (n = 137) 64% Some, but less than 25% (n = 210) 62% Know they existed, but never looked at them (n = 19) 57% Never even knew they existed until this moment (n = 89)
Another way to summarize is that, on 4 of the 5 questions (all but question 4 on the decoy effect) we can make comparisons to the results of previous research, and in all 4 cases LWers were much less susceptible to the bias or reasoning error. On 1 of the 5 questions (question 2 on temporal discounting) there was a ceiling effect which made it extremely difficult to find differences within LWers; on 3 of the other 4 LWers with a strong connection to the LW community were much less susceptible to the bias or reasoning error than those with weaker ties.
REFERENCES Ariely, Loewenstein, & Prelec (2003), “Coherent Arbitrariness: Stable demand curves without stable preferences” Chapman & Malik (1995), “The attraction effect in prescribing decisions and consumer choice” Jacowitz & Kahneman (1995), “Measures of Anchoring in Estimation Tasks” Kirby (2009), “One-year temporal stability of delay-discount rates” Toplak & Stanovich (2002), “The Domain Specificity and Generality of Disjunctive Reasoning: Searching for a Generalizable Critical Thinking Skill” Tversky & Kahneman’s (1974), “Judgment under Uncertainty: Heuristics and Biases”
There was a strong effect of LW exposure: 57% of those in the top third chose Drug A vs. only 44% of those in the bottom third.
I think this might just be due to the fact that the meme that “time is money” has been repeatedly expounded on LW, rather than long-time LWers are less prone to the decoy effect. All the rot13ed discussions about that question immediately identified Drug C as a decoy and focused on whether a low-income person should be willing to pay $12.50 to be spared a three-hour headache, with a sizeable minority arguing that they shouldn’t. I’d look at the income and country of people who chose each drug—I guess the main effect is what each responded took “low income” to mean.
“time is money” seems to me a pretty common and natural way to think if you live in a society whose workers tend to be paid hourly, whether you’re new to LW or not.
Even people nominally paid hourly often cannot freely choose how many and which hours to work. (With unemployment rates as high as there are now in much of the western world, employers have more bargaining power than workers, etc.) It’s not like if I got a headache this evening, I could say “rather than having a three-hour headache, I’ll take this $12.50 drug which will stop it, work two hours and earn $20, and then have fun for one hour”.
Okay, now I’m confused. When I did this question, I remember I ignored C as being strictly dominated by B and pulled out a calculator. When I saw this question in the analysis, I did the same thing before scrolling down. Here’s what I got:
Drug A saves you from 70 headaches at $350/yr, for a cost of $5 per averted headache. Drug B saves you from 50 headaches at a cost of $100/yr, for a cost of $2 per averted headache.
This seems to contradict your statement “Cost-benefit reasoning seems to favor Drug A”. Drug A has a higher cost per prevented headache according to my calculations, which would make Drug B the better one. Am I failing at basic arithmetic, or misunderstanding the question, or what? Please help.
EDIT: I was solving the wrong problem, and a bunch of people showed me why. Thanks for the explanations! I’m glad I got to learn where I was wrong.
Since each drug only reduces the number of headaches to a certain number, cost per headache isn’t the right way to look at it. Compare a drug that reduces the headaches to 99/year and costs $0, to a drug that eliminates the headaches completely for $1.
Instead of comparing the cost per headache, it’s better to assign a value to time, and calculate the net benefit or harm of each drug. If we assume one hour of time is valued at $7.25, or the US minimum wage, and using the stated information that each headache lasts three hours, the free drug nets you 1*3-0=21.75, drug A nets 70*3*7.25-350=1172.50, and drug B nets 50*3*7.25-100=987.5
Instead of comparing the cost per headache, it’s better to assign a value to time, and calculate the net benefit or harm of each drug.
That’s not a good way of looking at severe pain. People often will do long hours of mind-numbing tasks in order to prevent real or imaginary future short-term discomfort, like working out to get in shape for a one-time event.
Assuming I did the math right, it seems that folks valuing their time at more than $4.16 an hour should prefer drug A, and those valuing it at less should prefer drug B. To really make this unambiguous, “low income” needs to be defined; assuming it’s at least minimum wage, drug A wins pretty clearly...
I think I did the wrong math ($ per headache saved) when taking the actual survey, sadly...
You’re right about the cost per averted headache, but we aren’t trying to minimize the cost per averted headache; otherwise we wouldn’t use any drug. We’re trying to maximize utility. Unless avoiding several hours of a migraine is worth less to you than $5 (which a basic calculation using minimum wage would indicate that it is not, even excluding the unpleasantness of migraines—and as someone who gets migraines occasionally, I’d gladly pay a great deal more than $5 to avoid them), you should get Drug A.
(more detailed results, continued)
Question 4: Imagine that you are a doctor, and one of your patients suffers from migraine headaches that last about 3 hours and involve intense pain, nausea, dizziness, and hyper-sensitivity to bright lights and loud noises. The patient usually needs to lie quietly in a dark room until the headache passes. This patient has a migraine headache about 100 times each year. You are considering three medications that you could prescribe for this patient. The medications have similar side effects, but differ in effectiveness and cost. The patient has a low income and must pay the cost because her insurance plan does not cover any of these medications. Which medication would you be most likely to recommend?
Drug A: reduces the number of headaches per year from 100 to 30. It costs $350 per year.
Drug B: reduces the number of headaches per year from 100 to 50. It costs $100 per year.
Drug C: reduces the number of headaches per year from 100 to 60. It costs $100 per year.
This question is based on research on the decoy effect (aka “asymmetric dominance” or the “attraction effect”). Drug C is obviously worse than Drug B (it is strictly dominated by it) but it is not obviously worse than Drug A, which tends to make B look more attractive by comparison. This is normally tested by comparing responses to the three-option question with a control group that gets a two-option question (removing option C), but I cut a corner and only included the three-option question. The assumption is that more-biased people would make similar choices to unbiased people in the two-option question, and would be more likely to choose Drug B on the three-option question. The model behind that assumption is that there are various reasons for choosing Drug A and Drug B; the three-option question gives biased people one more reason to choose Drug B but other than that the reasons are the same (on average) for more-biased people and unbiased people (and for the three-option question and the two-option question).
Based on the discussion on the original survey thread, this assumption might not be correct. Cost-benefit reasoning seems to favor Drug A (and those with more LW exposure or higher intelligence might be more likely to run the numbers). Part of the problem is that I didn’t update the costs for inflation—the original problem appears to be from 1995 which means that the real price difference was over 1.5 times as big then.
I don’t know the results from the original study; I found this particular example online (and edited it heavily for length) with a reference to Chapman & Malik (1995), but after looking for that paper I see that it’s listed on Chapman’s CV as only a “published abstract”.
49% of LWers chose Drug A (the one that is more likely for unbiased reasoners), vs. 50% for Drug B (which benefits from the decoy effect) and 1% for Drug C (the decoy). There was a strong effect of LW exposure: 57% of those in the top third chose Drug A vs. only 44% of those in the bottom third. Again, this gap remained nearly the same when controlling for Intelligence (shrinking from 14% to 13%), and differences in Intelligence were associated with a similarly sized effect: 59% for the top third vs. 44% for the bottom third.
original study: ??
weakly-tied LWers: 44%
strongly-tied LWers: 57%
Question 5: Get a random three digit number (000-999) from http://goo.gl/x45un and enter the number here.
Treat the three digit number that you just wrote down as a length, in feet. Is the height of the tallest redwood tree in the world more or less than the number that you wrote down?
What is your best guess about the height of the tallest redwood tree in the world (in feet)?
This is an anchoring question; if there are anchoring effects then people’s responses will be positively correlated with the random number they were given (and a regression analysis can estimate the size of the effect to compare with published results, which used two groups instead of a random number).
Asking a question with the answer in feet was a mistake which generated a great deal of controversy and discussion. Dealing with unfamiliar units could interfere with answers in various ways so the safest approach is to look at only the US respondents; I’ll also see if there are interaction effects based on country.
The question is from a paper by Jacowitz & Kahneman (1995), who provided anchors of 180 ft. and 1200 ft. to two groups and found mean estimates of 282 ft. and 844 ft., respectively. One natural way of expressing the strength of an anchoring effect is as a slope (change in estimates divided by change in anchor values), which in this case is 562/1020 = 0.55. However, that study did not explicitly lead participants through the randomization process like the LW survey did. The classic Tversky & Kahneman (1974) anchoring question did use an explicit randomization procedure (spinning a wheel of fortune; though it was actually rigged to create two groups) and found a slope of 0.36. Similarly, several studies by Ariely & colleagues (2003) which used the participant’s Social Security number to explicitly randomize the anchor value found slopes averaging about 0.28.
There was a significant anchoring effect among US LWers (n=578), but it was much weaker, with a slope of only 0.14 (p=.0025). That means that getting a random number that is 100 higher led to estimates that were 14 ft. higher, on average. LW exposure did not moderate this effect (p=.88); looking at the pattern of results, if anything the anchoring effect was slightly higher among the top third (slope of 0.17) than among the bottom third (slope of 0.09). Intelligence did not moderate the results either (slope of 0.12 for both the top third and bottom third). It’s not relevant to this analysis, but in case you’re curious, the median estimate was 350 ft. and the actual answer is 379.3 ft. (115.6 meters).
Among non-US LWers (n=397), the anchoring effect was slightly smaller in magnitude compared with US LWers (slope of 0.08), and not significantly different from the US LWers or from zero.
original study: slope of 0.55 (0.36 and 0.28 in similar studies)
weakly-tied LWers: slope of 0.09
strongly-tied LWers: slope of 0.17
If we break the LW exposure variable down into its 5 components, every one of the five is strongly predictive of lower susceptibility to bias. We can combine the first four CFAR questions into a composite measure of unbiasedness, by taking the percentage of questions on which a person gave the “correct” answer (the answer suggestive of lower bias). Each component of LW exposure is correlated with lower bias on that measure, with r ranging from 0.18 (meetup attendance) to 0.23 (LW use), all p < .0001 (time per day on LW is uncorrelated with unbiasedness, r=0.03, p=.39). For the composite LW exposure variable the correlation is 0.28; another way to express this relationship is that people one standard deviation above average on LW exposure 75% of CFAR questions “correct” while those one standard deviation below average got 61% “correct”. Alternatively, focusing on sequence-reading, the accuracy rates were:
75% Nearly all of the Sequences (n = 302)
70% About 75% of the Sequences (n = 186)
67% About 50% of the Sequences (n = 156)
64% About 25% of the Sequences (n = 137)
64% Some, but less than 25% (n = 210)
62% Know they existed, but never looked at them (n = 19)
57% Never even knew they existed until this moment (n = 89)
Another way to summarize is that, on 4 of the 5 questions (all but question 4 on the decoy effect) we can make comparisons to the results of previous research, and in all 4 cases LWers were much less susceptible to the bias or reasoning error. On 1 of the 5 questions (question 2 on temporal discounting) there was a ceiling effect which made it extremely difficult to find differences within LWers; on 3 of the other 4 LWers with a strong connection to the LW community were much less susceptible to the bias or reasoning error than those with weaker ties.
REFERENCES
Ariely, Loewenstein, & Prelec (2003), “Coherent Arbitrariness: Stable demand curves without stable preferences”
Chapman & Malik (1995), “The attraction effect in prescribing decisions and consumer choice”
Jacowitz & Kahneman (1995), “Measures of Anchoring in Estimation Tasks”
Kirby (2009), “One-year temporal stability of delay-discount rates”
Toplak & Stanovich (2002), “The Domain Specificity and Generality of Disjunctive Reasoning: Searching for a Generalizable Critical Thinking Skill”
Tversky & Kahneman’s (1974), “Judgment under Uncertainty: Heuristics and Biases”
I think this might just be due to the fact that the meme that “time is money” has been repeatedly expounded on LW, rather than long-time LWers are less prone to the decoy effect. All the rot13ed discussions about that question immediately identified Drug C as a decoy and focused on whether a low-income person should be willing to pay $12.50 to be spared a three-hour headache, with a sizeable minority arguing that they shouldn’t. I’d look at the income and country of people who chose each drug—I guess the main effect is what each responded took “low income” to mean.
“time is money” seems to me a pretty common and natural way to think if you live in a society whose workers tend to be paid hourly, whether you’re new to LW or not.
Even people nominally paid hourly often cannot freely choose how many and which hours to work. (With unemployment rates as high as there are now in much of the western world, employers have more bargaining power than workers, etc.) It’s not like if I got a headache this evening, I could say “rather than having a three-hour headache, I’ll take this $12.50 drug which will stop it, work two hours and earn $20, and then have fun for one hour”.
Exactly. In South Africa that $350 could represent 16% or more of a possible yearly salary in some of our poorer areas.
Okay, now I’m confused. When I did this question, I remember I ignored C as being strictly dominated by B and pulled out a calculator. When I saw this question in the analysis, I did the same thing before scrolling down. Here’s what I got:
Drug A saves you from 70 headaches at $350/yr, for a cost of $5 per averted headache. Drug B saves you from 50 headaches at a cost of $100/yr, for a cost of $2 per averted headache.
This seems to contradict your statement “Cost-benefit reasoning seems to favor Drug A”. Drug A has a higher cost per prevented headache according to my calculations, which would make Drug B the better one. Am I failing at basic arithmetic, or misunderstanding the question, or what? Please help.
EDIT: I was solving the wrong problem, and a bunch of people showed me why. Thanks for the explanations! I’m glad I got to learn where I was wrong.
Since each drug only reduces the number of headaches to a certain number, cost per headache isn’t the right way to look at it. Compare a drug that reduces the headaches to 99/year and costs $0, to a drug that eliminates the headaches completely for $1.
Instead of comparing the cost per headache, it’s better to assign a value to time, and calculate the net benefit or harm of each drug. If we assume one hour of time is valued at $7.25, or the US minimum wage, and using the stated information that each headache lasts three hours, the free drug nets you 1*3-0=21.75, drug A nets 70*3*7.25-350=1172.50, and drug B nets 50*3*7.25-100=987.5
That’s not a good way of looking at severe pain. People often will do long hours of mind-numbing tasks in order to prevent real or imaginary future short-term discomfort, like working out to get in shape for a one-time event.
You’re right; I was generalizing from my experiences with migraines, where the pain goes away if I’m lying in a quiet, dark room
Assuming I did the math right, it seems that folks valuing their time at more than $4.16 an hour should prefer drug A, and those valuing it at less should prefer drug B. To really make this unambiguous, “low income” needs to be defined; assuming it’s at least minimum wage, drug A wins pretty clearly...
I think I did the wrong math ($ per headache saved) when taking the actual survey, sadly...
You’re right about the cost per averted headache, but we aren’t trying to minimize the cost per averted headache; otherwise we wouldn’t use any drug. We’re trying to maximize utility. Unless avoiding several hours of a migraine is worth less to you than $5 (which a basic calculation using minimum wage would indicate that it is not, even excluding the unpleasantness of migraines—and as someone who gets migraines occasionally, I’d gladly pay a great deal more than $5 to avoid them), you should get Drug A.