The meta-analysis is probably Simpson’s paradox in play at very least for the pain category, especially given the noted variability.
Some of the more recent research into Placebo (Harvard has a very cool group studying it) has been the importance of ritual vs simply deception. In their work, even when it was known to be a placebo, as long as delivered in a ritualized way, there was an effect.
So when someone takes a collection of hundreds of studies where the specific conditions might vary, and then just adds them all together looking for an effect even though they note that there’s a broad spectrum of efficacy across the studies, it might not be the best basis to extrapolate from.
For example, given the following protocols, do you think they might have different efficacy for pain reduction, or that the results should be the same?
Send patients home with sugar pills to take as needed for pain management
Have a nurse come in to the room with the pills in a little cup to be taken
Have a nurse give an injection
Which of these protocols would be easier and more cost effective to include as the ‘placebo’?
If we grouped studies of placebo for pain by the intensiveness of the ritualized component vs if we grouped them all together into one aggregate and looked at the averages, might we see different results?
I’d be wary of reading too deeply into the meta-analysis you point to, and would recommend looking into the open-label placebo research from PiPS, all of which IIRC postdates the meta-analysis.
Especially for pain, where we even know that giving someone an opiate blocker prevents the pain reduction placebo effect (Levine et al (1978)), the idea that “it doesn’t exist” because of a single very broad analysis seems potentially gravely mistaken.
I’m interested if you have a toy example showing how Simpsons paradox could have an impact here?
I assume that has a placebo/doesn’t have a placebo is a binary variable, and I also assume that the number of people in each arm in each experiment is the same. I can’t really see how you would end up with Simpsons paradox with that set up.
It’s not exactly Simpson’s, but we don’t even need a toy model as in their updated analysis it highlights details in line with exactly what I described above (down to tying in earlier PiPC research), and describe precisely the issue with pooled results across different subgroupings of placebo interventions:
It can be difficult to interpret whether a pooled standardised mean difference is large enough to be of clinical relevance. A consensus paper found that an analgesic effect of 10 mm on a 100 mm visual analogue scale represented a ‘minimal effect’ (Dworkin 2008). The pooled effect of placebo on pain based on the four German acupuncture trials corresponded to 16 mm on a 100 mm visual analogue scale, which amounts to approximately 75% of the effect of non‐steroidal anti‐inflammatory drugs on arthritis‐related pain (Gøtzsche 1990). However, the pooled effect of the three other pain trials with low risk of bias corresponded to 3 mm. Thus, the analgesic effect of placebo seems clinically relevant in some situations and not in others.
Putting subgroups with a physical intervention where there’s a 16⁄100 result with 10⁄100 as significant in with subgroups where there’s a 3⁄100 result and only looking at the pooled result might lead someone to thinking “there’s no significant effect” as occurred with OP, even though there’s clearly a significant effect for one subgroup when they aren’t pooled.
This is part of why in the discussion they explicitly state:
However, our findings do not imply that placebo interventions have no effect. We found an effect on patient‐reported outcomes, especially on pain. Several trials of low risk of bias reported large effects of placebo on pain, but other similar trials reported negligible effect of placebo, indicating the importance of background factors. We identified three clinical factors that were associated with higher effects of placebo: physical placebos...
Additionally, the criticism they raise in their implications section about there being no open label placebo data is no longer true, which was the research I was pointing OP towards.
The problem here was that the aggregate analysis at face value presents a very different result from a detailed review of the subgroups, particularly along physical vs pharmacological placebos, all of which has been explored further in research since this analysis.
Yea, the Cochrane meta-study aggregates a bunch of heterogenous studies so the aggregated results are confusing to analyze. The unfortunate reality is that it is complicated to get a complete picture—one may have to look at the individual studies one by one if they truly want to come to a complete understanding of the lit.
Yes, and I did look at something like four of the individual studies of depression, focusing on the ones testing pills so they would be comparable to the Prozac trial. As I said in the post, they all gave me the same impression: I didn’t see a difference between the placebo and no-pill groups. So it was surprising to see the summary value of −0.25 SMD. Maybe it’s some subtle effect in the studies I looked at which you can see once you aggregate. But maybe it’s heterogeneity, and the effect is coming from the studies I didn’t look at. As I mentioned in the post, not all of the placebo interventions were pills.
The meta-analysis is probably Simpson’s paradox in play at very least for the pain category, especially given the noted variability.
Some of the more recent research into Placebo (Harvard has a very cool group studying it) has been the importance of ritual vs simply deception. In their work, even when it was known to be a placebo, as long as delivered in a ritualized way, there was an effect.
So when someone takes a collection of hundreds of studies where the specific conditions might vary, and then just adds them all together looking for an effect even though they note that there’s a broad spectrum of efficacy across the studies, it might not be the best basis to extrapolate from.
For example, given the following protocols, do you think they might have different efficacy for pain reduction, or that the results should be the same?
Send patients home with sugar pills to take as needed for pain management
Have a nurse come in to the room with the pills in a little cup to be taken
Have a nurse give an injection
Which of these protocols would be easier and more cost effective to include as the ‘placebo’?
If we grouped studies of placebo for pain by the intensiveness of the ritualized component vs if we grouped them all together into one aggregate and looked at the averages, might we see different results?
I’d be wary of reading too deeply into the meta-analysis you point to, and would recommend looking into the open-label placebo research from PiPS, all of which IIRC postdates the meta-analysis.
Especially for pain, where we even know that giving someone an opiate blocker prevents the pain reduction placebo effect (Levine et al (1978)), the idea that “it doesn’t exist” because of a single very broad analysis seems potentially gravely mistaken.
I’m interested if you have a toy example showing how Simpsons paradox could have an impact here?
I assume that has a placebo/doesn’t have a placebo is a binary variable, and I also assume that the number of people in each arm in each experiment is the same. I can’t really see how you would end up with Simpsons paradox with that set up.
It’s not exactly Simpson’s, but we don’t even need a toy model as in their updated analysis it highlights details in line with exactly what I described above (down to tying in earlier PiPC research), and describe precisely the issue with pooled results across different subgroupings of placebo interventions:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7156905/
Putting subgroups with a physical intervention where there’s a 16⁄100 result with 10⁄100 as significant in with subgroups where there’s a 3⁄100 result and only looking at the pooled result might lead someone to thinking “there’s no significant effect” as occurred with OP, even though there’s clearly a significant effect for one subgroup when they aren’t pooled.
This is part of why in the discussion they explicitly state:
Additionally, the criticism they raise in their implications section about there being no open label placebo data is no longer true, which was the research I was pointing OP towards.
The problem here was that the aggregate analysis at face value presents a very different result from a detailed review of the subgroups, particularly along physical vs pharmacological placebos, all of which has been explored further in research since this analysis.
Yea, the Cochrane meta-study aggregates a bunch of heterogenous studies so the aggregated results are confusing to analyze. The unfortunate reality is that it is complicated to get a complete picture—one may have to look at the individual studies one by one if they truly want to come to a complete understanding of the lit.
Yes, and I did look at something like four of the individual studies of depression, focusing on the ones testing pills so they would be comparable to the Prozac trial. As I said in the post, they all gave me the same impression: I didn’t see a difference between the placebo and no-pill groups. So it was surprising to see the summary value of −0.25 SMD. Maybe it’s some subtle effect in the studies I looked at which you can see once you aggregate. But maybe it’s heterogeneity, and the effect is coming from the studies I didn’t look at. As I mentioned in the post, not all of the placebo interventions were pills.