One of the distinctions Guzey made that I think was important was the difference between work-related fatigue and sleep deprivation. Many SD studies are on resident doctors at the end of 24-hour shifts. They are experiencing both work-related fatigue from a notoriously taxing job and sleep deprivation. As such, if we care about SD as opposed to fatigue, then such studies are hopelessly confounded as far as relevance for our specific research question.
I do think that that’s an important distinction. Note that most of the studies included in the sleep restriction meta-analyses I quoted (all but one, in fact) are not on resident doctors, and, as far as I know, they pretty much exclusively examine the effect of sleep restriction (sleeping fewer than 6 or so hours per night) rather than the effect of staying awake for an abnormally long time.
I didn’t look at the meta-analyses you cite here. The ones I did look at, though (Pilcher and Huffcutt on cognitive impairment, and Irwin, Michael R., Richard Olmstead, and Judith E. Carroll on inflammation), for me surprised me with how unconvincing they were when I looked under the hood.
I did actually cite Irwin, but note that my conclusion from it was that sleep duration does not impact inflammation.
When Guzey posted his “Theses on Sleep,” I spent a lot of time going through meta-analyses and considering whether the underlying studies were sound. I talked about what I found in the comments there, and I wonder if you got a chance to read them?
Are you referring to your reply to my comment on Guzey’s post? If so, I did, and I believe that your concerns do not extend to the meta-analyses I cited here (because, as I said, they focus on the effects of sleep restriction rather than staying awake a very long time, and largely do not include physicians). Note that I did not cite Pilcher and Huffcutt, and did not bring it up as evidence, anywhere in this post.
(however, as I acknowledge in the post, these meta-analyses I did cite can have many possible problems & are not conclusive evidence)
I didn’t notice that it was you I was originally responding to—I apologize for the oversight! I also want to emphasize that I agree with you on some of your responses to Guzey. I think a lot of his arguments are weak, his Reddit- and self-supplied supporting evidence shouldn’t be stacked up against peer-reviewed controlled sleep studies, and some of the argumentation comes off as a conspiratorial strawman (i.e. “At this point, I’m pretty sure that the entire “not sleeping ‘enough’ makes you stupid” is a 100% psyop.”).
In the first of the meta-analyses you posted (Lowe, Safati, and Hall), I see some supporting evidence for your position, and some complicating factors. From the abstract:
This effect held for executive functioning (g = −0.324, p < 0.001), sustained attention (g = −0.409, p < 0.001), and long-term memory (g = −0.192, p = 0.002). There was insufficient evidence to detect an effect within the domains of attention, multitask, impulsive decision-making or intelligence.
So first, let’s acknowledge that they found significant, moderate effects in two areas that we may very well care quite a lot about! However, the long-term memory effect would be conventionally categorized as “small, not visible to the naked eye.” As a note, it’s hard to interpret effect sizes intuitively. They also point out that these effect sizes are highly variable across studies.
Let’s also talk publication bias. They do an Egger Funnel Plot and do find evidence of bias (p<0.001 for the overall effect).
Limitations include the possibility of publication bias, which was observed across almost all cognitive domains...
However, the impact of such publication bias on the effect appears to be minimal as another 75 studies with an effect size <0.0 would have to be added to result in a small overall effect size (g < −0.200).
Read literally, I’m not sure what to make of this caveat. First, isn’t it the number of subjects, rather than the number of studies, that’s relevant here? Perhaps they mean “average-size” studies? Second, I don’t know what they mean by “with an effect size <0.0.” Effect sizes that are lower than zero (as this refers to) are those that show a cognitive impairment. To give a precise statement on the number of (subjects or average-sized studies) needed to bring the total effect size to 0.2, we’d need to know a specific effect size that those studies would need to have. Mathematically, the statement just doesn’t make sense. I’ve read meta-analyses where the researchers at least try to find unpublished work, and it’s disappointing that the authors not only don’t do that here, but write about this issue in a way that seems to show a lack of care around the issue.
Most importantly, median and average sleep across studies was 4.3 and 4.6 hours, respectively—and this includes a few studies on young children, such as one in which “sleep deprivation” meant 8.1 hours of sleep. So the effects found by this meta-analysis, which seem relatively modest to me, required an amount of sleep deprivation that to me seems rather extreme.
So we have to ask two questions.
Is there some meta-analytic evidence of a statistically significant effect of an average of 3.5 hours of SD on at least some measures of cognitive function? The answer is yes.
Does that evidence paint a picture of an effect that warrants concern over a more modest 1-2 hour SD regimen over the long term, considering effect sizes and the scope of impairment across cognitive domains? This meta-analysis is both weak (due to bias and limited effect sizes) and flawed (due to publication bias) in regards to this specific research question.
There’s no perfect meta-analysis, and despite the flaws I see in this one, it’s still useful to investigate. But when I think through its shortcomings, it does not leave me worried about a consistent 6 hours of sleep per night, or even the occasional 4-5 hour night of sleep.
And note that these are the results without making any efforts to optimize for a shortened sleep schedule. What if you had a coach to help establish a rejuvenating 5 hour long-term sleep regimen? What about modafinil? Does impairment continue to degrade indefinitely over time, or do you eventually just get used to it? What if you’re motivated to do whatever activity’s keeping you awake, rather than doing a bunch of psych tests in a sleep lab? Is sleep debt a thing, or can you balance out 5x5-hour nights of sleep with a 2x9-hour nights of sleep, netting a total of 6 extra waking hours per week = 16 waking days per year?
I’m not complaining at all that these questions aren’t addressed in this meta-analysis. But the point is that the deck is somewhat stacked in favor of the “SD bad” hypothesis. That should be kept in the back of our minds when we take the additional step of synthesizing our interpretation of a particular meta-analysis into our overall view of, as Walker puts it, why we sleep.
Read literally, I’m not sure what to make of this caveat. First, isn’t it the number of subjects, rather than the number of studies, that’s relevant here? Perhaps they mean “average-size” studies? Second, I don’t know what they mean by “with an effect size <0.0.” Effect sizes that are lower than zero (as this refers to) are those that show a cognitive impairment. To give a precise statement on the number of (subjects or average-sized studies) needed to bring the total effect size to 0.2, we’d need to know a specific effect size that those studies would need to have. Mathematically, the statement just doesn’t make sense. I’ve read meta-analyses where the researchers at least try to find unpublished work, and it’s disappointing that the authors not only don’t do that here, but write about this issue in a way that seems to show a lack of care around the issue.
The article specifies that it used Orwin’s fail-safe N to calculate the number of missing studies required to reach a small effect size. It’s not as good as the standard trim-and-fill method I’ve seen a lot in meta-analyses. But it makes mathematical sense and provides evidence.
So first, let’s acknowledge that they found significant, moderate effects in two areas that we may very well care quite a lot about! However, the long-term memory effect would be conventionally categorized as “small, not visible to the naked eye.”
I think the small effect sizes are not as important for predicting overall cognition as the larger ones. If damage happens to a specific part of the brain and not others (as in e.g. most non-fatal strokes), that causes a lot more functional impairment than you would expect if you focused on all of the many parts of the brain that weren’t damaged.
They also point out that these effect sizes are highly variable across studies.
A lot of the variability in the effect sizes is caused by perfectly reasonable things like variations in sleep deficit and cumulative days of restricted sleep, as I explicitly pointed out in the post, and the type of cognitive test used is probably responsible for a substantial fraction of the variability too.
Most importantly, median and average sleep across studies was 4.3 and 4.6 hours, respectively—and this includes a few studies on young children, such as one in which “sleep deprivation” meant 8.1 hours of sleep. So the effects found by this meta-analysis, which seem relatively modest to me, required an amount of sleep deprivation that to me seems rather extreme.
The meta-analysis itself tells you the average sleep deficit across the studies compared with age-adjusted sleep recommendations, as I pointed out in my post, and that number is probably more relevant than those averages.
when I think through its shortcomings, it does not leave me worried about a consistent 6 hours of sleep per night, or even the occasional 4-5 hour night of sleep.
Note that one of my meta-analyses found that (the absolute value of) the effect size of sleep restriction on attention is higher when the comparison group undergoes sleep extension rather than when it undergoes no intervention. This is at least weak evidence that the dose-response relationship between sleep duration and cognition doesn’t flatten out once you get to the average amount of time people sleep.
(This is consistent with me own experience; I started taking less time on assignments and getting better grades in college after I started forcing myself to sleep longer than my previous higher-than-average baseline.)
I think the small effect sizes are not as important for predicting overall cognition as the larger ones. If damage happens to a specific part of the brain and not others (as in e.g. most non-fatal strokes), that causes a lot more functional impairment than you would expect if you focused on all of the many parts of the brain that weren’t damaged.
If I’m understanding you correctly, you’re saying that you think that a 1-2 statistically significant moderate effect sizes are more worrisome than if this research had discovered statistically significant but small effect sizes in every category. You expect that functional impairment on real-world tasks would be serious and wide-ranging as a consequence of moderate impairment in a single cognitive domain.
First, I think this is a great crux, and highlights the importance of carefully framing a research question.
Second, I think we ought to consider a few possible functional performance scenarios:
Tasks that stretch cognitive capacity to the maximum extent, often by design. Example: a chess championship or math exam.
Tasks that typically use limited cognitive capacity, but in which mistakes can have severe consequences and rare emergencies can occur at any time that will result in much higher cognitive demands than normal. Example: jet pilots, soldiers on patrol.
Tasks that use limited cognitive capacity, allow for corrections and adjustments, and are primarily bottlenecked by time, skill, and material resources. Example: most jobs, tasks of daily living, sailing competitions.
I think that scenarios (1) and (2) are the best candidates for the hypothesis of greater functional impairment effects relative to limited and modest cognitive function deficits, and scenario (3) is best for the hypothesis that modest and limited cognitive function deficits aren’t too concerning.
Some challenges might even be mixed. For example, we could imagine that in school coursework, learning is bottlenecked by hours of study in general. However, on exams, for well-prepared students, hours of sleep might become the limiting factor on their performance. Hence, an idealized study strategy might want to routinely sacrifice some sleep in exchange for more study time, but then increase sleep on the days leading up to an exam.
Under this mixed framework, a strategic sleep regimen would be flexible, based on the specific performance demands the person in question was facing.
Note that one of my meta-analyses found that (the absolute value of) the effect size of sleep restriction on attention is higher when the comparison group undergoes sleep extension rather than when it undergoes no intervention. This is at least weak evidence that the dose-response relationship between sleep duration and cognition doesn’t flatten out once you get to the average amount of time people sleep.
I’ll check into this when I have time—it’s interesting!
The meta-analysis itself tells you the average sleep deficit across the studies compared with age-adjusted sleep recommendations, as I pointed out in my post, and that number is probably more relevant than those averages.
The sleep times I referred to track closely with the average age-adjusted sleep deprivations, but you are right that it would have been more apt to choose the latter instead of the former. I simply find it more intuitive to gauge what X hours of sleep feels like than Y hours of sleep deprivation.
Edit:
As a followup, I’m not sure how to think about the relationship between single-factor cognitive effects and broad functional effects.
At first glance, it seems to me that if we hypothesize that a single-factor cognitive deficit corresponds to broad functional impairment, that we’re essentially saying that you’re held back by whatever cognitive faculty is furthest below baseline. That in turn would indicate that any single-factor cognitive improvements obtained by sleep extension (i.e. alertness) would not correspond to functional improvements, if any cognitive factors fail to be improved by sleep extension.
On the other hand, it could be that, in either direction, the cognitive factors that are harmed/helped by sleep adjustments happen to be the load-bearing factors for most activities. Maybe executive function and sustained attention are the limiting cognitive resource for most people at baseline, and alertness is the typical limiting cognitive resource above that. If that were true, we’d expect to see broad functional improvements or impairments across wider ranges of sleep adjustment.
I’m most inclined toward the idea that tasks have a heterogeneous relationship with particular cognitive capacities, and thus with the impact of sleep. This suggests that people can reap the most benefit by tailoring their sleep schedule to their particular requirements.
Busy parents with regular jobs that aren’t too cognitively demanding or hazardous might benefit most from a couple extra hours of wakefulness every day to get things done, and indeed, they often seem to opt for that strategy. By contrast, airlines apparently have strict policies for how long pilots must sleep before a shift, and I’ve also heard an anecdote from a biologist in a BSL4 facility that they don’t do inoculations unless they’re in absolutely tip-top cognitive shape. If they got bad sleep, or woke up with a crick in their neck, they reschedule.
My guess is that many people are modestly strategic about their sleep budget. But if this hypothesis is correct, there may be substantial gains to be reaped by individualized, flexible tailoring of the sleep schedule, rather than issuing a blanket recommendation of X hours of sleep across the board.
I do think that that’s an important distinction. Note that most of the studies included in the sleep restriction meta-analyses I quoted (all but one, in fact) are not on resident doctors, and, as far as I know, they pretty much exclusively examine the effect of sleep restriction (sleeping fewer than 6 or so hours per night) rather than the effect of staying awake for an abnormally long time.
I did actually cite Irwin, but note that my conclusion from it was that sleep duration does not impact inflammation.
Are you referring to your reply to my comment on Guzey’s post? If so, I did, and I believe that your concerns do not extend to the meta-analyses I cited here (because, as I said, they focus on the effects of sleep restriction rather than staying awake a very long time, and largely do not include physicians). Note that I did not cite Pilcher and Huffcutt, and did not bring it up as evidence, anywhere in this post.
(however, as I acknowledge in the post, these meta-analyses I did cite can have many possible problems & are not conclusive evidence)
I didn’t notice that it was you I was originally responding to—I apologize for the oversight! I also want to emphasize that I agree with you on some of your responses to Guzey. I think a lot of his arguments are weak, his Reddit- and self-supplied supporting evidence shouldn’t be stacked up against peer-reviewed controlled sleep studies, and some of the argumentation comes off as a conspiratorial strawman (i.e. “At this point, I’m pretty sure that the entire “not sleeping ‘enough’ makes you stupid” is a 100% psyop.”).
In the first of the meta-analyses you posted (Lowe, Safati, and Hall), I see some supporting evidence for your position, and some complicating factors. From the abstract:
So first, let’s acknowledge that they found significant, moderate effects in two areas that we may very well care quite a lot about! However, the long-term memory effect would be conventionally categorized as “small, not visible to the naked eye.” As a note, it’s hard to interpret effect sizes intuitively. They also point out that these effect sizes are highly variable across studies.
Let’s also talk publication bias. They do an Egger Funnel Plot and do find evidence of bias (p<0.001 for the overall effect).
Read literally, I’m not sure what to make of this caveat. First, isn’t it the number of subjects, rather than the number of studies, that’s relevant here? Perhaps they mean “average-size” studies? Second, I don’t know what they mean by “with an effect size <0.0.” Effect sizes that are lower than zero (as this refers to) are those that show a cognitive impairment. To give a precise statement on the number of (subjects or average-sized studies) needed to bring the total effect size to 0.2, we’d need to know a specific effect size that those studies would need to have. Mathematically, the statement just doesn’t make sense. I’ve read meta-analyses where the researchers at least try to find unpublished work, and it’s disappointing that the authors not only don’t do that here, but write about this issue in a way that seems to show a lack of care around the issue.
Most importantly, median and average sleep across studies was 4.3 and 4.6 hours, respectively—and this includes a few studies on young children, such as one in which “sleep deprivation” meant 8.1 hours of sleep. So the effects found by this meta-analysis, which seem relatively modest to me, required an amount of sleep deprivation that to me seems rather extreme.
So we have to ask two questions.
Is there some meta-analytic evidence of a statistically significant effect of an average of 3.5 hours of SD on at least some measures of cognitive function? The answer is yes.
Does that evidence paint a picture of an effect that warrants concern over a more modest 1-2 hour SD regimen over the long term, considering effect sizes and the scope of impairment across cognitive domains? This meta-analysis is both weak (due to bias and limited effect sizes) and flawed (due to publication bias) in regards to this specific research question.
There’s no perfect meta-analysis, and despite the flaws I see in this one, it’s still useful to investigate. But when I think through its shortcomings, it does not leave me worried about a consistent 6 hours of sleep per night, or even the occasional 4-5 hour night of sleep.
And note that these are the results without making any efforts to optimize for a shortened sleep schedule. What if you had a coach to help establish a rejuvenating 5 hour long-term sleep regimen? What about modafinil? Does impairment continue to degrade indefinitely over time, or do you eventually just get used to it? What if you’re motivated to do whatever activity’s keeping you awake, rather than doing a bunch of psych tests in a sleep lab? Is sleep debt a thing, or can you balance out 5x5-hour nights of sleep with a 2x9-hour nights of sleep, netting a total of 6 extra waking hours per week = 16 waking days per year?
I’m not complaining at all that these questions aren’t addressed in this meta-analysis. But the point is that the deck is somewhat stacked in favor of the “SD bad” hypothesis. That should be kept in the back of our minds when we take the additional step of synthesizing our interpretation of a particular meta-analysis into our overall view of, as Walker puts it, why we sleep.
The article specifies that it used Orwin’s fail-safe N to calculate the number of missing studies required to reach a small effect size. It’s not as good as the standard trim-and-fill method I’ve seen a lot in meta-analyses. But it makes mathematical sense and provides evidence.
Ah, perhaps the “<” is a typo.
(Quick reply, will go into more depth later)
I think the small effect sizes are not as important for predicting overall cognition as the larger ones. If damage happens to a specific part of the brain and not others (as in e.g. most non-fatal strokes), that causes a lot more functional impairment than you would expect if you focused on all of the many parts of the brain that weren’t damaged.
A lot of the variability in the effect sizes is caused by perfectly reasonable things like variations in sleep deficit and cumulative days of restricted sleep, as I explicitly pointed out in the post, and the type of cognitive test used is probably responsible for a substantial fraction of the variability too.
The meta-analysis itself tells you the average sleep deficit across the studies compared with age-adjusted sleep recommendations, as I pointed out in my post, and that number is probably more relevant than those averages.
Note that one of my meta-analyses found that (the absolute value of) the effect size of sleep restriction on attention is higher when the comparison group undergoes sleep extension rather than when it undergoes no intervention. This is at least weak evidence that the dose-response relationship between sleep duration and cognition doesn’t flatten out once you get to the average amount of time people sleep.
(This is consistent with me own experience; I started taking less time on assignments and getting better grades in college after I started forcing myself to sleep longer than my previous higher-than-average baseline.)
If I’m understanding you correctly, you’re saying that you think that a 1-2 statistically significant moderate effect sizes are more worrisome than if this research had discovered statistically significant but small effect sizes in every category. You expect that functional impairment on real-world tasks would be serious and wide-ranging as a consequence of moderate impairment in a single cognitive domain.
First, I think this is a great crux, and highlights the importance of carefully framing a research question.
Second, I think we ought to consider a few possible functional performance scenarios:
Tasks that stretch cognitive capacity to the maximum extent, often by design. Example: a chess championship or math exam.
Tasks that typically use limited cognitive capacity, but in which mistakes can have severe consequences and rare emergencies can occur at any time that will result in much higher cognitive demands than normal. Example: jet pilots, soldiers on patrol.
Tasks that use limited cognitive capacity, allow for corrections and adjustments, and are primarily bottlenecked by time, skill, and material resources. Example: most jobs, tasks of daily living, sailing competitions.
I think that scenarios (1) and (2) are the best candidates for the hypothesis of greater functional impairment effects relative to limited and modest cognitive function deficits, and scenario (3) is best for the hypothesis that modest and limited cognitive function deficits aren’t too concerning.
Some challenges might even be mixed. For example, we could imagine that in school coursework, learning is bottlenecked by hours of study in general. However, on exams, for well-prepared students, hours of sleep might become the limiting factor on their performance. Hence, an idealized study strategy might want to routinely sacrifice some sleep in exchange for more study time, but then increase sleep on the days leading up to an exam.
Under this mixed framework, a strategic sleep regimen would be flexible, based on the specific performance demands the person in question was facing.
I’ll check into this when I have time—it’s interesting!
The sleep times I referred to track closely with the average age-adjusted sleep deprivations, but you are right that it would have been more apt to choose the latter instead of the former. I simply find it more intuitive to gauge what X hours of sleep feels like than Y hours of sleep deprivation.
Edit:
As a followup, I’m not sure how to think about the relationship between single-factor cognitive effects and broad functional effects.
At first glance, it seems to me that if we hypothesize that a single-factor cognitive deficit corresponds to broad functional impairment, that we’re essentially saying that you’re held back by whatever cognitive faculty is furthest below baseline. That in turn would indicate that any single-factor cognitive improvements obtained by sleep extension (i.e. alertness) would not correspond to functional improvements, if any cognitive factors fail to be improved by sleep extension.
On the other hand, it could be that, in either direction, the cognitive factors that are harmed/helped by sleep adjustments happen to be the load-bearing factors for most activities. Maybe executive function and sustained attention are the limiting cognitive resource for most people at baseline, and alertness is the typical limiting cognitive resource above that. If that were true, we’d expect to see broad functional improvements or impairments across wider ranges of sleep adjustment.
I’m most inclined toward the idea that tasks have a heterogeneous relationship with particular cognitive capacities, and thus with the impact of sleep. This suggests that people can reap the most benefit by tailoring their sleep schedule to their particular requirements.
Busy parents with regular jobs that aren’t too cognitively demanding or hazardous might benefit most from a couple extra hours of wakefulness every day to get things done, and indeed, they often seem to opt for that strategy. By contrast, airlines apparently have strict policies for how long pilots must sleep before a shift, and I’ve also heard an anecdote from a biologist in a BSL4 facility that they don’t do inoculations unless they’re in absolutely tip-top cognitive shape. If they got bad sleep, or woke up with a crick in their neck, they reschedule.
My guess is that many people are modestly strategic about their sleep budget. But if this hypothesis is correct, there may be substantial gains to be reaped by individualized, flexible tailoring of the sleep schedule, rather than issuing a blanket recommendation of X hours of sleep across the board.