IMO it’s important to keep in mind that the sample size driving these conclusions is generally pretty small. Every statistician and machine learning engineer knows that a dataset with 2 data points is essentially worthless, yet people are surprisingly willing to draw a trend line through 2 data points.
When you’re dealing with small sample sizes, in my view it is better to take more of a “case study” approach than an “outside view” approach. 2 data points isn’t really enough for statistical inference. However, if the 2 data points all illuminate some underlying dynamic which isn’t likely to change, then the argument becomes more compelling. Basically when sample sizes are small, you need to do more inside-view theorizing to make up for it. And then you need to be careful about extrapolating to new situations, to ensure that the inside-view properties you identified actually hold in those new situations.
IMO it’s important to keep in mind that the sample size driving these conclusions is generally pretty small. Every statistician and machine learning engineer knows that a dataset with 2 data points is essentially worthless, yet people are surprisingly willing to draw a trend line through 2 data points.
When you’re dealing with small sample sizes, in my view it is better to take more of a “case study” approach than an “outside view” approach. 2 data points isn’t really enough for statistical inference. However, if the 2 data points all illuminate some underlying dynamic which isn’t likely to change, then the argument becomes more compelling. Basically when sample sizes are small, you need to do more inside-view theorizing to make up for it. And then you need to be careful about extrapolating to new situations, to ensure that the inside-view properties you identified actually hold in those new situations.