Also sorry I didn’t actually answer your main question. It’s actually something I’ve thought about quite a bit, but usually in the context of “not enough data to map out this very-high-dimensional space” rather than “not enough data to detect a small change”. The problem is similar in both cases. I’ll probably write a post or two on it at some point, but here’s a very short summary.
Traditional probability theory relies heavily on large-number approximations; mainstream statistics uses convergence as its main criterion of validity. Small data problems, on the other hand, are much better suited to a Bayesian approach. In particular, if we have a few different models (call them Mi) and some data D, we can compute the posterior P[Mi|D] without having to talk about convergence or large numbers at all.
The trade-off is that the math tends to be spectacularly hairy; P[Mi|D] usually involves high-dimensional integrals. Traditional approaches approximate those integrals for large numbers of data points, but the whole point here is that we don’t have enough data for the approximations to be valid.
Also sorry I didn’t actually answer your main question. It’s actually something I’ve thought about quite a bit, but usually in the context of “not enough data to map out this very-high-dimensional space” rather than “not enough data to detect a small change”. The problem is similar in both cases. I’ll probably write a post or two on it at some point, but here’s a very short summary.
Traditional probability theory relies heavily on large-number approximations; mainstream statistics uses convergence as its main criterion of validity. Small data problems, on the other hand, are much better suited to a Bayesian approach. In particular, if we have a few different models (call them Mi) and some data D, we can compute the posterior P[Mi|D] without having to talk about convergence or large numbers at all.
The trade-off is that the math tends to be spectacularly hairy; P[Mi|D] usually involves high-dimensional integrals. Traditional approaches approximate those integrals for large numbers of data points, but the whole point here is that we don’t have enough data for the approximations to be valid.