I think this is called “inconsistency” rather than bias. Inconsistency means no matter how many data you have, and however you average them, the result still has directional error.
Just to follow up on alex_zag_al’s sibling comment, you can have consistent estimators which are biased for any finite sample size, but are aymptotically unbiased, i.e., the bias shrinks to zero as the sample size increases without bound.
(As alex_zag_al notes, EY’s explanation of bias is correct. It means that in some situations “do an analysis on all the data” is not equivalent to “do the analysis on disjoint subjects of the data and average the results”—the former may have a smaller bias than the latter.)
I think this is called “inconsistency” rather than bias. Inconsistency means no matter how many data you have, and however you average them, the result still has directional error.
Just to follow up on alex_zag_al’s sibling comment, you can have consistent estimators which are biased for any finite sample size, but are aymptotically unbiased, i.e., the bias shrinks to zero as the sample size increases without bound.
(As alex_zag_al notes, EY’s explanation of bias is correct. It means that in some situations “do an analysis on all the data” is not equivalent to “do the analysis on disjoint subjects of the data and average the results”—the former may have a smaller bias than the latter.)
this is an easy mistake to make. You’re thinking about one large sample, from which you derive one good estimate.
When the article says
it means, getting many samples, and from each one of these samples, deriving an estimate from it in isolation.
You: 1 sample of n data points, n->infinity. 1 estimate.
Yudkowsky: n samples of d data points, n->infinity. n estimates.