More precisely, the first sample gives the most information about the mean. Learning one person’s income tells you a lot about incomes in general, even though incomes are heavy-tailed.
Imagine you had no prior knowledge of how wealthy people are on Earth, or even how to think about the concept of “wealth.” For you, the meaning of the term is as inscrutable as the term “flargibargh.” You might sample a very poor person, and think everybody’s living in poverty. You might sample a middle-class person, and miss the existence of the very rich and poor. You might (unlikely) sample a billionaire and think everybody’s incredibly wealthy.
However, those samples help you avoid the mistakes of thinking that wealth is commonly extremely negative, or of a gigantic magnitude (i.e. on the order of Avogadro’s number). It gets you vastly closer to the mean than you might land at if you had absolutely zero knowledge of what the concept of “wealth” refers to, and didn’t even know that it’s a word to measure something relevant to humans (in which domain manageable numbers are common).
However, the first sample gives you no information about the distribution of the sample. As the problem above illustrates, sampling one person tells you nothing about whether wealth is distributed on a bell curve, a heavy-tailed distribution, is exactly even, has a linear distribution, or some other form.
It’s very important to gain the skill of “get a sample or example” when dealing with new territory. At the same time, you need to understand what that sample does or does not tell you. Mistakenly thinking that a sample gives you information about X can lead you to make decisions based on that illusory “information,” when if you’d known your ignorance better you might not have acted.
And then, of course, it’s important to make sure that your sample is actually a sample of what you think it is...
More precisely, the first sample gives the most information about the mean. Learning one person’s income tells you a lot about incomes in general, even though incomes are heavy-tailed.
Imagine you had no prior knowledge of how wealthy people are on Earth, or even how to think about the concept of “wealth.” For you, the meaning of the term is as inscrutable as the term “flargibargh.” You might sample a very poor person, and think everybody’s living in poverty. You might sample a middle-class person, and miss the existence of the very rich and poor. You might (unlikely) sample a billionaire and think everybody’s incredibly wealthy.
However, those samples help you avoid the mistakes of thinking that wealth is commonly extremely negative, or of a gigantic magnitude (i.e. on the order of Avogadro’s number). It gets you vastly closer to the mean than you might land at if you had absolutely zero knowledge of what the concept of “wealth” refers to, and didn’t even know that it’s a word to measure something relevant to humans (in which domain manageable numbers are common).
However, the first sample gives you no information about the distribution of the sample. As the problem above illustrates, sampling one person tells you nothing about whether wealth is distributed on a bell curve, a heavy-tailed distribution, is exactly even, has a linear distribution, or some other form.
It’s very important to gain the skill of “get a sample or example” when dealing with new territory. At the same time, you need to understand what that sample does or does not tell you. Mistakenly thinking that a sample gives you information about X can lead you to make decisions based on that illusory “information,” when if you’d known your ignorance better you might not have acted.
And then, of course, it’s important to make sure that your sample is actually a sample of what you think it is...