Statistics is trying to “invert” what probability does.
Probability starts with a model, and then describes what will happen given the model’s assumptions.
Statistics goes the opposite direction: it is about using data to put limits on the set of reasonable/plausible models. The logic is something like: “if the model had property X, then probability theory says I should have seen Y. But, NOT Y. Therefore, NOT X.” It’s invoking probability to get the job done.
Applying statistical techniques without understanding the probability models involved is like having a toolbox, without understanding why any of the tools work.
It all goes fine until the tools fail (which happens often, and often silently) and then you’re hosed. You may fail to notice the problems entirely, or may have to outsource judgments to others with more experience.
Statistics is trying to “invert” what probability does.
Probability starts with a model, and then describes what will happen given the model’s assumptions.
Statistics goes the opposite direction: it is about using data to put limits on the set of reasonable/plausible models. The logic is something like: “if the model had property X, then probability theory says I should have seen Y. But, NOT Y. Therefore, NOT X.” It’s invoking probability to get the job done.
Applying statistical techniques without understanding the probability models involved is like having a toolbox, without understanding why any of the tools work.
It all goes fine until the tools fail (which happens often, and often silently) and then you’re hosed. You may fail to notice the problems entirely, or may have to outsource judgments to others with more experience.
Thanks, this is incredibly useful.
I think I understand enough to put together a curriculum to delve into this topic. Starting with the harvard course you recommended.