Clearly one could split a data set using basically any possible variable, but most are obviously wrong. (That is to say, they lack explanatory power, and are actually irrelevant.) To attempt to simplify, then, if you understand a system, or have a good hypothesis, it is frequently easier to pick variables that should be important, and gather further data to confirm.
This is made explicit in removing connections from the graph. The more “obviously” “wrong” connections you sever, the more powerful the graph becomes. This is potentially harmful, though, since like assigning 0 probability weight to some outcome, once you sever a connection you lose the machinery to reason about it. If your “obvious” belief proves incorrect, you’ve backed yourself into a room with no escape. Therefore, test your assumptions.
This is actually a huge component of Pearl’s methods since his belief is that the very mechanism of adding causal reasoning to probability is to include “counterfactual” statements that encode causation into these graphs. Without counterfactuals, you’re sunk. With them, you have a whole new set of concerns but are also made more powerful.
It’s also really, really important to dispute that “one could split a data set using basically any possible variable”. While this is true in principle, Pearl made/confirmed some great discoveries by his causal networks which helped to show that certain sets of conditioning variables will, when selected together, actively mislead you. Moreover, without using counterfactual information encoded in a causal graph, you cannot discover which variables these are.
Finally, I’d just like to suggest that picking a good hypothesis, coming to understand a system; these are undoubtedly the hardest part of knowledge involving creativity, risk, and some of the most developed probabilistic arguments. Actually making comparisons between competing hypotheses such that you can end up with a good model and know what “should be important” is the tough part fraught with possibility of failure.
Clearly one could split a data set using basically any possible variable, but most are obviously wrong. (That is to say, they lack explanatory power, and are actually irrelevant.) To attempt to simplify, then, if you understand a system, or have a good hypothesis, it is frequently easier to pick variables that should be important, and gather further data to confirm.
This is made explicit in removing connections from the graph. The more “obviously” “wrong” connections you sever, the more powerful the graph becomes. This is potentially harmful, though, since like assigning 0 probability weight to some outcome, once you sever a connection you lose the machinery to reason about it. If your “obvious” belief proves incorrect, you’ve backed yourself into a room with no escape. Therefore, test your assumptions.
This is actually a huge component of Pearl’s methods since his belief is that the very mechanism of adding causal reasoning to probability is to include “counterfactual” statements that encode causation into these graphs. Without counterfactuals, you’re sunk. With them, you have a whole new set of concerns but are also made more powerful.
It’s also really, really important to dispute that “one could split a data set using basically any possible variable”. While this is true in principle, Pearl made/confirmed some great discoveries by his causal networks which helped to show that certain sets of conditioning variables will, when selected together, actively mislead you. Moreover, without using counterfactual information encoded in a causal graph, you cannot discover which variables these are.
Finally, I’d just like to suggest that picking a good hypothesis, coming to understand a system; these are undoubtedly the hardest part of knowledge involving creativity, risk, and some of the most developed probabilistic arguments. Actually making comparisons between competing hypotheses such that you can end up with a good model and know what “should be important” is the tough part fraught with possibility of failure.