Why, when is the latter? Is there one reason or more and if so how can they be structured and by what? If one of the observables does not change, because there is a controlling observer (prediction+feedback), there is no way to establish correlation.
Typically you get causality without correlation when there is some controller that manipulates the causal variable in order to control the variable that it has an effect on.
I am displeased by bayesian probability combined with graphs (DAG), it so obviously lacks the nonlinear activation function.
DAGs only encode the structural relations, they make no inherent claims that things have to be linear. A common model is to allow each node to be an arbitrary function of its parents. The reason this isn’t used much in practice, even though this is what the math is based on, is that it is usually very hard to fit.
Typically you get causality without correlation when there is some controller that manipulates the causal variable in order to control the variable that it has an effect on.
DAGs only encode the structural relations, they make no inherent claims that things have to be linear. A common model is to allow each node to be an arbitrary function of its parents. The reason this isn’t used much in practice, even though this is what the math is based on, is that it is usually very hard to fit.