This question is inspired by the suprisingly complicated Wikipedia page on correlation and dependence.Can you explain distance correlation and brownias covariance as well as the ‘Randomized Dependence Coefficient’ in lay man’s terms and their application, particularly for rationalists? How about the ‘correlation ratio’, ‘polychoric correlation’ and ‘coefficient of determination’?
Clarity, you have a large number of comments with incorrect Wikipedia links. Your “introspective illusion” comment directly above this one does it correctly. You clearly are capable of generating functional links to Wikipedia pages.
Please take a few minutes to make your recent comments less frustrating to read. It is frankly astounding that so many people have given you this feedback and you are still posting these broken links.
The stage is set-up in this way: you observe two set of data, that your model indicates as coming from two distinct sources. The question is: are the two sets related in any way? If so, how much? The ‘measure’ of such is usually called correlation.
From an objective Bayesian point of view, it doesn’t make much sense to talk about correlation between two random variables (it makes no sense to talk about random variable either, but that’s another story), because correlation is always model dependent, and probabilities are epistemic. Two agents observing the same phoenomenon, having different information about it, may very well come to totally opposite conclusions.
From a frequentist point of view, though, the correlation between variables express an objective quantity, all the measure that you mention are attempts at finding out how much correlation there is, making more or less explicit assumptions about your model.
If you think that the two sources are linearly related, then the Pearson coefficient will tell you how much the data supports the model. If you think the two variables comes from a continuous normal distribution, but you can only observe their integer value, you use polychoric correlation. And so on... Depending on the assumptions you make, there are different measures of how much correlated the data are.
This question is inspired by the suprisingly complicated Wikipedia page on correlation and dependence.Can you explain distance correlation and brownias covariance as well as the ‘Randomized Dependence Coefficient’ in lay man’s terms and their application, particularly for rationalists? How about the ‘correlation ratio’, ‘polychoric correlation’ and ‘coefficient of determination’?
All your links are belong to wrongness. Please delete the ‘www’ before en. in en.wikipedia.
Clarity, you have a large number of comments with incorrect Wikipedia links. Your “introspective illusion” comment directly above this one does it correctly. You clearly are capable of generating functional links to Wikipedia pages.
Please take a few minutes to make your recent comments less frustrating to read. It is frankly astounding that so many people have given you this feedback and you are still posting these broken links.
This post would need to be in response to his post (not a lower level replier) or he would not get a notification about it.
A first broad attempt.
The stage is set-up in this way: you observe two set of data, that your model indicates as coming from two distinct sources. The question is: are the two sets related in any way? If so, how much? The ‘measure’ of such is usually called correlation.
From an objective Bayesian point of view, it doesn’t make much sense to talk about correlation between two random variables (it makes no sense to talk about random variable either, but that’s another story), because correlation is always model dependent, and probabilities are epistemic. Two agents observing the same phoenomenon, having different information about it, may very well come to totally opposite conclusions.
From a frequentist point of view, though, the correlation between variables express an objective quantity, all the measure that you mention are attempts at finding out how much correlation there is, making more or less explicit assumptions about your model.
If you think that the two sources are linearly related, then the Pearson coefficient will tell you how much the data supports the model.
If you think the two variables comes from a continuous normal distribution, but you can only observe their integer value, you use polychoric correlation. And so on...
Depending on the assumptions you make, there are different measures of how much correlated the data are.