The first principal component of scores on various tests is, of course, a mathematical artifact. But it’s not the real thing, it’s just an estimate, a finger pointing at the real thing.
I agree that people can be both stupid and smart in very different ways, but at a certain—and useful! -- level of aggregation, there are generally smart people and generally stupid people. There is a lot of variation around that axis, but I think the axis exists. I’m not arguing that everything should be projected into that one-dimensional space and reduced to a scalar.
Here is how this game works. We have a bunch of observed variables X, and a smaller set of hidden variables Z.
We assume a particular model for the joint distribution p(X,Z). We then think about various facts about this distribution (for example eigenvalues of the covariance matrix). We then try to conclude a causal factor from these facts. This is where the error is. You can’t conclude causality that way.
No I do not. I think IQ can be a useful predictor for some things (as good as one number can be, really). But that isn’t the story with g, is it? It is claimed to be a causal factor.
If we want to do prediction, let’s just get a ton of features and use that, like they do in machine learning. Why fixate on one number?
Also—we know IQ is not a causal factor, IQ is a result of a test (so it’s a consequence, not a cause).
If we want to do prediction, let’s just get a ton of features and use that, like they do in machine learning. Why fixate on one number?
Because it makes sense for many different people to study the same number.
In the last month I talked two times about Gottman. The guy got couples into his lab and observed them for 15 minutes while measuring all sorts of variables. Afterwards he did a mathematical model and found that the model has a 91% success rate in predicting whether newly-wed couples will divorce within 10 years.
The problem? The model is likely overfitted. Instead of using the model he generated in his first study I uses a new model for the next study that’s also overfitted.
If he would have instead work on developing a Gottman metric, other researcher could research the same metric. Other researcher could see what factors correlate with the Gottman metric.
In the case of IQ, IQ is seen as a robust metric. The EPA did studies to estimate how much IQ point are lost due to Mercury pollution. They priced IQ points. The compared the dollar value of the lost IQ points due to Mercury pollution with the cost for filters that reduce Mercury pollution.
That strong datadriven case allowed the EPA under Obama to take bold steps to reduce Mercury pollution. The Koch brothers didn’t make a fuss about but payed for the installation of better filters. From their perspective the statistics were robust enough that it doesn’t make sense to fight the EPA in the public sphere on the mercury regulation backed up by data driven argument.
The EPA can only do that because IQ isn’t a metric that they invented themselves where someone can claim that the EPA simply did p-hacking to make it’s case.
The main issue is to have consensus statistics. That reduces possibilities for clever h-hacking and allows researchers to study how the same metric acts in a variety of different contexts.
If every researcher invents his own metrics you get things like voodoo neuroscience.
I wouldn’t call the problem “shitty models” but models that aren’t tried and tested in many different contexts.
We know about how the model of IQ works a lot better than how a new model of intelligence works that a new researcher creates for his PHD thesis.
Once you think that it’s good to have a single metric for intelligence because it helps you to make arguments about issues like the effect of mercury pollution on intelligence, there are additional arguments why IQ is a good metric for that purpose.
Single parameter models for anything complicated are shitty models. Intelligence is complicated. A single parameter model of intelligence is a shitty model.
I know how the game works, I’ve paged through the Pearl book. But here, in this case, I don’t care much about causality. I can observe the existence of stupid people and smart people (and somewhat-stupid, and middle-of-the-road, and a bit smart, etc.). I can roughly rank them on the smart—stupid axis. That axis won’t capture all the diversity and the variation, but it will capture some. Whether what it captures is sufficient depends, of course. It depends on the purpose of the exercise and in some cases that’s all you need and in some cases it’s entirely inadequate. However in my experience that axis is pretty relevant to a lot of things. It’s useful.
Note that here no prediction is involved. I’m not talking about whether estimates of g (IQ, basically) can/will predict your success in life or any similar stuff. That’s a different discussion.
To the extent that you view g as what it is, I have no problem. But people think g is (a) a real thing and (b) causal. It’s not at all clear it is either. “Real things” involved in human intelligence are super complicated and have to do with brain architecture (stuff we really don’t understand well). We are miles and miles and miles away from “real things” in this setting.
The game I was describing was how PCA works, not stuff in Pearl’s book. The point was PCA is just relying on a model of a joint distribution, and you have to be super careful with assumptions to extract causality from that.
I think of g as, basically, a projection from the high-dimensional space of, let’s say, mind capabilities into low dimensions, in this case just a single one. Of course it’s an “artifact”, and of course you lose information when you do that.
However what I mean by g pointing a finger at the real thing is that this high-dimensional cloud has some structure. Things are correlated (or, more generally, dependent on each other). One way—a rough, simple way—to get an estimate of one feature of this structure is to do IQ testing. Because it’s so simple and because it’s robust and because it can be shown to be correlated to a variety of real-life useful things, IQ scores became popular. They are not the Ultimate Explanation for Everything, but they are better than nothing.
With respect to causality, I would say that the high-dimensional cloud of mind capabilities is the “cause”. But it’s hard to get a handle on it, for obvious reasons, and our one-scalar simplification of the whole thing might or might not be relevant to the causal relationship we are interested in.
P.S.
The point was PCA is just relying on a model of a joint distribution, and you have to be super careful with assumptions to extract causality from that.
PCA actually has deeper problems because it’s entirely linear and while that makes it easily tractable, real life, especially its biological bits, is rarely that convenient.
Also in practical informal talk, people overemphasize IQ because it is so fun for hierarchy-minded primates to arrange people from best to worst.
edit re: PCA: Yes, PCA is a super-parametric method, with the usual super-parametric method problems. However, the issue I have with PCA in this context is different, and also occurs in very flexible, fully non-parametric methods. Basically the issue is, no matter how you massage it, the joint distribution simply does not have the causal information you want in it, in general.
Look Eugene, I am not super interested in getting into an internet argument with you and your army of downvoting sockpuppets. What I am saying is you don’t understand how causality is concluded from data. Maybe you should read about it sometime.
No, you really really don’t. But this is sort for the record of “future you”. Start with (Neyman 1923) and (Rubin 1970).
I already had this conversation with gwern who basically eventually noticed that none of the naive causal analysis he was doing actually panned out, and he had to go and start learning all this stuff.
But you are not gwern, you are a demagogue, I don’t expect you to actually make scientific moves, or update on evidence, or anything. You need your precious causal factor to order people from best to worst.
The correlation studies that lead to defining IQ suggested that there is a single one, or if there multiple, they themselves are strongly correlated with each other.
The first principal component of scores on various tests is, of course, a mathematical artifact. But it’s not the real thing, it’s just an estimate, a finger pointing at the real thing.
I agree that people can be both stupid and smart in very different ways, but at a certain—and useful! -- level of aggregation, there are generally smart people and generally stupid people. There is a lot of variation around that axis, but I think the axis exists. I’m not arguing that everything should be projected into that one-dimensional space and reduced to a scalar.
Here is how this game works. We have a bunch of observed variables X, and a smaller set of hidden variables Z.
We assume a particular model for the joint distribution p(X,Z). We then think about various facts about this distribution (for example eigenvalues of the covariance matrix). We then try to conclude a causal factor from these facts. This is where the error is. You can’t conclude causality that way.
Do you think IQ has to be a causal factor to be a good predictor/be meaningful?
No I do not. I think IQ can be a useful predictor for some things (as good as one number can be, really). But that isn’t the story with g, is it? It is claimed to be a causal factor.
If we want to do prediction, let’s just get a ton of features and use that, like they do in machine learning. Why fixate on one number?
Also—we know IQ is not a causal factor, IQ is a result of a test (so it’s a consequence, not a cause).
Because it makes sense for many different people to study the same number.
In the last month I talked two times about Gottman. The guy got couples into his lab and observed them for 15 minutes while measuring all sorts of variables. Afterwards he did a mathematical model and found that the model has a 91% success rate in predicting whether newly-wed couples will divorce within 10 years.
The problem? The model is likely overfitted. Instead of using the model he generated in his first study I uses a new model for the next study that’s also overfitted. If he would have instead work on developing a Gottman metric, other researcher could research the same metric. Other researcher could see what factors correlate with the Gottman metric.
In the case of IQ, IQ is seen as a robust metric. The EPA did studies to estimate how much IQ point are lost due to Mercury pollution. They priced IQ points. The compared the dollar value of the lost IQ points due to Mercury pollution with the cost for filters that reduce Mercury pollution.
That strong datadriven case allowed the EPA under Obama to take bold steps to reduce Mercury pollution. The Koch brothers didn’t make a fuss about but payed for the installation of better filters. From their perspective the statistics were robust enough that it doesn’t make sense to fight the EPA in the public sphere on the mercury regulation backed up by data driven argument.
The EPA can only do that because IQ isn’t a metric that they invented themselves where someone can claim that the EPA simply did p-hacking to make it’s case.
Life is complicated, why restrict to single parameter models? Nobody in statistics or machine learning does this, with good reason.
If your argument for single parameter models has the phrase “unwashed masses” in it, I wouldn’t find it very convincing.
If you are worried about p-hacking, just don’t do p-hacking, don’t lobotomize your model.
The main issue is to have consensus statistics. That reduces possibilities for clever h-hacking and allows researchers to study how the same metric acts in a variety of different contexts.
If every researcher invents his own metrics you get things like voodoo neuroscience.
Yeah, I don’t buy it. Lying with statistics and shitty models are completely orthogonal issues. You can lie with shitty models or with good models.
Also the argument “we should use IQ because people lie with statistics” is a very different argument from the one usually made by IQ proponents.
I wouldn’t call the problem “shitty models” but models that aren’t tried and tested in many different contexts. We know about how the model of IQ works a lot better than how a new model of intelligence works that a new researcher creates for his PHD thesis.
Once you think that it’s good to have a single metric for intelligence because it helps you to make arguments about issues like the effect of mercury pollution on intelligence, there are additional arguments why IQ is a good metric for that purpose.
Single parameter models for anything complicated are shitty models. Intelligence is complicated. A single parameter model of intelligence is a shitty model.
Do you think it’s shitty in the sense that what the EPA is doing with it is without basis?
I think I am done repeating myself.
I know how the game works, I’ve paged through the Pearl book. But here, in this case, I don’t care much about causality. I can observe the existence of stupid people and smart people (and somewhat-stupid, and middle-of-the-road, and a bit smart, etc.). I can roughly rank them on the smart—stupid axis. That axis won’t capture all the diversity and the variation, but it will capture some. Whether what it captures is sufficient depends, of course. It depends on the purpose of the exercise and in some cases that’s all you need and in some cases it’s entirely inadequate. However in my experience that axis is pretty relevant to a lot of things. It’s useful.
Note that here no prediction is involved. I’m not talking about whether estimates of g (IQ, basically) can/will predict your success in life or any similar stuff. That’s a different discussion.
???
To the extent that you view g as what it is, I have no problem. But people think g is (a) a real thing and (b) causal. It’s not at all clear it is either. “Real things” involved in human intelligence are super complicated and have to do with brain architecture (stuff we really don’t understand well). We are miles and miles and miles away from “real things” in this setting.
The game I was describing was how PCA works, not stuff in Pearl’s book. The point was PCA is just relying on a model of a joint distribution, and you have to be super careful with assumptions to extract causality from that.
I think of g as, basically, a projection from the high-dimensional space of, let’s say, mind capabilities into low dimensions, in this case just a single one. Of course it’s an “artifact”, and of course you lose information when you do that.
However what I mean by g pointing a finger at the real thing is that this high-dimensional cloud has some structure. Things are correlated (or, more generally, dependent on each other). One way—a rough, simple way—to get an estimate of one feature of this structure is to do IQ testing. Because it’s so simple and because it’s robust and because it can be shown to be correlated to a variety of real-life useful things, IQ scores became popular. They are not the Ultimate Explanation for Everything, but they are better than nothing.
With respect to causality, I would say that the high-dimensional cloud of mind capabilities is the “cause”. But it’s hard to get a handle on it, for obvious reasons, and our one-scalar simplification of the whole thing might or might not be relevant to the causal relationship we are interested in.
P.S.
PCA actually has deeper problems because it’s entirely linear and while that makes it easily tractable, real life, especially its biological bits, is rarely that convenient.
I don’t think we disagree (?anymore?).
Also in practical informal talk, people overemphasize IQ because it is so fun for hierarchy-minded primates to arrange people from best to worst.
edit re: PCA: Yes, PCA is a super-parametric method, with the usual super-parametric method problems. However, the issue I have with PCA in this context is different, and also occurs in very flexible, fully non-parametric methods. Basically the issue is, no matter how you massage it, the joint distribution simply does not have the causal information you want in it, in general.
Yep. You just have to pick the correct metric: the one where you come out on top ;-)
Yes, you can. You can conclude that some causal factor exists. You then define g to be that causal factor.
No you can’t conclude that. I am glad we had this chat.
So you’re saying that the fact that all these traits are correlated is a complete coincidence?
Look Eugene, I am not super interested in getting into an internet argument with you and your army of downvoting sockpuppets. What I am saying is you don’t understand how causality is concluded from data. Maybe you should read about it sometime.
Oh, I do. How about you read up on IQ research sometime.
No, you really really don’t. But this is sort for the record of “future you”. Start with (Neyman 1923) and (Rubin 1970).
I already had this conversation with gwern who basically eventually noticed that none of the naive causal analysis he was doing actually panned out, and he had to go and start learning all this stuff.
But you are not gwern, you are a demagogue, I don’t expect you to actually make scientific moves, or update on evidence, or anything. You need your precious causal factor to order people from best to worst.
What if there are several such causal factors?
The correlation studies that lead to defining IQ suggested that there is a single one, or if there multiple, they themselves are strongly correlated with each other.