If “X” is something we don’t have a “gears model” of yet, aren’t “tests that highly correlate with X” the only way to measure X? Especially when it’s not physics.
In other words, why go the extra mile to emphasize that Y is merely the best available method to measure X, but not X itself? Is this a standard way of talking about scientific topics, or is it only used for politically sensitive topics?
Here the situation is different in that it’s not just that we don’t know how to measure X, but rather the way in which we have derived X means that directly measuring it is impossible even in principle.
That’s distinct from something like (say) self-esteem, where it might be the case that we might figure out what self-esteem really means, or at least come up with a satisfactory instrumental definition for it. There’s nothing in the normal definition of self-esteem that would make it impossible to measure on an individual level. Not so with g.
Of course, one could come up with a definition for something like “intelligence”, and then try to measure that directly—which is what people often do, when they say that “intelligence is what intelligence tests measure”. But that’s not the same as measuring g.
This matters because it’s part of what makes e.g. the Flynn effect so hard to interpret—yes raw test scores on IQ tests have gone up, but have people actually gotten smarter? We can’t directly measure g, so a rise alone doesn’t yet tell us anything. On the other hand, if people’s scores on a test of self-esteem went up over time, then it would be much more straightforward to assume that people’s self-esteem has probably actually gone up.
In this case it’s important to emphasize that difference, because a commonly raised hypothesis is that while we can see clear training effects on IQ, none of these effects are on the underlying g-factor, i.e. the gains do not generalize to new tasks. For naive interventions, this has been pretty clearly demonstrated:
IQ scores provide the best general predictor of success in education, job training, and work. However, there are many ways in which IQ scores can be increased, for instance by means of retesting or participation in learning potential training programs. What is the nature of these score gains?
[...]
The meta-analysis of 64 test– retest studies using IQ batteries (total N= 26,990) yielded a correlation between g loadings and score gains of −1.00, meaning there is no g saturation in score gains.
If “X” is something we don’t have a “gears model” of yet, aren’t “tests that highly correlate with X” the only way to measure X? Especially when it’s not physics.
In other words, why go the extra mile to emphasize that Y is merely the best available method to measure X, but not X itself? Is this a standard way of talking about scientific topics, or is it only used for politically sensitive topics?
Here the situation is different in that it’s not just that we don’t know how to measure X, but rather the way in which we have derived X means that directly measuring it is impossible even in principle.
That’s distinct from something like (say) self-esteem, where it might be the case that we might figure out what self-esteem really means, or at least come up with a satisfactory instrumental definition for it. There’s nothing in the normal definition of self-esteem that would make it impossible to measure on an individual level. Not so with g.
Of course, one could come up with a definition for something like “intelligence”, and then try to measure that directly—which is what people often do, when they say that “intelligence is what intelligence tests measure”. But that’s not the same as measuring g.
This matters because it’s part of what makes e.g. the Flynn effect so hard to interpret—yes raw test scores on IQ tests have gone up, but have people actually gotten smarter? We can’t directly measure g, so a rise alone doesn’t yet tell us anything. On the other hand, if people’s scores on a test of self-esteem went up over time, then it would be much more straightforward to assume that people’s self-esteem has probably actually gone up.
In this case it’s important to emphasize that difference, because a commonly raised hypothesis is that while we can see clear training effects on IQ, none of these effects are on the underlying g-factor, i.e. the gains do not generalize to new tasks. For naive interventions, this has been pretty clearly demonstrated: