If the test is normalized for a population A, then if we give it to a population B, the results don’t have to be Gaussian. The normalization occurs only once, when the relationship between the raw scores and the IQ values is defined. Later the existing definition can be reused.
You would get somewhat different shape when you a) calibrate the test for population A and then measure population B, or b) calibrate the test for A+B and then measure population B.
Probably the most correct way to compare two populations would be to skip the normalization step and just compare the histograms of raw scores for both populations. (I am not good enough in math to say how exactly.)
Also, I am not sure how much such comparison would depend on the specific test. Let’s imagine that we have one population with average IQ 100 and other population with average IQ 120. If we give them a test consisting of IQ-110-hard questions, the two populations will probably seem more different than if we give them a test consisting of a mix of IQ-80-hard and IQ-140-hard questions.
Also, I am not sure how much such comparison would depend on the specific test. Let’s imagine that we have one population with average IQ 100 and other population with average IQ 120. If we give them a test consisting of IQ-110-hard questions, the two populations will probably seem more different than if we give them a test consisting of a mix of IQ-80-hard and IQ-140-hard questions.
You can compare by looking at which percentile of population B, the median of population A corresponds to.
Edit: also once you’ve compared several populations this way, you can try to see if there is a way to normalize the test such that the distributions for all the populations have similar shapes.
If the test is normalized for a population A, then if we give it to a population B, the results don’t have to be Gaussian. The normalization occurs only once, when the relationship between the raw scores and the IQ values is defined. Later the existing definition can be reused.
You would get somewhat different shape when you a) calibrate the test for population A and then measure population B, or b) calibrate the test for A+B and then measure population B.
Probably the most correct way to compare two populations would be to skip the normalization step and just compare the histograms of raw scores for both populations. (I am not good enough in math to say how exactly.)
Also, I am not sure how much such comparison would depend on the specific test. Let’s imagine that we have one population with average IQ 100 and other population with average IQ 120. If we give them a test consisting of IQ-110-hard questions, the two populations will probably seem more different than if we give them a test consisting of a mix of IQ-80-hard and IQ-140-hard questions.
This backs my general notion that for a lot of measurements (especially of people?), we need graphs, not single numbers.
You can compare by looking at which percentile of population B, the median of population A corresponds to.
Edit: also once you’ve compared several populations this way, you can try to see if there is a way to normalize the test such that the distributions for all the populations have similar shapes.