I thought they were partially not the same because they added the writing subtest.
If this is true, I would expect there to be a correlation of around .844 between one test score and a later test score under the same grading system.
The reliability of recent SAT tests seems to generally be ~0.9 according to one random PDF I found (and has long been high). If I am understanding the formulas in this page correctly, then in this application, reliability simplifies to the Pearson’s r of the 2 scores*, and that reliability of 0.9 is pretty similar to the LW old/new correlation r of 0.84.
So this may be simply what one would expect from people taking the SAT twice, without having to invoke the lowered correlation caused by the additional sections and any other tweaks they’ve made.
* Specifically, I’m looking at Artifactual Influences, #3: reliability, where I think we can reuse the example: for test-retest, assume the LWer doesn’t get dumber or smarter and the true correlation would be 1; the reliability of the old SAT should be 0.9, the reliability of the new one should be 0.9 too, so you get ‘1 sqrt(0.9 0.9)’ or ‘sqrt(0.9 * 0.9)’ or ‘sqrt(0.9^2)’ or ‘0.9’. So, the expected correlation of 2 SAT tests simplifies to the original reliability of 0.9.
I thought they were partially not the same because they added the writing subtest.
The reliability of recent SAT tests seems to generally be ~0.9 according to one random PDF I found (and has long been high). If I am understanding the formulas in this page correctly, then in this application, reliability simplifies to the Pearson’s r of the 2 scores*, and that reliability of 0.9 is pretty similar to the LW old/new correlation r of 0.84.
So this may be simply what one would expect from people taking the SAT twice, without having to invoke the lowered correlation caused by the additional sections and any other tweaks they’ve made.
* Specifically, I’m looking at Artifactual Influences, #3: reliability, where I think we can reuse the example: for test-retest, assume the LWer doesn’t get dumber or smarter and the true correlation would be 1; the reliability of the old SAT should be 0.9, the reliability of the new one should be 0.9 too, so you get ‘1 sqrt(0.9 0.9)’ or ‘sqrt(0.9 * 0.9)’ or ‘sqrt(0.9^2)’ or ‘0.9’. So, the expected correlation of 2 SAT tests simplifies to the original reliability of 0.9.