There seem to be some critical methodological errors here that have easy fixes. First, the intervention subject took the same or strictly more time in the second test compared to the first, and the control took the same or less time. This is pretty bad for iq tests of this sort, you would already expect more time to result in better scores. Second, the SAME tests were used for before and after, and some of the tests literally tell you the answers after you do the questions. In particular, the spatial aspect of the first test tells you the answers for a large number of the questions, so this is quite prone to practice related increases, and the spatial subsection in particular was used to judge fluid intelligence change. Considering you seemed to be operating under the assumption that the scores on different tests are measuring the same thing, why not just take different tests before and after?
There seem to be some critical methodological errors here that have easy fixes. First, the intervention subject took the same or strictly more time in the second test compared to the first, and the control took the same or less time. This is pretty bad for iq tests of this sort, you would already expect more time to result in better scores. Second, the SAME tests were used for before and after, and some of the tests literally tell you the answers after you do the questions. In particular, the spatial aspect of the first test tells you the answers for a large number of the questions, so this is quite prone to practice related increases, and the spatial subsection in particular was used to judge fluid intelligence change. Considering you seemed to be operating under the assumption that the scores on different tests are measuring the same thing, why not just take different tests before and after?
All of these issues are resolved by having controls and by the variance within control.
Using different tests, given that the results don’t correlate very well, would be a mistake.