Steve Sailer recently commented on the SAT math score gains, suggesting, in effect, that the gains could be a artifact arising from changes in the test composition and subject pool. A recent study of 1.7 million test takers indicates that this probably is not the case, as math scores have risen in parallel on the ACT and on EXPLORE and, as it now seems, as the whole bell curve has shifted to the right — ethnic differences and all.
Our results provide one of the key missing pieces to the Flynn effect puzzle — that the Flynn effect operates in the right tail. Flynn (1996) was correct in suggesting that IQ gains extend to every level. This result, along with the finding that the rate of gain in the right tail on the math subtests is the same as in the middle and lower parts of the distribution, illustrates for the first time that it is likely the entire curve that is rising at a remarkably constant rate
The more interesting question, of course, is: “Are the generally intelligent getting more generally intelligent?” The findings paint a rather ambiguous picture.
Are the gains uncovered here genuine intelligence gains, are they due to artifact, or some of both? Jensen (1998) has argued that the increase in IQ scores over time is likely on the measure’s specific knowledge content rather than the g factor (e.g., Nettelbeck & Wilson, 2004; Rushton, 1999). He also provided a distinction between shadowaspects and true aspects of the IQ gains, using the analogy of trying to indirectly approximate the height of an individual by using their shadow rather than measuring their height directly. Jensen (1996, p. 150) notes that “it is still quite unknown just how much of the secular increase in scores on g-loaded tests is due to the ‘shadow’ aspect of mental measurement and how much is due to real changes in the biological substrate of mental development.” He suggested that one way to determine whether the gains are real or true is to examine the degree of gains on various composites and subtests. For example, for tests that are most g loaded such as the SAT, ACT, and EXPLORE composites, the gains should be lower than on individual subtests such as the SAT-M, ACT-M, and EXPLORE-M. This is precisely the pattern we have found within each set of measures and this suggests that the gain is likely not due as much to genuine increases in g, but perhaps is more likely on the specific knowledge content of the measures. Additionally, following Wicherts et al. (2004),we used multigroup = confirmatory factor analysis (MGCFA) to further investigate whether the gains on the ACT and EXPLORE (the two measures with enough subtests for this analysis) were due to g or to other factors. Using time period as the grouping variable, we uncovered that both tests were not factorially invariant with respect to cohort which aligns with the findings of Wicherts et al. (2004) among multiple tests from the general ability distribution. Therefore, it is unclear whether the gains on these tests are due to g or to other factors, although increases could indeed be due to g, the true aspect, at least in part.
Unlike the difference between Blacks and Whites, for which measurement invariance holds (e.g. Dolan, 2000; Dolan and Hamaker, 2001; Lubke, et al 2003), the difference between cohorts is not readily comparable to the difference within cohorts. The inference that it was, of course, was central to Flynn and Dickens’ (2001) anti-hereditarian case.