Resurrecting alternative tests of Spearman’s hypothesis — Gordon’s method applied to Secular differences

Understanding the psychometric characteristics of group score differences is essential when it comes to interpreting their cause. In a recent post, I noted that the psychometric characteristics of differences, such as immigrant-native differences, could be explored in a number of ways. As a demonstration, I applied a method introduced by Gordon (1985) to immigrant data discussed by Richwine (2009).

To summarize the method: one can compute the point biserial correlations between group membership and group differences; these correlations can be said to represent the factorial loadings on a group difference factor (Gordon, 1985) since these correlations would be the g-loadings found were one to conduct a principle component analysis with the group factor and scores and were one to rotate the axis so that the group factor had a loading of 1 (Jensen, 1987). One can then compare the similarity between the averaged group g-loadings and the group difference factors using Tucker’s congruence coefficient. A congruence coefficient above 0.95 is indicative of factorial identity. When the congruence coefficient between the averaged group g-factors and the between group difference factors is greater than 0.95, one can, following common interpretative rules, interpret the group difference as being a difference in g (Jensen, 1987; Gordon, 1985). This method represents an alternative to Jensen’s method of correlated vectors (MCV) which is based on the correlation between the magnitude of the standardized group differences and the averaged g-loadings. Gordon (1985) argued that this method has important advantages (1). The method is best treated as one of many alternative methods, each which have different advantages and disadvantages.

Gordon (1985) applied this method the the Black-White difference and found that the Black-White factor was identical to the g-factor. I applied Gordon’s method to the Immigrant Mexican – Native White factor and the Immigrant White – Native White factor and found that the former was identical to the g-factor while the latter wasn’t, rCC 0.97 and 0.88, respectively.

Table from Gordon (1985)

Gordon 1985

To test the robustness of this method, I applied it to the secular differences reported in Must et al. (2009), differences which were shown to be measure non invariant and for which an anti-Jensen Effect was found. Given the confirmatory factor results and the MCV results, one would expect Gordon’s method, where it robust, to reject the position that the time 1- time 2 factor was identical to g.

The results are below.

Gordon, Must

The congruence coefficient was 0.75; the standard interpret of this would be that the two factors, time1-time2 and g exhibit low similarity. Thus Gordon’s method seems to be robust.


1. In his critique of MCV, Gordon (1985) notes:

Consequently, we have no developed standard, other than the usual ones for judging correlations, that tell us how to evaluate the outcomes of a test of the Spearman hypothesis. Short of obtaining perfect or nearly perfect correlation, there is no way to know how large a nonzero correlation it is reasonable to demand as evidence.

Thus, Sandoval (1982) cautiously regarded a (rank) correlation of .48, which was significant with one-tailed test, as not “strongly supportive” (p. 200) of the Spearman hypothesis, A number of Jensen’s correlations are lower than .48, yet Jensen, correctly in my opinion, regards all of his sets of data as consistent with this hypothesis. Many readers may grant that Jensen’s mean correlation of .61 us a nontrivial result yet still not know what attitude to adopt towards the residual black-white differences or what to make of the batteries that yielded lower correlations.

Clearly, there is a problem with using correlations alone to tests the hypothesis. Correlations measure covariation with respect to variation around the local mean, no matter have trivial that variation may be. Indeed, it is virtually axiomatic that the better an intelligence battery has been constructed, the more difficult it will be to find evidence of the Spearman hypothesis…

….Consequently, although a correlation is suitable for assessing how much of the variation that the Spearman Hypothesis accounts for, the same correlation may be unsuitable for identifying the underlying nature of the black-white difference — unless, of course, the correlation approaches 1.0. Thus, the task of assessing variance needs to be distinguished from the task of identifying what construct the population difference represents, if any.

Mean black-white differences can be expresses as point-biserial correlations. Such correlations can be viewed as subtest loadings on a black-white population factor or component, and that factor can be compared with g via the same coefficient of factorial similarity (or congruence) that Jensen used to compare general factors of blacks and whites in Table 3 (see his note 2 for the formula).

The factorial similarity coefficient (Harman 1960, p. 257) measures covariation with respect to variation around zero, rather than around the local mean. That zero in a meaningful one on the absolute scale of values taken by correlations, hence comparisons based on the coefficient remain on the same absolute scale from one application to another and from one factor to another. They also remain sensitive to the scale on which the correlations of the original factored matrices were expressed and to the signs of those correlations as reflected in the signs of the loadings. In contrast, variation about the mean loadings need have no relation to the scale or signs of original correlations, and so it is easy to contrive extreme examples in which the correlation is -1 but in which the similarity coefficient is positive and virtually perfect.


Gordon, R. A. (1985). The black–white factor is g. Behavioral and Brain Sciences, 8(02), 229-231.

Jensen, A. R. (1987). Further evidence for Spearman’s hypothesis concerning black–white differences on psychometric tests. Behavioral and Brain Sciences, 10(03), 512-519.

Must, et al. (2009). Comparability of IQ scores over time. Intelligence, 37(1), 25-33.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s