(Note: As of June 16, This post has been substantially updated. Quite a bit of background material was cut to keep it readable. Refer to my various other posts and blogs which touch on this subject (e.g..)
Recap: In the US, there is a large stubborn Black-White differential in intelligence (section A). This differential, on the individual and population level, explains a large portion of the social outcome difference. Within populations, intelligence is highly heritable. As such, the behavioral genetic default is that this differential also has a high heritability (section N). It could be otherwise, though. As such, facts on the ground were explored and environmental explanations were evaluated. The facts include: (1) the difference is a true difference in psychometric intelligence (Section B), (2) it largely represents a difference in the general factor (Section C), (3) it has shown great persistence, having decreased little in the last century (Section A1), (4) there is currently regression with age (Section A2), (5) there appears to be a robust biological component to the difference (Section E), (6) the difference shows a Spearman/Jensen Effect (Section D and G), (7) biometric analysis indicates that the gap has a sizable genetic component (Section H), (8) the difference is not caused by environmental influences unique to one or the other populations (Section I), (9) if environmental influences are causing the gap they act fairly uniformly across the population (Section K), (10) the difference is no smaller at the upper SES levels than the lower (section C and K), (11) family influences can not explain the difference, statistically explaining a decreasing amount with age (Section L and M), (12) Mixed race individual perform intermediate to monoracial individuals and this phenomena has been noted for centuries (Section O) , (13) the Difference correlates with physical indexes of Caucasian admixture in the Black population (Section P), and (14) environmental interventions appear to show little to no lasting effect (Section S and T). Causal biological explanations were then explored (Section F) and found to be wanting, capable of explaining at most 1/15th of the gap. Causal cultural explanations were discussed (Section D, H, K and M and elsewhere) and it was noted that these seem to be unable to explain the Jensen Effect, the g-loadedness of the gap, the uniformity of the gap across the IQ spectrum, and the inter-individual between race stability at adulthood.
It goes without saying that differences in intelligence within populations are highly heritable:
Bouchard (2009) Genetic influence on human intelligence (Spearman’s g): How much?
That the population called “African Americans” or “Blacks” is genetically different from the population called “Whites”:
Zakharia, et. al. (2009) Characterizing the admixed African ancestry of African Americans
That there are socially significant genetically mediated differences between the two populations:
Ethnic disparity in preterm delivery between African Americans and European Americans has existed for decades, and is likely the consequence of multiple factors, including socioeconomic status, environment, and genetics. This review summarizes existing information on genetic variation and its association with preterm birth in African Americans. Candidate gene-based association studies, in which investigators have evaluated particular genes selected primarily because of their potential roles in the process of normal and pathologic parturition, provide evidence that genetic contributions from both mother and fetus account for some of the disparity in preterm births. To date, most attention has been focused on genetic variation in pro- and anti-inflammatory cytokine genes and their respective receptors. These genes, particularly the pro-inflammatory cytokine genes and their receptors, are linked to matrix metabolism because these cytokines increase expression of matrix degrading metalloproteinases. However, the role that genetic variants that are different between populations play in preterm birth (e.g. the SERPINH1 – G56 SNP) cannot yet be quantified. Future studies based on genome wide association or admixture mapping may reveal other genes that contribute to disparity in prematurity.
Anum (2009) Genetic Contributions to Disparities in Preterm Birth
That Blacks and Whites differ in psychometric intelligence:
That this difference explain a large portion of the social outcome difference (even when population level effects are not considered):
Nyborg and Jensen (1999) Occupation and income related to psychometric g
That the genetic difference in standardized deviations is commensurate with the phenotypic difference in Intelligence.
A. Persistence of the gap in the US
Racial differences in intelligence between Europeans and Africans, both in the US and outside, have been noted for over two centuries. In Notes from the State of Virginia, Query 14, Thomas Jefferson pointed to these, stating:
Comparing them by their faculties of memory, reason, and imagination, it appears to me, that in memory they are equal to the whites; in reason much inferior…The improvement of the blacks in body and mind, in the first instance of their mixture with the whites, has been observed by every one, and proves that their inferiority is not the effect merely of their condition of life.
The first quantitative estimate of the racial gap was made by Francis Galton. In Hereditary Genius (1869), Galton wrote:
Secondly, the negro race is by no means wholly deficient in men capable of becoming good factors, thriving merchants, and otherwise considerably raised above the average of whites—that is to say, it can not unfrequently supply men corresponding to our class C, or even D. It will be recollected that C implies a selection of 1 in 16, or somewhat more than the natural abilities possessed by average foremen of common juries, and that D is as I in 64—a degree of ability that is sure to make a man successful in life. In short, classes E and F of the negro may roughly be considered as the equivalent of our C and D—a result which again points to the conclusion, that the average intellectual standard of the negro race is some two grades
below our own.
Galton proposed that there was a two grade phenotypic difference between English and Black Africans. Two grades in his system is equivalent to 1.39 standardized unites (Jensen, 1973) or to 19 IQ points on an IQ test which has standard deviations of 15. One century after Galton’s publication of Hereditary Genius, Arthur Jensen published his 1969 Harvard Educational Review article, which set into motion the IQ wars. Jensen noted that the average African-American/ European-American IQ gap was 15 points, or one standard deviation, and he speculated that 50% to 75% of this difference had a genetic basis:
It is a subject with a now vast literature which has been quite recently reviewed by Dreger and Miller (1960, 1968) and by Shuey (1966), whose 578 page review is the most comprehensive, covering 382 studies. The basic data are well known: on the average, Negroes test about 1 standard deviation (15 IQ points) below the average of the white population in IQ, and this finding is fairly uniform across the 81 different tests of intellectual ability used in the studies reviewed by Shuey.
Jensen later estimated the magnitude of the US gap in relation to age in his magnum opus, The g-factor: “By five to six the mean difference is .7 SD (eleven IQ points), then approaches about 1 SD during elementary school years, remaining fairly constant until puberty, when it increases slightly and stabilizes at about 1.2 SD.” Thirty-seven years after Jensen published his paper, environmentalists James Flynn and William Dickens estimated the adult gap in the US to be 1.1 standard deviations, but argued that the gap had narrowed:
Rushton and Jensen (2006, this issue) concede that the magnitude of the Black-White IQ gap is not immutable, but could have narrowed by as much as 3.44 IQ points, or 0.23 White standard deviations. They concur that the Black-White IQ gap rises with age. Using Shuey’s 1966 data, Jensen (1998) estimated a gap of 0.70 standard deviations in early childhood, 1.00 standard deviations in middle childhood, and 1.20 standard deviations in early adulthood. Our current estimates are 0.31 (age 4), 0.63 (age 12), and 0.87 (age 18 …
[…]As for the gap of 1.1 standard deviations, the median age in the meta-analysis of Roth et al. would not be under 24. Our Figure 3 projected to age 24.7 gives a current IQ for Blacks of 83.5, or exactly 1.1 standard deviations below Whites.”
Based on the data points provided, we have the following estimates of the magnitude of the gap in relation to age for year 1920 to 1960 and 1960 to 2000:
As can be seen, there has been substantial narrowing at young ages but minimal narrowing at older ages. Estimates of the magnitude of the gap at younger ages in previous years vary however. The figure below shows the estimated gap by age from 1922 to 1944 (midpoint 1933), 1944-1966 (midpoint 1955), and 1966. Based on these figures, there seems to be little relation between age and the magnitude of the gap, with the gap remaining constant at about 1 standard deviation across ages.
Whatever the case, the gap appears to have narrowed greatly at young ages and little at older ages. This is confirmed by an analysis of scores on AFQT, Stanford-Binet, WISC, WAIS, WAIS, Raven, Woodcock, DAS, Wonderlic, and the NLSY Peabody. (Data points here).
The persistence of the gap at older ages and the narrowing at younger ages has implications for racial nature nurture debate.
A1. Regression with age
As noted, a Black-White adult gap of over 1 standard deviation has been found for nearly a century. Shuey (1966), in a review of studies conducted during the first half the 20th century concluded that that the Black-White adult gap was about 1 SD; fifty years later, Flynn and Dickens (2006) concluded the same. Shuey also found that the gap at young ages was about 1SD. Similar findings led the 52 (signatures) of the Mainstream Science on Intelligence to endorse the following statement:
Racial-ethnic differences in IQ bell curves are essentially the same when youngsters leave high school as when they enter first grade.
Again, as mentioned, Flynn and Dickens (2005), showed evidence to the effect that the gap has been decreased in proportion with youth. A genetic explanation of this is that the between population difference is of the same nature as the within population difference. Within populations the heritability of IQ increases near linearly with age (Haworth, et al., 2009); cognitively, with maturity, offspring grow to resemble their parents. As such, one would expect a increase in the magnitude of the gap with age when environment is controlled for.
Take the following mathematical relation:
Within group heritability is mathematically related to between group heritability and is a function of the between group difference. By this equation, if the within group heritability is zero, to take an extreme example, the between group heritability will also be zero (assuming no unique genetic factors).
From a genetic perspective, a significant percent of the phenotypic Black-White youth gap was and is still due to parental provided rearing environments, that is, was and is a result of environmental intergenerational IQ transference. While the between population IQ variance can be narrowed by outsourcing parenting (e.g., preschool, HeadStart, early intervention, etc.), as the children age, it is predicted that their IQs will regress towards that of their parental population’s mean — just as children within populations do . This is what, in fact, is seen to occur with both transracial adoptions and early intervention programs (see section N, Failure of Intervention programs to produce and enduring Effect)
While a genetic hypothesis can readily explain the environmental controlled increase with age and the historic pattern, it’s not evident how an environmental hypothesis can. Why has the gap only been decreased with youth? Environmentalists could argue that the change in the gap represents a decrease due to a sort of trickle up effect that has only reached younger cohorts, but longitudinal studies show otherwise (e.g., The early Childhood Longitudinal Study). The alternative is to argue that the current age increase is due to a sort of cumulative deficit, but, if so, why didn’t the gap previously increase likewise? Were such a theory correct, we might have expected that the first grade IQ gap (of about 1SD) found by Coleman et al (1966) and Shuey (1966) should have magnified to about 1.7 SD or so by the time those kids were 24. Obviously, it didn’t; as such, it’s evident that there was little accumulation of deficit.
Environmentalist have to explain how a set of environmental factors can currently produce a steadily widening differences with age from early childhood to adulthood and then abruptly level off at approximately the same time that the increase in heritability of IQ does. Since in previous years there was no comparable widening, these factors must have arisen in the last several decades and their influence must have increased with potency so as to explain the secular increase in the correlation between age and the magnitude of the gap.
It’s worth noting that based on the small magnitude of gaps at young ages, Fryer and Levitt (2006) have attempted to construct an environmentalist proof. Refer to note .
A2. Persistence in adulthood.
Based on the data Flynn and Dickens (2006) provide, the Black-White gap, by adulthood, narrowed 0.1 standard deviations between 1920 and 2000. Below is a more complete list of data points:
The adult gap appears to be rather stubborn. In Understanding Human History (PDF copy of the book here), Michael Hart lucidly explained the implications of this stubbornness:
“Eight decades ago, when δ was first measured (from the tests given by the US Army in World War I), the average difference between the scores of American blacks and whites was about 17 points.
Of course, at that time (during the “Jim Crow” era) there was a vast difference between the environments in which most American blacks were born and raised (and continued to live in as adults) and the environment of most whites. The majority of American blacks lived in great poverty. Their job opportunities were severely limited— sometimes by company policies, sometimes by union rules, sometimes by custom, and sometimes by law. Most American colleges had miniscule numbers of black students, and many would not admit any blacks. In addition, the public schools that most blacks attended were severely underfunded. Because of their poverty, most blacks had poor housing, poor diets, and inferior medical care.
In the intervening decades, the situation of American blacks has improved enormously, both absolutely and in comparison to that of white Americans. Their average income is still considerably lower than that of whites, but there has been a very marked convergence between the environments of the two groups. It is difficult to measure the extent of this convergence precisely, but a reasonable estimate might be that the difference between the environments of typical American whites and blacks is only about one-third (or at most one-half) as great now as it was then. (We might call this fraction the remnant factor, or R.)
It follows that if δ was caused largely by environmental factors then it should have diminished enormously in the course of the last eight decades. Indeed, if hypothesis [a+] is correct, then δ should now be only about one-third to one-half of the original 17 points (i.e., 5.7 to 8.5 points). However, tests taken in recent decades indicate that δ is about 15 points today.1 The discrepancy is so large as to clearly refute hypothesis [a+]. Indeed, since δ appears to have diminished by only 2 points during that long interval, hypotheses [a] and [b] are also implausible, and hypothesis [d] appears to be the one that best fits the facts.
If the adult gap in the 1920s (or 1940s or 1960s, if those early figures are not seen as credible) was due entirely to environmental deprivation, we should expect a smaller gap now in proportion to the decrease in this deprivation. Based on the best current estimate, the adult gap is 1.1 SD. In the last 80 years (or 60 or 40), the gap has narrowed only 0.1 Standard deviations or 9%. To argue that the gap is entirely due to environmental deprivation, then, is to argue that the situation of Blacks in the United States relative to Whites has increased by no more than this amount, which is preposterous. The United States has shifted from a society which actively discriminated against African Americans to one which actively discriminates for them in the form of affirmative action, “celebrating diversity,” reducing “disparate impact,” anti-discrimination laws, and so on (e.g., Sacket, et al, 2001; McDaniel, 2009). This massive shift appears to have led to a shift in cognitive ability of less then 10%. This is hardly suggestive of a primarily environmental etiology for the gap.
B. The Absence of psychometric bias
Of the plausible explanations for the Black-White score difference on measures of intelligence, at least in the United States, psychometric bias is not one of them. The most detailed work on this issue was Jensen’s 700+ page “Bias in mental testing,” which prompted both the National Academy of Science and the National Research Council to set up special committees to determine the issue; after exhaustive review of the evidence, both substantially agreed with Jensen (1980) (e.g., Wlgdor and Garner, eds, 1982). Since, a plethora of studies have investigated this issue using diverse methods and confirmed the conclusion. The most compelling evidence for the absence of psychometric bias comes from studies of measure invariance in standardization samples (e.g., Dolan, 2000; Dolan & Hamaker, 2001; Lubke, Dolan, Kelderman, & Mellenbergh, 2003; Edwards and Oakland, 2006). The finding of measure invariance implies that the factorial difference between groups are of the same nature as differences within groups (Wu et al., 2007). Wu et al. explain: “Mellenburgh (1989), Meredith (1993), and Meredith and Millsap (1992) provided a statistical definition of [Measure Invariance]. Namely, an observed score is said to be measurement invariant if a person’s probability of an observed score does not depend on his/her group membership, conditional on the true score. That is, respondents from different groups, but with the same true score, will have the same observed score.”
The difference represents a true difference in mental abilities.
C. Spearman’s hypothesis, the g-nexus, and the biologicality of the gap
IQ tests measure, to one degree of reliability or another, general intelligence. The general factor is of interest because 1) it’s psychometrically structurally similar across populations, sexes, ages, and cultures, (and several species), 2) it stands at the nexus of a behavior-psychometric manifold with numerous educational, psychological, and sociological correlates, 3) it correlates with the cognitive complexity of activities, 4) it’s highly heritable (within populations) and 5) it has numerous neurophysiological correlates such as brain neural conduction velocity, cerebral glucose metabolic rate, the latency and amplitude of evoked electrical brain potentials, the volume of white and grey matter, the mass of the prefrontal lobe, etc (Gottfredson, 2011). Convergent lines of evidence indicate that the US Black-White difference largely represents a difference in general intelligence or general mental ability. To review the evidence:
The numerous measured differences behave as if they were largely manifestations of a general intelligence difference both in terms of predictiveness and Intercorrelation.
1. Predictive validity. IQ tests have roughly the same predictive validity for Blacks as Whites. They predict real world differences, such as scholastic and work performance with nearly the same accuracy. And g is the backbone or active ingredient of a test’s ability to generalize and predict diverse results (Jensen, 1998; Gottfredson, 2002; Sackett et al., 2008).
2. Matching for factor analyzed g statistically explains income and occupational differences between Blacks and Whites (Nyborg and Jensen, 1999).
3. Positive manifold between groups. The differences measured by IQ tests behave as if they were g differences, standing at the nexus of a behavioral psychometric manifold. They predict the between group difference on virtually every other test that reliably measures general intelligence (e.g., industrial, military, and academic cognitive tests) (Roth et al., 2001; Roth, Bobko, and Huffcut, 2003; Roth, et al., 2008; Roth, 2010; Sackett and Shen, 2008; Gottfredson, 2006). On the individual and population levels, they also predict differences in social outcomes such as academic performance, work performance, performance on simulation exercises, trainability, crime rates, rates of single motherhood, rates of HIV infection, poverty rates, etc — all of which are related by their correlation which general intelligence (Gordon, 1997; Jensen, 1998; Rushton and Templer, 2011).
4. No other factor predicts the magnitude of the Black-White difference on diverse tests. The size of the difference in the US is unrelated to the cultural loading of tests, whether the test is verbal or non-verbal/written or oral/ timed or untimed/ multiple choice or response based, whether the test is a test of accumulated knowledge, digit span, reaction time, etc. (Jensen, 1998).
5. Non-cognitive correlations with IQ (weight, height, chronological age, and head circumference) are the same for both Blacks and Whites, evidencing that the same latent factor is being measured by IQ tests (Jensen, 1998).
Statistical analysis shows that the difference is as if it were largely a general intelligence difference.
6. A lack of measurement bias. Numerous statistical tests have shown that IQ tests are not psychometrically biased (e.g., Analyses of the order of item difficulty, analyses of the item score correlation, tests of congruence of item characteristic curves, etc). This implies that IQ scores have the same meaning for Blacks and Whites (Jensen, 1998; Miele, 2002).
7. Same factor structure. Differences within Black and White populations have the same factorial structure, with general intelligence at the apex (Jensen, 1998).
8. Spearman’s effect. In 26/26 studies, it has been found that the size of the Black White gap increases in proportion with a sub tests’ general intelligence loadings (Miele, 2002). [note 1]. This finding that the difference is systematically related to the general intelligence loading of an assessment extends beyond cognitive ability tests, per se. Other forms of assessment show a similar phenomena. Roth et al. (2012) notes:
9. Pseudo race experiments. The difference between Blacks and Whites was found to show a similar Spearman’s effect to the difference between Blacks and Whites at different ages, a difference which represents a biological effect (Jensen, 2003). Conversely, the difference was found to be dissimilar to the difference between deaf, blind, and non-handicap children, a difference which represents a cultural effect (Kane and Brand, 2008). It has also been found to be dissimilar to the effect of adoption, an environmental effect (Jensen, 2003).
10. MGFA. In 3 separate studies, multigroup confirmatory factor analysis showed that the nature of the Black-White difference was consistent with a general intelligence difference. (Dolan, 2000; Dolan and Hamaker, 2001; Lubke, et al 2003). [Note 2]
11. Other Factor Analysis. Factor analysis shows that the Black-White difference loads with general intelligence differences and heritable differences as opposed to secular gains (Rushton and Jensen, 2010) Principle factor analysis using Pearson’s correlation and analyses of the congruent coefficient also have shown that the differences are consistent with a general intelligence difference. (Jensen, 1998, p. 409-410).
Other findings show that, neurologically and developmentally, the differences behave as if they were largely manifestations of a general intelligence difference.
12. A Black-White gap in reaction time and inspection time, both of which are indexes of information processing and g correlates, have been found (Nobel, 1969; Jensen, 1993; Jensen, 1998 p. 389-400; Pesta and Poznanski, 2008). The elementary cognitive task difference demonstrates Spearman’s Effect (Jensen, 1993); differences are larger in proportion the the g-loading of the task.
13. MRI studies have show that IQ differences correspond to differences in cortical volume in representative samples and that these differences are mediated by general intelligence (Karama, et al., 2009; 2011).
14. Blacks show a lower rate of mental development. The chronological increase in raw scores on IQ tests occurs at a lower rates for Blacks than Whites (Jensen, 1998).
15. The Black-White differece correlates with heritability. The only plausible way to explain this for both hereditarians and environmentalists is to posit statistical g differences. (See section E below.)
Now, while between population differences in g don’t necessitate genetic differences, they make them more plausible as g differences, assuming that they are true latent differences, represent robustly biological differences. The basic syllogism can be expressed as follows:
1) There are g-differences between individual and these differnces explain a substantial portion of outcome variance
2) These between individuals g-differences are biologically rooted and genetically based
3) There are g-differences between group X and Y and these differences explain a substantial portion of outcome variance
4) These between group g-differences are biologically rooted and stubborn
Differences in general intelligence imply robustly biological as opposed to cultural causation (Steppan, 2010). And, in the case of the Black-White difference in the US, genes are the most plausible candidates. This is because physiologically affecting environmental factors (e.g., lead poisoning, malnutrition, disease burden, etc.) are not typically found at higher SES levels – at least in the post-industrialized world — and yet it has consistently been found that the Black-White difference is larger at higher SES levels than lower levels (Jensen, 1973; Murray and Herrnstein, 1994; Jensen, 1998). This suggests that biological insults are not behind the difference and leaves a genetic explanation as the most plausible account. (See section E. Failure of causal biological explanations.)
Plausible causal models for group differences by their relation to general intelligence
Steppan, 2010. Protestantism and Intelligence: Max Weber and the Rindermann-Paradox
The Black-White IQ gap as a function of SES
To clarify, the evidence indicates that the IQ gap in the US largely represents a biologically robust general intelligence gap. Given the persistence of the gap, indeed increase, with SES, biologically affecting environmental factors unlikely explain this gap. Genetic factors are implicated.
D. Spearman’s Effect
As noted above, although the B-W gap behaves like a g-gap, some have disputed Spearmna’s hypothesis (e.g. Malda et al 2010). A more restricted claim is that the B-W gap is g-loaded — it shows Spearman’s Effect — and this makes problematic numerous environmental explanations.
Cognitive tests and subtests vary in their level of cognitive complexity or in the amount of mental manipulation involved. Some tests are more simple, such as forward digit span, which involves repeating a list of digits, and some tests are more complex, such as backwards digit span, which involves repeating a list a digits backwards. They vary in their level of cognitive specificity. Some tap into abilities specific to a test (or task) and some tap into abilities general across tasks. For example, Peabody Picture Vocabulary, which involves pointing at pictures that match terms (e.g. Yacht), largely taps into verbal ability, while Raven’s Matrices largely taps into a general mental ability. The latter is more g-loaded and the former is more s-loaded. As a result, the latter is more predicative of performance across diverse tasks. As it happens, cognitive complexity and cognitive generality correlate, a phenomenon which is prima fascia evidence of general intelligence. This correlation between cognitive complexity and generality, of course, is not logically necessary. Were there multiple intelligences (e.g., Gardner), for example, cognitive complexity would, instead, correlate with cognitive specificity, as complex cognitive tasks would draw upon the specific abilities particular to a task rather than an ability general across tasks.
Cognitive Complexity and IQ
Now, Spearman’s Effect is an established fact. Blacks do worse on more cognitively complex tests and do worse on tests that better tap into the general factor; and this difference is found across cognitive fields (e.g., verbal, performance, mathematical, etc). This, of course, is indicative of a gap in general mental ability and these patterns confirm Spearman’s hypothesis (see above). One implication of Spearman’s Effect is that motivational explanations, per se, can not easily account of the gap. Another is that ability specific factors, at least individually, can not either. Murray (2005) has succinctly summarizes this point:
The black-white difference in digits-backward is about twice as large as the difference in digits-forward. (60) It is a clean example of an effect that resists cultural explanation. It cannot be explained by differential educational attainment, income, or any other socioeconomic factor. Parenting style is irrelevant. Reluctance to “act white” is irrelevant. Motivation is irrelevant. There is no way that any of these variables could systematically encourage black performance in digits-forward while depressing it in digits-backward in the same test at the same time with the same examiner in the same setting.
It has been suggested that the correlation between the gap and complexity could result from specific mental ability disadvantages. One can see this line of thinking at work in a number of explanations for various B-W gaps. For example, numbers of words read has been proposed as the cause of the reading comprehension gap. James Flynn has made a case similar to this, stating:
Reverting to group differences at a given time, does the fact that the performance gap is larger on more complex then easier tasks tell us anything about genes versus environment? Imagine that one group has better genes for height and reflex arc but suffers from a less rich basketball environment (less incentive, worse coaching, less play). The environmental disadvantage will expand the between-group performance gap as complexity rises, just as much as a genetic deficit would.
Such an explanation could explain the correlation with cognitive complexity in a specific cognitive field. Accordingly, the effect of a specific deficit, say doing basic algebra, would compound and so result in a larger gap in, say, calculus, but it fails to explain the generality of the Black-White deficit across fields. Why would a simple arithmetic gap result in a calculus gap and a reading comprehension gap? Why would a basic reading gap lead to a reading comprehension gap and an odd-man-out reaction time gap?
It could, perhaps, be argued that the environmental factors which, say, led to the simple arithmetic gap were correlated with those that led to the basic vocabulary gap (e.g., as a result of poor early schooling), creating a sort of general environmental factor. But such an explanation fails to account for the g-loading of the gap within fields, which is not the same as the generality of the gap across them. If the Black-White difference was due to numerous intercorrelated specific ability deficits (e.g., a less rich basic math, reading, and reaction time environment), as tasks became more complex (reading comprehension, calculus, and odd man out performance), deficits unique to the tasks would be compounded and the gap would become less correlated with g-loadings within cognitive fields.
Three independent aspects of the Black-White gap need to be accounted for: the generality of the gap across diverse cognitive fields, the correlation of the gap with complexity, and the correlation of the gap with the general factor loading of a test within cognitive fields. Group differences, of course, do not always show Spearman’s Effect. So it can not be argued that any pattern of deficit would lead to this effect. As noted above, while the Black-White gap correlates with g-loadings, the gaps between deaf and blind individuals and unimpaired individuals of the same race do not. Likewise, gaps clearly due to tests practice and educational differences do not show Spearman’s Effect. The same can be said about secular differences and the effect of adoptions. There is, in short, no necessary connection between a group difference and a g-loaded difference. When one is found, an explanation is wanting.
To review, the correlation between the Black-White gap with g-loadings across diverse cognitive fields (i.e., Spearman’s Effect) is indicative of a deficit in a general ability as opposed to numerous deficits in specific abilities. This itself does not evidence a genetic origin for differences. But, rather, it provides evidence against numerous environmental accounts. Specifically, both non-cognitive motivational explanations and explanations which focus on specific disadvantages or collections thereof are implausible. To the extent that Spearman’s Effect provides evidence for Spearman’s hypothesis [see 1], it further adds to the evidence against environmental accounts, specifically non-biologically based/ non-developmental accounts.
As for the later point, Spearman effects (a.k.a Jensen effects) tend to be associated with biological rather than cultural causation.
Jensen and non-Jensen effect clusters
Black/White IQ difference-g (US)/ 0.69/ Rushton, 1998
Reaction time-g/ 0.61/ Jensen, 1998
Within population white and gray matter volume differences-g/ various, 0.5 to 0.9/ Colom et al., 2006
Fetal alcohol/normal population IQ difference-g/ 0.56/ Juretko, 2006
Age difference (older versus younger)-g/ 0.45/ Jensen, 2003
Inbreeding/normal population IQ differences -Wisc III-g/ 0.39/ Rushton, 1999
SES IQ differences (based on biological parent’s SES for biological offspring)-g / 0.74/Jensen, 1973
SES IQ differences (based on biological parent’s SES for adopted away kids) – g in Capron and Duyme cross fostering/ 0.55/ Jensen, 1998
SES IQ differences (based on adopted parent’s SES-g in Capron & Duyme/ 0.1/ Jensen, 1998
Flynn effect (Scottland)/ -0.06/ Rushton, 1998
Flynn effect (Netherlands and US)/ -0.07/ Te Nijenhuis, 2012
Protestant effect (Swizerland)/ -0.22/ Steppan, 2010
Educational attainment in Spain on the WAIS/ non-specified for whole group but negative /Colom, 2002.
Flynn effect (USA)/ -0.3/ Rushton, 1998
Flynn effect (Austria)/ -0.32/ Rushton, 1998
Flynn effect (Germany)/ -0.33/ Rushton, 1998
Flynn effect (Estonia)/ -0.4/ Must et al., 2003
Protestant effect (Netherlands)/ -0.5/ Steppan, 2010
Test trained-Control/ -0.86/ Steppan, 2008
Test-retest (learning potential)/ -1/ Te Nijenhuis, et al., 2007
E. Biologicality of the gap
Above we saw that the African-European American difference represents a difference in general intelligence, or at least that it most probably does. Given the circumstances of the difference, this suggests a partial genetic origin. This conclusion, though, could readily be disputed. It could be maintained, for example, that differences in general intelligence don’t imply a robustly biological origin and that they can arise as a result of cultural differences (e.g. Wicherts, 2007) or it could be argued that the differences are, in fact, not g differences. We might then ask if there is more substantial evidence for the biologicallity of the differences.
With regards to African and European Americans, there is quite a bit of such evidence.
It can no longer be disputed that brain mass, cortical volume, and cortical circumference correlate with general intelligence and is genetically correlated within populations. Evidence of differences in these traits, therefore, indicates biologically mediated differences in g.
As for such differences, Jensen (1998) summarized the brain mass findings from a large Case-Western Reserve (1980) study (N= 811 W, 450 B). An age matched and height adjusted B-W differences of ~100g (~.78SD) was found, which is commensurate with the findings of Bean (1906), Mall (1909), Pearl (1934), and Vint (1934) as reported by Rushton and Ankney (2009). Correspondingly, Holloway (unpublished; discussed in Holloway, 2008) found a B-W difference of 63 grams (N = 1,391 W; 615 Black). [Note 3.] Similar findings have been found using imaging studies. In their study, Isamah, et al. (2010) found that African Americans have 1 SD less total cerebrum volume than European Americans (“Variability in Frontotemporal Brain Structure: The Importance of Recruitment of African Americans in Neuroscience Research.”). These findings fit with MRI studies have show that IQ differences correspond to differences in cortical volume in representative samples and that these differences are mediated by g (Karama, et al., 2009; 2011). In addition to brain mass differences, in the US, (small) differences in cranial circumference have been found (Rushton and Ankney, 2000).
Mean brain weight by race by age as reported in Ho et al., 1980
An additional line of evidence for the biologicality of the gap, at least in the US, is increased Black rate of mild mental retardation. Most informative is that it persists controlling for SES. As Chapman et al. (2008) note intergenerational developmental (i.e. biological) factors are implicated:
Addressing all of these potential pathways related to maternal education may still not be enough to eliminate the large racial disparities found in mental retardation and among mild mental retardation placements in particular. Compared to White children, the prevalence of mild and moderate/severe mental retardation among Black children was 4.5 and 2.1 times higher. These racial disparities have persisted, even after controlling for sociodemographic factors (Yeargin-Allsopp et al., 1995). To fully address this problem, we may need to consider intergenerational risk factors, which involve the mother’s own developmental history. Maternal intergenerational factors clearly play a role in low birthweight (Emanuel, 1986; Emanuel, Filakti, Alberman, & Evans,1992), and it is likely that other aspects of development, including cognitive development, also have an intergenerational component (Chapman & Scott, 2001). Intergenerational factors may explain, in part, why race differences in mental retardation placements and risk factors associated with mental retardation, such as low birthweight, have persisted, even after controlling for maternal factors, such as age, education, SES, and prenatal care (G. Alexander, Kogan, Himes, Mor, & Goldenberg, 1999; Din-Dzietham & Hertz-Picciotto, 1998; Foster, Wu, Bracken, Semenya, & Thomas, 2000; Migone, Emanuel, Mueller, Daling, & Little, 1991; Starﬁeld et al., 1991) (Chapman et al., 2008. Public Health Approach to the Study of Mental Retardation).
Concerted effort to decrease the mental ability gap, largely by decreasing the MR threshold, have reduced this, but a gap nonetheless persists. [Note 4]
The increased rate of MMR in the African American population across SES in conjunction the craniometrical findings discussed above reinforce the case for the biologicality of the Black-White gap and, as Chapman et al (2008) note, points to intergenerational developmental causes. This reinforces the biological argument from Spearman’s hypothesis.
F. Failure of Causal Biological explanations
To explain the apparent biologicality of the difference, environmental-biological influences could be appealed to. Generally, we can classify potential causes of a mean difference according to the neuropsychological pathways by which they are proposed to work. The most general classes are cultural and biological causes, where the former refers to influences which act through sensory-informational pathways and the latter refers to influences which act through non-sensory physiological pathways. The say that the Black-White difference has an environmental-biological origin is to say that the cause lies in environmental influences which affect intelligence though the latter pathways. A near exhaustive list of such possible influences is given by Wiessen (2009); they include: Prenatal exposure to pollutants; Prenatal experiences leading to low birth weight; Fetal alcohol syndrome; Maternal Iron deficiency; Hunger; Organic disorders; Iron deficiency; Lead poisoning; Severe dehydration; Exposure to drugs; Postnatal exposure to pollutants; Postnatal exposure to heat; Poor health; Hypertension; Mercury exposure; Inequality in health and dental care; Inequality in immunizations, parasite infections, and rates of breast feeding. The commonality, again, is that the effects of these influences on mental ability are not mediated through sensation and perception.
These influences can be decomposed into Prenatal and Postnatal ones. One theoretical consideration with Prenatal influences, that needs to be born in mind, is that the net effect on a population mean is not clear, even when there are clear negative effects on the individual level. Specifically, insofar as these factors lead to fetal death and there is selection for physiologically healthier offspring, given that general intelligence is related to general fitness (Gottfredson et al, 2009), such influences by pruning the physically and mentally weak can, in principle, raise a population’s IQ.
On theoretical grounds, both Prenatal and Postnatal influences are problematic explanations for the mean differences, as they tend to be unshared environmental influences within populations. That is, they decrease the correlation between siblings. This is problematic as:
(a) Differential sibling regression to the mean studies imply that the causes of the B-W difference have a relatively uniform effect within and across families.
(b) The Black and White sibling correlations are equivalent (Jensen, 1974).
(c) The variance in IQ explained by unshared environmental factors (factors which make sibs different) is approximately the same in the Black and White population. (See: Gou and Stearns 2002, for example).
This problem exists so long as the influences in single or sum are not relatively uniformly distributed between family members of the populations affected. And frequently the types of environmental-biological influences cited are not (e.g., nutrition and infectious disease; see: Corruccini, 1983).
Many specific explanations are untenable as:
(d) The exposure to the influences decreases with SES and yet the differences are larger, or at least not smaller, at higher SES levels (Jensen, 1998). (For an important class of environmental-biological influences (e.g., Lead pain exposure, malnutrition, etc.), the rate of exposure is conditioned on SES. These types of influences, then, are poor candidates for explaining the gap at the higher SES levels; and obviously, if, in aggregate, they explained a substantial portion of the gap, then in proportion, the gap should be smaller at higher SES levels, which it is not.)
Prenatal factors are problematic specifically as:
(e) IQ differences increase with age; according to Flynn and Dickens (2005), Fryer and Levitt (2004), and Rippeyoung, (2006), they are quite small at young ages. Indeed, Flynn make this a point:
It has also emerged that they steadily lose ground on white people with age. At just 10 months old, the average score is only one point behind; by the age of 4, it is 4.6 points behind, and by the age of 24, the gap is 16.6 points. This could be due to genes, but the steady rate after the age of 4 (about 0.6 IQ points lost every year) suggests otherwise, since genetically driven differences such as height differences between males and females tend to kick in at a certain age (Flynn, 2008. A tough call
While this is a poor argument against genetic influences for reasons discussed elsewhere, this is a fair one against prenatal ones, as such influences should affect performance, between populations, no less at young ages than at old. This is, after all, what is found within populations for many of these influences. This point about the magnitude of the gap, age, and biological influences is accentuated when one controls for cultural influences (see below). Cultural influences likely causally explain a substantial portion of the young age gap Were biological-environmental influences to substantially causally explain the older age gap, one would have to propose either that the gap at young ages was large and negative or maintain that the influence of these factors increase with age. Both are implausible.
Both prenatal and postnatal factors are problematic as:
(f) Differences are g-selective. Blacks are depressed in neither weight nor height; Nor in rote memory; nor in psychomotor ability (e.g., reactivity to sensory stimulation and coordination; see: Haiback et al., 2011). And yet, it is empirically established that many of these influences (e.g., lead poisoning, mercury poisoning, malnutrition, etc.) affect memory and psychomotor ability.
Some of these causally biological-influences can be classified as maternal factors (e.g., rates of breast feeding and birth weight).To the extent that they are said to affect Black mothers more frequently than White mothers, they are problematic explanations as:
(g) The scores of mixed race, Black and White, individuals are, at least at older ages, intermediate to those of monoracial Black or White individuals. And yet the vast majority of mixed race individual have White mothers.
We can turn to specific proposed influences to see how our points above apply. In “Health Disparities and Gaps in School readiness,” Janet Currie discussed numerous possibilities and makes that case that environmental-biological factors can explain 2 IQ points. She tells us:
Three common chronic conditions—dental caries, allergies, and ear infections—are potentially implicated in cognitive and behavior problems in children, but research is not yet far enough along to make it possible to estimate how large those effects might be.
Dental caries (tooth decay) is the most common childhood chronic condition. Chronic pain from dental disease can affect both children’s cognitive attainment and their behavior.
Here, it is either being argued that (1) there is a direct biological effect on latent intelligence, (2) that there is an indirect effect on latent intelligence, such that these influences lead to a differential developments of IQ, or (3) that these influences lead to different IQ tests scores (e.g., via distractions.) (3) can be dismissed as the IQ gap is a true latent intelligence gap . (2) represents causally cultural influences, not biological – which will be discussed latter. (1) is problematic, by point (d) above, in addition to others, because the exposure to these influences is conditioned on SES, as Currie notes. Currie continues:
The literature on asthma strongly suggests that its greater prevalence among impoverished children could be due in part to characteristics of their housing. The degree of segregation by race, ethnicity, and income in American cities suggests that some groups are more likely than others to be exposed to environmental hazards. Moreover, to the extent that known environmental hazards are capitalized into housing prices, pollution will lower rents, making hazardous areas more attractive to poor people than to rich ones. Conversely, low land prices in poor neighborhoods may draw in new hazards. One environmental hazard whose effect on children’s health has been studied extensively is lead…A calculation similar to those made for ADHD and asthma suggests that differing exposure to lead might be responsible for 0.2 point of the average eight-point racial gap in scores assumed above. If racial disparities in exposure to other environmental hazards have also grown, exposure to such hazards could be an increasingly important cause of disparities in school readiness.
Again, we run into problems with our point (d) above (in addition to (a) through (c). And, as lead poisoning affects memory and psychomotor abilities, our point (f). Currie continues:
Iron deficiency is much more common among poor and black children than among
other children. Twice as many black children as white children are iron deficient (16 percent versus 8 percent for toddlers), while poor children are more than 50 percent more likely to be deficient than nonpoor children. If iron deficiency impairs cognitive functioning, it could well be responsible for part of the test score disparities between blacks and whites and between poor and nonpoor children. Sally Grantham-McGregor and Cornelius Ani reviewed observational studies that followed a group of children over time and found that conditional on measures of social background, gender, and birth weight, low hemoglobin levels in children aged two or younger are strongly linked to poor schooling achievement, cognitive development, and motor development in middle childhood. These studies, however, do not establish a causal relationship, given the strong association between iron deficiency and other factors that could affect development, such as poverty.
In addition to points (a) through (c), (d) and (f), this explanation is problematic due to point (e). Turing to maternal factors, Currie continues:
Typically they find IQ gains of two to five points for healthy infants and up to eight points for low birth weight babies. Once again, however, given the strong relationship between breast feeding and various measures of socioeconomic status, it is unclear whether the association between breast feeding and cognition is causal….
[..]If, however, breast feeding does affect IQ scores, then the racial differences in prevalence are large enough to explain a significant part of the gap in the generic test score that I have been considering. Suppose, for example, that breast feeding for six months raises IQ by five points, or about one-third of a standard deviation. Then the fact that 29 percent of white infants, but only 9 percent of black infants, are breast fed for six months would generate a one point difference in average scores.
Within populations, controlling for maternal IQ substantially attenuates the correlation between breast feeding and offspring IQ. Since within populations, the relationship between breast feeding and IQ is only partially causal, between populations the relationship could be partially causal, fully causal, or fully non-causal. The reason for supposing that, between populations, it is less causal than not, is our point (g) in addition to (a) through (e). Whatever the case, the amount of difference explainable is not very large.
This exhausts the environmental biological explanations that Currie has to offer. In the same special journal issue, Nancy Reichman discusses the impact of birth weight in “Low Birth Weight and School Readiness”:
Only two studies of which I am aware have presented data indicating the potential effect of low birth weight on racial test score gaps. Yolanda Padilla and her coauthors, in a study using = National Longitudinal Survey of Youth (NLSY) child data and focusing on the effects of the Mexican American birth weight advantage on early childhood development, found that low birth weight explains less than 1 percent of the (unadjusted) black-white gap in scores on the Peabody Picture Vocabulary Test-Revised (PPVT-R) among three- and four-year-olds in the late 1980s and early 1990s.52 Jeanne Brooks-Gunn and her coauthors presented a similar estimate in a recent analysis of the contributions of family and test characteristics to the black-white test score gap.53 Also using NLSY child data, they found that low birth weight and gender together explain less than 2 percent of the unadjusted racial gap in PPVT-R scores at age five.
My own estimate of the potential impact of birth weight on the racial gap in one test of cognitive ability—full-scale IQ score—is similar, though somewhat higher. My subject is all black and white infant survivors born in 2000, including multiples. In contrast to Padilla and Brooks-Gunn I do not use the NLSY data, because although that data set has actual test scores, it may underrepresent the very lightest babies. Instead I use vital statistics data, which provide exact race-specific birth weight distributions for surviving infants in the United States, though test scores must be imputed. I assigned an IQ score to each survivor, based on the infant’s birth weight. I then computed the racial gap in imputed IQ scores and divided this figure by the total observed racial gap in IQ scores, to compute the maximum proportion of the overall gap that can be explained by birth weight. Using various distributions of IQ scores based on past research and a range of assumptions, I found that birth weight explains a maximum of 3 to 4 percent of the racial gap in IQ scores, or one-half a point in IQ.
One major problem here is that the birth weight differences between races are substantially conditioned by genes (e.g., Anum et al., 2009.) To the extent that these differences condition IQ differences, the IQ differences between races, can be said to be partially indirectly genetic. (To note, the fact that birth weight differences predict African ancestry calls into question the practice of statistically controlling for birth weight when attempting to environmentally explain the gap). Another is that we run into the problem of determining the population level effect. As discussed above, higher rates of prenatal causalities may increase a populations mean IQ.
Whatever the case, based on both author’s best estimates, added together the influences mentioned could statistically explain only 2.5 points of a 16.5 point adult gap. But these influences co-vary to some extent, so their effects can not simply be added. Taking into account co-variation, the amount statistically explainable is surely less than 2.5 points. And this would be the effect statistically explainable. To some extent the association between these influences and IQ will be mediated through parental IQ (e.g.., Breast feeding) and genetics (e.g., reproductive casualties). Taking this into account, the upper bounds of the causal effect of the influences listed is likely closer to 1 point. Counting environmental influences, of course, is also problematic since the influences focused on are typically the ones thought to depress Black IQ relative to White IQ. Yet there are bound to be influences that run the other way — for example, the effect of older age of reproduction on IQ, which is a problem more for Whites.
It could be argued, nonetheless, that there are numerous undiscovered environmental-biological influences which possibly contribute to the IQ gap, but as we said we have a number of theoretical reasons for concluding that, in sum, these are not causing at most more then a small part of the IQ gap. If they were, we would see this in terms of: (a) either reduced differential sibling regression at higher IQ levels or non-linear regression, (b) substantially different sibling correlations between Blacks and Whites, (c) a noticeable differences in unshared variance between Blacks and Whites, (d) a decreased magnitude of the gap at higher SES levels relative to lower SES levels, (e) larger unexplained gaps at young ages, and (f) depressed sensory-motor and memory capacities in Blacks relative to Whites
Overall, causally biological explanations seem to make for poor explanations of the gap.
G. Jensen Effect
The evidence above indicates that the gap has a robustly biological dimension. As some have pointed out though (e.g. Brody, 2002), this, by itself, doesn’t imply a genetic basis. We might ask, then, if there is a more direct connection between the genes and the gap.
Since aptitude tests differ in their heritabilities; some tests are more environmentally influenced and other tests are more genetically influenced. A genetic hypothesis would predict that tests found to be more heritable for Blacks would also be more heritable for Whites and that the Black-White difference will correlate positively with indexes of heritability (and negatively with indexes of environmentality). In contrast, most formulations of the environmental hypothesis predict that the Black-White difference will correlate negatively with indexes of heritability (and positively with indexes of environmentality). A number of studies, using sibling correlations (Jensen, 1973 p. 107-117; Jensen, 1998) and inbreeding depression (Jensen and Rushton, 2010) as indexes of heritability, have borne out the genetic hypothesis’ prediction. Just as the highly heritable differences within populations correlate with genes (the correlation being the square root of the heritability estimate), the between population difference also correlates with genes.
Correlation between the Black-White gap and inbreeding depression
As to this point, Jensen (2012) summarizes this argument:
I had demonstrated in my research of the 1970s that mean Black–White differences in IQ were more pronounced on the more heritable, less cultural subtests. For example, Jensen (1973) cited a study by Nichols (1972) which found a correlation of r = .67 (p < .05) between the heritabilities of 13 tests estimated from twins and the magnitude of the Black–White differences on the same tests. I further demonstrated an inverse relation of r = .70 (p < .01) between the environmentality (the converse of heritability, that is, the percentage of variance that can be attributed to nongenetic factors) for 16 tests estimated from differences between siblings and the mean White–Black differences (Jensen, 1973)…
[…]Strong inference is possible: (1) genetic theory predicts a positive association between heritability and group differences; (2) culture theory predicts a positive association between environmentality and group differences; (3) nature + nurture models predict both genetic and environmental contributions to group differences; while (4) culture-only theories predict a zero relationship between heritability and group differences. These results provide strong and reliable corroboration of the hypothesis that the cause of group differences is the same as the cause of individual differences, that is, about 50% genetic and 50% environmental (Rushton & Jensen, 2005, 2010) (Jensen, 2012. Rushton’s contributions to the study of mental ability).
This evidence has, of course, been disputed. Revelle et al (2011), for example, argue, in effect, that as better measures of general intelligence are correlated with heritability within groups, measures of gaps in g between groups would likewise be so correlated. Naturally, such an argument only works if we suppose that the gap is, in fact, a g gap. And that leads to the question of how that gap arose.
In general, it stands to reason that the type of environmental influences that cause the Black-White difference must be unlike those that cause the differences in cases were no Spearman/Jensen effects are found. From this we can infer that: the cultural causes of the Black-White gap must be unlike the causes of the Flynn Effect and the Protestant Effect (causes unknown, but which both probably have to do with a increased emphasis on learning). It must be unlike the cause of the Deaf/Dumb-unimpaired gap, which is probably due to partial sensory deprivation. It must be unlike the test trained-untrained gap, which is due to immediate sensory-informational exposure. And it must be unlike the between SES gap, when the SES measured is that of adopted, not biological parents, which is due to prolonged sensory-informational exposure.
H. Structural equation modeling
It might be maintained, nonetheless, that the found correlation between the magnitude of the gap and indexes of heritability are spurious and so require no consideration. But more sophisticated techniques which compare the pattern of the Black-White difference to the patterns of differences between siblings and half-siblings in Black and White samples, have confirmed this relation and have provided estimates of the between population heritability. Two studies have been conducted to determine the heritability of the black-white difference using structural equation modeling (SEM) to determine the best fit model for the pattern of sibling correlations and have found that the best fit model includes between group heritabilities ranging from .36 to .74. (Link to a discussion, here,here, and here.) The SEM findings, as the authors of the studies themselves have noted (e.g., Rowe 2005), are open to alternative interpretations and so not dispositive. Notably, they assume that the differences between populations are caused by the influences, genetic and environmental, which cause the differences within populations and that the statistical resemblance between the difference between populations and the differences within populations results from common within and between causes. In principle, differences could be due to unique factors or the statistical resemblance could be the product of some confounding influences. Yet the burden is on environmentalists to make the case, as the procedure used has been accepted as causally informative in other instances (e.g., between sex differences in Blood pressure.) In general, a unique factor explanation for these results is not plausible as discussed in section I below. If they are to be explained environmentally, one must invoke confounding influences.
As for confounding influences, a case has been made for these, but it fails to hold up. Specifically, Dickens (2005) attempts to explain the SEM results in terms of COVGE. He states:
The more the pattern of black-white differences across different tests resembles the pattern of genetic influence on different tests, the more the statistical procedure will attribute the blackwhite differences to genetic differences. Using this method, David Rowe and Jensen have independently estimated that from one half to two-thirds of the black-white gap is genetic in origin….
[…] Those cognitive abilities for which multiplier processes are most important will be the ones that show the largest heritability, because of the environmental augmentation of the genetic differences. But they will also be the ones on which a persistent change in environment will have the biggest influence. Thus we might expect that persistent environmental differences between blacks and whites, as well as between generations, could cause a positive correlation between test score heritabilities and test differences.
(Genetic Differences and School Readiness)
His explanation though, predicts that the score differences between generations will positively correlate with general intelligence and heritability estimates. But we know this is not the case. As it is, the Black-White differences has been found to be unlike the secular differences in two important respects: (1) Measure Invariance holds for the former but not the latter and (2) the former but not the latter shows a Spearman/Jensen Effect. As other researchers have noted:
This clearly contrasts with our current findings on the Flynn effect. It appears therefore that the nature of the Flynn effect is qualitatively different from the nature of B–W differences in the United States. Each comparison of groups should be investigated separately. IQ gaps between cohorts do not teach us anything about IQ gaps between contemporary groups, except that each IQ gap should not be confused with real (i.e., latent) differences in intelligence. Only after a proper analysis of measurement invariance of these IQ gaps is conducted can anything be concluded concerning true differences between groups. (Wicherts, Dolan, Hessen, et al. 2004. Are intelligence tests measurement invariant over time? Investigating the nature of the Flynn effect)”
Flynn effect gains are predominantly driven by environmental factors. Might these factors also be responsible for group differences in intelligence? Group differences in intelligence have been clearly shown to strongly correlate with g loadings. The empirical studies on whether the pattern of Flynn effect gains is the same as the pattern of group differences yield conflicting findings. We present new evidence on the topic using a number of datasets from the US and the Netherlands. Score gains and g loadings showed a small negative average correlation. The general picture is now that there is a small, negative correlation between g loadings and Flynn effect gains. It appears that the Flynn effect and group differences have different causes (te Nijenhuis, 2012. The Flynn effect, group differences, and g loadings).
It seems that Dickens’ explanation of the SEM results fails. There of course could be other environmentalist explanations, but they have not been made. Generally, environmentalists who account for the SEM results in terms of X-factors, have to identify influences which mostly uniformly affects Blacks relative to Whites, despite the variance unique to Blacks (e.g., Colorism). Environmentalist who appeal to the Flynn effect (directly or indirectly) or any other set of influences to explain these results in terms of confounds, must due so while abiding the logic of the Jensen/Spearman Effect.
I. Similarity of developmental processes and the demise of unique population explanations
There are a number of potential classes of causes for a between population difference. The causes could be unique population factors or common population factors. In the former case, the causal factors are unique to one or the other populations or they have a unique effect on one or the other population ; the significance of these factors is that they are unconstrained by within population heritability estimates. Rowe et al. (1994) summarizes this issue:
It is conceivable that the causal processes leading to average levels would be different from those creating within-group variation in behavior. This possibility is real in the mathematical sense in that averages and correlations are statistically independent. However, for this alternative to hold requires also that minority-unique Factor X contributes to average level but does not contribute to variation among individuals… However, this argument—influences on means separate from those on individual differences—is a strong one because it requires nearly equal exposure to and influence of the unique causal mechanism in all exposed persons
In the latter case, the causal factors are shared by both population and are related to within population variance in IQ in the same way (i.e., the correlations are the same). The significance of these factors is that they are constrained by within population heritability estimates.
Unique population factors can, in turn, be unique variable factors or X factors; unique variable factors cause variance within a population but are unique to one or the other population, while X factors are unique to one or the other population but cause no variance within a population.
Unique uniform factors (or X-factors). As we said, there are two classes of unique influences: Unique uniform and unique variable influences. With the first class, the influence does not vary among individuals within the group affected – and because these influences don’t vary among individuals within the affected group, they have a uniform effect on individuals between groups. As such these factors are both uniform within and between groups. It’s rather difficult to think of examples of these in the case of Blacks and Whites in the US, but some examples are more obvious between generations (e.g., iodine fortified salt and fluorinated water). These are influences for which the vast majority of individuals in one groups (e.g,, people born in the 2000s) are exposed to and for which the vast majority of individuals in the other group (e.g., people born in the 1800s) were not. With regards to Black and Whites in the US, these are theoretically implausible for the following reasons:
(1) African-Americans are not affected uniformly, relative to Whites, as the difference between Blacks and Whites is conditioned on the color (and Negro appearance) of the former but not the latter. As such, the cause of the mean difference between Blacks and Whites can not, at least fully, be said to be unrelated to the cause of the differences within the Black population, which is what a uniform factor X hypothesis proposes.
(2) While there are mean environmental differences between the groups, there are no apparent ones which are invariant in one or the other. This situation is most clearly seen when comparing children of Black, White, and mixed parentage. For example, in the nationally representative NLSY 97, biracial individual scores 0.46 SD below monoracial Whites, and 0.49 SD above monoracial Blacks on the highly general intelligence loaded AFQT (Gullickson, 2004); owing to convention, these “mixed” individuals are classified as Black and constitute 10% of the “Black” population. These mixed race children do not appear to uniformly inhabit an environment either distinct from Whites and in common with Blacks or distinct from both Whites and Blacks. Given the continuum of differences between Whites, biracial individuals, Caucasian appearing Blacks, intermediate appearing Blacks, and Negro appearing Blacks and given the apparent absence of uniform environmental differences, these types of factors are implausible.
Unique variable factors. The second class contains influence with vary within the affected population. In the case of the Black-White differences, this class contains hypothetical influences such as: Black culture, the effect of racism, and minority caste system. Such types of influences have frequently been evoked. For example, Scarr (1982) postulated “Blacks culture” as the unique influence stating:
As I have noted in several papers, the Factor X to which Jensen refers is none other than cultural differences in child-rearing styles, values, and emphasis on skills though to be desirable for children to learn.
This was said in response to a statement about the constraints of within population heritability. Now the specific cultural differences mentioned are not mostly uniform across the Black population (e.g., upper and lower class African-Americans), so they are best classified as unique variable influences. The problem with this class is that such influences should alter the IQ-environment/kinship correlation matrices between populations. Unique influence which causes one groups to differ from another will not leave the patterns of correlations between IQ and other variables untouched (assuming that there is variability within the affected population), just as poking on a balloon will not leave the material surrounding the indentation unstretched, Yet Rowe et. al. 1994 analyzed the correlation matrices between Blacks and Whites in six studies and found them to be statistically identical. Two of the studies included good indices of cognitive ability, the National Longitudinal Survey of Youth and the Richmond Youth Project . Rowe et al. concluded:
Our main result was that developmental processes in different ethnic and racial groups were statistically indistinguishable. Developmental process refers to the association among variables in these groups and to the variables’ total variances. This conclusion held for the examination of six data sources, containing a total of 3,392 Blacks, 1,766 Hispanics, and 8,582 Whites, and in one data source, 906 Asians. The patterns of covariances and variances were essentially equal when one ethnic or racial group was compared with another; moreover, this structural similarity between ethnic or racial groups was no less than that within random halves of a single ethnic or racial group.
Given the above, unique variable influences, as with unique uniform influences, do not seem to be promising candidates for explaining the group differences in the US.
To the extent environmental influences are causing the gap, they are influences that are common to both Black and Whites and which have the same relation in both populations.
K. Sibling and offspring regression towards the mean and the demise of common variable explanations
We saw above that the Black-White American gap correlates with genes, is robustly biological, persists across the SES distribution, and most certainly represents a gap in general intelligence. Together, these lines of evidence are indicative of a partial genetic etiology. A further line of evidence, albeit indirect, comes from regression to the mean studies.
It’s a mathematical given that two groups drawn from a common population will show regression, but unless there is some differentiating factor with respect to the dimension measured, they will regress towards a common mean. If two groups regress towards separate population means, a causal explanation is needed. It has been found that Black siblings show differential regression (Jensen, 1973 p. 117-119; Jensen 1998 p. 358; Murray, 1998). For example, Black siblings of children with IQs of 115, have IQs of 100, while White siblings of children with IQs of 115 have IQs of 108; conversely, Black siblings of children with IQs of 70, have IQs of 78, while White siblings of children with IQs of 70 have IQs of 85. Related findings have been found in the case of parents and offspring, with parental IQ indexed by SES. The children of Blacks with high SES have IQs around that of the children of Whites with low SES (Coleman et al 1966; Shuey, 1966; Wilson, 1967; Scarr-Salapatek 1971). From a genetic perspective, this is because the children of higher IQ Blacks regress downwards to a population mean of 85 and the children of lower IQ Whites regress upwards to a population mean of 100.
Offspring regression and SES (first half of the 20th century):
Offspring regressing and SES (2008 SAT)
The neatest demonstration of this regression effect is shown by matching Black and White children for IQs and comparing the IQs of their siblings:
A genetic hypothesis predicts this regression. An environmental hypothesis could account for this, but it is incumbent of environmentalists to explain why this pattern of regression is fairly constant across generations, social class, and the IQ spectrum. The most plausible environmental explanation for the sibling regression is that Blacks show regression like Whites but that the IQs of Black children are artificially depressed by a uniform amount, giving the illusion that Blacks regress towards a lower mean (e.g., Brody, 2002). According to this explanation, Blacks with an IQ of 115 really have genotypic IQs of 130, but are environmentally depressed by 15 points; their siblings, then, who have phenotypic IQs of 100 (instead of 107 as with Whites), really have genotypic IQs of 115, but again are environmentally depressed by 15 points. Together, this gives the illusion that Blacks are genetically regressing towards a lower mean. Such an explanation is only tenable, though, if the proposed depressing factor uniformly affects all Black families across social class and the IQ spectrum about equally. Were the this not the case, the regression effect would not be linear (e.g., across the IQ spectrum). To put it another way, to explain the regression results environmentally one needs to posit common uniform factors. These factors can not be unequally distributed in the Black population relative to IQ matched Whites.
To put this another way, we can divide maintain that the Black-White difference is either due to Variable between or uniform-between difference. Variable-between and uniform-between influences differ in terms of the distribution of the environmental effect. In the former case, the effect is not distributed equally — some members of a population are affected more and some less — and in the latter case it is. In both instances there is variance within groups. To illustrate: We could imagine a situation such that the # of books a person has influences intelligence. And that the correlation between the # of books and IQ is 0.2 and that the standard deviation of the # of books is 1. And that for both of the groups in question, Blacks and Whites, the # of books is normally distributed. Now, further, we could imagine an initial condition in which both groups started out with the same average number of books. If 5 books were later confiscated from every Black individual, in our situation, depressing each individual’s IQ by 1 SD relative to the individual’s IQ at the initial condition, we would have a uniform-between influence — since Black individuals would be uniformly depressed relative to Whites. In this situation, were we to match Blacks and Whites for the same IQ at the later time, we would find that all Black individuals had 5 less books than White individuals (5 x 0.2 = 1 SD) of the same IQ. Accordingly, a Black individual with an IQ of 115 would be equivalent to White individual who would have had an IQ of 130 where he not deprived 5 books. Alternatively, if the Black population was divided into quarters and 0, 2, 8, and 10 books were later confiscated from each, depressing the quarters 0 SD, 0.4 SD, 1.6 SD, and 2 SD respectively, we would have a variable-between influence. In this situation, were we to match Blacks and Whites for the same IQ at the later time, we would find that some individuals would have 0 less books and some would have 10 less books. Some Black individuals would be affected by the various influences depressing the Black population on average, and some would be unaffected. Now many people assume that the mean difference is due to variable-between influences; they typically don’t suppose that Blacks with an IQ of >115 are afflicted to the same degree that Blacks with an IQ <85 are.
Were the influences depressing the Black population variable, then Black individual who had higher IQs would, on average, be less affected, as less affected Black individuals would have higher IQs. Differential sibling regression is an index of depressive influence. If Blacks and Whites are matched for the same IQ, and the siblings of Blacks regress to a lower mean than that to which the siblings of Whites regress, we can reasonably conclude that both the matched Blacks and their siblings are depressed by the magnitude of the differential regression divided by 0.6. (The alternative is to argue that the unmatched Black siblings are depressed but the matched ones are not; the problem with the alternative is readily obvious: Typically, siblings of IQ matched Blacks are found to regress 0.6 SD below siblings of IQ matched Whites, when Blacks and Whites differ on average by 1 SD. Were we simply to posit a uniformly depression of 1 SD, we would get our differential regression of 0.6. However, were we to posit that Black sibs were variable depressed 0.6 SD (by factors which vary within families, of course) we could only explain 0.3 SD of the total depression (0.6 x ½ of our sibs). The other 0.7 SD would have to act either variably or uniformly or through some combination of the two with respect to the Blacks siblings. Whatever way we would get a differential regression of substantially more that 0.6. (e.g., uniform: 0.7 SD x 0.6 + variable: 0.6 SD.). Now, since siblings of IQ matched Blacks do show differential regression (i.e, they regress to a different mean) and since this regression is no less at the upper end of the IQ spectrum than the lower end, we can conclude that the influences depressing the Black population are not substantially of the variable-between sort.
Hence we need to search for common influences which vary within populations which fairly uniformly depress the Black IQ.
L. Gene-environment architecture
The heritability of IQ for both Blacks and Whites is high at older ages. For example, in an analysis of the nationally representative ADD health data, Guo and Stearns (2002) found respective Black and White adolescent heritabilities of .57 and .71 and respective between family environmentalities of .16 and .03; similarly high within population heritabilities have been found in other studies (Rushton and Jensen, 2005). (See note 13.) In absence of measurement bias, unique factors (see section G), and substantial G-E correlations, high within population heritabilities constrain between population environmental influences. If, for example, two populations have common between family environmentalities of .16 and strict measurement invariance holds, for environmental influences to create a 1 SD difference, 2.5 SDs of environmental influence are needed. As noted, several studies have shown that, in the case of Blacks and Whites, there are no detectable unique factors (e.g. Rowe et al., 1994; Rowe et al., 1995) and several other studies have shown that differences between Blacks and Whites are not due to measurement bias (Dolan, 2000; Dolan and Hamaker, 2001; Lubke, et al. 2003). This implies that to the extent between family explanations are employed to explain the gap, they must be rather large. (This point is discussed in more detail, here.) This situation becomes dire at the upper end of the SES distribution, where we saw that the gap was somewhat larger (section B). It has been shown that heritability varies as a function of SES (Nisbett et al., 2012). At higher levels of SES, the between family environmental influence drops close to zero. And yet at these SES levels, the mean B-W IQ difference is 1.1 SD. Between family factors then seem to show no promise as explanations for the Black-White difference at higher levels of SES. [See note 11 for an elaboration of this.]
One possible alternative, proposed by Flynn and Dickens (2001), is that heritability estimates are confounded by G-E correlations. If so, the apparent high heritability of IQ would not restrict the environmental malleability of IQ. GE models of IQ’s heritability, though, are contradicted on a number of grounds. [Note 5, 6]
Alternatively, it has been argued, though, that this magnitude of influence is not implausible. But we can see how this plays out when it comes to strictly family influence explanations. If the gap, of approximately 1SD, at older ages is to be explained by the effects of home influences, given a within population shared envrionmentality of no more than 0.2 at this age, at least 2.2 SD of early home influences needs to be posited. But, were there were 2.2 SD of early home influences, then the gap in early childhood, when the shared environmentality is around .4, should be at least 1.4 SD below the White mean. Alternatively, if the gap is, at most, 0.7 SD in early childhood when the shared envrionmentality is 0.4, then to explain the gap 1.1 SD of effect needs to be proposed. But if there is only 1.1 SD of effect this could only explain 0.5 SD at older ages when the shared environmentality is 0.2. This theoretical prediction, of course, substantially over predicts the magnitude of the gap at young ages (or under predicts that magnitude of the gap at older ages). From this we can infer that either (a) shared environmental factors do not account for a large part of the IQ gap at older ages or (b) that the effect of these amplify with age between populations, despite diminishing within.
M. The partial demise of family influence explanations and the problem of stability
To verify our theoretical prediction above (that family influences diminishingly statistically explains the gap with age, we can look at the empirical data. (It goes without saying that controlling for environmental factors controls for genes (See Cleveland et al.. (2000) for a discussion of the empirical results; genes explain approximately 50% of IQ variance explained by environmental factors; also see: Plomin, 1994. Nature and nurture: genetic contributions to measures of the family environment.)
Below are a number of studies that look are the magnitude of the gap statistically explainable by home environmental influences. As can be seen, home environmental influences –many of which are shared family influences – have a diminishing ability to explain the gap with age as was predicted. Other data agrees with this. For example, Cordero-Guzman (2001) found that home environmental factors (i.e., receives magazines, has a library card, net family income, highest grade of mother +father), school factors (quality of school), and individual factors (highest grade achieved) accounted for only 50% of the one standard deviation B-W difference in AFQT scores in the NLSY97 (ages 12 to 17). This can be contrasted with Yeung and Pfeiffe’s age 5 to 12 data and other’s.
From an environmental perspective, which assumes no genetic differences and therefore that no genetic influences are being controlled for, only half at most of the adult gap can be explained by family influences. Extra family influences need to be sought out as explanatory factors (e.g., peer influences.). And yet, there is the issue of stability. In the Bell Curve, M & H, made the point that, by adulthood, IQ differences are stable. Within populations, this stability can be seen in the high test-retest correlation in adulthood as compared to the low correlation in early childhood (see Brody, 1992 p. 233; Intelligence, Chapter 8: Continuity and Change in Intelligence). The difference between individual Blacks and Whites is likewise stable by adulthood. This stability needs to be explained.
It so happens that environmental influences that are not shared by family members don’t induce such stability. Rather, stability is conditioned by genes and shared environment and yet shared environment, as we see, can’t explain the older age gap.
US Collaborative Perinatal Project in Fryer, 2010. Children born between 1959 and 1965. Measure: various. Controls: parental income , parental occupation, mother’s age, number of siblings, mother’s reaction to and interaction with the child, birth weight, prematurity.
Children of the National Longitudinal Survey of Youth 1979 in Duncan and Magnuson, 2005. Children born in the 1990s. Measure: PPVT. Controls: grandparents’ education; grandparents’ occupation; Southern roots; mother’s number of siblings; mother’s number of older siblings; no one in mother’s family subscribed to magazines, newspapers, or had a library card; percent of white students in mother’s high school; student-teacher ratio in mother’s high school; percent teacher turnover in mother’s high school; mother’s educational expectations; mother’s self esteem index; two indicators for mother’s sense of control or mastery; interviewer’s assessment of mother’s attitude toward interview; mother’s education; father’s education; child birth weight; child birth order; family structure; mother’s age at child’s birth; household size; set of dummy variables for average income; mother’s AFQT score; mother’s class rank in high school; and interviewer’s assessment of mother’s understanding of interview.)
The Children of the National Longitudinal Survey of Youth 1979 in Fryer, 2010. Children born mostly in the 80s to 90s. Measure: PIAT (math and reading). Controls: Free lunch status, special education status, whether the child attends a private school, family income, the HOME inventory, mother’s AFQT..
The Early Childhood Longitudinal Study, Birth (1-4) /Kindergarten (6+) Cohort in Fryer, 2010. Children born in the 90s. Measure: Various: Controls (ECLS-B) socioeconomic status, mother’s age, number of siblings, family structure (child lives with: \two biological parents,”\one biological parent,”and so on), Nursing Child Assessment Teaching Scale (NCATS), birthweight, the amount premature that the child was born). ECLS-K: parental education, parental occupational, status, household income, child’s age at the time of enrollment in kindergarten, WIC participation, mother’s age at first birth, birth weight, and the number of children’s books in the home.
Panel Study of Income Dynamics (PSID) in Yeung and Pfeiffer, 2008. Children born between 1980 and 1992. Measure: Woodcock-Johnson subtests. Controls: grandparents’ education, mother’s characteristics at child’s birth (whether received AFDC while pregnant, whether a teenage mother), and child’s characteristics (gender, low birthweight, birth order), parental SES, number of children at family, family structure, urbanicity index, and whether the child ever attended a private school, parenting behavior, mother’s test score
National Education Longitudinal Survey in Fryer 2010. Children born in late 1970s. Measure: Math and reading achievement tests. Controls: family income and parents’ levels of education.
N. Sub-population differences, Heritability, and Probability
Contrary to what is often said, Heritability estimates allow one to make a probabilistic statement about the factors underlying a random individual’s or subpopulation’s deviation from the population’s mean. The probability that genetic factors contribute more to an individual’s or subpopulation’s deviation from the mean than environmental factors is given by the following formula:
(Tal. 2010. From heritability to Probability)
Example: For a random subpopulation with a mean IQ of 80, whose population has a heritability of IQ of 0.8, there is an 84% chance that the deviation from the mean is due more to genetic than environmental factors.
From the math, given the within population heritabilities, it follows that the Black-White difference, of roughly one standard deviation, is a priori more probably genetic than not. That is, the behavioral genetic null hypothesis, or default, is that differences are more genetic than not. The burden is on environmentalists to show that, with regards to cognition, there are systematic between race differences. To the extent their explanations for why the default is incorrect fail, we are left with the partial genetic hypothesis.
And, indeed, no environmental explanations has been able to yet account for the facts that need to be explained.
O. Hybrid studies
Discussions of biracial performance in context to a genetic hypothesis go back to the 1700s. Thomas Jefferson commented on the phenomena in Notes from the State of Virginia:
Comparing them by their faculties of memory, reason, and imagination, it appears to me, that in memory they are equal to the whites; in reason much inferior….The improvement of the blacks in body and mind, in the first instance of their mixture with the whites, has been observed by every one, and proves that their inferiority is not the effect merely of their condition of life. We know that among the Romans, about the Augustan age especially, the condition of their slaves was much more deplorable than that of the blacks on the continent of America…Yet notwithstanding these and other discouraging circumstances among the Romans, their slaves were often their rarest artists. They excelled too in science, insomuch as to be usually employed as tutors to their master’s children. Epictetus, Terence, and Phaedrus, were slaves. But they were of the race of whites. It is not their condition then, but nature, which has produced the distinction. (Jefferson, 1781)
As Reuters (1911) noted, the commonly held view was that Mulattoes were intellectually superior to full-blooded African-Americans.
Mulattoes always have enjoyed opportunities somewhat greater than those enjoyed by the rank and file of the black Negroes. In slavery days, they were most frequently the trained servants and had the advantages of daily contact with cultured men and women. Many of them were free and so enjoyed whatever advantages went with that superior status. They were considered by the white people to be superior in intelligence to the black Negroes and came to take great pride in the fact of their white blood…. The higher the standard of success, the lower the per cent [sic] of full-blooded Negroes. (The superiority of the Mulatto, pg. 378-379)
This prejudice was confirmed in 1914 soon after the development of cognitive ability tests. For example, Fergunson (1914) tested Mulattoes in Virginia and found the following:
In the IQ wars, Environmentalists (e.g., Nisbett and Flynn) have made much out of a few studies which show little or no significant difference between biracials and Whites. Others (e.g., Murray and Rushton) point to studies which show their hypothesis’ predicted gap and contend that the biracial data largely supports a genetic interpretation. The studies present a conflicting picture and it’s difficult to adjudicate between the environmentalist and hereditarian claims because the subject numbers are rather small. One way to approach such situations is by conducting a meta-analysis. To that end, I did a literature search and located 6 studies from the 1970s on that contain data on the IQs of American biracials. I then computed Cohen’s d for Biracial-Whites and Biracial-Blacks across studies and found respective d’s of 0.29 (N= 577) and 0.49 (N= 431).
Scarr et al. (1994)…..55…..0.66
Willerman et al. (1974).129…..0.3
Harrison et al. (2001)..128….0.32
Scarr et al. (1994)……55…….0.55
Willerman et al. (1974)..129……0.55
Harrison et al. (2001)…128……0.51
I then estimated the between group heritability. This was done by determining the cross sample average deviation from the genetic prediction that Biracials will perform halfway between the parental populations and then subtracting this from one. So, for example, Scarr et al. deviated 3% from a genetic prediction while Eyferth deviated 93%. The assumption here is that genetic factors condition intermediate performance while environmental factors tend to make biracial individuals more similar to one or the other of their parental populations, situation depending. The more heterogeneous the scores across samples the more environmental influence. Based on this assumption, I estimated a heritability of 0.43. The correlation between age and deviation from a genetic prediction was -0.55, which indicated that older samples tend to conform closer to a genetic prediction than younger ones, which is consistent with a genetic hypothesis.
As can be see, on the whole a genetic hypothesis is supported. Similar results, as above, can be found in the General Social Survery. Biracial Black and White individuals score inbetween their parental populations:
Further supporting evidence comes from recent PISA and NAEP data. In the 2003 to 2009 PISA data, self identifying biracial students score intermediate to self identifying monoracial Whites and Blacks. The data can be analyzed online using International Explorer. Below is an example of the typical results that one finds.
Similar results can be found with the NAEP explorer. (Refer to “It could be culture, part II” for a detailed discussion of methodology.)
For a list of references and extended discussions refer to: “Mixed achievement” (IQ); It Could be Culture, part II (The NAEP Black-Mixed-White gap) (NAEP); Biracial Black-White, Pisa 2009 (PISA).
In the US, at least, it’s clear that hybrids perform intermediate to monoracial Blacks and Whites. These results imply that the differences are not due to factors unique to one or the other race, but rather are due to factors for which there is a continuous gradient of difference.
P. Physical indexes of ancestry and genealogical indexes
The continuity of the difference is not just between multigenerational Blacks and Whites. It can also be found within the multigenerational Black population.
(A detailed discussion of the older studies discussed below can be found here:
Admixture studies discussed in Shuey (1966)).
P1. Color and IQ
Since the African-American population is a hybrid population, a genetic hypothesis would predict an association between indexes of racial admixture and IQ. Consistent with a genetic hypothesis, a number of studies have found an association between physical indexes of admixture and IQ. The findings with regards to skin color are summarized below:
The N-weighted correlation between IQ and skin color is 0.15 (N= 3694).
Peterson and Lanier (1929)/r=0.18/n=83
Peterson and Lanier (1929)/r=0.3/n=75
Scarr et al. (1977)/r=0.155/n=288
ADD Health (unpublished)/r=0.17/n=1131
The average Cohen’s d between the upper and lower 4rths of the spectrum is about 0.5 n = >6,000.
Feguson (1919)/d= about 0.7/n=657
Feguson (1919)/d= about .9 SD/n=667
Kock and Simmons (1926)/d= about 0.15/n=1078
Klineberg (1928)/d= about 0.15/n=200
Young (1929)/ d= about 0.8 and 0.33/n=277
Peterson and Lanier (1929)/d = about 0.66/n=83
Peterson and Lanier (1929)/d= about 0.2 SD/m=83
Bruce (1940)/d=about 0.25/n=72
Codwell (1947)/d= about 0.33/n=480
Lynn (2002)/d= about 0.5/n=430
NLSY97 (unpublished)/d= about 0.4/n=1433
ADD Health (unpublished)/d=about 0.5/n=1131
Correlations between indexes of IQ and color in the nationally representative NLSY97 sample
The magnitude of these differences are well consistent with a genetic hypothesis. [Note 7]
These intrarace differences certainly aren’t merely score differences. They are accompanied by the same g nexus between races. We are told, for example:
Dark-skinned blacks in the United States have lower socioeconomic status, more punitive relationships with the criminal justice system, diminished prestige, and less likelihood of holding elective office compared with their lighter counterparts. This phenomenon of “colorism” both occurs within the African American community and is expressed by outsiders, and most blacks are aware of it. Nevertheless, blacks’ perceptions of discrimination, belief that their fates are linked, or attachment to their race almost never vary by skin color. We identify this disparity between treatment and political attitudes as “the skin color paradox.” (Hochschild and Weaver, 2008. The Skin Color Paradox and the American Racial Order)
Based on the above, overt discrimination and Black identity would seem to make for impoverished explanations. It’s notable that in one of the samples above, the nationally representative Add Health sample, the heritability of IQ for the Black population was found to be substantial at 0.57. Guo and Stearns (2002) found that between family effects explained only 17% of Black IQ variability, data in which there was roughly a 1/2 SD difference between lighter and darker Black kids (ages ranging approximately from 12 to 18).) If not between family effects, then what? This leads us to another quandary: If genes explain the lion’s share of cognitive variance within the Black population, how is it that they explain no variance between subpopulations defined by color (or genealogy)?
P2. Negro appearance and IQ.
A few other studies have used physical indexes other than skin color. The three that report correlations are Herskovits (1926); Peterson and Lanier (1929); and Klineberg (1928). Based on these studies, the weighted average correlation between IQ and indexes was .08 (indicating that the more admixed individuals scored higher). (Interpupillary Span, – 0.01 (N=75); Nose Width, 0.05 (N=329); Ear height 0.2 (N=75); Lip thickness 0.1 (N=329). As noted above, this magnitude of association is consistent with a genetic hypothesis.
Other studies, such as Codwell (1947), have used a concordance of physical differences and found the same pattern. Codwell (1947) found, in a sample from Houston, a 5 point gap between blacks judged to be mostly African and blacks for which there was strong evidence of admixture as judged by a concordance of skin color, hair texture, hair color, and eye color.
It seems that environmentalists should change the name of their lead explanatory hypothesis from “colorism” to “negroism.”
P3. Genealogical indexes of ancestry
Three studies have been conducted on the association between genealogical indexes of admixture and IQ (Tanser, 1939 (Canada); Tanser, 1941 (Canada); and Witty and Jenkins, 1934. Tanser (1939) found that mixed blooded black scored 7.3 points above full blooded blacks in Kent Country Ontario (N=54). Admixture was judged by reports of genealogy. Tanser (1941) found that mixed blooded blacks, as judged by gemological reports, scored an average of 6 points above full blooded blacks in Kent County Ontario, though the ½ blooded Blacks scored below the 1/4 blooded blacks on two of the three assessments (N=204).
Tanser’s findings are more or less in accordance with a genetic hypothesis. The findings of Witty and Jenkins are said not to be by environmentalists, but closer inspection shows that they either are or are inconclusive.
Based on the above, it’s clear that the cause of the difference varies substantially within the Black population. How to reconcile this, from an environmentalist perspective, with points G, H, and I is not clear.
Q. Admixture studies using blood groups
Admixture studies using Blood groups have been conducted by Scarr et al. (1977) and Loehlin et al. (1973). No statistically significant correlation between IQ and ancestry as indexed this way was found. But the findings are not inconsistent with a genetic hypothesis, given the study limitations. [Note 8].
More recent studies have found highly statistically significant correlations between indices of SES (e.g., education, income, occupation), all which causally correlate with general intelligence, and African ancestry in the African Americans population.
(Cheng, et al., 2012. African Ancestry and Its Correlation to Type 2 Diabetes in African Americans: A Genetic Admixture Analysis in Three U.S. Population Cohorts))
The magnitude of the association found is consistent with a strong genetic hypothesis . But the results are not dispositive as the association can always be explained, with some degree of plausibility or another, by “colorism.” Surely, though, these admixture findings are not inconsistent with a genetic hypothesis.
S. Adoption studies
There have been two published transracial adoption studies that tested the genetic hypothesis. The Minnesota Transracial Adoption Study (MTAS) and the study by Moore (1986). The former had the advantage of having larger numbers, being longitudinal, having multiple measures of achievement (GPA, Class Rank, and scholastic aptitude), and allowing for a 4 way comparison between the white adopting familiies’ biological children, the adopted white children, the adopted biracial children, and the adopted Black children. The MTAS was explicitly designed to test an environmental hypothesis:
The Transracial Adoption Study was carried out from 1974 through 1976 in Minnesota to test the hypothesis that black and interracial children reared by white families (in the culture of the tests and the schools) would perform on IQ tests and school achievement measures as well as other adopted children (Scarr & Weinberg, 1976) (Scarr and . Weinberg, 1983. The Minnesota Adoption Studies: Genetic Differences and Malleability.)
Numerous arguments have been made against the results of the MTAS, which support a genetic hypothesis. But none succeed in completely undermining them. It’s frequently pointed out, for example, that the Black children in the study had later adoption times. Yet, in the study, adoptive experience variables explained only 13% of the late adolescent IQ variance, down from 32% in childhood. Moreover this wouldn’t account for the difference between mixed race kids and Whites. Alternatively it’s argued that there was attrition among the adopted Whites in the follow up and that the true scores would have been a good deal lower otherwise. This nonetheless wouldn’t have changed the White-Mixed-Black rank order; and instead of comparing adopted Whites to adopted Black one can just compare the biological White children to the adopted Blacks, adjusting for the genetic effect of parental IQ. (In the follow up, the mean IQ of the biological White children was 110, which is what one would predict given the mean parental IQ of 120 and regression to the mean.) The results are practically the same.
As counter evidence to the MTAS, Moore (1986) is often cited. The study involved both “traditionally” and “transracially” adopted black and biracial children. The mean age of the kids was 8. The scores of the traditionally adopted biracial and blacks children were consistent with a genetic hypothesis. The standardized difference between the biracial and black children was .27, which is equivalent to a black-white children difference of .54 SD. (The standard deviations were only about 10; 2.7/10 = 0.27). What was inconsistent was the difference between the transracially adopted biracial and black children; here, the standardized difference was .15 in favor of the black children. The numbers, however, were quite small with only 14 biracial kids and 9 black kids in the transracial component.
Results for Moore’s study
(Standard deviation = 10)
More adoption data comes from the 2009 High school longitudinal study. The data is publicly available. In the study a specially designed math test was given. The results for traditionally and transracially adopted and reared Black, White, and Asian students can be seen below:
The study also had a sample of students with both Black and White Parents. The figure below summarizes both the biracial data and the adoption data using alternative variables to identify the students:
Generally the findings are consistent with the MTRAS. Overall, it seems as if the transracial adoptions agree more than not with a non-trivial genetic hypothesis.
T. Failure of Intervention programs to produce an enduring Effect
Were the Black-White intelligence gap culturally caused (i.e., due to differences in sensory environment), one might predict that Blacks who were provided with superior sensory environments would enduringly outperform Blacks who were not. Since the gaps between subpopulations of Whites are substantially genetically caused, one might also expect the effect of intervention to be different for Blacks and Whites, with the enduring impact being greater for the former. It has been found, however, that while early intervention programs have a strong effect on children’s IQs – an average effect size of 0.5 SD — this effect does not endure. The same pattern is found for Blacks and Whites. Brody (1992) summarizes the longitudinal results from numerous HeadStart and Project Follow Through studies:
Is it possible to increase intelligence by changing the early intellectual environment of the child? Research on this topic was influenced by the development of Project Head Start, which was an attempt to provide an enriched environment for pre-school aged children whose impoverished background was assumed to place them at risk of inadequate intellectual development. It was assumed that the provision of superior childhood education for these children would increase their intelligence and increase their ability to cope with the educational program of the public schools….Perhaps the best summary of the outcomes of these studies was published by a consortium formed for the amalgamation of this body of research [Consortium for Longitudinal Studies, 1983]. The Consortium included 14 investigators whose initial samples included 100 or more subjects who were engaged in a longitudinal investigation of the effects of early educational intervention. The samples included in these studies were predominantly black preschoolers which a median IQ of 92 at the times of entry in the program and mothers with a median number of years of education of 10.4. They were predominately of lower socioeconomic status. The results of these studies as summarized by Royce, Darlington, & Murray  are reasonably consistent. Seven studies used IQ as a dependent variable. They obtained a clear increase in IQ as a result of participation in HeadStart. The median IQ benefit at the conclusion of these projects was 7.42 points. Three or four years after the conclusion of these projects the median difference between the experimental and the control group declined to 3.04 points. The last reported assessment includes Wechsler test scores for program participants aged 10 to 17. These assessments occurred 7 to 10 years after the conclusion of the educational intervention. Pooled over a subset of these investigations there was no significant difference between experimental and control groups. Although program children started first grade with an average IQ that was 5.80 points higher than the control group children, these differences were not maintained. Other reviews of early intervention studies based on Head Start reached the same conclusion as the Consortium for Longitudinal studies. There
are no enduring changes in intelligence tests scores associated with participation in this program [.] The failure of attempts to increase intelligence by early childhood intervention may be attributable to the relatively brief durations of these programs. It is possible that interventions that continued into the first few years of schooling would lead to more enduring changes in performance on intelligence tests. Project Follow Through was designed to extend Head Start interventions in order to obtain more enduring changes. The interventions were based on quite different intervention models. A comprehensive evaluation of the Follow Through Programs compared 22 different intervention models [.] Spitz (1986) evaluated the obtained changes on the
only test of intelligence used in these studies — the Ravens. For 107 comparisons that were available where children assigned to Follow Through Programs were comparable to their untreated controls, he found 5 significant difference in favor of groups assigned to the intervention, 11 significant differences in favor of groups not provided with intervention, and 91 comparisons in which there were no significant differences….These data indicate that intervention extended into the first year or two of elementary school will not lead to enduring changes in intelligence test performance. Brody, 1992. Intelligence.)
This null longitudinal result has been confirmed by a recent meta-analysis of 177 intervention programs. For the 15 studies which had follow ups 1 to 2 years after treatment, the effect size of the difference between the intervention group and the non-intervention group was 0 05; for the 11 studies which had follow ups 2 to 4 years after the effect size was 0.01 SD; for the 9 studies which had follow ups more than 4 years after the study the.effect size was 0.05 SD The authors of the study write:
Overall, the collection of studies in our meta-analytic data base generated a mean effect size of .27 standard deviations – in the range of the short-run impacts documented in the recent Head Start impact study, but considerably smaller than many of the impacts generated by model programs such as Perry Preschool, Abecedarian and the Infant Health and Development Program. By and large effect sizes tended to be modestly (but insignificantly) larger if the children were under the age of 3 when the programs began. Effect sizes varied little by program duration. In the case of the persistence of program effects, impacts generally persisted at close to full strength for 1-2 years beyond the end of the programs but then fell substantially. (Leak et al., 2010. Is Timing Everything? How Early Childhood Education Program Impacts Vary by Starting Age, Program Duration and Time Since the End of the Program)
Taken as a whole, and counter to the predictions of a simple cultural model, early childhood interventions have no enduring impact  and are not more beneficial for Blacks than Whites of the same cognitive percentile. It can always be argued, of course, that the interventions were not long enough or not intensive enough, but such objections miss the point. The problem is not with the interventions. They result in a substantial, if transient, increases in cognitive ability. The problem is that the differences are not lasting. The genetic-biological explanation for this is simple. The effects are not lasting for Blacks like Whites of the same percentile because the interventions do not treat the underlying problem. Which is not a lack of sensory information but a difference in capacity to process and store information.
U. The demise of cultural explanations
The following steps are involved in establishing a congenital difference in intelligence:
(1) Establish the existence of a stubborn test score differential
(2) Demonstrate a lack of psychometric bias associated with (1) and therefore establish a difference in some agglomeration of stratum I to III mental abilities
(3) Show that (2) largely represents a general or set of broad ability difference (Stratum II or III), as these stratum have appreciable heritabilities.
(4) Show that there is a robustly biological basis to (3), where robust means unlikely to be culturally caused
(5) Establish a genetic basis to (4)
Or, at any point, falsify all alternatives.
With regards to the difference under discussion, steps (1) to (3) have been completed (Sections A to D). Additionally, the alternative to (5), assuming (4), has been shown to be untenable (Section F); specifically, it has been shown that causally biological influences can not explain more than a small portion of the difference. Unestablished is (4). But this point is supported both directly (Section E) and indirectly (Section C). With regards to the latter, within population differences in general intelligence are largely robustly biological (see: Jensen, 1998), as such it is reasonable to make strong inference that they are so between populations, especially when the psychological and social correlates within mirror those between. Between group differences, though, theoretically could be cultural-g differences (a la the Dynamic Mutualism model of g (van der Mass et al., 2006). Since this possibility can not be ruled out at present, causal cultural explanations need to be evaluated. The major problem associate with such explanations is that they fail to well account for the Spearman/Jensen Effect. However, again if one posited cultural influences that interacted with g, which one must in this case, one could possibly account for the effects. But left unexplained then is why numerous sub-population differences fail to show similar effects. As example, in Spain, differences between groups defined by educational attainment were shown to not exhibit a Spearman Effect (see: Colom et al (2002) “Education, Wechsler’s Full Scale IQ, and g.”) In general, as numerous subpopulational differences do not exhibit such effects (Reviewed in Section D), it stands to reason that the supposed cultural cause of the B-W differences must be unlike the cause of these. That said, as we don’t know the specific causes of many of the differences in question, the issue is murky. As such, we are forced to consider causal cultural explanations for the B-W differential.
Such explanations can roughly be subdivided into family cultural influences and extra-family cultural influences. In both cases, we are supposing that relative to Whites, Blacks are not developing their general mental ability due a lack of proper sensory-informational exposure. In the former case, the relative sensory-information deprivation is supposed to come from inside the family and in the latter from outside. As it is, family influences (e.g., parental interactions, SES, types of school attended, etc.) can statistically explain no more than 50% of difference by young adulthood, with the amount explainable decreasing with age (Section M). Controlling for family influences, though, produces somewhat skewed results as the magnitude of the difference tends to positively correlates with indicators of these. One can take SES as an example. Controlling for SES statistically explains approximately 50% of the difference (at young ages), which suggests that SES potentially has strong influence; yet when Black children from the highest SES brackets are compared to White children from the lowest, one finds an equivalence of performance. This phenomena has been noted for close to a century and suggests that SES potentially has a weak influence. These different methods of analysis lend themselves to different results due to the correlation between the size of the difference and SES and the slope of the Black increase with SES. Putting this aside, there are firm reasons to think that family cultural influences can causally explain little of the differential at older age despite being able to statistically explain a substantial amount. Most notably, it has been found that causal cultural influences have transient effects on IQ within populations (Dickens, 2005). If this is the case within populations one would expect it to be the case between populations. Family influences, then, would have little to no enduring impact. In support of this contention is data from the one published longitudinal transracial adoption study (Section S), which indicates that little of the difference is due to family environment, and are meta-analytic results from early intervention studies which indicate that causal cultural interventions have moderate effects, little of which lastingly remain (Section T).
The conclusion that one must draw from the above is that extra-family influences (e.g., peer influences) must be called on to explain a large portion of the difference. To analyze the situation further we can subdivide these into what can be termed “reactive” and “active” influences, where the former refers to influences forced on and the latter to influences cultivated by a population. Since we are talking about supposed differences in developed general mental ability, we mean, in the former case, societal influences which discourage Black individuals from developing their “cultural g” (e.g., market discrimination) and, in the latter case, Black subcultural influences which encourage Black individuals to not develop their “cultural g” (e.g., peer influences). It should be noticed that the underlying presupposition is that general intelligence can be developed (or undeveloped) and done so from adolescence on. A massive body of evidence contradicts this, specifically when it comes to the general factor. Yet we will proceed on the assumption that this is possible.
Our reactive “cultural g” influences can be ruled out by the simple fact that there is a premium on Black human capital. In non-economic terms, Blacks are discriminated for in the market place and employers will overpay to meet defacto quotas. Data from the NLSY97 show this:
In: Fryer, 2010. The Importance of Segregation, Discrimination, Peer Dynamics, and Identity in Explaining Trends in the Racial Achievement Gap
This leaves our active “cultural g” influences (e.g., peer dynamics and cultural identify). In general, these influences are problematic explainers as they tend not to affect individuals uniformly across populations and yet the net effect depressing the Black mean is marked by uniformity. The relative uniformity can be seen in both adolescent to young adult sibling differential regression results and in parent-offspring differential regression results (Section K). Moreover, such influences would likely show longitudinal instability. The effect would be decrease the year to year stability of IQ in the Black population relative to that in the white population, a decrease which is typically not reported.
Overall, these “active” cultural influences do not seem promising explainers of the differential, especially when one takes into consideration the Spearman/Jensen Effect and other points mentioned. We are then left with the default, which is an environmental + genetic hypothesis of group differences.
 The probability that Spearman’s Effect does not hold for the Black-White gap is under one in a billion. However, in their critique of Jensen’s method of correlated vectors, which has been used to establish Spearman’s correlations, Dolan et al (2001) argue that a repeated findings of Spearman’s Effect is necessary but not sufficient to establish a general intelligence differences (see also Wicherts and Dolan, 2005). It has been argued by te Nijenhuis et al. (2007), though, that the method of correlated vectors may be more robust when used with meta-analyses. The author’s note “The fact that our meta-analytical value of r=−1.06 is virtually identical to the theoretically expected correlation between g and d of −1.00 holds some promise that a psychometric meta-analysis of studies using MCV is a powerful way of reducing some of the limitations of MCV…Additional meta-analyses of studies employing MCV are necessary to establish the validity of the combination of MCV and psychometric meta-analysis. Most likely, many would agree that a high positive meta-analytical correlation between measures of g and measures of another construct implies that g plays a major role, and that a meta-analytical correlation of −1.00 implies that g plays no role. However, it is not clear what value of the meta-analytical correlation to expect from MCV when g plays only a modest role.” Meta-analysese on the Black-White difference (e.g. Roth 2001) show a high positive correlation between g and the B-W difference.
 Lee (2009) tells us: “Three studies examining the factorial nature of the black–white IQ difference have found that the difference does not arise from measurement bias (Dolan, 2000; Dolan & Hamaker, 2001; Lubke, Dolan, Kelderman, & Mellenbergh, 2003). This implies that the black–white difference is indeed a difference in very general abilities. In contrast, a study of stereotype threat employing simi- larly sized samples found measurement bias to be an important contributor to the differences between treatment groups (Wich- erts, Dolan, & Hessen, 2005).
 Holloway (2008) tells us: “In the late 1970s and early 1980s, I collected autopsy data from the Pathology department at Columbia’s College of Physicians and Surgeons (now CUMS). I was interested in age, sex, and ethnic effects on brain size changes through time as might be found in cross-sectional data. Roughly 2000 cases were collected, without personal identiﬁcations, and all cases of brain pathology were culled out of the data set. The results, unpublished, were roughly the same as found in the Ho et al. (1980, 1981) work on a sample from Milwaukee, which indicated that African American brains were statistically signiﬁcantly lower in weight than were European American brains, that is, of course referring to the mean values. Ho et al. (1980) concluded that cultural effects were the reason behind the difference. Interestingly, Lieberman (2005) in his review of Rushton’s (2000, 2002) claims regarding ethnic (racial) differences in brain sizes and behaviors ignored this work by Ho et al. Needless to say, Tobias’s oft-cited paper on brain weight collecting methods (Tobias 1970) was cited to claim that autopsy data on brain weights are useless. Unfortunately, however problematic such data are, one tends to forget that autopsies are not done discriminately. Once the body is on the morgue slab, the autopsy is conducted in exactly the same fashion irrespective of the cadaver’s race, and thus comparisons of such data collected by the same anatomist or medical examiner are surely valid, depending on which variables are being compared. Comparing data collected by different examiners may of course be difﬁcult, and perhaps statistical metaanalyses would be in order. To my knowledge, none exists (The Human Brain Evolving: A Personal Retrospective) has an interesting discussion on the brain controversy).”
 Decreasing the ratio of Black to White mental retardation is, in part, accomplished by lowering the threshold for classification. This works because the rate of organic MR is similar in both groups. Lowering threshold, then, decreases the proportion of familial MR, which is a product of the normal distribution of intelligence, without affecting the proportion of organic MR. This leads to a decreased ration of total MR. With regards to efforts to decrease the MR rate, one author or a recent paper notes:
“There have been several important efforts made to reduce the unequal representation of CLD students in special education. These include two National Research Council reports (Donovan & Cross, 2002; Heller et al., 1982), which offered valuable guidance and also generated considerable debate about how to explain disproportionality, political pressure, and litigation made by major professional organizations, such as the Council for Exceptional Children, and federally funded projects, such as NCCRESt. Furthermore, the 1997 and 2004 reauthorizations of the Individuals with Disability Education Act (IDEA) recognized the disproportionate representation of CLD students in special education. IDEA instructs states to have policies and procedures that prevent disproportionality and to collect, examine, and report data to determine significant disproportionate rates. In addition, states must revise policies, procedures, and practices when significant disproportionality is determined, reserving 15% of federal funds under Part B of IDEA for early intervening services. However, states can define significant disproportionality in their own terms, making disproportionality comparisons.”
 Flynn (2010) states:
I used the Flynn Effect to break this steel chain of ideas: (1) the heritability of IQ both within the present and the last generations may well be 0.80 with factors relevant to group differences at 0.12; (2) the correlation between IQ and relevant environment is 0.33; (3) the present generation is analogous to a sample of the last selected out by a more enriched environment (a proposition I defend by denying a significant role to genetic enhancement); (4) enter regression to the mean — since the Dutch of 1982 scored 1.33 SDs higher than the Dutch of 1952 on Raven’s Progressive Matrices, the latter would have had to have a cognitive environment 4 SDs (4×0.33=1.33) below the average environment of the former; (5) either there was a factor X that separated the generations (which I too dismiss as fantastic) or something was wrong with Jensen’s case. When Dickens and Flynn developed their model, I knew what was wrong: it shows how heritability estimates can be as high as you please without robbing environment of its potency to create huge IQ gains over time.
Flynn’s counter argument is flawed on three accounts. 1) Secular differences, unlike the Black-White difference, have repeatedly been shown to not be directly comparable. Some undetermined percent of the difference is due to psychometric bias. 2) There is no a priori reason why x-factors are implausible in the case of cohort differences. In the case of the Black-White difference they had to be empirically disconfirmed. (Flynn himself was a proponent of X-factors in the 80s. As was Sandra Scarr, who argued that Black culture was the X-factor.) As it is, the secular rise in the US has been shown to behave like an X-factor, being not significantly difference across ethnic groups, sexes, urbanity, age, family interaction, household income, and income x race (Ang et al., 2009. The Flynn Effect within subgroups in the U.S.: Gender, race, income, education, and urbanization differences in the NLSY-Children data). 3) While, between family differences may be the only relevant factors for the Black-White difference, as Flynn suggests, there is no reason why this would be the case across cohorts. Given this, we have the following: on average, there has been a three point rise in IQ test scores per decade; some undefined percent of this is due to measurement bias; and an unidentified amount of this is in general intelligence. Since the correlations between the secular rise and g is about -0.3, we might infer that one third of the difference in an actual g difference. This inference is supported by Wai and Putallaz (2011)’s massive study, “The Flynn effect puzzle: A 30-year examination from the right tail of the ability distribution provides some missing pieces,” which looked at over 1.7 million scores. The authors state:
For example, for tests that are most g loaded such as the SAT, ACT, and EXPLORE composites, the gains should be lower than on individual subtests such as the SAT-M, ACT-M, and EXPLORE-M. This is precisely the pattern we have found within each set of measures and this suggests that the gain is likely not due as much to genuine increases in g, but perhaps is more likely on the specific knowledge content of the measures. Additionally, following Wicherts et al. (2004),we used multigroup = confirmatory factor analysis (MGCFA) to further investigate whether the gains on the ACT and EXPLORE (the two measures with enough subtests for this analysis) were due to g or to other factors. Using time period as the grouping variable, we uncovered that both tests were not factorially invariant with respect to cohort which aligns with the findings of Wicherts et al. (2004) among multiple tests from the general ability distribution. Therefore, it is unclear whether the gains on these tests are due to g or to other factors, although increases could indeed be due to g, the true aspect, at least in part..(a).
(a)…Under this model the g gain on the ACT was estimated at 0.078 of the time 1 SD. This result was highly sensitive to model assumptions. Models that allowed g loadings and intercepts for math to change resulted in Flynn effect estimates ranging from zero to 0.30 of the time 1 SD. Models where the math intercept was allowed to change resulted in no gains on g. This indicates that g gain estimates are unreliable and depend heavily on assumptions about measurement invariance. However, all models tested consistently showed an ACT g variance increase of 30 to 40%. Flynn effect gains appeared more robust on the EXPLORE, with all model variations showing a g gain of at least 30% of the time 1 SD. The full scalar invariance model estimated a gain of 30% but showed poor fit. Freeing intercepts on reading and English as well as their residual covariance resulted in a model with very good fit: χ2 (7) = 3024, RMSEA=0.086, CFI=0.985, BIC=2,310,919, SRMR=0.037. Estimates for g gains were quite large under this partial invariance model (50% of the time 1 SD). Contrary to the results from the ACT, all the EXPLORE models found a decrease in g variance of about 30%. This demonstrates that both the ACT and EXPLORE are not factorially invariant with respect to cohort which aligns with the findings of Wicherts et al. (2004) investigating multiple samples from the general ability distribution. Following Wicherts et al. (2004, p. 529), “This implies that the gains in intelligence test scores are not simply manifestations of increases in the constructs that the tests purport to measure (i.e., the common factors).” In other words, gains may still be due to g in part but due to the lack of full measurement invariance, exact estimates of changes in the g distribution depend heavily on complex partial measurement invariance assumptions that are difficult to test. Overall the EXPLORE showed stronger evidence of potential g gains than did the ACT.
So if one thirds of the difference is a real difference in g and the environmentality of IQ (between and within families) is .25, IQ affecting environmental conditions would rise about 2 points per decade, which is not implausible. That the Black-White difference is solely due to environmental factors is implausible, on the other hand, because it’s implausible that the difference, as Flynn notes and regression to the mean and SEM findings agree with, arise out of the within family environmental variance at a single point in time.
 Sesardic (2005) discusses a number of lines of evidence which disconfirm a gene x environment correlation model of heritability. Briefly, an rGE explanation runs accordingly: G (genetic difference) –> C (characteristic not related to trait) –> E (environmental influence) –> P (trait differences). (1) A GE model implies C will correlate with P. Scarr and Carter-Salzmann (1982) compared MZ twins to DZ twins incorrectly classified (because they were phenotypically very similar) as MZ twins. Contrary to GE predictions, incorrectly classified DZ twins were less phenotypically similar than MZ twins. (2) Such a model implies P differences are systematically related to E differences. Loehlin and Nichols (1976) collected information about twins’ different treatment and found that average correlations between a composite and these factors and psychological traits was close to zero. (3) Such a model predicts a correlation between a given environmental measure of one twin and another’s phenotype, since G –> E–> P. Loehlin 1992; Neale & Cardon (1992): compared twins reared apart. Found no cross correlation between E and P. Plomin 2001: compared parent-Child pairs. Found little evidence of active or reactive G-E covariance. (4) Such a model would predicted that similarity would decrease with age of separation, if shared MZ twin environment causes the phenotypic similarity: Lykken (1995) found no effect. (5) Such a model is limited. Many GE interactions can be ruled out a priori as the between family environmentality is close to zero which means that between family differences are uncorrelated with Intelligence. Additional evidence against a rGE model comes from studies on the relation between IQ and endophenotypes. The rGE model implies E will correlate with endophenotyes (van Leeuwen et al., 2009) Posthuma et al. (2003), De Moor et al. (2008), van Leeuwen et al (2009), and Betjeman et al. (2009) found this not to be the case. (This is discussed here in detail.)
Ultimately the issue is an empirical one, though. And there are a number of methods to test for COVGE. In the few studies that have been done, at older ages, the % variance explained by COVGE is low. For example, Plomin et al (1997) found negligible COVGE in the Colorado adoption sample:
“Environ-mental transmission from parent to child is negligible. The inconsequential estimate of genotype-environment correlation is to be expected given the negligible influence of environmental transmission from parent to child (Plomin et al., 1997. Nature, Nurture, and Cognitive Development from 1 to 16 Years: A Parent-Offspring)”
Unfortunately, no one has conducted a meta-analysis of all studies.
 For reference, the expected mean correlation between IQ and skin color (SC) would be the square root of the product of the reliabilities (i.e., square) of the correlation between IQ and individual ancestry (IA) and SC and individual ancestry (IA), assuming some between group heritability (BGH) of IQ. The average SC-IA correlation for African Americans is around .44 (ranging from .34 to .54); the reliability of skin color as a predictor of African American Ancestry is, therefore, .19. The average IQ-IA correlation obviously has yet to be determined. According to Zakharia, et al. (2009):
Numerous studies have estimated the rate of European admixture in African Americans; these studies have documented average admixture rates in the range of 10% to 20%, with some regional variation, but also with substantial variation among individuals . For example, the largest study of African Americans to date, based on autosomal short tandem repeat (STR) markers, found an average of 14% European ancestry with a standard deviation of approximately 10%, and a range of near 0 to 65% , whereas another study based on ancestry informative markers (AIMs) found an average of 17.7% European ancestry with a standard deviation of 15.0% .…
…These results were confirmed in the estimation of IA by using the program frappe (also in Figure 1).(Zakharia, et al., 2009. Characterizing the admixed African ancestry of African Americans)
If this is the case, US Blacks, who are 20% White, differ in White Ancestry from hypothetical US Blacks who are 99% White by 5.3 Standardized differences. If we propose that there is a genotypic IQ difference of 1 Standard deviation, at maximum, between US Blacks and hypothetical Blacks who are 99% White, we might suppose that the correlation between IQ and ancestry in the US Black population is 1/(5.3) or 0.19, since the correlation would be the change in X (IQ) over the change in Y (White Ancestry). Using .44 as the SC-IA correlation and 0.19 as the IQ-IA correlation, the SC-IQ correlation would be around 0.8.
The difference between the upper and lower fourths of the Black population, who differ by approximately 3 to 4 standard deviations in skin color would be 0.8 x 3 to 0.8 x 4 SD or 0.24 to 0.32 SD. The differences found then are somewhat larger than those predicted. The excess could be a result of “colorism” or cross assortative mating for color and IQ in the African American population.
 To see this, one can just look at the results of the better of the two studies.
Scarr et al. (1977) tested the genetic hypothesis in two ways. First, they looked to see if g was associated with an index of African ancestry and found a statistically non-significant -.05 correlation and, second, they divided their subjects into thirds based on their index of ancestry and compared the g scores of the top third to the bottom third; the latter analysis showed a non-significant difference of .11 SD between the groups. Scarr et al. concluded: “An extrapolation from the contrast between extremes within the hybrid group to the average differences between the races predicts that not more than one third of the observed difference between the races could be due to genetic differences. In view of the negligible correlations between estimated ancestry and intellectual skills even this seems unlikely.”
It’s easy to see why the results don’t support the conclusion:
a. According to Scarr et al. the difference between the upper and lower thirds of the distribution was 0.11 SD. Assuming a normal curve approximation, the upper and lower thirds of a distribution are 2.2 SD apart. If the upper and lower thirds are 2.2 SD apart and the correlation between IQ and ancestry is, according to Scarr et al. 0.05, we would expect that the difference between the thirds would be 0.11 SD (2.2 x o.05). So the “negligible” correlation is consistent with the mean difference, which Scarr et al tell us is consistent with a population difference of 1/3rd of a standard deviation – which is hardly negligible.
b. Now it’s clear that Scarr et al.’s index of ancestry was unreliable so we have to correct for that. Based on partial correlations, Jensen calculated the validity of the index to be 0.49. We can calculate it alternatively by simply dividing the skin color-ancestry correlation found in the US Black population when using modern methods (0.44) by the skin color-ancestry correlation which Scarr et al. found (.27). We get a validity of .61, which might be an overestimate, as some of the correlation between skin color and Scarr et al.s index of ancestry could be due to the correlation between blood groups and skin color genes as the authors noted.
c. Using the higher estimated reliability (.61), the corrected mean difference is 0.18 (0.11/.61).
d. Plugging this into Scarr et al.’s formula above (expected BGH x 0.23 = 0.18), we get a between group heritability of 0.78.
Quote: “The rough calculation for the estimate of the difference between upper and lower thirds of the black group proceeds as follows. If the resultant difference in standard deviations is 0.9 between the races when the mean difference in degree of Caucasian ancestry is about 0.77 (0.99 – 0.22 = 0.77) then the difference between upper and lower thirds of the black group alone should be about 0.23SD when the difference in Caucasian ancestry is about (0.35 – 0.15) = 0.20. Furthermore, if three-fourths of that mean difference is due to racial genetic differences alone the smallest expected difference is (0.75 x 0.23) = 0.18.”
Following Scarr et al. we could conclude that given the mean difference between the upper and lower thirds of the Black population and given the commensurate correlation between IQ and ancestry, a genetic difference over one half of a standard deviation seems likely.
 Others paint a more optimistic picture. For example, Steve Barrett summarizes 9 studies in ”Long-Term Effects of Early Childhood Programs on Cognitive and School Outcomes,” which show an n-weighted IQ increase of 4.5 points (in the studies he presents that allow data to be pooled.) The pooled number of individuals in the treatment group, at follow up, was 635. And the mean age at follow up was 13.
Carolina Abecedarian (1972–1985)
Florida Parent Home Education project ((1966-1970)
Milwaukee Project (1968-1978)
Early Training Project ((1962–1967)
Experimental Variation of Head Start (1968–1969)
Harlem Training Project ((1966-1967)
High/Scope Perry Preschool Project ((1962-1967)
Philadelphia Project (1963-1964)
Verbal Interaction Project ((1967-1972)
Noticeably, Barrett made no mention of the studies discussed by Brody, even though he included a Head Start study. Similar curious omissions can be seen in other reviews (e.g., Fryer, 2010). Another odd omission, as it was an extension of the Carolina Abecedarian project, which was included in the review, was the Infant Health and Development Program. The experimental sample size in this study was relatively large, at 377, which was seven times larger than the sample in the Abecedarian project. By age 5 and 18, no significant difference was found between the control group and the intervention group. B & B note:
When the IHDP presented their 5-year follow-up analysis (Brooks-Gunn et al., 1994), one general conclusion comes through distinctly: Whatever effects they claim to have obtained when the children were 3 years of age had evaporated. WPPSI full-scale scores were 91.6 and 91.4 for the treatment and control groups, respectively. The one small remaining effect they attempted to salvage (as reported by Brooks-Gunn et al., 1994) was a 3.7 mean IQ difference (p = .03) favoring heavier infants in the treatment group. But when we examined their summary data deposited with the National Auxiliary Publication Service (NAPS), the mean difference between heavier birth weight babies was 3.0 points (p = .09).” (Early Generic Educational Intervention Has No Enduring Effect on Intelligence and Does Not Prevent Mental Retardation: The Infant Health and Development Program)
If we pool the results of IHDP with the studies listed by Barrett, the n-weighted longitudinal effect is 3 points. Adding the HeadStart and Project Follow studies, drops the effect size to under 1 point.
 For example, in “Early Childhood Care and Education: Effects on Ethnic and Racial Gaps in School Readiness,” Magnuson and Waldfogel estimate that HeadStart (transiently) narrows the early childhood achievement gap 0.12 SD.
If Head Start boosts skills as much as CPC, then with 19 percent of black children in Head Start, black children’s skills would be about 0.12 of a standard deviation lower, on average, if they did not attend Head Start or other early education programs. Since the black-white test score gap is estimated at close to 0.50 of a standard deviation, such a reduction implies that the black-white test score gap would be about 24 percent larger
(at 0.62 of a standard deviation) in the absence of Head Start.
 Fryer and Levitt’s argument:
The Black-White gaps by age are: Adulthood 1 SD; Childhood .85 SD; Infancy .077 SD. The Phenotypic correlations are as follows: Mother-Infant .3, Mother Child .39, Infant-Child .30. Given the inter-correlations the magnitude of the infant Black-White gap is too small to be consistent with a genetic hypothesis of any significant magnitude.
In Testing for Racial Differences in the Mental Ability of Young Children, Fryer and Levitt note that:
“An alternative model is one in which an individual’s intelligence has not yet fully developed at age 1, but otherwise the restrictions in the simple model are maintained. In the most extreme conceptualization of this view, one would assume that I = 00 for all individuals. Such a model can generate a small racial test score gap early in life and a one-standard deviation gap later in life if W > B. Most importantly, the model cannot explain how one observes a similar degree of correlation between parental test scores and their children’s test scores both early and later in life. Given the high degree of heritability in intelligence, one would expect that if early test scores do not reflect intelligence, then they should correlate much less strongly with parental test scores than later tests. (Fryer and Levitt, 2006. Testing for Racial Differences in the Mental Ability of Young Children.)
It’s not clear what Fryer and Levitt mean when they speak of intelligence being “not yet fully developed” or speak of tests which do not fully “reflect intelligence.” The relevant questions are (a) “Do the test score differences at age 1 capture the magnitude of the general intelligence difference at age 1?” (b) “What are the within group heritabilities at the ages in question?” (c) “What are the between group heritabilities at the ages in question?” (d) “What are the genetic correlations between IQ at the various ages?” and so on.
As for (b) and (d), a quick search of the literature gives the relevant parent-offspring correlations. For example, Plomin et al. (1997):
Correlations between biological parents (weighted averages for mothers and fathers) and their adopted-away offspring also start off at modest levels at 3 and 4 years (.12). However, unlike adoptive-parent/ adopted-child correlations, correlations between biological parents and their adopted-away offspring increase during middle childhood (.18), early adolescence (.20), and late adolescence (.38). The increasing resemblance between adopted children and their biological parents, with correlations rising from about .1 in early childhood to about .2 in middle childhood to about .3 in adolescence, suggests increasing genetic influence. Results for control parents are similar to those for biological parents: Correlations of .19 in early childhood, .24 in middle childhood, .28 in early adolescence, and .31 in late adolescence indicate again that parent-offspring resemblance for general cognitive ability is largely due to genetic influences (Plomin et al 1997 Nature, Nurture, and Cognitive Development from 1 to 16 Years: A Parent-Offspring Adoption Study)
You can find this in a standard behavioral genetics book (Plomin et al., 2006):
“For example, the IQ correlation between biological mothers and their adopted-away children in infancy is about .10″ (Plomin et al., 2006. pg. 144).”
The mid-parent-offspring correlation can be seen as a rough index of the portion of the correlation that is genetically mediated. Alternatively, you can merely multiply the 0.5 genetic correlation between parent and child by the estimated heritability of IQ at the given ages. It’s difficult for me to understand how Fryer and Levitt missed this. Whatever the case, a slightly more informed version of the argument pops back up in Fryer (2010):
A model in which parents’ scores influence their offspring’s environment is, however, equally consistent with mean racial gaps in G of one standard deviation. For this to occur, G must exert little influence on the baby’s test score, but be an important determinant of the test scores of children. Take the most extreme case in which G has no influence on the baby’s score (i.e., _b = 0). If genetic factors are not directly determining the baby’s test outcomes, then environmental factors must be important. Assuming…. (Fryer, 2010. Racial inequality in the 21st century: The declining significance of discrimination)
An important consideration that Fryer neglects is the genetic correlation between IQ at age 1 and other ages. IQ at two ages can be perfectly heritable but genetically uncorrelated, such that the genes that explain variance at one age do not explain variance at another. As for the genetic correlation between ages 1 and on, it’s not clear. Typically, it’s found to be significantly less than 1 (Brody, 1992 pg. 160; see also Plomin et al. 2008 pg. 169.). To the extent different genes are involved at different ages, genetic variance between populations at one age does not necessitate genetic variance at another.
Whatever the case, we then have the following:
Phenotypic correlations (Fryer, 2010)
Mother Child .39
Correlations mediated by genes (Plomin et al., 1997; Brody, 1992)
Infant-Child undetermined but <.3
Gaps (Fryer, 2010)
Now what is readily noticeable is that the Black-White Infant phenotypic gap is noticeably smaller than one would predict given the Adult phenotypic gap and the portion of the correlation attributable to the environment. It should be 1SD x (Mother-Infant Phenotypic correlation – Mother-Infant genotypic correlation). Following Fryer and Levitt’s own logic, we can conclude that there are factors boosting the performance of Black infants relative to White infants. This positive influence, of course, would work against any negative influence due to general intelligence genes.
Anyways, after imputing the correct values we see that there is no inconsistency, except that the childhood gap is larger then it should be were g genes alone involved. But we already know this.
 It can’t be argued that heritability estimates are insensitive to shared influences since at younger ages and especially at young ages at lower SES levels, shared influences explain a large portion of variance. The problem is not with the heritability estimates, per se.
There are two possible escapes from the conclusion:
(1) It could be that adult shared influences, in general, are underestimated
(2) It could be that adult shared influences for one of the two populations in question are
The theoretical grounds for the second possibility is that shared environmentality estimates seem to vary as a function of SES, being greater at lower SES levels (e.g., Tucker-Drob et al., 2011; Harden, 2007), and that Blacks have, on average, lower levels of SES. It is plausible, therefore, that shared influences have a greater impact in the Black population. This escape is blocked, though, by fact that the Black-White difference increases with SES (Rowe, 1994; Jensen, 1998). As a result, consideration can be restricted to upper SES individuals of both races and any race x SES x heritability interaction can be ignored.
If there is to be an escape, it must be along the first path. The major problem here is that the evidence converges on a low adult shared environmentality (Lee, 2009; Bouchard, 2009). This conclusion has been debated though. Nisbett et al., for example, write:
Shared environment effects are sometimes reported to be very low or even zero by adulthood (Bouchard, 2004; Johnson, 2010; McGue & Bouchard, 1998). If shared environment effects were really this low in adulthood, it would prompt pessimistic conclusions about the degree to which interventions in childhood would have enduring effects. One basis for the claim that shared environment effects are zero in adulthood is a review of three studies in 1993 by McGue, Bouchard, Iacono, and Lykken (1993), which has been frequently cited since (e.g., Bouchard & McGue, 2003; Rushton & Jensen, 2005a). But a large range of shared environment effects has been reported. Bouchard and McGue (2003) reproduced the 1993 review figure with its assessment of zero adult shared environment effects, but they also found a shared environment effect in excess of .25 for 16–20-year-olds. Johnson (2010) reported that shared environment effects are zero in adulthood (but did not provide sources) and in the same year reported a study showing that the shared environment effect was .07 for 17-year-olds in Minnesota and .26 for 18 year-olds in Sweden (Johnson, Deary, Silventoinen, Tynelius, & Rasmussen, 2010). Another recent study found shared environment effects of .26 for 20-year-olds and .18 for 55-yearolds (Lyons et al., 2009), and yet another found shared environment effects of .20 for Swedish conscripts (Beauchamp, Cesarini, Johannesson, Erik Lindqvist, & Apicella, 2011). A recent review of six well-conducted studies found shared environment effects in adulthood to be .16 on average (Haworth et al., 2009).
There is some misreporting here by Nisbett et al., though. For example, Haworth et al., 2009 provided shared environment effects of 0.16 for young adults (mean age 17), not adults. That figure is along with the young adult estimates given by Bouchard and McGue (2003) and Johnson et al. (2010) are utterly consistent with the claim that shared environmentality drops down to near zero by adulthood (i.e,., above age 25).
It might be, though, that the shared environmental effects are high enough, in general, to allow for a shared environment explanation of the between race difference. Perhaps, on average, shared environment explains 20% of the IQ variance in adulthood and the adult heritability is, in fact. only 0.55 (h= 0.55, C = 0.20, E = 0.25). (The amount of shared effect between races would have to be massive at 1.1 divided by the square root of 0.2 – but at least an explanation would be possible.) But here again we are constrained by the relation between SES and the IQ differences. Nisbett et al. forcefully argue that heritability is related to SES. So if the average shared effect is 0.2 by young adulthood and adulthood, it should be less for those at the upper SES levels. As a result, consideration can be restricted to upper SES individuals of both races and any shared effects in adulthood can be treated as insignificant.
For a critical discussion on shared environmental influence from an environmentalist perspective, readers are referred to: Kaplan (2013) “The Effects of Shared Environment on Adult Intelligence: A Critical Review of Adoption, Twin, and MZA Studies“. With regards to the correlation between unrelated siblings, a correlation which is a direct measure of influence of shared environment, Kaplan states:
The average correlation between unrelated adopted siblings, weighted by sample size, for roughly 700 pairs of participants who were mostly or entirely 16 years or younger is .26. (The 95% confidence interval is .19 to .33.) The average correlation, again weighted by sample size, for about 300 pairs who were mostly or entirely between the ages of 16 and 22 years is minus .01, virtually zero. (The 95% confidence interval is 0 to .12.) These numbers provide evidence that shared environment has a significant effect on intelligence for children and early adolescents but not for participants once they have reached late adolescence.
So, taking the average of all unrelated sibling pairs gives us a c^2 (shared environmentality) of 0.18. This can be considered the upwards bound estimate for the c^2 by adulthood (in developed countries). To note: estimates based on MZ-DZ correlation differences tend to overestimate c^2 estimates as the effect of assortative mating, which tends to elevate DZ correlations but not MZ correlations, is often not taken into account.
 For those two studies, the following variables were looked at: (NLSY) Mother’s education, Age of child, HOME Cognition, HOME Emotion, School self-esteem, Self-worth, Math achievement, Reading recognition, Reading comprehension, Problem behavior; (Richmond Youth Project) Participation with father, Supervision by father, Overall GPA, Mother’s education, Participation with mother, Supervision by mother, Peer orientation, IQ, Standard self-report, delinquency. Official offenses to 1967.
 The n-weighted heritabilities, based on the 5 studies prior to 2005, which have reported results for Blacks and Whites who were a part of the same sample and who took the same test under the same conditions is 0.51 (1273 pairs) for Blacks and 0.58 (1730) for Whites. This summary includes the results from the four studies reported by Jensen (1998) pg. 367 and the results from Guo and Stearn (2002). As the individuals in these studies were mostly in the pre-adolescent to late adolescent age range, and as the heritability of IQ increases with age, these estimates are consistent with that which is typically found. Note, the lower Black heritability coefficient is driven by Guo and Stearn (2002), who report Black and White heritabilities of 0.57 (n, pairs= 309) and 0.74 (n, pairs= 950), respectively. They looked at the Add Health sample. The same sample was analyzed by Rowe et al. (1999). Based on a more complete set of kinship pairs, Rowe et al. (1999) found nearly identical heritabilities for Blacks and for Whites:
Were one to use Rowe et al. (1999) in place of Guo and Stearns (2002), one would get a n-weighted heritability of 0.58 for Blacks (n, pairs = 1551) and of 0.57 for Whites (n, pairs = 2110).
 What would a genetic hypothesis predict? To determine this, we need to ascertain the average and standard deviation of admixture in the population. In table 1 of Cheng et al. 2012, we see that the African admixture in 84% and we can deduce that the standard deviation is 8.5, based on the interquartile range.
From the above, we can determine that Blacks in this sample differ from Whites by 9.88 SD of African admixture, based on the sample admixture SD, e.g., (84-0)/8.5 SD. Were genetic IQ d to equal 1, then a shift in 1 genetic IQ SD would be associated with a shift in 9.88 SD of African admixture. The predicted genetic IQ, ancestry correlation would then be 0.1. And 0.1 x 8.5% would give us the predicted % African Admixture difference between sub samples that deviate by 1 standardized unit in genotypic IQ.
We can see that the admixture difference between the Black SES-defined sub populations, who different phenotypically probably by no more than 2 SD in IQ, is, at the extremes, 4%. This is well above the association predicted by a genetic hypothesis (i.e., less than 2%). The found differences, then, while consistent with a strong genetic-IQ hypotheses, necessitate that additional factors need to be involved.
(15) References for B/W secular trends in IQ
Yerkes, R. M. (Ed.). (1921). Psychological examining in the U.S. Army: Memoirs of the National Academy of Sciences (Vol. 15). Washington, DC: U.S. Government Printing Office.
Shuey, A. M. (1966). The Testing of Negro Intelligence. New York: Social Science Press.
Gottfredson, L. S. (2005). Implications of cognitive differences for schooling within diverse societies. Pages 517-554 in C. L. Frisby & C. R. Reynolds (Eds.), Comprehensive Handbook of Multicultural School Psychology. New York: Wiley.
Loehlin, J. C., Lindzey, G., and Spuhler, J. N. (1975). Race Differences in Intelligence. San Francisco, CA: Freeman.
Coleman, J. S. (1966). Equality of Educational Opportunity. Washington, D.C.: U. S. Office of Education.
Osborne, R. T., and McGurk, F. C. (1982). The Testing of Negro Intelligence. Athens, GA: Foundation for Human Understanding
Broman, S. H., Nichols, P. L., and Kennedy, W. A. (1975). Preschool IQ. New York: J. Wiley.
Arthur R. Jensen. Educability and Group Differences . New York: Harper and Row, 1973
Roth, P. L., Bevier, C. A., Bobko, P., Switzer, E S., and Tyler, P. (2001). Ethnic group differences in cognitive ability in employment and educational settings: a meta-analysis. Personnel Psychology, 54, 297-330.
Dickens,W.T.,&Flynn, J.R. (2006). Black Americans reduce the racial
IQ gap: Evidence from standardization samples. Psychological
Science, 17, 913–920
Kaufman, A. S., and Doppelt, J. E. (1976). Analysis of WISC-R standardization data in terms of the stratification variables. Child Development, 47 165-171.
Murray, C. (2007). The magnitude and components of change in the black–
white IQ difference from 1920 to 1991: A birth cohort analysis of the
Woodcock–Johnson standardizations. Intelligence, 35, 305−318.
Avolio, B. J., and Waldman, D. A. (1994). Variations in cognitive, perceptual and psychomotor abilities across the working life span: examining the effects of race, sex, experience, education and occupational type. Psychology and Aging, 9, 430-442.
Mercer, J. R., and Lewis, J. F. (1984). System of Multicultural Pluralistic Assessment: Manual. San Antonio, TX: Psychological Corporation.
Reynolds, C. R., Chastain, R. L., Kaufman, A. S., and McLean, J. E. (1987). Demographic characteristics and IQ among adults: analysis of WAIS-R standardization sample as a function of the stratification variables. Journal of School Psychology, 25, 323-342.
Herrnstein, R. J., and Murray, C. (1994). The Bell Curve: Intelligence and Class Structure in American Life. New York: Free Press.
Dunn, L. M. (1988). Bilingual Hispanic Children on the U. S. Mainland. Honolulu: Dunn Educational Services.
Thorndike, R. L., Hagen, E. P., and Sattler, J. M. (1986). Stanford-Binet Intelligence Scale: Fourth Edition Manual. Chicago: Riverside.
Nyborg, H., and Jensen, A. R. (2000). Black-white differences on various psychometric tests: Spearman’s hypothesis tested on American armed services veterans. Personality and Individual Differences, 28, 593—599
Murray, C. (2006). Changes over time in the Black–White difference on
mental tests: Evidence from the children of the 1979 cohort of the
National Longitudinal Survey of Youth. Intelligence, 34, 527–540.
Pnfitera, A., Lawrence, L. G., and Saklofske, D. H. (1998). The WISC-III in context. In A. Prifitera and D. H. Saklofske (Eds.). (1998). WISC-III Clinical Use and Interpretation. San Diego, CA: Academic.
Kaufman, J. C, McLean, J. E., Kaufman, A. S., and Kaufman, N. L. (1994). White-black and white-Hispanic differences on fluid and crystallized abilities by age across the 11 to 94 year range. Psychological Reports, 75, 1279-1288.
Kramer, R. A., Allen, L., and Gergen, P. J. (1995). Health and social characteristics and children’s cognitive functioning: results from a national cohort. American Journal of Public Health, 85, 312-31
Rowe, D. C. (2002). IQ, birth weight, and number of sexual partners in white, African American, and mixed race adolescents. Population and Environment., 23, 513-524.
Weiss, L. G. (2010). WAIS-IV clinical use and interpretation: Scientist-practitioner perspectives. London: Academic.