Lifting all boats

16 minute read

James R. Flynn is a social scientist at the University of Otago, New Zealand. Beginning in 1981, Flynn performed a series of statistical analyses on the results of IQ tests in Western populations. The analyses showed that IQ scores in every population measured are systematically rising over time, with each succeeding generation apparently smarter than its predecessor. These IQ gains have been dubbed the "Flynn effect," and the causes and patterns underlying these changes are still obscure. Flynn recounts the story of his findings in his article "Searching for justice: the discovery of IQ gains over time," in the January 1999 issue of American Psychologist, as well as discussing the current state of research into IQ changes and the moral and sociological corollaries of the research.

The initial finding of rising IQ came from looking at the pattern of correlations between successive versions of IQ tests. When publishers came up with revised versions of tests, they would apply both the new and old test to a set of the same subjects. If the subjects scored similarly on both versions of the test, it would demonstrate that the new test was measuring the same skills as the old test. But Flynn noticed something else about these results: the subjects who took both versions of the test would average a significantly higher score on the older version. Since the older version's scores were standardized at the time it was written, Flynn noted that "the only possible explanation was that representative samples of White Americans were setting higher standards of test performance over time" (5). In this first analysis, the IQ gain from 1947 to 1972 was 8 points. In a broader sample of tests, "the rate of gain was about 0.30 IQ points per year, roughly uniform over time and similar for all ages" (6, citing Flynn 1984).

In light of these results, Flynn gathered data on IQ surveys in as many nations as had routine military induction testing or other large-scale testing organizations (6). The startling result was that IQ has been increasing in every one of these countries, and that the increases were especially noted in tests that were thought to be less susceptible to biases from educational or cultural factors. The longest-term sample was available from Britain, where it became clear that the fifth percentile among persons born in 1967 was equivalent to the ninetieth percentile of those born in 1877 (Raven et al. 1998).

What should we make of this phenomenon? According to Flynn (7):

This deals a stunning blow to our confidence in the ability of IQ tests to compare groups for intelligence, at least when those groups are separated by cultural distance. Can anyone take seriously the notion that the generation born in 1937 was that much more intelligent than the generation born in 1907, to say nothing of the generation born in 1877? It also deals a blow to the Spearman-Jensen theory of intelligence. That theory is based in g, the general intelligence factor derived from the tendency of the same people to excel on a wide range of IQ tests.

Flynn relies on common sense to justify this intuition, but to the extent that the results would seem to indicate very surprising conclusions, his common sense appears to be sound. As he notes, it seems unlikely that high proportions of earlier generations would have lacked the intelligence to understand the rules of popular sports. And as he notes (7), "[achievement gains] fall away the closer we come to the content of school-taught subjects" such as "arithmetic, information, and vocabulary."

He also gives a quick rundown of the reasons suggested to explain the IQ rise over time, along with reasons why they cannot explain the entire problem. His passage on the effect of better nutrition is well worth reading. Each section is a reminder of the nature of the problem: first, can these reasons explain IQ gains that were progressive, gradual, and constant, and second, are we really to believe that earlier generations actually had intelligence as low as their scores compared to today's scores would indicate?

The social justice element of the article comes from the consideration of how apparent IQ gains affect the interpretation of race differences in IQ test performance. One section is spent considering the ways that group achievement in real terms may not represent the expectations from their IQ scores compared to other groups. Flynn focuses on the cultural and educational differences among groups as a way to explain differences in outcome that are not predicted by IQ scores.

In a second section, Flynn lays out the issues surrounding the interpretation of race differences in IQ as applied to American Blacks and Whites (his terminology). In this, he gives a thoughtful presentation of the Jensen position. This contains a very clear presentation of the concept of regression that is worth quoting (13):

The key to what follows is the concept of correlations as measures of regression toward the mean. To provide a simple example: Imagine that the correlation between height and between-family environment was perfect, or 1.00. The significance of that would be this: If we found a group one standard deviation below the mean for height, and environmental factors were solely responsible for their height deficit, then they should be one standard deviation below average in terms of environment. Now imagine the correlation is less than perfect, perhaps only .35. In that case, it would take several standard deviations of environmental deprivation to account for a one standard deviation height deficit. A bit of arithmetic shows it would take 2.86 standard deviations, because 2.86 times .35 equals one standard deviation. In summary, the correlation determines precisely how many standard deviations of environmental deficit it takes to account for a one standard deviation height deficit, or to be technical, it determines how far toward the mean the below-average group will regress as each standard deviation of environmental deficit is eliminated: clearly only 35% of the way.

As Flynn notes, Jensen's (e.g. 1972) case for Black-White IQ differences rested on the observation that between-family environmental differences had a correlation of .35 with IQ within Whites. According the the example above, it would require a deficit of 2.86 standard deviations of environmental conditions in Blacks compared to Whites in order to explain the IQ difference. According to a normal distribution, this would imply that "the average environment of Black Americans would have to be inferior to that enjoyed by 99.79% of White Americans" (13). Jensen argues that such a difference is unlikely, but in particular examines the idea that such an environmental difference might exist by discussing what its likely effects would be. In his view, a factor like racism might account for IQ differences by affecting self-image, confidence, and other social factors. But as Flynn summarizes his argument, "Who could argue that these same factors do not vary significantly within the Black population? ... If these factors both are potent and vary among Blacks, why do they explain so little IQ variance within the Black population?"

Flynn discusses Lewontin (1976) in particular as someone who argued for an environmental explanation by means of a thought experiment. Certainly there is no theoretical difficulty in imagining a nongenetic difference that would cause differences between groups without causing significant within-group differences. But one must imagine that the environmental difference was uniform within each group, which is arguably not true of any real-life human environmental variable. In Flynn's view, Lewontin's example merely begs the question of the real differences between races, since Lewontin provided no real-world example that would credibly avoid Jensen's objections.

Flynn describes his own attempts to address this issue. The major piece of evidence he drew upon (Flynn 1980) involved the IQ scores of children of the American occupation of Germany after World War II. Both Black and White American soldiers fathered children with German women, and the IQ scores of both groups of children were almost identical. Flynn argued that direct evidence of this kind, reflecting what actually occurs when different races are subject to qualitatively similar environments, is more important than the kind of indirect evidence that comes from attempting to correct samples for socioeconomic status or other environmental variables.

In addition to this kind of direct evidence, Flynn argues here that the observation of IQ test gains over time further reduces the relevance of the indirect evidence of environmental differences. For one, the gain of IQ from one generation to the next must certainly be almost all, if not entirely, due to environmental change. Flynn presents this as a real-world example that fits Lewontin's model, above, where within-group differences are mostly genetic, while between-group differences are mostly environmental in origin. (This also plays into a continuing current theme in Flynn's research examining how IQ can continue to increase while remaining fairly strongly heritable. This itself appears paradoxical, since continued environmental gains would normally be expected to reduce the heritability by decreasing potential environmental variance.) Second, the magnitude of the IQ gains is so large, that the Black-White average IQ difference seems comparatively minor (15):

As for the environmental gap one must posit to explain the Black-White IQ gap, IQ gains over time pull this out of the stratosphere and down to earth. It appears that Blacks have enjoyed a slightly higher rate of gain on Wechsler-type tests than Whites (Herrnstein and Murray 1994, pp. 277, 289). This implies that since 1945, Blacks have gained at an average rate of over 0.30 points per year and have gained a total of 16 points over 50 years. Therefore, the Blacks of 1995 should have matched the mean IQ of the Whites of 1945. Therefore, an environmental explanation of the racial IQ gap need only posit this: that the average environment for Blacks in 1995 matches the quality of the average environment for Whites in 1945. I do not find that implausible.

Of course, this still leaves unexplained just why the IQ gains have occurred and whether they actually reflect differences in some mental properties beyond the process of psychometry. So as an account of race differences, it is not entirely satisfactory. But it does show quite clearly that the kind of environmental differences that would be sufficient to explain racial differences in IQ as measured today are very much within the range of environmental changes that must have occurred in a historical context. For that reason, we have every reason to think that the environmental differences between groups today might be very large, and sufficient to explain observed differences in IQ scores. Or as Flynn puts it (16): "The appropriate rejection of Black genetic inferiority is this: Nothing at present coerces rational belief."

The penultimate section of Flynn's paper concerns the relationship of IQ with class membership, the thesis of Herrnstein and Murray's The Bell Curve. Here, Flynn does something very interesting (this section is derived from a longer 1996 paper in the Journal of Biosocial Science). Most critiques of The Bell Curve have focused on the race differences aspect of the book. But Flynn takes a greater interest in the aspect of the book that focuses on the idea of a natural meritocracy, for which IQ scores are assumed to be a correlate (16):

[Race differences are] a distraction from the real challenge The Bell Curve poses. The humane-egalitarian concept of social justice includes more than compensating people who suffer because of their group membership. It gives high priority to certain ideals, such as reducing environmental inequality and social privilege to tolerable levels. Herrnstein and Murray (1994) went beyond race to level the most devastating possible critique of those ideals, namely, that they self-destruct in practice. I refer to the meritocracy thesis, which runs as follows. The closer we come to environmental equality, the more all talent differences become caused by genetic differences. The more we eliminate privilege, the more we have total social mobility, and good genes for talent rise to the top and bad genes sink to the bottom. The tendency to marry those of similar IQ produces mating couples whose social status correlates with genetic quality. The result is an elite class whose children replicate their parents' high status, because of luck in the genetic lottery, and a large immiserated underclass whose children, handicapped by their bad genes, cannot escape low status.

Flynn presents an argument against this "meritocracy thesis" that claims the thesis is psychologically incoherent. It is not enough to show that an IQ elite is not already emerging, for although the evidence clearly shows that class differences in IQ are not increasing, that does little to assuage the moral difficulties that emerge from the concept of a true meritocracy. For there can be little argument that a reduction in environmental differences between people is a goal of "enlightened" social policy; it is, after all, the very meaning of "equality of opportunity." If the emergence of strong and permanent class differences based on genetic differences were a natural consequence of true equality of opportunity, one might well question the social value of such policies.

But Flynn argues that the meritocracy thesis is internally incoherent. It proposes that if social and environmental inequalities were eliminated, a strong ordering of people by class according to their innate talents would result. Because of equal environments, variation in talents would be largely genetic, and would therefore be increasingly resistant to change. But consider that social stratification occurs precisely because of the striving of individuals for greater status, wealth, prestige, and other indices of social inequality. Flynn essentially argues that Herrnstein and Murray (1984) unjustifiably project the current value of materialism and elitism into a hypothetical future, one which is predicated on the absence of the current level of materialism and elitism. In other words, the meritocratic future depends on strong notions of social competition based on wealth (or other markers of status), but the establishment of such a future depends on eliminating most differences in wealth and status. Or more directly (18):

The case against meritocracy can also be put sociologically: (a) Allocating rewards irrespective of merit is a prerequisite for meritocracy, otherwise environments cannot be equalized; (b) allocating rewards according to merit is a prerequisite for meritocracy, otherwise people cannot be stratified by wealth and status; (c) therefore, a class-stratified meritocracy is impossible.

Thus, the idea of an "immiserated underclass" seems inconsisent with the assertion that environments are qualitatively equal. But "if all have decent work, housing, education, health care, security in old age, what remains is not essential for happiness. Many people of talent may want more than the not-unattractive minimum, but ho many will care about shaking the last dollar out of the money tree?" (18). Flynn concludes (18):

Analysis of the meritocracy thesis provides not only a rebuttal but also a better understanding of the dynamics of humane-egalitarian ideals. The truth is that we cannot push equality much beyond our capacity to humanize. Every significant step toward equality must be accompanied by an evolution of values unfriendly to success as defined by the present class structure. Humane-egalitarian ideals possess a great glory: a self-correcting mechanism that avoids meritocratic excess. Whatever dark spirits lurk in the depths of equality, meritocracy is not among them.

I find this part the most intriguing, because it invites some expectations about ancient human societies, and the probable correlates of intelligence. Clearly, intelligence (broadly construed) in humans has both genetic and environmental components. Likewise, other traits including status (wealth is less relevant in a Pleistocene context) and of course fitness have both genetic and environmental components. Each of these traits was likely correlated to some extent with the others, and to the extent that each trait was correlated with fitness and was heritable, it would be under selection.

In this context, humans had every reason to increase their fitness through systematic alteration of the environmental component of these traits. Some aspects of the environment would be largely outside their control. For example, maximizing nutrition must have been a constant struggle for all people largely at the mercy of local ecological conditions. Other aspects could have emerged from interesting patterns of social interaction. For example, practical intelligence (as applied to problems of survival and reproduction) must have been greatly influenced by other people, through teaching, observation, learning feedbacks, storytelling, and other opportunities. It seems plausible that the environmental component of this kind of intelligence would have been higher in Pleistocene societies than today. One reason is the likely diversity of social contexts in small groups with high mortality rates (such as the increased chance of absence of one or both parents).

This kind of interaction among variables creates a behavioral context in which not only the brain functions underlying intelligence-related skills would have been under selection, but also those functions related to enabling those functions to their maximum under the existing environmental regime. This latter selection would be highly kin-mediated, as the inclusive fitness of individuals depended partly on the intelligence of their relatives. For that matter, the direct fitness of individuals would depend on the realized intelligence of members of their groups in the long-term struggle for survival and differential reproduction. Consider:

  1. These processes predict that group effects in human evolution might have been largely intelligence-mediated. Individual survival depends in part on the hunting effectiveness of group members, on their ability to maintain social ties with kin in neighboring groups, if they remember or not important information about ecological or climatic variability, etc. So it is in every individual's best interest to contribute materially to the education (meaning environmental maximization of operational intelligence) of other members of his or her group, related or not.
  2. Any genetic advantages in intelligence mean little without substantial environmental equality within the population at large. This is because environmental differences in intelligence might easily outweigh any genetic advantage.
  3. It goes without saying that people should have competed to the greatest extent possible for those material (or behavioral) circumstances that are associated with maximal environmental benefits for fitness. But if intelligence was correlated with fitness, then those circumstances most favorable for the development of intelligence should also have been subject to strong competition. These might include:
    1. preserving the lives of elders or others with valuable information
    2. recruiting new group members with certain properties conducive to greater group learning, such as good storytellers
    3. developing strong cultural justifications for the transmission of certain kinds of knowledge
    4. exercising mate choice based on behaviors related to intelligence
    and certainly others.

These ideas are suggestive. People were not merely involved in a struggle for survival and reproduction, in which intelligence may have been a factor. They were simultaneously locked in a meta-struggle, in which the determinants of intelligence were themselves judged among people and their social standing and other characteristics became dependent on them because of their value for the group, present and future kin, and their risk observed for neighboring peoples. These conditions were only genetic to a minimal extent. For the most part, this struggle took place within populations based on the fundamentally environmental variation that was always present and could not be eliminated (because it would always have been maintained by population pressure if nothing else). So the natural selection underlying the evolution of the brain was itself a side effect of a very powerful social structure with behaviors devoted to detecting intelligence and promoting it for basically selfish reasons.


Flynn JR. 1984. The mean IQ of Americans: massive gains from 1932 to 1978. Psych Bull 95:29-51.

Flynn JR. 1996. Group differences: is the good society impossible? J Biosoc Sci 28:573-585.

Flynn JR. 1999. Searching for justice: the discovery of IQ gains over time. Am Psych 54:5-29.

Jensen AR. 1972. Genetics and education. New York: Harper and Row.