Serial founder effects

I've been trying to think of the best way to approach last week's "serial founder effects" paper by Ramachandran and colleagues (abstract). The paper has been publicized as a support for the out-of-Africa theory.

I always find the science-by-press-release a bit irritating, because it is impossible to examine the claims to see if the data support them. I guess I should adjust my expectations: if there is a press release and no paper yet, I should just assume the data are weak.

The short answer is, the paper doesn't prove out-of-Africa. It doesn't even present any new data that support out-of-Africa. It presents some new simulations of how an out-of-Africa dispersal might work, but it doesn't test those simulations by comparing them to data that might differentiate their preferred model ("serial founder effect") from other hypotheses that might explain the same observations.

In the end, I really don't have a problem with the paper. You see, it doesn't actually mention the words "out-of-Africa". It doesn't claim to support out-of-Africa. All it does is show a correlation between their simulation results and some genetic data. Personally, I wouldn't have written the paper without testing the hypothesis with data that might refute it, but that's just me.

Of course I'm very interested in modern human origins and genetic information about the subject, so I'll try to give a record of my thought process. I include some discussion of isolation-by-distance as a model for genetic variation, different scenarios that would produce the pattern, and the kind of data that would test those scenarios.

The press

What I could find out last week at this time came from the press, much of which can be traced to the October 18 University of Michigan press release:

Small groups of settlers expanding outward from Africa are the most likely progenitors of the modern human population worldwide, according to a new study by researchers at the University of Michigan and Stanford University.

This led to a National Geographic News article on the same date, which says this:

"When we searched over 4,000 points around the world, we found that no point outside of Africa had as high a fit as any point inside of Africa," [University of Michigan geneticist Noah] Rosenberg said. "So this seems to support an 'Out of Africa' historical model for human evolution."
Genetic diversity is highest, and thus oldest, in Africa. This fact has led many geneticists to point to the continent as the birthplace of humankind.

A Discovery Channel news brief picked up the story October 21, starting like this:

Modern humans left Africa in waves and colonized the Mideast first and then Europe, according to a new study that traced early human migration patterns through variations in DNA.
The study, which supports the "Out of Africa" theory that humans first emerged in Africa before migrating to other parts of the world, determined that South America was the last settled region.

But these were the only two news sources to bite, apparently. That itself is usually a bad sign -- "supports out-of-Africa" tends to gather more press attention.

The paper

The paper itself appeared in PNAS Early Edition on October 21, three days after the press release. A text search shows no mention of "out of Africa". Or "recent African origin". Or anything of the sort.

Hmmm.... I'm confused.

The dataset in the paper includes 783 microsatellites sampled in 1027 individuals from different source populations. Like many other genetic samples from humans, this sample has two notable characteristics: overall variation is higher within Africa than elsewhere, and geographically distant populations are more genetically different than geographically close populations. This correlation of geographic distance and genetic difference is often related to the model of "isolation-by-distance", in which the movement of individuals between populations is a function of distance.

Isolation-by-distance

Isolation-by-distance makes perfect sense for human populations. Historically, people have tended to mate with other people close to them, and the chance that they will move a long distance to mate is much less than the chance they will move only a short distance. But isolation-by-distance is no support for any kind of recent mass migration, out of Africa or anywhere else. People could have always lived where they are now, and the genes would still show isolation-by-distance.

There is a long story here that is increasingly only of historical interest. During the early 1990's, a series of comparisons of genetic variation in Africa versus variation in Europe and Asia made a really bad assumption -- they assumed that people in these regions never interbred with each other. If this were true, and if none of the differences between these populations were the result of natural selection, then you could figure out how long ago these "separate" populations must have shared a common ancestry. These studies were among the earliest supports for the idea that modern humans had a recent African ancestry -- owing to the fact that the "date" of population divergence between Africans and non-Africans was between 50,000 and 100,000 years ago or so. By the late 1990's, there were studies that tried to trace the population history within China, or Europe, or Africa using the same methods. Assume the populations never interbred, work out the dates, and presto! There's your population history.

Of course, the assumptions behind these estimates were basically bunkum. A global correlation of geographic distance and genetic difference is compatible with lots of hypotheses of population history -- from long-term isolation-by-distance to "demic diffusion" to recent mass migrations. But what it is not consistent with is a complete lack of interbreeding among human groups. And when you examine the "fit" between these "tree" models of population history (the no-interbreeding models) and real genetic data, you find that they just don't fit very well (Templeton 1998 reviews this issue).

So what about those "dates" of population divergences? Turns out they aren't necessarily divergence dates at all. In fact, if you just assume that the populations never diverged but always interbred, the genetic distances can be explained by different rates of migration (Relethford 1995).

This entire controversy consumed a lot of ink (and now pixels), all based on a single faulty assumption.

Subsets of diversity

A better strain of argument was first proposed by Tishkoff et al. (1996). By this time, it was known that many genes were more variable in Africa than elsewhere, and that Native Americans lacked much of the variation present in Eurasia. But in an analysis of linkage disequibrium around the CD4 gene, Tishkoff et al. (1996) showed that diversity itself formed a gradient, or cline, in which some African populations at or near linkage equilibrium, and populations geographically more distant from Africa showed stronger disequilibrium. This mirrored the pattern of variation of mtDNA, and appeared to indicate a contrast: variation was continuous across space, but discontinuous across time. The greater disequilibrium in populations farther from Africa could be explained as a consequence of recent genetic movement (at least of CD4 genes) into those populations from more variable populations.

Several other genes were later found to show similar patterns: variation was high in Africa, and became systematically lower in populations further from Africa. Sometimes this pattern was characterized as "subsets of diversity", in which Eurasian populations contained only a "subset" of the alleles present in Africa. "Subset" was a misnomer for the actual pattern (at least in every case I looked at), since it implied that no uniquely Eurasian variants occur. The actual pattern is generally more complex, with a smaller number of uniquely Eurasian variants (so-called "private" alleles) than African variants, and a greater average age for African variants than for Eurasian variants.

One hypothesis to account for this pattern of diversity is a "serial founder effect". The idea is that a small group left Africa to found a population in West Asia. Then a small group from that population left to found a population in, say, India. Then a small group from India left to found a new population in Thailand. And so on, until the entire world was populated.

Under this hypothesis, a substantial number of African alleles would be left behind in Africa. Even more of these alleles would be left behind in West Asia. More would be left behind in India. In the end, the diversity of populations would reflect the series of founding events that trace their ancestors' movement out of Africa and into the rest of the world. A serial founder effect from Africa across the globe could account for the decline in genetic variation in populations further and further from Africa.

But there is a problem: notice that the serial founder effect, once again, assumes no interbreeding between human groups. If Africans could move out of Africa into West Asia not only 100,000 years ago but every date after that as well, then their alleles ought to be a lot less likely to have been left behind.

Now this problem is not so pronounced as for the model with a small number of branches. With enough steps (i.e. individual founder effects), it is much easier to make a serial founder effect consistent with human variation, which is clinal.

Even better, if you actually channel comparisons of geographic and genetic distances through a small set of "waypoints", you can get the two to match really well. These "waypoints" represent chokepoints of human movement -- that is, you can't get from Asia to Africa without passing near Cairo; you can't get from Asia to Europe without crossing the Bosporus, etc.

Except, well, you can get from Asia to Europe without crossing the Bosporus, if you can go north of the Black Sea. And you can get from Asia to Africa without going near Cairo if, like the Austronesians, you take a boat by way of Madagascar.

All this is just to say, that if you make your model of founder effects complicated enough -- say, by including a huge number of steps -- then you can come close to matching the overall pattern of human genetic variation. But building a complicated model along one set of assumptions (in this case, the idea that all genetic variation can be explained by drift and founder effects) doesn't confirm the hypothesis that this scenario really happened.

Nor does it test other possible hypotheses for human genetic variation.

Other hypotheses

The data that the paper attempts to explain are (1) the correlation of genetic distance and geographic distance among human populations, and (2) the decrease in genetic diversity in populations farther from Africa. We may ask, what other hypotheses would explain the same data? And what kind of evidence could test these hypotheses, instead of just asserting that they "match" the pattern of evidence.

One scenario that matches the evidence is multiregional evolution with a recent African dispersal of some adaptive genes. This is the hypothesis presented by Eswaran (2002). The idea is that human populations interacted for a long time in Africa and Eurasia, and that during the Late Pleistocene, adaptive changes within Africa allowed those populations to spread alleles into existing populations in Eurasia. The strength of the "founder effect" in this scenario depends on the genetic structure and selective advantage of the new African adaptive complex. Ramachandran et al (2005) actually cite Eswaran (2002) as an example of a serial founder effect. So the idea that there was widespread genetic movement out of Africa does not necessarily imply an out-of-Africa population replacement. The data do not require a replacement, and some -- even many -- of the genetic variants outside of Africa may have nothing to do with recent genetic movement out of Africa.

A second hypothesis is presented by Templeton (2002), who proposed that several founder effects happened at different times in the Pleistocene, each carrying one or more genetic variants out of Africa. The pattern of genetic variation appears to indicate that some genes left Africa during the Lower or Middle Pleistocene, while others dispersed later, during the Late Pleistocene. For Templeton (2002), this pattern indicates multiple dispersals, none of which was sufficient to wipe out the genetic contribution of earlier dispersals. This scenario also would lead to a pattern of correlation of genetic and geographic distance (because most genes have been affected by isolation-by-distance for a long time), while the recurrent dispersals would explain the decline in genetic variation outside of Africa.

A third hypothesis is that population size was simply greater within Africa than within Eurasia. The smaller population size (along with isolation-by-distance) would explain the difference in genetic variation; the correlation of genetic and geographic distance would be explained by isolation-by-distance. We may consider a fourth hypothesis also: that natural selection has tended to create slightly more genetic uniformity within Eurasia and slightly more genetic diversification in Africa. Such a scenario might be justified on ecological grounds: African populations cover a wider range of ecologies and have historically had a greater exposure to zoonotic disease, for example.

Except for the serial founder effect with population replacement, none of the other hypotheses are mutually exclusive. In other words, some genes might have been influenced by natural selection, most might have been somewhat influenced by differences in population size, but the largest effect might have been recurrent population dispersals.

Now the question is whether other sources of genetic evidence can exclude one or more of these hypotheses. The serial founder effect with replacement is the simplest to test, because it not only makes predictions about the pattern of variation (which Ramachandran et al. (2005) consider) -- it also makes predictions about the date of population movement. If variation outside of Africa is inconsistent with a single dispersal that was very recent, then the recent serial founder effect with replacement must be wrong. So the test of the hypothesis is the date of movement -- if any genes preserve evidence of more ancient population movement than the Late Pleistocene, they reject the recent serial founder effect with replacement.

Ramachandran et al. (2005) do not discuss the date of population movement. There is no occurrence of the words "date" or "age" in the paper. There is one relevant occurrence of the word "time":

There clearly has not been time to reach equilibrium between the extremes of man's inhabited range, or even within continents, in the very short evolutionary history of modern humans (29) (Ramachandran et al. 2005:15945).

I left in citation 29 to point out that it is Cavalli-Sforza and Feldman (2003). In other words, their only citation or mention of "recent" population movement is from themselves!

A review of other recent work shows that many genes don't match the time required by the replacement scenario. Templeton (2002) traces evidence of population movements dating to well over 300,000 years ago, with some evidence of much older movement. Eswaran et al. (2005) argue that the diversity of genes outside of Africa is inconsistent with any recent replacement.

I discussed this evidence in my earlier post on mtDNA selection; the data haven't changed since then. The bottom line is that two pieces of information -- the genetic-geographic distance correlation and the cline of lower diversity out of Africa -- match the replacement scenario with serial founder effects. But other hypotheses also match these pieces of information, and none of them require a recent population replacement. A third piece of information does not match a recent population replacement -- the apparent antiquity of genetic variation outside of Africa. This piece of evidence is the crucial test.

References:

Eswaran V, Harpending H, Rogers AR. 2005. Genomics refutes an exclusively African origin of humans. J Hum Evol 49:1-154.

Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL. 2005. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc Nat Acad Sci USA 102:15942-15947.

Templeton AR. 1998. Human races: a genetic and evolutionary perspective. Am Anthropol 100:632-650.

Templeton AR. 2002. Out of Africa again and again. Nature 416:45-51.