Selection's genome-wide effect on population differentiation

Alon Keinan and David Reich Keinan:Reich:2010 have tested an obvious prediction of the hypothesis that recent selection has had a major effect on variation across the genome, and in doing so have provided some strong support for our hypothesis of a recent acceleration.

A new mutation that increases rapidly under positive selection will carry with it a lot of nearby variants that are physically linked to it. The region of this “genetic hitchhiking” will depend on the local rate of recombination – the lower the recombination rate, the longer the extent of the hitchhiking region.

Meanwhile, a new mutation takes a while, sometimes many thousands of years, to spread widely beyond its population of origin. We can measure population differences for a single locus as FST. The FST attained by a new selected variant depends on what frequency it has reached in different populations. For many selected alleles, they have not yet attained high frequencies anywhere, and so FST is low. But for a few, the selected variant has reached a high frequency in a few populations, but remains rare elsewhere. These are recognizable as high FST loci.

What is true of the selected allele itself will also be true, to a lesser extent, of the linked haplotype that is hitchhiking along with it. And so, if selection has been sufficiently common in recent human history, there should be a relationship between the local rate of recombination and measures of population differentiation like FST.

Which is exactly what Keinan and Reich found.

Further, they found that this relationship is true of regions of the genome that contain a lot of coding loci, and much less true of gene-poor regions:

We cannot envision any demographic or mechanistic explanation that would produce a correlation between recombination rate and allele frequency differentiation as observed and we hypothesize that our observations reflect a history of natural selection. Natural selection is usually expected to increase population differentiation at linked neutral sites, an effect that is expected to extend over longer physical distances in regions of lower recombination rate. A prediction of an explanation based on natural selection is that the effect would be more marked in regions that are more likely to be influenced by selection, such as genes.

The observed FST in these categories is not super-high – we’re not looking predominantly at genes for which more than 20 percent of the variation is the between-population component. Therefore, the comparison can encompass quite a bit more of the selected variation in the genome, instead of the extremely stringent cutoffs required to identify an individual candidate gene. It’s a bit like getting a measure of wind speed as opposed to looking at the few highest-flying kites.

Perhaps the most interesting aspect of the study is that they compared the Phase 3 HapMap samples, which include some pairs of nearby populations. They found that the apparent effect of selection on population differentiation was much higher for those nearby pairs of populations:

In addition to qualitatively replicating our findings, analysis of HapMap 3 data allows us to generalize them to additional populations. A striking result is that the relationship between FST and recombination rate is stronger for FST between pairs of closely-related populations, whether within or outside Africa: FST between a West African sample and Maasai (of mixed West African and East African ancestry [57]) decreases by an average of 6% for every 1 cM/Mb (Figure 4D), FST between Italians and individuals of North-Western European ancestry decreases by 10% for every cM/Mb (Figure 4E), and FST between Japanese and individuals of Chinese ancestry decreases by 4% (Figure 4E). In view of the large effective population size in recent human history since each of these pairs of populations have split, these observations support the possibility that the different patterns observed between different pairs of populations are due to natural selection operating more efficiently in the context of larger population sizes.

That’s a direct sign, in other words, of the recent acceleration of positive selection in human populations. There are a lot more genes that are geographically circumscribed and low in frequency affecting FST at a more localized level, and fewer affecting major allele frequencies between continental regions. It’s a neat comparison, and it helps to answer the comment that selection is somehow “weak”, or insignificantly different from drift, because the new selected alleles haven’t spread very far. The point is, most of them are so new that they haven’t had time to disperse widely and reach appreciable frequencies very far from their origins.

UPDATE (2010-03-28): A reader pointed out an error in the post; I had written “lower” recombination rate at one point that should have been “higher”. I have corrected the text.