Brian McEvoy and colleagues report in Genome Research that recent natural selection accounts for many of the largest differences between today’s English, Irish, Dutch, and Scandinavian populations:
Geographical structure and differential natural selection amongst North European populations
...There is evidence from FST based analysis of genic and non-genic SNPs that differential positive selection has operated across these populations despite their short divergence time and relatively similar geographic and environmental range. The pressure appears to have been focused on genes involved in immunity, perhaps reflecting response to infectious disease epidemic [sic]...
Two things. First, I have to write a paper with the word “amongst” in the title!
Second, if we look closer in the paper, we find that the evidence for selection is more diffuse – not limited to the “Immunity and Defense” category – but that category is the only one with a disproportionate increase. The rationale for natural selection is that the high-FST SNPs between their samples are disproportionately in genes. If these differences were neutral, the high-FST SNPs would be equally non-genic.
In fact, there should be a slight deficit of genic SNPs, since these are constrained by purifying selection. This is a good argument against a couple of papers that appeared last year, that suggested Europeans had undergone intense bottlenecks leading to a disproportionate number of deleterious SNPs. If that were actually true, the genic SNPs would not look different from the non-genic SNPs, or if anything purifying selection in the last few thousand years would have made them more similar, not more different.
OK, so genome-wide, the high-FST SNPs are more likely to be genic. Then, they compared all functional classes to see if selection was concentrated in any particular categories:
The Immunity and Defense (BP00148) category is the only one of the 23 ontological terms (with sufficient numbers to be tested) to show a significant enrichment of high FST SNPs after correction for multiple testing (p=0.0012, Adjusted Significance Level = 0.0022).
which itself is pretty surprising – it took a lot of hits on many SNPs to make that signal. This comparison is really only going to pick up SNPs that are very tightly linked to a selected variant, and won’t pick them up if the history of the many populations has been too similar – that’s why lactase doesn’t show up, for example.
Anyway, the paper discusses several other genes that are not immune-related with high differentiation among these Northern European samples, and they give a list of some rather long regions that appear to have been selected, where they cannot identify the selected locus.
I think a more powerful test could be applied to these data. The FST analysis is the first step, using geography as a test of differential change over space. But we can probably do better than this with a test that actually considers the dispersal pattern of selected genes, in comparison with the population history. That will take some involved demographic modeling, but selected genes ought to really stand out.
McEvoy BP and 26 others. 2009. Geographical structure and differential natural selection amongst North European populations. Genome Res (early online) doi: 10.1101/gr.083394.108