natural selection

I keep seeing people, who really ought to know better, saying that the new Neandertal genome results show that the gene flow must have been Neandertal men mating with modern human women, and not the other way around.

You see, they're fixated on the idea that the mtDNA showed no signs that the Neandertal clade survived into the present-day population. That result really convinced some people that interbreeding was impossible. They're flummoxed that some of the rest of the genome has significant signs of intermixture. It's like their world is spinning out of control. I'm not naming any names, but if you've followed much of the press around the Neandertal genome, you've probably seen this suggestion.

I don't know why it hasn't occurred to them that the Neandertal mtDNA type was probably lost because of natural selection.

To avoid raising the awful specter of Darwin, they've been talking about weird mating restrictions. Well, I suppose that if you really have to find a way to get Neandertal nuclear genes into us, without bringing mtDNA along, a total lack of Neandertal women contributing genes is formally one way to get that.

I'd just like to see these people explain how exactly we managed not to get any Neandertal Y chromosomes, either.

Is it safe to talk about selection, now?

UPDATE (2010-05-11): A reader writes:

With regard to your latest blog post on lack of neanderthal mitochondrial and Y chromosome DNA in humans: yes, it's possible natural selection had a part. However, given that only a small proportion of our ancestors seem to have been neanderthals at the appropriate time, it strikes me that this is a case where drift could be the correct explanation - despite the fact that I'm usually not a big fan of drift as an explanation.

Much depends on the size of the ancestral population and the pace of population growth in the generations surrounding the pickup of Neandertal genes. Drift is less likely to eliminate alleles in a growing population, but it depends how many copies there were to begin with. The key questions -- where and when the population was growing -- are unlikely to be the same as assumed by the modeling that showed drift couldn't have eliminated the Neandertal mtDNA, as most assumed the location of contact would be Europe and the time would be late.

There were other deficiencies with the modeling, also. Here we've been working on a source-sink model as a possible demographic scenario for Pleistocene humans; that kind of metapopulation dynamic might easily explain allele losses without selection, and becomes more and more credible as we learn the variance of contribution of Neandertal-like alleles across the genome. It's a different world this week than last week.

These are all mathematically tricky answers, clever, but academic unless we have good matches to genome-wide variation. Meanwhile a very simple answer, easy to explain to anyone, lies fallow. Exceedingly curious.

I'd be happy to be proven wrong about the Y chromosome, by the way -- we don't really know that Neandertals didn't have a human-like type, although we do now that today's human population has an exceedingly recent coalescent time. Could be bad estimates of mutation rate. Maybe we'll have more surprises in store.

Mailbag: Holding on to ancient DNA

Hi,

It is often claimed that ancient genes that were once very adaptable are discarded over time by drift, bottle necks etc. What if an ancient trait were again valuable as climate swings or other environmental opportunites and are now again favorable. My point is that if an organism, especially in a variable climate, that carried this gene would be at a selelctive advantantage if that trait were inherited. The inheritable “trait” being the ability to retain ancient DNA. Also, this trait could be inherited in pieces spread over more than one organism, which are recombined through hybridization with the same results.

The most basic version of this is frequency-dependent polymorphism. Suppose that an allele is useful when rare, and harmful when common. Over the long term, it will never approach fixation, but nor will it become extinct unless the advantages are weak relative to the size of the population.

Now, suppose that the allele is advantageous only some of the time, and otherwise neutral. Now it can drift to fixation. If the times when it is useful are far enough apart, it can drift to loss. But anytime the environment is favorable for the allele, it will get a little boost. The tendency will be toward fixation, biased just to the extent of the strength of selection and duration of the favorable time intervals.

OK, add another element of complexity. The allele is favored during some intervals, and disfavored during others. Motoo Kimura described the dynamics of this scenario; the ultimate fate of the allele depends on the duration of the time intervals, of course, and may lead to an unstable polymorphism, fixation or loss.

You propose a "reserve" mechanism, where the genome holds on to old variants to resurrect them at some later time when they become useful.

Of course, we potentially have such a mechanism now, as we can dig up ancient DNA and experiment with it in vivo. But you suggest that a reserve of ancient genetic material might be adaptive.

I believe the dynamics of such a mechanism would be the same as if the population were merely larger. In that case, drift (and selection against recessives) would be much slower to eliminate alleles that had lost their advantage. So when the environment changed, the population could respond more quickly without waiting for the old variants to reappear by de novo mutations.

Also, a larger population makes it much more likely for mutations to happen.

There's no evidence that a store of ancient genetic variants lie silent in our genomes, but I think you might look at actual gene silencing mechanisms as a parallel to your suggestion. We do retain functional genes within our genomes that we turn off by methylation early in development. The genes either act early in development, are imprinted by maternal or paternal origin, or are turned off in tissues that don't need them. That's a way of maintaining variations for use in some circumstances but not all.

An (old) interview with Warren Ewens

I ran across an interview between Anna Plutinski and population geneticist Warren Ewens.

I cannot say enough about Ewens' book, Mathematical Population Genetics. If you can work through it, you can do population genetics. It doesn't cover every au courant topic, but those will change next week anyway. And it's on Kindle now. Which I suppose probably looks pretty good on the DX, assuming the math displays well -- the book's format is just the right size for it.

Anyway, this interview from 2004 was probably conducted around the time the book was released. It covers pretty much the gamut of his career. I have to select some part to quote for you, so I'll select the passage that would be most likely to come out of my own math in my genetics class:

WE: Of course there is a strong possibility that the neutral theory is assumed not because it is appropriate but because the math of that theory is so very simple compared to the math applying for any selective theory.

AP: Can I follow that up? Do you think that that has lead to models of phylogenetic change that is not very well supported by the evidence?

WE: I think that that is quite possible. However, here we enter into another question. In mathematical population genetics theory you know from the very start that you are making big simplifying assumptions. You are in a very different position from a physicist, who might believe that his mathematical models describe reality exactly. No sensible population geneticist would make any claim along those lines. He or she is forced to simplify, because reality is so complicated that you don’t know it in any detail, and even if you did know it and used math describing it faithfully, the analysis would be impossible to carry through. So simplification is unavoidable. I do not know whether the use of the neutral theory is too much of a simplification and has lead us to incorrect and distorted views about the true evolutionary tree, it’s shape and dimensions, but I suspect that there has been quite a significant distortion.

There is much more at the link, some history of association testing, genetic draft, a lot on Ewens sampling theory, and a touch about his work here in Madison.

Dude, it's called "relaxed selection"

This is a doofy story running on MSNBC without an author byline: "Shrinking of Scottish sheep tied to warming". Why do I say "doofy"? Take a look at the way it describes natural selection:

The study upends the belief that natural selection is a dominant feature of evolution, noting that climate can trump that card.

"According to classic evolutionary theory," [study author Tim] Coulson added, the sheep "should have been getting bigger, because larger sheep tend to be more likely to survive and reproduce than smaller ones, and offspring tend to resemble their parents."

Yes, and classic evolutionary theory also says that if you stop killing the small ones, the population average is going to get smaller. Duh. A reduction will happen in a single generation as small individuals remain to become adults who would otherwise have been removed. The reduction may continue for a few generations, either by chance, or by changing the environmental component of variance in size. It can go on for many generations if there is a heritable component to size that confers a disadvantage on the largest individuals. Plausibly, larger individuals take longer to develop, there may be advantages to smaller size in females that are no longer opposed as strongly by antagonistic selection for larger size in males, or any number of other possibilities.

Has climate "trumped natural selection"? No. Cold and consequent food scarcity in this case is one cause of selection (killing small lambs). Possibly, one or more causes of stabilizing selection remain in force (maybe longer development time, but there are other possibilities). Or maybe not. Climate change has caused a change in the pattern of selection, by relaxing selection against small individuals who would otherwise have died from food scarcity.

The way the article describes selection is an old-time fallacy -- "survival of the fittest" is recast as "survival of the strongest", where strongest means "biggest". If the small are somehow fail to be eliminated, then natural selection is failing at its work. It's the eugenic fallacy, brought to 21st century climate change. It makes an eye-catching headline -- "Climate Change Overpowers Natural Selection". But it's false.

A more accurate headline would be "Wee Lambs, Once Doomed to Starve, Saved by Climate Change"

I happen to have been reading some of the earlier research on these sheep, so I know that the work is interesting because researchers actually know about the fitness outcomes for individuals across their study duration. The observed fitness outcomes indicate that larger individuals have more offspring within each generation, but the population nevertheless became smaller over time. That comes down to viability of small young individuals and the non-heritable (environmental) component of variance in size, in a fairly complicated way. I'll revisit the paper later to describe the study more fully. I just wanted to point out that this news story gets it totally wrong. Climate change is one of the big causes of the pattern of natural selection, it doesn't magically repeal it.

Elliott Sober's book, The Nature of Selection, discusses the philosophical underpinnings of evolutionary explanation in relation to other sciences. I turn to it once in a while when I need to sharpen a definition, and today ran across this passage (p. 50-51):

The source laws of physical theory have the austere beauty of a desert landscape. Just four types of force are recognized, and some scientists hope to make this list even shorter (Davies 1979). By contrast, the theory of natural selection exhibits the lush foliage of a tropical rain forest. The physical circumstances that can generate fitness differences are many. Perhaps someday these will be regimented and reduced in number. But at present evolutionary theory offers a multiplicity of models suggesting a thousand avenues whereby the morphology, physiology, and behavior of organisms can be related to the environment in such a way that a selection process is set in motion.

Mitochondrial DNA selection review

I was reading through an excellent review of the recent literature about mtDNA and selection, from Damian Dowling and colleagues (2008). The review focuses on the patterning of evidence for selection in ecological and phylogenetic terms, and to some extent upon the function of mtDNA or the mito-nuclear complex of proteins involved in oxidative metabolism. It includes a long passage covering the significant mismatch between mtDNA variation and effective population sizes across animals (but not mammals). A short section discusses the possibility of adaptive polymorphism maintained by mito-nuclear interactions:

Knowing that deleterious mutations in mtDNA can accumulate within populations because of genetic drift [21], there certainly seems to be scope for mito-nuclear co-evolution to proceed via a ‘compensatory’ model. Under this model, deleterious mutations accumulate in the mitochondrial genome, with selection then favouring an adaptive response in the nuclear genome to restore any compromised metabolic function [24]. In effect, mtDNA mutations will act as the drivers of adaptive evolution in nuclear genes. This scenario is not unlikely, given that more than 1000 nuclear-encoded proteins, which are essential for metabolism, are transported into the mitochondrion [25].

Additionally, given that at least some mtDNA polymorphism might have been shaped via positive selection [7] and [8], scope might also exist for mito-nuclear co-evolution to proceed via a model in which adaptive mutations in one genome select for a response in the other.

There has been recent interest in the coinheritance of sex chromosomes and mtDNA. Because the sex-determining chromosome is opposite in birds from mammals, a number of natural experiments may be available to examine the role of coevolution for the mtDNA and co-inherited sex chromosomes. Further, a number of studies have identified a substantial cytoplasmic contribution to fitness and lifespan variance in Drosophila, suggesting that adaptive variation in mtDNA may be segregating within populations.

The review discusses the possible importance of the adaptive perspective for aspects of biology ranging from life history and aging to speciation (where fast-evolving mtDNA genes may induce hybrid incompatibilities). And sperm are a surprising focus of research -- mtDNA mutations affect motility, fertility, and the outcome of sperm competition. On that topic, more later.

References:

Dowling DK, Friberg U, Lindell J. 2008. Evolutionary implications of non-neutral mitochondrial genetic variation. Trends Ecol Evol 23:546-554. doi:10.1016/j.tree.2008.05.011

Filed under

A new study of genetic introgression and human ancestry

Fed up on hobbit news? Well, I'm going to do my best this week to scoop the science journalists, covering stories in paleoanthropology that ought to get some more attention but might be drowned out by otherwise hobbitrocious stories.

I'll start with a story in which I have a special interest -- a new paper by Jeff Wall, Kirk Lohmueller, and Vincent Plagnol, titled, "Detecting ancient admixture and estimating demographic parameters in multiple human populations."

A couple of years ago, Wall and Plagnol (2006) looked at a sample of genes in the "Environmental Genome Project. At that time, the sample consisted of 135 genes in 12 Yoruba and 22 CEPH individuals. It's not a large sample by today's 3.9-million genotype standards. But the EGP sample has one important thing going for it -- with resequencing data, we have access to a much larger number of mutational differences at very small map distances from each other. Tight linkage between sites means that we can use the genealogical properties of samples to examine much more ancient events. The HapMap gives us a vast number of genotypes from a large sample of individuals, but the density of loci is quite low -- an average of nearly 1000 base pairs between loci. The EGP doesn't sample as many loci, but it gives a denser representation of the variation at each locus. Only this kind of sample is sufficient to test for genetic ancestry of modern human populations in ancient populations of the Middle Pleistocene.

Plagnol and Wall applied a simple admixture model to these data, and found that the complete out-of-Africa replacement model did not adequately explain the variation in the European-derived sample. Instead, they found that a model with 5 percent admixture of some non-African Middle Pleistocene ancestral population was a much better fit for the current diversity of European gene trees. In other words, the low variation of recent humans cannot be explained by a small population in a single ancient population; instead, there must have been several populations, partly isolated from each other, one or more of which gave regionally-specific alleles to modern Europeans. Multiregional evolution fits those observations very well -- this is not one or two introgressive genes, and there is no specific evidence of selection on them (although selection may be responsible).

A number of people picked up on that study in the course of later work. Gregory Cochran and I discussed it in our own 2006 paper about genetic introgression. In late 2005, Dan Garrigan and colleagues had published their own analysis of a pseudogene region on the X chromosome, called RRM2P4. Garrigan reviewed this work together with Mike Hammer (2006) and again with Sarah Kingan (2007). Early last year, I also reviewed the evidence together with Cochran, Henry Harpending and Bruce Lahn (2008).

We and many other people are following up on this research, trying to discover the ancestry of human populations beyond the simple out-of-Africa replacement scenario. In the new study, Wall and colleagues extend their analysis to a more recent release of the EGP, including 222 genes, and adding 24 Chinese individuals to the 12 Yoruba and 22 CEPH individuals. It's a simple paper and relatively short. In a word, they find that their data reject the simple out-of-Africa replacement scenario, and that the genetic variation of coding genes in their sample must be explained in part by long-standing population structure.

It's not proof that the Neandertals, or any other particular group of ancient humans, survived and passed their genes on to more recent people. This is a study of the genes of recent human populations, and it merely concludes that their ancestors could not have lived in a single small population. Maybe every Neandertal became extinct, and present-day Europeans got this genetic variation from somewhere else. But it is logical to figure that non-Afircan populations may have been among the contributors to present non-African peoples -- particularly since the statistical test focuses on region-specific gene frequencies. The study also finds evidence that today's African population has a complex ancestry -- a kind of multiregional scenario playing out inside Africa (or potentially involving gene flow back into Africa from elsewhere).

Testing for admixture

Wall and colleagues reasoned that an allele coming in from an ancient, partially isolated human population would vary in a distinctive pattern. Because of the long history of partial isolation in an ancient subpopulation, they expected that such an allele would come in with multiple mutational differences from the non-introgressive allele. And if it came in from some non-African population, it ought to show relatively strong differences in frequency between populations. So they devised a statistic, mathematically combining FST and a linkage measure -- the idea being to detect alleles that differentiate populations and that are surrounded by large sets of tightly linked polymorphisms.

This kind of pattern might also occur under positive selection. But a new mutation under positive selection would start out weakly linked to nearby polymorphisms, each of which already exists at some substantial frequency in the population. An introgressive allele might be linked to several other unique mutations that happened during the long period of limited gene flow between ancient populations. And a new mutation would not tend to be surrounded by high FST polymorphisms, until it got to be very common in the population -- up above 50 percent. In contrast, an introgressive allele coming into the population with several nearby mutations would generate a cluster of relatively high FST polymorphisms even at low frequencies. It may not be a perfect test for any individual locus -- there's a lot of uncertainty. But applied to more than 200 loci, it should be possible to test the hypothesis that "archaic admixture" is zero.

Wall and colleagues do test that hypothesis, and they are able to refute it strongly for each of the three groups. Living European and Chinese samples refute the out-of-Africa replacement model with p<0.01. The Yoruba sample refutes the hypothesis of panmixia in ancient Africans at p<0.0000001.

The authors also provide a supplementary table with a list of genes that may be candidates for introgression. I didn't see any really obvious genes on the list, but each of them bears some examination. I expect that we will be able to use more detailed analytical techniques to look at the regions around these genes and see what is going on. Or at least, in the next couple of years more and more resequencing data will become available, allowing us to test the same hypotheses with larger samples.

It's worth pointing out that nothing in the approach of Wall and colleagues implies that any of the putative introgression occurred under natural selection. I've argued that introgression may have occurred under selection in ancient humans, but so far few other people have looked at the question with the idea of ancient selection in mind. No doubt we can improve a bit on the methods in the paper if we are willing to make some assumptions about the evolutionary dynamics involved in Late Pleistocene populations.

Lingering uncertainty

So what's not to like about this study? After all, here we have what appears to be strong evidence against an exclusive out-of-Africa replacement. It suggests that the ancestry of recent Europeans and Asians owes something to the Middle Pleistocene populations of those regions, and gives an estimate of that contribution consistent with what we know so far about the Neandertal genome.

But I have to approach this study as critically as I would any other piece of population genetics. In this case, there is a clear weakness to their model. The authors tested for significance of a single parameter, which they call "archaic admixture." Consider their Figure 1, a schematic of their population model:

Population model schematic from Wall et al. 2009

Is "archaic admixture" significantly different than zero? Well, you can see that must depend on the values of no less than six other parameters. When did the European population start growing significantly -- was it after the Last Glacial Maximum? During the Neolithic? The Aurignacian? How about the African population? Was there really a long bottleneck in the ancestry of Europeans?

The reason why I'm so critical of population models used in genetics is simple. The authors of studies almost never try to make the simplest effort to justify these kinds of parameters against the archaeological or fossil record. Their conclusions -- in this case, the significant finding of ancient admixture -- depend on some range of values for these other parameters.

Now, Wall and colleagues take a fundamentally different approach than I would use. I would draw upon non-genetic sources of information about these parameters, to increase confidence about the others. In contrast, they performed a broader range of simulations, attempting to find maximum likelihood estimates for all the parameters simultaneously.

The problem with that approach is that it's hard to say that some other parameters may not have been more important. Consider recent positive selection. As I mentioned above, a recent positively selected mutation could in principle create a pattern like that described for an introgressive allele -- at least under the statistics used in this paper. The chances are low for any randomly chosen mutation under positive selection, because a new positively selected mutation isn't likely to be linked to other rare mutations -- it's much more likely to be linked to common polymorphisms. But if we actually have many hundreds, or even thousands, of recently selected alleles (as we do in humans), then there is a pretty good chance that some of them will look like introgression under the test used here. Another scenario that could mimic introgression under this statistical approach is long-standing balancing selection.

There are probably too many genes on these lists for all of them to reflect selective balances or recent positive selection -- there are a lot of recently selected genes, but few of them will have the specific kind of linkage that would show up as significant in this study. But I think the authors could do more to validate the demographic model against non-genetic evidence. Besides that, there is plenty of morphological evidence for gene flow among these ancient human populations. The authors would be well-served to work more directly with the morphological record of human evolution -- when they write that:

To our knowledge, the question of ancient admixture in other parts of the world has been relatively neglected by the evolutionary genetics community

it is both true and sad. There is abundant anatomical evidence addressing the issue of genetic continuity or gene flow in parts of the world other than Europe.

UPDATE (2009-05-08): Dienekes also looks at the paper, and suggests that finding evidence for ancient population structure in Europe and East Asia may be no big deal, because it may simply derive from population structure within Africa before the putative out-of-Africa migration. I'd have to review the data to be sure, but it seems to me there are two arguments against that explanation:

1. The East Asian and European comparisons come up with different genes showing evidence of putative introgression. There's not a lot of overlap between the sets. If this were merely ancient East African genes, we'd expect the populations outside Africa to have the same ones. And the numbers had actually been cut down by the serial founder effect scenario (Chinese having undergone more and larger bottlenecks), then we'd expect China to have a subset of the European introgressive genes. I wouldn't go out on a limb about this without looking at the actual frequencies of the supposed ancient alleles, but the pattern isn't consistent with Europe and China being drawn randomly from the same ancient African population.

2. The entire point of the out-of-Africa replacement idea is to draw humans from an unstructured ancient population. Humans have to be inbred to explain the low genetic variation today. A long bottleneck in Africa is one explanation for this inbreeding -- but the bottleneck has to have been severe, down to an effective size around 10,000, and it has to be very long. A long history of population structure within Africa works against that bottleneck -- population structure featuring several partially isolated populations would prevent the kind of inbreeding that a long bottleneck could create. If Wall and colleagues are correct, we would have to scrap the long bottleneck idea and come up with some other explanation for high inbreeding. There are some others, as I've pointed out before.

There are other arguments against exclusive continuity outside Africa, and in favor of some significant -- perhaps overwhelming -- gene flow from Africa into the rest of the world during the late Pleistocene. But no other argument is exclusive of some continuity outside Africa. And if we don't need the bottleneck anymore, accepting some continuity is the reasonable explanation for the facts that don't fit, including the observations in this paper and the morphological and archaeological evidence suggesting continuity.

References:

Evans PD, Mekel-Bobrov N, Vallender EJ, Hudson RR, Lahn BT. 2006. Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage. Proc Nat Acad Sci doi:10.1073/pnas.0606966103

Garrigan D, Kingan SB. 2006. Archaic human admixture: A view from the genome. Curr Anthropol 48:895-902. doi:10.1086/523014

Garrigan, D., Mobasher, Z., Severson, T., Wilder, J. A., Hammer, M. F. 2005b. Evidence for archaic Asian ancestry on the human X chromosome. Mol. Biol. Evol. 22:189-192. doi:10.1093/molbev/msi013

Hardy, J., Pittman, A., Myers, A., Gwinn-Hardy, K., Fung, H. C., de Silva, R., Hutton, M. and Duckworth, J. 2005. Evidence suggesting that Homo neanderthalensis contributed the H2 MAPT haplotype to Homo sapiens. Biochemical Society Transactions 33:582-585.

Hawks J, Cochran G. 2006. Dynamics of adaptive introgression from archaic to modern humans. PaleoAnthropology 2006:101-115. Open access

Hawks J, Cochran G, Harpending HC, Lahn BT. 2007. A genetic legacy from archaic Homo. Trends Genet doi:10.1016/j.tig.2007.10.003

Plagnol, V., Wall, J. D. 2006. Possible ancestral structure in human populations. PLoS Genet. 2:e105. doi:10.1371/journal.pgen.0020105

Wall JD, Lohmueller KE, Plagnol V. 2009. Detecting ancient admixture and estimating demographic parameters in multiple human populations. Mol Biol Evol (early online) doi:10.1093/molbev/msp096

Zietkiewicz, E., Yotova, V., Gehl, D., Wambach, T., Arrieta, I., Batzer, M., Cole, D. E., Hechtman, P., Kaplan, F., Modiano, D., Moisan, J. P., Michalski, R., Labuda, D. 2003. Haplotypes in the dystrophin DNA segment point to a mosaic origin of modern human diversity. Am. J. hum. Genet. 73:994-1015.

"Hundreds of natural selection studies could be wrong"

Happily, though, the study isn't about our method for finding recent selection!

Instead, Masatoshi Nei and colleagues at Penn State have the long knives out for tests of selection based on excess amino acid substitutions:

Nei said that for many years he has suspected that the statistical methods were faulty. "The methods assume that when natural selection occurs the number of nucleotide substitutions that lead to changes in amino acids is significantly higher than the number of nucleotide substitutions that do not result in amino acid changes," he said. "But this assumption may be wrong. Actually, the majority of amino acid substitutions do not lead to functional changes, and the adaptive change of a protein often occurs by a rare amino acid substitution. For this reason, statistical methods may give erroneous conclusions." Nei also believes that the methods are inaccurate when the number of nucleotide substitutions observed is small.

Well, that's not us -- we're studying much more recent events, based on linkage disequilibrium. Hey, the observation that selection was rare through most of human evolution actually strongly supports our observation that the recent rate of selection represents a massive acceleration over the long-term rate.

Still, I'm skeptical about Nei's conclusions. According to the press release, they identify a number of cases in which sites inferred to be under selection are actually not the functional change, because other functional changes have been identified by experiment. That's hardly a general argument that selection has been overcounted in these analyses.

I find that in most counts of selection based on amino acid substitutions, the criteria for counting selection are ridiculously conservative. Often, you see the inference of selection only for cases where the number of amino acid changes actually exceed the number of silent changes. That's silly -- there's a strong bias against amino acid substitutions because of purifying selection. Only in repeated instances of positive selection are you ever going to see more amino acid substitutions than silent ones.

Meanwhile, the press release mis-states some research into human-chimpanzee genetic differences:

"These statistical methods have led many scientists to believe that natural selection acted on many more genes in humans than it did in chimpanzees, and they conclude that this is the reason why humans have developed large brains and other morphological differences," said Nei. "But I believe that these scientists are wrong. The number of genes that have undergone selection should be nearly the same in humans and chimps. The differences that make us human are more likely due to mutations that were favorable to us in the particular environment into which we moved, and these mutations then accumulated through time."

In fact, Margaret Bakewell and colleagues (2007) in the same journal showed that chimpanzees have more selected amino acid substitutions than humans. Nei's got it completely backward.

Now, I think Bakewell and colleagues might be wrong. The chimpanzee genome draft had many more sequencing artifacts at that time than the human genome, and these might account for the apparent excess in chimpanzees. But it's simply not true that researchers have shown "many more genes" under selection in humans than chimpanzees.

Well, except for us, referring to very recent human evolution. But in that case, as Nei notes, we're talking about "mutations that were favorable to us in the particular environment into which we moved." It's the massive environmental and demographic changes of the last 50,000 years that have made the difference. For most of the six million years before that, human genetic evolution seems to have gone at almost the same rate as in chimpanzees.

(via Gene Expression)

African origins and phenotypic variance

I just read the new paper by Philipp Gunz and colleagues, titled, "Early modern human diversity suggests subdivided population structure and a complex out-of-Africa scenario". That's a mouthful.

The late Middle Pleistocene population of Africa was genetically variable, and that genetic variability is probably the biggest component of genetic variation still remaining in living humans. Moreover, the phenotypic variability of the Levantine sample has been recognized since its initial description by McCown and Keith (1939). So to read this is not surprising:

Seemingly ancient contributions to the modern human gene pool (36) have been explained by admixture with archaic forms of Homo, e.g., Neanderthals. Although we cannot rule out such admixture (37), the clear morphological distinction between AMH and archaic forms of Homo in the light of the proposed ancestral population structure of early AMH to us suggests another underestimated possibility: the genetic exchange between subdivided populations of early AMH as a potential source for ‘‘ancient’’ contributions to the modern human gene pool (9, 36).

I've stressed the importance of African population structure before (e.g., Hawks et al. 2008). So I agree completely with this part of the interpretation in the paper: African variation was larger than in other regions, and it was important.

But that being said, these morphometric comparisons are not very straightforward. Some comments:

1. Phenotypic variance is not a measure of genetic variance. If we see a population that has a large measure of phenotypic variability, it does not mean that the population had much genetic variability. Perversely, genetic variability can sometimes be lower in a population that has greater phenotypic variance -- often because genetic drift can cause a loss of epistases that once constrained the phenotype. In some cases environmental variance may actually increase when the additive genetic variance declines, because of a loss of developmental robusticity. In any event, we can't just go from a variable phenotype and infer that there's variation in genotypes.

2. There's no evidence for subdivision here. They measure a high phenotypic variance within the sample they refer to early modern humans. But that variance is expressed not mainly between geographic locations in the sample, but within them. Qafzeh 6 and 9 are far apart; Jebel Irhoud 2 and Skhul 5 are close together. The East African fossils Omo 2 and LH 18 are far apart. This isn't subdivision, it's just high within-population variance.

3. Weird sample composition. The early modern human sample includes the African and Levantine crania complete enough for analysis. But why lump these? Why is the South African Fish Hoek skull lumped with Upper Paleolithic Europeans?

4. Temporal range. There are two samples here that have a high average distance between nearest neighbors in the sample: "archaic" humans and early modern ones. What these two samples have in common is that they each cover a much larger range of time than the other samples. The early modern sample spans more than 100,000 years by current dates. That's more 80,000 years longer than the Upper Paleolithic sample, 50,000 years longer than the Neandertal sample -- a huge component of variance that is uncontrolled in the other samples.

5. Principal components. PC axes are those that account for the largest covariances in the sample. If two samples are lumped together, there is a within-population component of variance and a between-population component. These may be partly independent in their effects on the total variance, or they may not be. In any event, if we derive the PC structure from the total sample, or even from the individual samples pooled together, the larger samples will weight the PC structure more toward the factors that explain their within-sample covariances. In this case, we have many more recent humans than fossil ones, and many more archaic humans and Neandertals than "early modern" humans. It's hard to have an intuitive idea about the biases that can result from sample composition, and that's a big reason for caution.

Those are all reasons for re-examining the results in different ways. In particular, if I were doing this kind of analysis, I would repeat it for subsets of the cranium, where I could include a larger number of fragmentary fossils. If the African-Levantine sample is really unusually variable, that should hold up strongly when we examine parts as well as the whole cranium.

Well, although I listed several reasons for caution, we can ask how to interpret the study's conclusion:

Any model consistent with our data requires a more dynamic scenario and a more complex population structure than the one implied by the classic Out-of-Africa model.

If we take the high variance of their "early modern" sample at face value, what we have to conclude is that later humans evolved substantially less phenotypic variance than African-West Asian people who lived between 200,000 and 90,000 years ago. Genetics tells us that there was no massive genetic drift during the time span after 90,000 years ago within Africa. Thus we must conclude that some other force resulted in a significant restriction of the phenotypic variation of recent humans, including people who lived as long as 40,000 years ago.

My hypothesis would be natural selection on some significant subset of phenotypic characters, which reduced the phenotypic variance of most of the cranium by pleiotropy. An out-of-Africa migration is not sufficient to explain the reduction in variance, because all modern humans are limited in phenotypic variance, not only non-Africans. Selection on some significant set of genes would help to explain why the ancestral African population predominated within the last 100,000 years. This selection would have predated most of the recent acceleration we observe in the genomic variation of current populations -- indeed, whatever set of genes was strongly selected before 50,000 years ago might have been fixed long ago.

A wave of selection can promote dispersal and demographic growth without the necessity of complete population replacement (cf. Eswaran 2002). A substantial transition in the genetic background would alter the phenotypic effects of any genes that remained in non-Africans from their local ancestors. In other words, the answer about what happened to fossil humans outside of Africa depends on the kind of events that happened inside Africa. So from that perspective, this research is very interesting.

References:

Gunz P, Bookstein FL, Mitteroecker P, Stadlmayr A, Seidler H, Weber GW. 2009. Early modern human diversity suggests subdivided population structure and a complex out-of-Africa scenario. Proc Nat Acad Sci USA (early online) doi:10.1073/pnas.0909160106

Eswaran V. 2002. A diffusion wave out of Africa: the mechanism of the modern human revolution? Curr Anthropol 43:749-774.

McCown TD, Keith A. 1939. The Stone Age Man of Mount Carmel: The fossil human remains from the Levalloiso-Mousterian. Clarendon Press, Oxford.

mtDNA selection in Iceland?

Leave it to me to have readers unwilling to ignore selection in recent populations! Here's an e-mail:

Why couldn't the Icelandic genetic changes have been the result of selection that favored some mtDNA lineages rather than others? We know the population of Iceland derived from settlers that were transplanted into a relatively alien climate and ecology, and had to adjust agriculture and subsistence activity to survive there. We know that there were dramatic environmental insults to the population: disease, starvation, eruptions. At least some of these insults would have likely been more severe than the ancestral populations would have encountered, whether they were Scandinavian or "Celtic".

So why isn't there at least a token mention of selection, either by you or by the authors? Is "genetic drift" that much more likely than selection? Is selection a more academically risky proposition than the comforting mathematics of "drift"?

Could genetic drift really break your heart?

Are these people crazy?

The combination of such a large risk with such a high frequency is, fortunately, unique. "How can such a harmful mutation be so common?" asks Chris Tyler-Smith from The Wellcome Trust Sanger Institute, Hinxton, UK. "We might expect such a deleterious change to have 'died out'.

"We think that the mutation arose around 30,000 years ago in India, and has been able to spread because its effects usually develop only after people have had their children. A case of chance genetic drift: simply terribly bad luck for the carriers."

This is a 25-bp deletion in a muscle protein gene, MYBPC3. The current allele frequency in India is estimated to be 4 percent; it is estimated to be carried by 60 million people. The paper suggests that it originated 30,000 years ago. Carriers of the gene have a massive increase in their chance of cardiomyopathy.

Here's the relevant passage from the paper:

The presence of a disease-associated variant at substantial frequency raises an evolutionary question: if it is disadvantageous, how did it become so common? In principle, it could be evolutionarily neutral, manifesting its disadvantages only late in life; alternatively, its disadvantages could be outweighed by advantages early in life, or in a different environment, so that it could have been positively selected. To address this question, we examined the haplotype structure surrounding the deletion. Using five short tandem repeat (STR) markers, spanning ca. 3.4 Mb surrounding the deletion in 287 heterozygous individuals, we found similar high degrees of variation in the inferred haplotypes from chromosomes with and without the deletion (Supplementary Fig. 7 and Supplementary Table 6 online). We then used allele-specific amplification to resequence ca. 10-kb haplotypes centered on the 25-bp deletion from nine heterozygous individuals (Supplementary Tables 7 and 8 online). The chromosomes carrying the 25-bp deletion showed five closely related haplotypes (Supplementary Fig. 8 online). After excluding variants likely to have arisen by recombination, we estimated a time to most recent common ancestry (TMRCA) of ca. 33 ± 23 thousand years for the deletion haplotypes (Supplementary Methods). This time slightly postdates the initial peopling of the subcontinent 30,000–50,000 years ago and together with its restricted geographical distribution suggests that the deletion did not arrive with the first modern human settlers from Africa [more than] 50,000 years ago, but arose subsequently within the subcontinent. Its occurrence in two populations from Southeast Asia can be explained by recent gene flow from India (Supplementary Note online). Collectively, these observations provide no evidence for rapid spread of a recent founder haplotype or any departure from neutral evolution (Dhandapany et al. 2009:4).

The issue is not really whether a gene could go from 1 copy to 4 percent in 1200 generations by chance. That wouldn't be so terribly unlikely in Pleistocene humans -- in fact, the mean time for a mutation to go from 1 copy to 4 percent by drift in a population of effective size 10,000 individuals is not 30,000 years, but only around 20,000 years. On the other hand, mtDNA variation today suggests that South Asia experienced early and rapid population growth -- so we're not likely talking about a population of 10,000, but more like a minimum of 100,000 effective individuals through the past 30,000 years at least. It would take genetic drift at least 10 times longer to accomplish the requisite frequency change given that demographic history. Still, a single allele at a single gene locus might be exceptional.

But that scenario, however unlikely, is simply not the situation we have here. Here we have a deletion that must have some disadvantage, because it gives people a fatal disease. This disadvantage is apparently dominant in effect, based on the case-control study. Yet the deletion has managed to persist within the large South Asian populations of the last 10,000 years so that today it is still around 4 percent.

People mainly die of cardiac problems after age 40. But human reproductive lives aren't over until they're done investing in their children. Further, a weakened heart may reduce work potential or health even if it kills slowly. The fitness cost of this deletion is smaller than if it gave people a chance at a fatal disease when they are 17, but a smaller fitness cost is still a fitness cost. In a large population, that small fitness cost is going to whittle away the frequency of the allele over time.

A thousand generations is a lot of potential whittling. Using some quick calculations, it looks like selection against the deletion as low as 0.001 to 0.0015 in heterozygotes should have been enough to cut the frequency down to around 1 percent, from an initial value of 4 percent. So even if drift increased the deletion early after its origin, it ought to be much rarer today. Meanwhile, drift looks even more unlikely, since the chances of a mutation growing from 1 copy to 4 percent against such selection are nil.

Did this deletion have a fitness cost as high as one in a thousand? It increases cardiomyopathy by 5-fold or more compared to the wild type. So it seems very plausible. But really, we don't have any good estimates of the fitness costs of chronic diseases in pre-industrial populations.

If the deletion was favored by some selection, that would probably be antagonistic, that is, acting against the fitness cost of the deletion late in life. The authors briefly investigated this hypothesis, as described above. They found no evidence for a recent expansion of a single haplotype around the deletion. That means that if there was strong selection favoring this deletion, it must have happened early after its origin and then petered out. If the expansion had been late in South Asian history, it would show more LD around it, and most of the deletion-carrying chromosomes would share a single long-range haplotype. So this deletion has not been increasing rapidly in the past few thousand years.

I would hypothesize that the disadvantages of the deletion have actually increased over time. The average lifespan increased into the Upper Paleolithic and probably later as well. Meanwhile, as the population grew, larger completed family sizes became more important to fitness. As people became more sedentary, the accumulation and inheritance of possessions and land became an important means of investing in children. The increasing importance of later survival and investment in children should have raised the fitness cost of chronic disease. That would explain a pattern of evolution in which this deletion increased in frequency early in its history, but later remained static or declined.

So, I don't suppose I can say people are crazy for thinking genetic drift could explain this deletion's current high frequency. But considering the powerful effect of weak selection over the many generations involved here, and the very large size of the South Asian population during most of that time, genetic drift seems pretty unlikely.

References:

Dhandapany PS and 23 others. 2009. A common MYBPC3 (cardiac myosin binding protein C) variant associated with cardiomyopathies in South Asia. Nat Genet (online early) doi:10.1038/ng.309

Reading through P. A. P. Moran's book, The Statistical Processes of Evolutionary Theory, I found this passage (p. 12):

It should be pointed out that the above stochastic models [of density dependence] usually result in there being a non-zero probability that the population will die out altogether. In genetic problems this is an unmitigated nuisance. In population genetics we are concerned with the variation and distribution of gene frequencies and it is very difficult to make stochastic models in which both the gene frequency and the population size are random variables (see Feller (1951), p. 242 for a beginning in this direction, in which, however, the population may die out). Many genetic phenomena do depend on the population size and the models we shall consider later nearly all assume that this size is held constant. It is true that if we have, for example, a situation in which a new mutant gene takes over the whole population by reason of some selective advantage, the total population size, which is held in check by density-dependent forces, can usually be expected to increase somewhat, or at any rate to change slightly, but this is not likely to have an important effect.

That's interesting for several reasons. Recently I've been investigating the connections between selection and demographic growth. In humans, there are a number of recently selected genes whose advantage comes from relaxing density dependence (that is, increasing carrying capacity), for example by allowing greater resource extraction from the environment. In those cases, the effect of a selective fixation on population size will not be negligible. Examples of that kind may not be rare in nature, although in many instances selection may increase population size only to result in added pressure to various prey species, which then reduce the carrying capacity.

Another reason why this is interesting is that it reveals a fairly unusual way of thinking about selection. From one point of view selection is just a condition of the demography of alleles. In particular, both selection and genetic drift (and for that matter, mutation) are described by the same equations that describe demography. Under genetic drift, these allelic demographies are in all cases of similar form to the demography of the population in which those alleles are embedded. Selection, on the other hand, is notable for showing the demography of alleles to be inconsistent with the demography of the population. The most commonly considered case is where one allele increases while the population remains the same size. But balancing selection, for example, can be reduced to density-dependence on an allele's frequency.

One of the easiest ways for selection to set itself apart from stochastic changes in populations is to be deterministic. But the results of selection are nonetheless stochastic, and it is good to be reminded.

Darwin, in The Variation of Animals and Plants Under Domestication, volume 2, pp. 248-249.

Throughout this chapter and elsewhere I have spoken of selection as the paramount power, yet its action absolutely depends on what we in our ignorance call spontaneous or accidental variability. Let an architect be compelled to build an edifice with uncut stones, fallen from a precipice. The shape of each fragment may be called accidental; yet the shape of each has been determined by the force of gravity, the nature of the rock, and the slope of the precipice,—events and circumstances, all of which depend on natural laws; but there is no relation between these laws and the purpose for which each fragment is used by the builder. In the same manner the variations of each creature are determined by fixed and immutable laws; but these bear no relation to the living structure which is slowly built up through the power of selection, whether this be natural or artificial selection.

If our architect succeeded in rearing a noble edifice, using the rough wedge-shaped fragments for the arches, the longer stones for the lintels, and so forth, we should admire his skill even in a higher degree than if he had used stones shaped for the purpose. So it is with selection, whether applied by man or by nature; for though variability is indispensably necessary, yet, when we look at some highly complex and excellently adapted organism, variability sinks to a quite subordinate position in importance in comparison with selection, in the same manner as the shape of each fragment used by our supposed architect is unimportant in comparison with his skill.

The unbearable hotness of Neandertals

According to the Telegraph (UK), Neandertals became extinct because their mitochondria leaked excess heat:

Professor Patrick Chinnery, a neurogeneticist at Newcastle University, believes the differences in this mitochondrial DNA could have caused Neanderthals to be inefficient at producing energy, meaning their cells leaked heat.

He said: "The question is why did Neanderthals disappear? There are lots of explanations to do with changes in climate and the food supply.

"Differences in these mitochondrial DNA sequences might explain why modern humans were able to survive while Neanderthals were not.

So, is it true? Did Neandertals go panting into that long good night?

Well, Siberians may have mtDNA alleles that leak extra heat, and they're not extinct. It seems like a good idea if you don't live in the tropics and have enough food. Because it's not like Europeans lack opportunities to take off a layer to deal with the heat.

Plus, as the Neandertal morphology waned in Europe, the climate was getting colder, not warmer.

The real mistake here is assuming that the mtDNA necessarily shared the same fate as the rest of the genome. Sure, there aren't any living members of the Neandertal mtDNA clade, at least that we know of. But that suggests selection favoring human mtDNA, not necessarily Neandertal extinction. The idea of selection is supported by the finding of functional variations between human and Neandertal COX2, which I discussed in August.

The current research seems like it probably adds detail to this comparison, but that's not an argument for Neandertals going extinct in the heat.

Information theory and mutual information between genetic loci

This is the second in a series on information theory and tests for recent selection. The first entry, "Information theory: a short introduction" reviewed the basic concepts of information measures and their background.

The International HapMap is a massive project to determine the genotypes for up to 3 million single nucleotide polymorphisms (SNPs) in samples of people from 11 population samples around the world. The current data release (Phase 3) includes genotypes for a subset of over 1.5 million SNPs in 1,115 people. The 11 population samples include people of African ancestry from the US Southwest, Utah residents of Northern and Western European ancestry, Han Chinese from Beijing, people of Chinese ancestry from Denver, people in the Houston Gujarati Indian community, Japanese people from Tokyo, Luhya and Maasai people from Kenya, people of Mexican ancestry from Los Angeles, Italians in Tuscany, and Yoruba from Ibadan, Nigeria.

As impressive as this effort is, we may wonder why exactly SNP genotyping of so many people is a valuable enterprise in itself. The project’s homepage includes this short statement:

The goal of the International HapMap Project is to compare the genetic sequences of different individuals to identify chromosomal regions where genetic variants are shared. By making this information freely available, the Project will help biomedical researchers find genes involved in disease and responses to therapeutic drugs.

There are theoretical and practical objections to this simple explanation (as I discussed here last month). However, what no one involved with the project seems to have expected is the extent to which the data would demonstrate the importance of recent adaptive evolution in human populations.

Here, I am describing some of the ways that we can test hypotheses about natural selection by using the SNP genotypes from the HapMap. This is a theory-centric description, with some digression into practical aspects of handling the genotype data. First, I consider how we might use information theoretic concepts to test the hypothesis of independence between two genetic loci.

Positive selection and the tip of the iceberg

Razib points to a new paper by Johansson and Gyllensten, in which they develop a comparison of FST and haplotype block length as a test of positive selection. The paper is interesting enough (and open access) -- they give a list of a few variants that are likely selected in different populations.

What I wanted to point to was this figure:

Selection versus drift, Johansson and Gyllensten 2008

That pretty much encapsulates the problem of detecting recent positive selection, with current methods. The distribution of selected alleles looks significantly different from the distribution of neutral alleles, but there is a tremendous overlap. Particularly when it comes to choosing an arbitrary cutoff between the two distributions.

Imagine if you had a sample of men and women, and you chose an arbitrary cutoff of stature to distinguish them. Say, everyone over 5 foot 7 is a man. Well, that will do better than chance, but you've included a lot of women in your sample of men, and vice versa. Now, suppose you thought that men were inherently rare compared to women. Say, 100 women for every man. A cutoff of 5 foot 7 inches is going to include many more false positives (i.e., tall women) than genuine men. So you choose a very conservative cutoff, one that is not likely to include very many women. Maybe 6 foot 5. The people you see who are over 6 foot 5 are extremely likely to be men -- not certainly, you still will catch some very tall women -- but quite likely men. But you've excluded 95 percent of the men to do this.

That's the situation we are in with respect to detecting selection. There is an enormous set of false negatives -- truly selected alleles that are indistinguishable by means of an arbitrary cutoff from neutral alleles. In the figure above, Johansson and Gyllensten suppose that each ascertained variant (at s=0.01) represents almost 5 in the population. So far, few have made much of the point that a small number of selected alleles under a very stringent cutoff must correspond to a large number that don't make the cutoff. (our 2007 paper being an exception). The issue is not only ascertainment; it is the shape of the non-ascertained distribution.

Still, one may hope to do better at identifying selected alleles. So far, people have been hacking at these statistical distributions with a hatchet. We need a scalpel.

References:

Johansson A, Gyllensten U. 2008. Identification of local selective sweeps in human popluations since the exodus from Africa. Hereditas 145:126-137. doi:10.1111/j.2008.0018-0661.02054.x

Evo-devo and HACNS1

Science has a very important paper in the current issue about the evolution of a gene enhancer in hominids, expressed in forelimb development and concentrated toward the first digit. The enhancer is a conserved sequence named HACNS1, it exhibits a stronger signature of recurrent selection on the human lineage than any other conserved enhancer sequence. In transgenic mice, the human version of this enhancer triggers gene expression in the forelimb, concentrated toward the thumb side, and some other parts of the body, notably the pharyngeal arches (which give rise to elements of mouth, throat and larynx), eye and ear. The research is by Shyam Prabhakar and others at Lawrence Berkeley National Lab, and involves Edward Rubin and James Noonan, otherwise prominent in the Neandertal genome sequencing.

I think this is an extraordinarily important result. You don't see me write those words very often. This is a paper that every biological anthropologist should read. It gives an extremely good example of the importance of developmental regulation to human evolution. We will see many more papers like this one in the coming years. This is one of the genes that makes us human.

Ed Yong of Not Exactly Rocket Science has written a nice online review of the research, and Science has accompanied it with a perspective piece by Gregory Wray and Courtney Babbitt. Here's a quote from that article:

To test the function of this region, they genetically engineered mouse embryos to express a construct composed of human HACNS1, the promoter element of a heat shock gene, and a reporter gene. Their results show that human HACNS1 drives expression in the mesenchyme of the early developing forelimb, and later developing hindlimb, in these mouse embryos. A comparison of expression patterns driven by macaque, chimpanzee, and human orthologs of HACNS1 revealed that consistently strong forelimb expression is a unique property of the human version. By testing various combinations of human and chimpanzee HACNS1 sequences, the authors narrowed down the relevant functional mutations to an 81-base pair region containing 13 substitutions that arose during human evolution. This concentration of substitutions is highly unusual relative to the genome as a whole, implying positive selection on this region during human origins.

The press are going with the story that the evolution of this gene may underlie the unique evolution of human manual dexterity. It's a good hypothesis, but I think there is a more accurate way of putting the situation. We see that the enhancer has effects in different areas of the developing embryo. Its action is therefore pleiotropic: changing its function in one area might well screw up its action somewhere else. So at the very least, this is an enhancer that must satisfy multiple constraints. Strong evolutionary change in its sequence may reflect changes in one of those functions, or more than one. But at the very least, it implies that the hominid developmental program not only satisfies different fitness constraints than in the human-chimpanzee common ancestor, but that these changes required repeated changes.

We don't know how long it would have taken all these nucleotide substitutions to happen. But we might find signs in the fossil record of such a sequence of events, if we had enough bones, and if we had more information about the effects of different forms of the gene on the adult phenotype. For example, the relatively long thumbs of the Hadar hominids (compared to chimpanzees and gorillas) suggest that the sequence of changes started early in hominid evolution. There's a hypothesis.

But like I said, I wouldn't rule out other possible functions of the enhancer as targets for selection. It is plausible (as a hypothesis) that the enhancer with the most selected substitutions on the human lineage might be more likely than others to have been selected for multiple functions. And we have plenty of reasons to suspect selection on its other targets, particularly the developing mouth, throat and ear.

It may even be that the evolution of human thumbs was a side effect of evolution in the throat, or vice versa. That's the kind of weird world evo-devo makes for us!

References:

Prabhakar S and 9 others. 2008. Human-specific gain of function in a developmental enhancer. Science 321:1346 - 1350. doi:10.1126/science.1159974

Wray GA, Babbitt CC. 2008. Enhancing gene regulation. Science 321:1300-1301. doi:10.1126/science.1163568

Sample sizes and the "Neandertal haplogroup"

I have an excellent e-mail question about last week’s Neandertal mtDNA paper, which has provoked a lot of commentary.

I just skimmed over your comments on the recent paper and I have a couple questions. First, how many Neanderthals did they receive mitochondrial DNA from? I think I read somewhere that it was fewer than ten.

Second if that is true, what the hell does it mean? I wouldn’t try and predict anything based on even fifty humans from that long ago much less 8 or 9 in genetic terms. I don’t think that anyone else would either unless they are grandstanding. You can’t prove a negative so they really can’t say that no modern humans have any Neanderthal DNA. Did they study Neanderthals from Asia? I just think they don’t have a good enough sample and until we can resequence a Neanderthal nucleus and bring the little tyke to term and wait for him or her to marry then wait for those kids to have kids will we really be sure we’ve got the goods.

Krause et al. (2007) list 15 Neandertal partial mtDNA sequences. Ten of these at that time presented relatively long portions, including the central Asian Okladnikov and Teshik Tash specimens, Mezmaiskaya, Feldhofer 1 and 2, Vindija 75 and 80, Scladina, Monte Lessini, and El Sidrón 1252. The same paper lists five additional specimens for which only a very short sequence had been recovered (just enough to diagnose as part of the Neandertal clade), including Vindija 77, El Sidrón 441, Engis 2, Rochers de Villeneuve, and La Chapelle-aux-Saints.

We do not know that every Neandertal belonged to the same mtDNA clade as those 15 sequences. Some of them may have looked different, possibly including the new clade otherwise present in later Upper Paleolithic and living people. But based on the 15 sequences we have, we can say that a large fraction of Neandertals must have carried the “Neandertal haplogroup.” Exactly how large a fraction depends on what we are willing to believe about contamination, preservation, and the randomness of our sample.

Now, let’s consider the question: Can we predict anything about Neandertal evolution and relationships based on this small, possibly unrepresentative sample of mtDNA?

The answer is that it doesn’t matter very much whether we have 5 sequences or 500. If 15 out of 15 specimens from different sites across Europe preserve a single mtDNA haplogroup, we can’t say it was universal, but we can say it was common. If 40 out of 50, or 400 out of 500 specimens had the same haplogroup, that would increase the precision, but not change the basic fact: Neandertals had at least one common haplogroup that is now so rare it has never been found in a sample of 100,000 or more people. We deserve some explanation.

The possible explanations are:

  1. Random genetic drift
  2. Accelerated genetic drift due to demographic turnover
  3. Population extinction and replacement
  4. Natural selection


Drift

Random genetic drift is fairly easy to refute, although it might not appear so at first. In favor of drift: There were few Neandertals, and the population size of the succeeding Upper Paleolithic, up through the Last Glacial Maximum, was also small—the best estimates are on the order of 2000 people for Western Europe and 5000 for continental Europe to the Urals (Bocquet-Appel et al.2005). There would have been perhaps twice or more that number across the entire Neandertal range. The effective population size represented by this population would have been smaller; perhaps 3000–5000 for Neandertals and Aurignacian-era people, only half, or around 2000, females. Genetic drift in this small mtDNA population would have been much stronger than for autosomal genes, and very much stronger than in most recent human populations.

But when we plug these numbers into a model of random genetic drift, it starts to appear very unlikely that drift alone could explain the observations. Let’s assume (falsely) that our Neandertal genetic samples all dated to 40,000 years ago, and the female effective size was 2000 individuals between then and 15,000 years ago, and that the population of Neandertal country were a random mating pool. Following these assumptions, on averageall the mtDNA genomes at 15,000 years ago would descend from only 4 or 5 ancestral copies in the population 40,000 years ago. If these five ancestral copies were, by chance, a different haplogroup from the 15 copies we’ve already found, then drift could explain the data.

However, this still doesn’t appear very likely. So far, every one of the Neandertals shares a single haplogroup. The frequency of this haplogroup was apparently very high, making it very unlikely that all five ancestral copies would have belonged to some other haplogroups of which we have never found any trace.

Notice that this argument does not depend very much on the number of Neandertal mtDNA sequences that we have found. The fact that there are 15 helps to constrain the frequency of the haplogroup within the population 40,000 years ago, in our model. That frequency is unlikely to be less than around 85%, assuming random sampling. But suppose there were only five. We would still know that the Neandertal haplogroup was very common in its population, even if we thought it was only 50%. It would still be unlikely to draw four or five ancestral copies and have all of them be some other haplogroup that we haven’t found.

This gives us a considerable confidence margin against drift. We need it. After all, the Neandertals were not randomly sampled at a single time, and it is possible that some of them actually carried a human-like mtDNA sequence, which we now falsely interpret as contamination. But even with these shadows hanging over us, it would still be unlikely that none of the ancestors of today’s mtDNA variation were like the Neandertal haplogroup.

Also, the population was not a random-mating pool. When we add geographic structure to the story, which tends to reduce the importance of genetic drift, we find that the possibility that drift alone is almost zero, and it remains very unlikely that a single migration of modern humans interbreeding with Neandertals under random drift could explain the observations, either (Currat and Excoffier2004).

Extinction

It is at this point that most geneticists turn to the hypothesis of complete Neandertal extinction. They have a point. Genetic drift apparently cannot explain what we have observed, In their point of view, if genetic drift alone cannot explain the Neandertal mtDNA disappearance, then the only other random process at hand is extinction.

I think that hypothesis is false. It does not account for morphological similarities between Neandertals and later people, genetic evidence that suggests a strong ancient population structure with introgression, or with the apparent behavioral continuity in the Upper Paleolithic.

Happily, I don’t have a commitment to random processes. Instead, I think that the mtDNA evolution of Europe was driven by nonrandom processes of demographic turnover and selection.

Demographic turnover

Here we come to an important point. No one believes that later Europeans evolved from earlier Neandertals by a random process of genetic drift. Yet that is precisely the hypothesis that most studies have set up to refute. Without question it is valuable to set up boundary conditions under the hypothesis of random genetic drift. But the time has come to investigate more interesting models.

Personally, I am surprised that more complicated metapopulation dynamics have not gotten more attention as an explanation for the Neandertal mtDNA results. Population sources and sinks are a hot topic in biology, and you would think that anthropologists would have picked up on this. To my knowledge, the only time anyone has examined a population sink model was in 2001, when Milford Wolpoff and I worked with mathematician Per Enflo on such an idea for Neandertals (Enflo et al.2001). This idea deserves a fuller treatment (I think I’ll suggest it as a project for one of my classes this year!).

In a nutshell, a population sink is a region where the average rate of reproduction is below replacement levels. This region can remain populated only if individuals migrate in from other places. The places that reproduce above replacement are called population sources. The continual migration from sources to sinks creates a genetic gradient. Individuals sampled at any given time in the population sink are overwhelmingly likely to have ancestors not in the sink but in one or more source populations.

Europe today is a population sink. The population of the continent does not produce enough children to replace itself, and immigration from other parts of the world is high. There are several reasons to suggest that Europe may have been a population sink in prehistory as well. In Neandertal and Upper Paleolithic times, climate fluctuations created unique challenges in Europe, where caloric expenditures were high and food harder to obtain than some other regions.

Continual migration into Europe would provide a simple explanation for why none of today’s mtDNA haplogroups derive from the European Neandertals. The mtDNA population of 15,000 years ago had a few ancestors 40,000 years ago, and none of these ancestors lived in the sink population—all came from the source population in Africa or West Asia. The Neandertal mtDNA variation would have been a short-lived phenomenon, continually being turned over from source populations. Some Neandertal genes would have survived in Europe for hundreds of thousands of years, but some would have come in with more recent migrants from the population source.

There are points that argue against this source-sink hypothesis. The Neandertal-human divergence time for mtDNA is not very different than that estimated for the autosomal genome. If a European population sink had made genetic drift more powerful, that should have affected mtDNA more than the autosomes, so we might expect a more recent mtDNA divergence. Still, there is nor reason why the source-sink dynamic need have been constant over Neandertal evolution, and there may have been multiple sources in the Pleistocene, not only Africa and West Asia. Investigating the boundary conditions of the source-sink model and its correspondence to autosomal genetic results would be helpful.

I should note that mtDNA is not special. Neandertals had lots of traits that are now very rare. The horizontal-oval, or “bridged” mandibular foramen is a prominent example. Out of the relatively small sample of Neandertal mandibles, half have this derived form. Fewer than one percent of recent European mandibles have this form. As for mtDNA, a once-common variant is now very rare. And as for mtDNA, we deserve some explanation. A source-sink model would appear consistent with the continued evolution of such traits during the Upper Paleolithic—a time when the extinction and replacement hypothesis predicts no change in these characters.

Natural selection

The other nonrandom hypothesis is natural selection, which would presumably have favored one or more modern human types while eliminating the original Neandertal haplogroup. I won’t say much about that hypothesis here, since I discussed it in my initial post about the whole-mtDNA-genome sequencing. Selection has a leg up over the other hypotheses now because it seems like there’s good evidence it happened.

Still, selection on mtDNA alone could not explain the total pattern of observations about Neandertals. Physical traits that were once frequent in Neandertals were much less common or absent in later Europeans, and some continued to reduce in frequencies over time. To explain these changes, we must invoke either selection on other traits, or continued demographic turnover in the post-Neandertal population (probably more immigration into Europe) or both.

So selection on mtDNA has never been a sufficient or necessary hypothesis, even if we assume that other genes carried by Neandertals still survive. But given the current evidence that suggests something distinctive about the mtDNA of recent humans, natural selection may receive renewed attention as a factor in the disappearance of the Neandertal mtDNA haplogroup.

References


   Bocquet-Appel JP, Demars PY, Noiret L, Dobrowsky D. 2005. Estimates of Upper Palaeolithic meta-population size in Europe from archaeological data. J Archaeol Sci 32:1656–1668. doi:10.1016/j.jas.2005.05.006.

   Currat M, Excoffier L. 2004. Modern humans did not admix with Neanderthals during their range expansion into Europe. PLoS Biol 2:e421.

   Enflo P, Hawks J, Wolpoff MH. 2001. A simple reason why Neanderthal ancestry can be consistent with current DNA information. Am J Phys Anthropol 114:S62.

   Krause J, et al. 2007. Neanderthals in central Asia and Siberia. Nature 449:902–904. doi:10.1038/nature06193.

Life history and disease in Tasmanian devils

The keywords to the article include, "carnivorous marsupial" and "precocious breeding." What better teaser could you possibly hope for?

Tasmanian devils are dying because of a transmissible cell line infection, or "cancer," decimating their population. In fact, in some places it's killing 9 out of 10, which is way beyond decimation.

The new paper by Menna Jones and colleagues claims that the population is evolving toward a radical life history solution to the problem: Tasmanian devils are starting to mate and have large litters after a single year, before they have a chance to succumb to the disease:

This change in life history is associated with almost complete mortality of individuals from this infectious cancer past their first year of adult life. Devils have shown their capacity to respond to this disease-induced increased adult mortality with a 16-fold increase in the proportion of individuals exhibiting precocious sexual maturity. These patterns are documented in five populations where there are data from before and after disease arrival and subsequent population impacts. To our knowledge, this is the first known case of infectious disease leading to increased early reproduction in a mammal.

It's a simple response: young breeders used to have lower fitness, because of competition from older adults. Now, the high mortality after the first year has made it a losing strategy to wait to reproduce. When the early breeders are the only ones having many offspring, the population will evolve quickly to early breeding.

References:

Jones ME, Cockburn A, Hamede R, Hawkins C, Hesterman H, Lachish S, Mann D, McCallum H, Pemberton D. 2008. Life-history change in disease-ravaged Tasmanian devil populations. Proc Nat Acad Sci USA (in press) doi:10.1073/pnas.0711236105

Carl Zimmer puts in a nice entry on the new flounder evolution paper, covering the history of the question including the debate between Darwin and Mivart about the evolution of the upward-facing flounder eye position. It's a recommended read. Here's the end:

Amphistium and Heteronectes now join the transitional fossil hall of fame, along with a fish with limbs, Tiktaalik, and the limbed cousin of whales, Indohyus. They’re also a reminder that the argument, “It can’t possibly have evolved because I can’t imagine it evolved” is not an argument at all. It may be hard to imagine Amphistium and Heteronectes, but they are real. In fact, they’ve been sitting around in museums for centuries, waiting for someone to recognize their true wonder.

I especially like the aspect of "sitting around in museums," because the truth is that there are a lot of discoveries still waiting to be made on material removed from the ground decades ago. In this case, the ability to CT-scan the fossils is a nice new addition, but in fact there are lots of things that an eye trained in modern systematics will see that someone many years ago may have missed. Of course, in science fiction novels, it's usually some horrible ancient truth waiting to be discovered, but scientists are doing the real thing all the time!

Weed species (part 1)

This is the first in a series of essays titled, "Practical Evolution." Here are links to the whole series and the series introduction. I've decided to break the articles up into two parts, so that a full essay will appear in two successive weeks. So if you enjoy the current installment, by all means come back on Friday, when I will follow the threads of dispersal by way of an obnoxious animal pest right back to hominids.

Dandelion seeds

Evolution of the monkeyflowers

Spring has finally come to us here in the North, and it's time to start thinking about planting. So, when I went to a seminar yesterday by John Willis, it was with dual motives.

Naturally, I was interested in hearing about his work relating the evolutionary ecology of Mimulus species to their genomics. As Willis and his many former and current lab members made clear in a recent review article in Heredity, monkeyflowers have become a really interesting model system for studying the dynamics of natural selection on genomes -- particularly, with relation to local ecological adaptation, and also with relation to speciation.

But I was also thinking about whether I could find a nice flower variety for my garden. I'm not particularly excited about peas, and I tolerate Arabidopsis when it comes up, but let's face it, it's not exactly a show flower. I'd love to get one of the prettier hawkweeds going (these have eponymical appeal as well as botanical interest) but the common ones are pretty boring.

Well, Willis's lab has been a center of development for Mimulus genetics. They have developed a store of SNPs and other markers (available at the Mimulus evolution website) for QTL mapping, and are using them to find genes responsible for ecological adaptations in different wild Mimulus populations. In the talk, Willis featured some of his collaborators' work finding genes involved in wet versus dry habitat adaptations and in early versus late flowering. These traits are connected to each other, as well as to other life history, plant size and flower size.

I left having my prior belief abundantly confirmed: botany is awesome. I mean, think about it. You can go outside, in your own neighborhood, and study biology. You can uproot your subjects and transplant them somewhere else, to watch how well they do. If they die, well, that's a data point, not an ethical emergency! Worried about gene-environment interactions? No problem, just put samples of all your subjects in the same greenhouse and wait. Need to isolate a QTL against a uniform genetic background? Cool, just repeatedly backcross it into an inbred line for a few generations, selecting for the trait each time. Want to study genetic correlations? Well, you can breed a thousand plants and select for any trait you want!

Oh, and if you want to, you can clone them.

Let's look at an example, from the Heredity review:

Recent work on floral evolution demonstrates that fundamental evolutionary questions can be addressed in Mimulus through the combination of field experiments and modern genomic approaches. Bradshaw et al. (1995, 1998) pioneered the application of genome mapping to study of ecologically important traits in Mimulus using RAPD and allozyme markers to map floral QTLs underlying the divergence between red-flowered, hummingbird-pollinated M. cardinalis and pink-flowered, bee-pollinated M. lewisii. The initial mapping experiments, with hybrid phenotypes measured in controlled greenhouse environments, revealed QTLs with major effects on virtually every floral character studied, from coloration and morphology to nectar production. To determine the effect of these QTLs on pollinator visitation and discrimination, Schemske and Bradshaw (1999) moved the genotyped hybrids to a field site near one of the few regions where the species coexist, and observed bee and hummingbird visitation behavior. Amazingly, the M. cardinalis allele at a single QTL, YELLOW UPPER (YUP), was responsible for an 80% loss of visitation by bee pollinators, and the M. cardinalis allele at a QTL responsible for variation in nectar production doubled hummingbird visitation (Schemske and Bradshaw, 1999). Bradshaw and Schemske (2003) subsequently created near-isogenic lines (NILs), where heterospecific alleles at YUP were reciprocally introgressed into the parental genetic backgrounds, and evaluated the response of pollinators to the NILs in the field. They observed an even clearer pattern of pollinator discrimination due to this locus, with a 74-fold increase in bee visitation in M. cardinalis NILs that carried the M. lewisii YUP allele, and a 68-fold increase in hummingbird visitation in M. lewisii NILs with the M. cardinalis YUP allele. Although the ecological context, in this case the community of potential pollinators, is certainly important to the evolution of new pollinator associations, these results also demonstrate that single genomic regions can have a large effect on major evolutionary transitions (Wu et al. 2008: 224-225).

The talk was mostly focused on the Mimulus guttatus complex, where some of the most pressing issues are life history, drought tolerance, and tolerance of high mineral concentrations, such as salt or copper. They were able to trace many QTL's of small effect with relation to the major differences in life history and moisture requirements in ecogeographic races of M. guttatus, to show that the within-population variation for these traits is caused by high-frequency (likely balanced) alleles rather than mutation-selection balance or rare alleles, and to find the correlated responses to selection of different plant traits based on different QTL's.

With respect to the genetics of speciation and ecogeographic race formation, they are helped by a long history of research on Mimulus. For example:

Macnair and Christie (1983) performed the first direct genetic analysis of hybrid incompatibilities in Mimulus. While studying the genetic basis of copper tolerance in California populations of M. guttatus, they noticed that some crosses between plants from the copper mines and certain other populations resulted in F1s that died as young seedlings. Further crossing studies revealed that the F1 lethality was caused by a deleterious epistatic interaction between the copper tolerance allele from the mine populations (or a gene tightly linked to it) and alleles at an unknown number of different loci from the other populations. Such deleterious interlocus interactions, usually referred to as Dobzhansky–Muller (D-M) incompatibilities, are thought to be the major cause of low hybrid fitness in plants and animals (reviewed in Coyne and Orr, 2004). Remarkably, it appeared that natural selection for copper tolerance had indirectly resulted in the evolutionary origin of the hybrid incompatibility (Wu et al. 2008:226).

So yes, say what you want, botany is awesome. Plus, there's one more thing: I sat through an entire lecture about natural selection and ecological differentiation of species and races, and never once heard the word, "bottleneck." It was like traveling to some kind of bizarro world where biologists still read Darwin!

So we come down to the really difficult question: which variety am I going to plant? Mimulus glabratus is native here in Wisconsin, including Dane County, but it is not very showy, and prefers wet habitat. That makes it a poor fit for my native plant patch, which is dry/mesic, and which I never water unless the black-eyed Susans and bee balms start to wilt. Mimulus ringens is prettier, with bigger, lavender flowers, but also likes it wet.

I guess I'll have to keep looking. M. lewisii is a pretty variant, if I can find a good source for it, and I can keep it in one of the wetter corners of the yard. I would try for M. cardinalis, since we have hummingbirds sometimes, but I'd like to get Lobelia cardinalis going also, and it's a lot easier to find. Besides, it hardly looks like a monkey!

References:

Wu CA, Lowry DB, Cooley AM, Wright KM, Lee YW, Willis JH. 2008. Mimulus is an emerging model system for the integration of ecological and genomic studies. Heredity 100:220-230. doi:10.1038/sj.hdy.6801018

FOXP2 is really recent, it really did introgress (if it's not contamination)

That's the thrust of a technical comment by Graham Coop and colleagues, now online in Molecular Biology and Evolution. The letter refers to the extraction of FOXP2 from two Neandertal specimens from El Sidrón, by Johannes Krause and colleagues, reported last year (I wrote about the paper here).

First, the bad news. The current letter raises the prospect of contamination. Notably, the controls applied by Krause et al. (2007) may be relatively weak evidence against contamination, because of polymorphism within large human comparative samples. The tests rely on the assumption that there is little DNA from living humans in the samples. But if we cannot distinguish Neandertal from human DNA with great accuracy, then we will be mistaken some proportion of the time. Krause et al.'s test, based on derived human alleles absent from the Neandertal genome draft, can still go wrong if the human contaminants happen to have all the ancestral (non-derived) human alleles.

Well, that seems to be the story these days with Neandertal DNA extraction. No test of contamination is good enough. (And remember, that every "test" of contamination is really a procedure for excluding the hypothesis that ancient sequences are identical to recent ones.)

Now, the more interesting news. Coop and colleagues verify that the selective sweep affecting human FOXP2 was indeed recent -- they estimate 42,000 years ago:

To demonstrate this, we estimated the time of the most recent common ancestor (tMRCA) of the selected haplotype (see Figure 1), using an approach sometimes called phylogenetic dating (Thomson et al. 2000; Hudson 2007). This method does not make assumptions about demography and selection, but only requires that the mutations in the intron be neutral or nearly neutral. Taking this approach, we obtained a mean tMRCA of 42 Kya (see SOM for details). While there is considerable uncertainty associated with this estimate, it is surprisingly recent if selection took place over 300 Kya (see SOM). In other words, the selective scenario proposed by the authors cannot account readily for patterns of variation in modern humans. Given that we have no power to detect a beneficial substitution that occurred over 250 Kya, (cf. Sabeti et al. 2006) yet we see a footprint of positive selection at FOXP2, the conclusion of a recent selective sweep at FOXP2 is not surprising (Coop et al. 2008:3-4).

FOXP2 is in one of the ENCODE regions, so its variation is pretty well known. This is not a problematic case: it has a very limited amount of variation around it, and has a strong excess of rare alleles, both signs of a recent sweep.

Coop and colleagues suggest that the beneficial human allele spread into Neandertals (or vice versa) by low levels of gene flow coupled with its selective advantage -- in other words, introgression.

They do allow for an alternative -- perhaps the two amino-acid-coding mutations were not the target of selection, but instead some linked locus. This would not erase the necessity of gene flow from Neandertals, but would question whether this gene flow had involved the FOXP2-language scenario, since it might be some linked gene unrelated to language.

(CORRECTION (2008/04/18): If selection were on a linked site, then Neandertals might share the human-derived amino acids as a result of ancient shared ancestry with humans, while the linked selected sweep might be absent in Neandertals, not necessitating any gene flow.)

I doubt this hypothesis of a linked sweep, since the two sites with human-derived substitutions are otherwise very strongly conserved among mammals. This looks like a credible target for recent selection. But the hypothesis of selection on a linked site cannot presently be tested.

So that's the story. It seems very likely that Neandertals got the language gene from us, or us from them, long after many other genes in the two populations diverged. I write "many" rather than "most" because we haven't really been able to assess the proportion of derived alleles shared by humans and Neandertals. The completion of the draft sequence may help, but I'm afraid that the specter of contamination is going to keep on being raised whenever a part of the Neandertal draft genome looks humanlike.

(via Dienekes)

References:

Coop G, Bullaughey K, Luca F, Przeworski M. 2008. The timing of selection at the human FOXP2 gene. Mol Biol Evol (in press) doi:10.1093/molbev/msn091

Why have variants influencing recombination rate been selected in non-Africans?

A complicated story is tangled through this paper by Augustine Kong and colleagues, and I don't see where it may end. But here's the abstract:

The genome-wide recombination rate varies between individuals, but the mechanism controlling this variation in humans has remained elusive. A genome-wide search identified sequence variants in the 4p16.3 region correlated with recombination rate in both males and females. These variants are located in the RNF212 gene, a putative ortholog of the ZHP-3 gene that is essential for recombinations and chiasma formation in Caenorhabditis elegans. It is noteworthy that the haplotype formed by two single-nucleotide polymorphisms (SNPs) associated with the highest recombination rate in males is associated with a low recombination rate in females. Consequently, if the frequency of the haplotype changes, the average recombination rate will increase for one sex and decrease for the other, but the sex-averaged recombination rate of the population can stay relatively constant.

Perhaps it's not so curious that alleles of this gene have opposite effects on recombination in males and females. The mechanisms of gamete production are obviously different in the two sexes, and we might expect some kind of frequency-dependent mechanism to regulate recombination. At least, it's a hypothesis.

What I find mysterious is this:

A phylogenetic analysis of a 55-kb region containing rs3796619 and rs1670533 in the HapMap data (24) revealed three well-differentiated clusters of haplotypes showing notable differences in frequency between the Yoruban Nigerians (YRI) and CEU and East Asians (CHB and JPT) (fig. S6). The [C,T] and [T,C] haplotypes that associate most strongly with recombination rate have a combined frequency of only 17% in the YRI sample, but reach a frequency of 91% and 98% in the CEU and East Asian samples, respectively. Several SNPs in this region show an unusual degree of divergence among the HapMap groups, on the basis of the rank percentile of their FST values (Wright's coefficient, a measure of variance in allele frequencies among populations) among all autosomal SNPs with the same overall frequency in the HapMap. Specifically, we identified eight SNPs whose FST values are in the top 0.5% for differences between the YRI and East Asian HapMap samples and also in the top 5% of differences between the YRI and CEU samples. Each of these SNPs differentiated a subset of [T,T] haplotypes from the rest, perhaps indicating an episode of positive selection (or a severe founder effect) that increased the frequency of [C,T] and [T,C] haplotypes in the ancestors of European and East Asian populations.

The [C,T] and [T,C] haplotypes are the ones associated with increased recombination rate in males and females, respectively. The markers are in strong disequilibrium (no [C,C] haplotypes were observed), and seem to have been selected outside of Africa.

I have no idea why.

The recombination rates were all inferred from a large Icelandic sample, so maybe the rates don't really characterize the haplotypes in other populations. Maybe recombination rate is incidental to the real reason for the selection. Or maybe in populations roaring with positive selection on many genes at once, it is a good thing to break them apart more often.

References:

Kong A and 16 others. 2008. Sequence variants in the RNF212 gene associate with genome-wide recombination rate. Science 319:1398-1401. doi:10.1126/science.1152422

Bees R Us

The PNAS Early Edition this week includes a paper by bee genome researchers Amro Zayed and Charles Whitfield. After a short review of honeybee phylogeny, they demonstrate two things:

1. An ancient dispersal of honeybees from Africa into Europe was accompanied by a pulse of positive selection on coding genes, amounting to selection on approximately 10 percent of bee genes.

2. As Africanized bees have spread across South and into North America, adaptive genes from the existing populations of European bees have introgressed into the Africanized population, increasing under positive selection.

These are remarkable parallels to the worldwide evolution of humans. In bees, the geographic pattern is not the same, and the timescale is different, but the overall genetic impact is quite similar.

Here's the bee history:

In its native range, A. mellifera is classified into approximately two dozen subspecies, which are further organized into four major geographically and genetically distinct groups: African, Western and Central Asian (hereafter referred to as Asian), Eastern European, and Western and Northern European (hereafter referred to as West European) (9-11). European honey bees were introduced by humans to the New World by European settlers as early as the 1600s. In Brazil in 1956, an intentional introduction of African honey bees (A. mellifera scutellata), which hybridized with previously introduced European bees, led to the establishment and spread of the highly invasive and economically devastating Africanized honey bees in North America and South America (12). Subsequent studies have shown that Africanized bees are predominantly African in ancestry with minor but consistent contribution from European genotypes (11, 12). Using recently developed SNP panels, Whitfield et al . (11) demonstrated that the honey bee originated in Africa and subsequently expanded into Eurasia in two or more independent ancient expansions. One expansion gave rise to Western European honey bees, and at least one other independent expansion gave rise to Asian and Eastern European honey bees. Honey bee subspecies vary in a host of phenotypic traits, such as morphology, behavior, physiology, and gene expression (9-11, 13, 14) (Zayed and Whitfield 2008:3421).

I was not aware of the initial dispersals of bees into Europe and Asia. The genetic data show that the Western European strains are the ones with the most adaptive evolution since their dispersal from Africa. The separate ancient bee dispersals were documented by Whitfield et al. (2006), but they were not able to provide date estimates for the ancient dispersals, and none are attempted in this study.

This is the kind of test that ought to fail in most wild populations. Without a shift in the adaptive landscape, the fraction of new mutations with potential adaptive value is bound to be small -- because species are optimized to the environments that they have occupied for a long time. But European bees have a number of recent environmental changes, ranging from the simple effect of moving from a tropical to a temperate environment, the need to use new and different flora, and the effects of domestication. In a very numerous, rapidly dispersing species, these effects led to a rapid adaptive response in a large proportion of genes. These are the basic principles underlying the recent acceleration of positive selection in our lineage also.

The introgression of European genes into the dispersing Africanized bees in the Americas is interesting, because it seems counter-intuitive. The main differences between Africanized bees and European bees involve adaptations to climate. European bees put up lots of honey for the winter, and swarm less frequently, in addition to being more sedate. African bees don't bother with as much honey, which together with their more frequent swarming would seem to be a good fit for the tropical pattern of seasonality. These African traits explain why the African bees have spread at the expense of the European bees across the tropical New World. But Africanized bees have picked up a lot of genes from the European bees in the New World.

The authors propose some possible explanations:

The adaptive value of functional (coding) portions of Western European genomes could be related to positive selection on novel variation in West European bees, to positive selection on novel hybrid gene combinations, and/or to selection for heterozygous genotypes. Our study thus provides direct evidence that invasive populations can exploit hybridization in an adaptive fashion -- a finding of immense relevance to understanding the dynamics of biological invasions (Zayed and Whitfield 2008:3424).

In other words, behavioral correlates of climate may be a target of selection and introgression -- I would speculate because of the intrinsic rarity of adaptive mutations in these functions.

This is a relatively course-grained analysis of positive selection, since the study basically averages within SNP categories, determining FST between pairs of populations. For non-coding SNPs, the Africanized bees are very similar to African bees (FST = 0.05), while for coding SNPs they are twice as divergent (FST = 0.10). That's a lot of difference in allele frequencies over a short time; it must have been caused by strong positive selection across a broad sample of loci. They do not attempt the same kind of "10% of genes" estimate for the introgression, but their figures show that it is quite significant across their data.

I don't know but it may be a while before this initial study can be followed up with recombination based selection tests, because of this little known fact: bees have a recombination rate of 19 cM/Mb -- roughly 15 times higher than humans. Still, Whitfield et al. (2006) found an excess of linkage disequilibrium in the West European subspecies of bees. It now seems likely that some of this LD is explained by the widespread selection documented in the current study.

In other words, the genetic structure of global bee populations provides another strong example of the importance of rapid evolution in abundant species, coupled with ecological changes. Bees also now provide a strong example of adaptive introgression -- in this case, within a very tightly timed dispersal with known climatic conditions.

References:

Zayed A, Whitfield CW. 2008. A genome-wide signature of positive selection in ancient and recent invasive expansions of the honey bee Apis mellifera. Proc Nat Acad Sci USA 105:3421-3426. doi:10.1073/pnas.0800107105

Whitfield CW and 9 others. 2006. Thrice out of Africa: Ancient and recent expansions of the honey bee, Apis mellifera. Science 314:642-645. doi:10.1126/science.1132772

Syndicate content