As far as cladistics can take mtDNA analysis

8 minute read

In the early access online edition of Genetics, there is a new paper by Toomas Kivisild and (many) colleagues, titled "The role of selection in the evolution of human mitochondrial genomes" (via Dienekes).

The conclusion of the paper is that the appearance of many nonsynonymous mtDNA changes in certain populations may be the consequence of hotspots where mutations happen repeatedly. The rapid mutation rate at these hotspots means that they saturate more quickly than other sites, and their variation in recently-founded populations is therefore higher than expected compared to their variation in more ancient populations. They suggest that the appearance of many non-synonymous variants in "Arctic" populations (found by Ruiz-Pesini 2004) should be explained by the recent colonization of these regions, as opposed to new adaptations to cold in these populations.

The study was a phylogenetic analysis of human mtDNA variation, from a sample of 277 individuals. After deriving a most parsimonious tree, they looked for sites that underwent recurrent mutations in different branches of the phylogeny. These "hotspots" make up a disproportionately large number of the changes within and between human mtDNA lineages. Thus, it is likely that the high proportion of nonsynonymous changes in certain populations might be due to these hotspots.

Within-human coding variation

So does it matter whether or not some human population has a higher number of nonsynonymous variants? If a population did have a higher proportion of nonsynonymous variants, would that be a good sign of local selection?

I would suspect the answer to both questions is no. It certainly makes sense to me, as Kivisild et al. (2005) claim, that the excess of nonsynonymous changes in some populations may be an overrepresentation of nonsynonymous hotspots compared to more limited variation at other sites. So there is a statistical reason besides selection for this observation.

But considering the low global variation of human mtDNA, there shouldn't have been too much opportunity for different regions to become very different in their mtDNA variants. All of them have a recent common mtDNA ancestor, so locally adaptive variants probably don't differ by a large number of substitutions. And if they don't, then we shouldn't expect to see a significant increase in the proportion of nonsynonymous substitutions for those locally adaptive variants. So this is just not a very good test for local selection.

But there is a pretty good test for whether a variant might be a target of selection: Look at its functional consequences. And we now know that many of the variants that are common in different parts of the world actually have functional consequences on life history, degenerative disease, metabolic efficiency, and high-energy tissues like the brain. Some variants are associated with higher cancer rates, some with higher Alzheimer's and Parkinson's rates, some with higher lifespan, and others with greater energy conversion. When these variants differ significantly in their frequencies in different regions, it is reasonable to suggest that they were selected.

Of course testing the hypothesis of selection depends on demonstrating a fitness advantage for the variants, so it remains at least theoretically possible that different individuals have mtDNA with higher or lower cancer risk, lifespan, and energy efficiency without any difference in fitness.

But I don't think that we would make that assumption for any other gene -- it would be silly. And we don't need to know the proportion of nonsynonymous mutations to make that judgement; we just need to know that the gene does something differently in different places.

So I think the paper goes about as far as anybody can in demonstrating the rates of different kinds of mutations from phylogenetic comparisons. But that still doesn't tell us what we want to know: do the genes do anything differently in different populations. And in fact we already know that they do. The phylogenetic comparisons might inform us about how many selected changes there have been since the mtDNA coalescent, but in fact that number must be small because the coalescent is recent.

Comparison of different primate species

This comparison is discussed to some extent in the paper, but it does not become one of the major foci of the conclusion. I think there is more interesting stuff to be found here, and it points to the possibility of significant adaptive evolution in mtDNA sequences across primates.

You might not get this from the conclusion, which suggests that there is little evidence of positive selection in hominoids on the coding regions of the mtDNA as a whole. But read the criteria:

In these tests, maximum likelihood ratios of non-synonymous to synonymous mutations (omega) exceeding 1 are consistent with the hypothesis of positive selection, while values close to 1 indicate selective neutrality, and values converging on 0 suggest strong purifying selection. We conducted both lineage and site specific tests. For the lineage-specific tests, we used a model in which all lineages have the same omega (hereafter referred to as M0) and compared that with a model in which omega is estimated for each lineage (hereafter referred to as M1). To test for the action of selection among amino acid sites within a specific lineage, we compared a model that allows for heterogeneity in omega among sites, but not among lineages, with a model that allows for variation in omega along a predefined lineage (as in (YANG and NIELSEN 2002)) (Kivisild et al. 2005:8).

Negative selection reduces the number of amino-acid coding substitutions (nonsynonymous subtitutions) compared to synonymous substitutions. Positive selection increases it. This test assumes that either negative selection or positive selection has happened, but not both. Of course, there's no easy test to tell whether both might have happened. They alter the ratio of NS/S subsitutions in opposite directions, so the actual NS/S ratio must reflect their force relative to each other. The paper recognizes this problem (p. 18), but doesn't explore it. Is it credible to think that a site that evolves by positive selection in some lineages is not constrained by negative selection in others? If evolution involves the occasional positive selection of variants at sites usually under negative selection, then the test of selection used here will be extraordinarily weak. Indeed, it is significantly stacked against detecting positive selection.

Even so, the phylogenetic comparison of hominoid ( + macaque) mtDNA found that the model that incorporated positive selection at some sites was superior to neutral or purely negatively selected alternatives. Based on this model, the study found that 16 amino-acid codons in hominoids were significantly likely (i.e. p > .95) to have been under positive selection. That seems to me like a bare minimum, as there must probably have been positively selected sites in individual lineages that wouldn't show up in the cross-hominoid comparison.

The total possibility for positive selection on the human lineage seems large. The study found 167 amino acid substitutions separating humans and chimpanzees, compared to 452 between chimpanzees and orangutans (and only 96 between cats and dogs, which seems incredibly low to me). They tabulate the proportion of substitutions from one amino acid to another (e.g. Ala <> Thr, Ile <> Val, etc.), and find that these proportions differ in some cases from the proportion of segregating variants within humans.

Suppose we assume that those 84 of those 167 mutations are human-specific (the paper doesn't include this information). If six of those were positively selected, that's one per million years. If twelve, one per 500,000 years. And there's no reason to think that some of these might not have undergone multiple substitutions; indeed the presence of hotspots suggests that some sites might have been recurrently selected as the genetic background at other sites changed. And it seems likely that the 414 amino acid segregating variants in humans might include some that had been selected previously during human evolution also. How many selected substitutions may have happened during recent evolution cannot yet be estimated, but how surprising should it be that the most recent one happened around 160,000 years ago?

An aside

Here's an interesting suggestion; I wonder if it's true:

One factor that could, theoretically at least, explain the different amino acid replacement patterns observed between populations and between humans and other mammals is diet. Threonine and valine, essential amino acids that must be taken in the diet, are abundant in meats, fish, peanuts, lentils, and cottage cheese, but deficient in most grains (Kivisild et al. 2005:17-18).

It's another possible reason for selection based on diet during the last 10,000 years. If it affects metabolism strongly enough, which remains to be demonstrated.

Do I have to keep writing about mtDNA?

I'm sure some readers are beginning to think this is mtDNA Selection Central. Believe it or not, I've gotten a lot of requests to cover this topic, which of course is one of the central issues in the Neandertal problem as well as the unraveling of human origins.

And it's an exciting developing story: it shows how medical genetics is steamrolling the human genetics of the past thirty years. Finding mutations that actually do things has great medical interest, and the search is accelerating. This work is being undertaken by people who have no investment in the idea that variation among humans should be completely neutral.

After all, what's more important: that a neutral mtDNA lets us trace human migrations, or that understanding mtDNA selection helps us find treatments for Alzheimer's disease? There's no way that obsolete lineage tracing can survive this kind of conflict. Finding out the history of mtDNA variability is telling us something very important, but it isn't about the movements of people around the globe 100,000 years ago. It's about the evolutionary tradeoffs that led to advantages and disadvantages for different variants.


Kivisild T et al. 2005. The role of selection in the evolution of human mitochondrial genomes. Genetics (online before print).