mtDNA

French Neolithic discontinuities

Marie-France Deguilloux and colleagues [1] present a short analysis of ancient mtDNA recovered from a Neolithic burial at Prissé-la-Charrière, between the Loire and Garonne valleys of western France.

The mtDNA sample in the end was only three individuals -- one haplogroup X2, one U5a and one N1a. Each is intriguing, as far as a single sequence can be, because all are rare or absent from France today. I think one shouldn't go far interpreting three samples, but they contribute to the view that Neolithic mitochondrial variation in Europe was very different from recent Europeans. The N1a and U5b sequences fit within the already-known Neolithic (and for U5a, Mesolithic) variation in central and northern Europe.

It is from the U5a that Deguilloux and colleagues make a point about possible Mesolithic population continuity.

Subhaplogroup U5b has also been encountered in German Neolithic remains from the Corded Ware Culture (Haak et al., 2008) and in the hunter-gatherers studied by Bramanti et al. (2009), although in both instances, the branches concerned were distinct from the U5b in the Prissé sample. It is, however, worth noting that haplogroup U5 has been encountered in surprising frequency in the hunter-gatherers studied by Bramanti et al. (2009) and could correspond to a Mesolithic heritage.

The story of N1a is that it was very common in the central European Neolithic, even though it is very rare today. That was first noted by Wolfgang Haak and colleagues [2], and has in subsequent years been joined by the observation that the pre-Neolithic hunter-gatherers had yet other common haplogroups. The population history of Europe was a lot more interesting than we suspected 10 years ago.

Deguilloux and colleagues attempt a conservative explanation for the frequencies of N1a in Neolithic samples:

The widespread distribution of the N1a lineage in Early and Middle Neolithic northwestern Europe may indicate genetic continuity from Mesolithic populations. This scenario would support a Mesolithic contribution to the earliest Neolithic of Atlantic Europe. This would imply that the N1a lineage was already common in indigenous north European populations and that the spread of the Neolithic was principally the result of cultural diffusion. Although so far the N1a lineage has not been encountered among late European hunter-gatherers in central and north Europe (Bramanti et al., 2009; Malmström et al., 2009), it is worth noting that less than half of the hunter-gatherers' paleogenetic data come indeed from the pre-Neolithic period (predating LBK expansion). Finally, no paleogenetic data currently exist for the Mesolithic period in Western Europe. This prevents any conclusion being drawn about N1a occurrence during the Mesolithic period in those regions.

I will note this -- the more that N1a is replicated across the Neolithic of Europe, the less and less likely that its subsequent vast reduction in frequency could result from genetic drift. When there was only one or two samples from Central Europe with high N1a, it was at least possible that this was a local founder population that did not spread its mtDNA diversity very far. If it were localized, even in the central Danube (a fairly big region) it might be possible to maintain that the later decline of N1a to its present low frequency had been due to population replacement.

Now N1a seems like a real marker of the LBK, spread widely into Western Europe. It may be, as Deguilloux and colleagues suggest, that it will be found at substantial frequencies in earlier samples somewhere in Europe. We do want some explanation for how it got to be common in this culture area.

Dienekes has written about the study. His point is a good one: If N1a were present somewhere in pre-Neolithic Europe, it would require some kind of "partition" of the pre-Neolithic population, along with its propagation -- presumably southeastward -- into the LBK of central Europe. Seems doubtful.

The study includes an illuminating paragraph about the sources of contaminating sequence in these Neolithic extractions.

Strict precautions were followed during all procedures (including precautions during excavation) and proved to be effective, because all researchers who directly participated in this study (from people working in the field to those working in the laboratory) were genotyped and their sequences were never observed during analyses. However, European sequences were randomly found in clones (28% of the sequences obtained). These specific sequences are regularly observed in the laboratory, whatever the project tackled (including samples from Polynesia or South America), in clones from samples or negative controls. They are not reproducible for a specific sample and are different from researchers' sequences. These facts lead us to suspect the contamination of PCR reagents (Leonard et al., 2007). It was relatively easy, however, to discard those contaminating sequences from our analyses because they were largely in the minority when compared with endogenous sequences.

It would not be very difficult to compare the results from different labs and do a forensic-quality analysis of these reagent contamination events. Surely a good fraction of ancient DNA results prior to the last few years must represent such contamination. Nowadays people have the expectation that Neolithic-era remains may have rare or exotic haplogroups, but it hasn't been so long since people assumed that French equals French. I expressed some concern about this criterion before -- "strange" stands in for "non-contaminated" in too many studies.

It might be very helpful to have a paper outlining the actual contamination pathways that have been found to affect multiple labs. Then the results could be compared against reports that have come out over the years. If people are reluctant to cull doubtful ancient DNA results, at the very least they can target a set for replication studies.


References

Mailbag: mtDNA "out of whack"

Re: "Time to revise the mtDNA timescale?":

You said "The timescale of mtDNA divergence is already out of whack with the rest of the genome."

What's the time scale for the rest of the genome? It seems to me it should be expected to be at least twice as much as that for mtDNA since at least half the instances of mtDNA - those in males - dead end each generation. With perfect mixing and replacement, 50% of the mtDNA instances pass from one generation to the next, while 75% of the autosomal instances do. Imperfect mixing and replacement would make both numbers lower, but the mtDNA number would still remain much lower than the autosomal number, so the coalescence time should still be expected to be much lower.

Thanks for noticing that, it's leading to something but I haven't yet described the problem. My apologies for being less than clear.

What you're describing (you probably already know) is commonly described as the "four-times rule" -- the uniparental inheritance and single copy number give mtDNA one fourth the effective size, on expectation, as an autosomal locus.

That's in a constant-sized population. Which of course we haven't been. For around the past 100,000 years, African populations were big enough that genetic drift didn't decrease their genetic diversity markedly. The mtDNA coalesces around 100,000 years before that, compared to more than 700,000 years for the typical autosomal locus -- it's 7 times instead of four. That discrepancy is probably not significant given the huge intrinsic variance of the coalescent. But I don't think it's been seriously investigated.

The real problem is that the out-of-Africa timescale for mtDNA is now very short -- less than 65,000 years -- while the nuclear timescale looks long -- maybe up to 140,000 years. Maybe these can also be reconciled; it's not yet clear. But it's a problem.

Time to revise the mtDNA timescale?

Krzysztof Cyran and Marek Kimmel (2010) have presented a revised set of estimates of the human mtDNA most recent common ancestor (MRCA). It's an interesting theoretical paper, written for the purpose of developing a method that doesn't rely on the same assumptions as the usual coalescent models.

Their new method gives an estimate of 174,000 years ago for the human MRCA. They report an upper/lower range as 96,000 to 449,000 years ago. That range does not represent a confidence interval on the estimate, it's an upper/lower based on extreme assumptions about human/Neandertal genetic distance and the human/Neandertal MRCA.

The Neandertal mtDNA has really affected the way we estimate human MRCA, at least for the mitochondrial genome. Chimpanzees are just too distant. When we compare human and chimpanzee mtDNA genomes, there has been a lot of parallelism and reversal on both lineages, because mutations have hit the same place multiple times. Multiple hits and purifying selection make a mess out of rate estimation -- generally, they make the human MRCA seem a lot older than it truly was. The Neandertals are closer, and are therefore less of a problem.

But the Neandertal-human MRCA itself was poorly known, as long when we had only chimpanzees to calibrate the mutation rate....

I keep seeing people, who really ought to know better, saying that the new Neandertal genome results show that the gene flow must have been Neandertal men mating with modern human women, and not the other way around.

You see, they're fixated on the idea that the mtDNA showed no signs that the Neandertal clade survived into the present-day population. That result really convinced some people that interbreeding was impossible. They're flummoxed that some of the rest of the genome has significant signs of intermixture. It's like their world is spinning out of control. I'm not naming any names, but if you've followed much of the press around the Neandertal genome, you've probably seen this suggestion.

I don't know why it hasn't occurred to them that the Neandertal mtDNA type was probably lost because of natural selection.

To avoid raising the awful specter of Darwin, they've been talking about weird mating restrictions. Well, I suppose that if you really have to find a way to get Neandertal nuclear genes into us, without bringing mtDNA along, a total lack of Neandertal women contributing genes is formally one way to get that.

I'd just like to see these people explain how exactly we managed not to get any Neandertal Y chromosomes, either.

Is it safe to talk about selection, now?

UPDATE (2010-05-11): A reader writes:

With regard to your latest blog post on lack of neanderthal mitochondrial and Y chromosome DNA in humans: yes, it's possible natural selection had a part. However, given that only a small proportion of our ancestors seem to have been neanderthals at the appropriate time, it strikes me that this is a case where drift could be the correct explanation - despite the fact that I'm usually not a big fan of drift as an explanation.

Much depends on the size of the ancestral population and the pace of population growth in the generations surrounding the pickup of Neandertal genes. Drift is less likely to eliminate alleles in a growing population, but it depends how many copies there were to begin with. The key questions -- where and when the population was growing -- are unlikely to be the same as assumed by the modeling that showed drift couldn't have eliminated the Neandertal mtDNA, as most assumed the location of contact would be Europe and the time would be late.

There were other deficiencies with the modeling, also. Here we've been working on a source-sink model as a possible demographic scenario for Pleistocene humans; that kind of metapopulation dynamic might easily explain allele losses without selection, and becomes more and more credible as we learn the variance of contribution of Neandertal-like alleles across the genome. It's a different world this week than last week.

These are all mathematically tricky answers, clever, but academic unless we have good matches to genome-wide variation. Meanwhile a very simple answer, easy to explain to anyone, lies fallow. Exceedingly curious.

I'd be happy to be proven wrong about the Y chromosome, by the way -- we don't really know that Neandertals didn't have a human-like type, although we do now that today's human population has an exceedingly recent coalescent time. Could be bad estimates of mutation rate. Maybe we'll have more surprises in store.

African-American mtDNA and regional populations of Africa

I'm attending a symposium on genetics and genealogy of the African Diaspora this morning. Fatimah Jackson is here giving a very interesting talk about her genetic work in Africa and African-Americans, and in particular her idea of "ethnogenetic layering" (Jackson 2008), which is basically a strategy for describing the fine-scale makeup of present-day populations by examining their genetic ancestry from different regions of the Old World.

Part of her research has involved characterizing the regional distribution of mtDNA haplotypes within African populations. She shared some newer data with us, but I thought it worth pointing people to an earlier publication by Bert Ely, Jackson and others (2006), which gave rise to some strong insights about the poverty of current sampling of African populations.

The study reports on a sample of 3725 mtDNA sequences (HVS-I) from a diversity of sub-Saharan African populations. That's quite a massive sample of sequences, certainly on the scale that had been available earlier. It is substantially more numerous than

When a sample of 74 Gullah/Geechee mtDNA sequences were compared with the sub-Saharan database, approximately half of the mtDNAs were identical to two or more mtDNAs in the database and only seven mtDNAs matched mtDNAs from a single ethnic group. The remaining 28 mtDNAs were not identical to any sequence in the expanded database.

Similar results were obtained when the 97 African-American AFDIL mtDNAs were compared with the databases. Approximately half (49) of the mtDNAs were identical to multiple sequences in the original database. As with the Gullah/Geechee sample, fewer than 10% of the sequences matched a sequence from a single ethnic group, and 40% of the sequences did not have any perfect match in the database (Ely et al. 2006:3).

There are two aspects worth noting in those results. On the one hand, the common haplotypes -- the ones that the African-American samples were likely to have a match to -- were not regionally specific within Africa. They are shared by many ethnic groups, distributed across the continent.

On the other hand, 40% of the African-American sequences have no match among the nearly 4000 sequences taken from continental Africa. That's astounding to me, just from the standpoint of sampling. Most of the common haplotypes will emerge within a relatively small sample, so to find something you haven't already seen, you have to sample disproportionately more -- in fact, exponentially more -- individuals. You can just imagine how many tens or hundreds of thousands of sequences you would have to gather to have an adequate representation of African mtDNA for this purpose -- the purpose of finding matches for a large fraction (say, more than 90 percent) of African-American mtDNA haplotypes that originated in Africa (there are of course a substantial fraction whose recent maternal ancestry originated somewhere else).

One of the features of the symposium is a discussion of the relevance of ancestry testing. Jackson is an expert in this field and well-recognized -- she appeared in several of the "African-American Lives" episodes, for example.

With several companies and organizations now offering various kinds of ancestry tests, these have become increasingly affordable. But the results are often confusing; people don't know how to interpret them. Some of that confusion was evidenced in questions here at the symposium -- as part of a year-long discussion group, several local people submitted cheek swabs for ancestry interpretation. The results are often poor, because the sampling of recent populations is inadequate to really answer many questions. Where were today's populations 300 years ago? Have we adequately sampled the variation of present populations.

Research like Jackson's has shown that even widespread and numerous samples provide a real poverty of information about mtDNA diversity. The situation is vastly worse if we turn to autosomal variation, because the samples are smaller and more scattered.

Of course, for many anthropological purposes, the samples we have today are tremendously useful. My work on recent selection, for example, has made leaps and bounds on samples of a few hundred individuals.

But the converse case -- you take a person and ask whether you can diagnose their origin -- that task requires much larger samples to gain any statistical confidence in the general case. There may be specific haplotypes that are highly specific as to their present distribution -- but then, all of those are rare haplotypes, and you have to be lucky enough to have it within the comparative sample that the organization or company has gathered.

I'm still listening here and some of the later presentations will touch on the issues of genetic ancestry testing more directly. But I thought I would share a quote I really liked, with which Jackson ended her comments:

I'm not against genetic ancestry testing. It's fun. But in the final analysis, you have to look in the mirror, and you decide who you are.

Related posts:

Skip Gates discovers that genetic tests don't mean what he thought they meant.

Anne Wojcicki from 23andMe comments on genomics and race

Unintended consequences of genetic ancestry tests

References:

Ely B, Wilson JL, Jackson F, Jackson BA. 2006. African-American mitochondrial DNAs often match mtDNAs found in multiple African ethnic groups. BMC Biology 4:34. doi:10.1186/1741-7007-4-34

Jackson FLC. 2008. Ethnogenetic layering (EL): an alternative to the traditional race model in human variation and health disparity studies. Ann Hum Biol 35:121-144. doi:10.1080/03014460801941752

Darwin's mitochondria

I'm always skeptical when pathologists attempt to diagnose the ills of historical figures. Even if there are medical records or abundant attestations of symptoms from contemporary sources, people in the past had different ways of describing the observations that doctors today collect.

But that doesn't stop people from trying. Last month, John Hayman published a paper in the British Medical Journal that claims a new diagnosis for the lifelong malady that Darwin described in his own journals and correspondence:

Darwin’s symptoms are those of cyclical vomiting syndrome. Although this is primarily a disease of children it may persist into adulthood or may appear for the first time in adulthood. The disease is related to classic migraine and abdominal migraine but is also linked to abnormalities of mitochondrial DNA, with mutations in the MTTL1 gene. This disease is neither well known nor well recognised, particularly in adults, although it was first described in the English literature in 1882.

People with cyclical vomiting syndrome experience abdominal, circulatory, and cerebral symptoms, including headaches and anxiety. Symptoms overlap with those of classic and abdominal migraine, except for a lack of aura. Affected people may experience some or all of these symptoms, with each individual having similar symptoms with each episode. Over time, however, progression or change may occur in the most prominent feature, and episodes may coalesce. Many people report severe motion sickness, and this may be associated with a full episode.

It seems plausible enough, as much so as any retrospective diagnosis could probably be. It bears all the drawbacks of other attempts to diagnose historical figures.

A test of the hypothesis: mtDNA is maternally inherited and haploid, so symptoms are very likely to be shared by maternal relatives:

Darwin’s mother Susannah died with abdominal pain when he was 8. As a child she had vomiting and boils, experienced motion sickness, had excessive sickness during pregnancies, and "was never quite well." Her younger brother Tom had similar symptoms, with headaches, abdominal pains, and motion sickness. A sister, Sarah, considered that Charles and his uncle Tom had the same illness. Evidence of a matrilineal inheritance pattern is good, consistent with an abnormality of mitochondrial DNA.

It's a sad thing to affect a family.

Mitochondrial disorders are increasingly recognized as causes of chronic disease -- just the other day, a new study implicated defective mitochondria as causes of Parkinson's Disease. I think it's hilarious because there is a cadre of geneticists who depend on the notion that mtDNA is a neutral marker of population history.

Darwin's DNA has nothing to say about whether it was neutral on an evolutionary timescale, but every famous mtDNA functional mutation reminds us that there is biological function there, which is a target of selection under some circumstances.

(via Why Evolution Is True)

References:

Hayman JA. 2009. Darwin's illness revisited. Br Med J 339:b4968. doi:10.1136/bmj.b4968

Filed under

Mailbag: mtDNA mutation rates

Regarding frozen penguin mtDNA and mutation rates:

I read your comments on the ancient penguin DNA post on your blog, and the follow up where you mentioned that you don't think this study would push back the speciation date between humans and the other primates.

But would this have an impact on the "recent out of Africa origin of modern humans theory"? Doesn't the penguin study call into question the estimated date of mitochondrial eve at 160ky BP and Y-chromosomal Adam at 60ky BP?

The basic question is how the "phylogenetic rate" relates to the "mutational rate". The mutational rate can be measured by comparing mothers and daughters, this turns out to be very high. If these mutations were not under selection, each would be equally likely to reach fixation, and so the "phylogenetic rate" would equal the mutation rate.

But we know that it doesn't, because chimpanzee mtDNAs are an awful lot more like us than they would be if they had diverged at the mutational rate. So, purifying selection must eliminate a large fraction of the new mutations that happen between mothers and daughters.

So far, so good. The question is whether purifying selection works in a few generations, or if it takes many thousands of years.

We know that human mtDNAs have an ancestor, but the time that we estimate for that ancestor depends on the pattern of selection -- fast or slow, strong or weak? If purifying selection was very strong, then the only mutations that survive for very long must be neutral. Counting differences between populations will give us the same answer today as tomorrow, and the same differences would still be there after five million years -- the phylogenetic rate, in other words.

If purifying selection was weak, then many of the mutations we see between populations today may be deleterious, ultimately bound for extinction. So counting differences today is an overestimate of how different the populations will be tomorrow. Something between the mutational and phylogenetic rates.

There has been quite a bit of literature on this topic in humans, so I don't think the penguins add much except as a convenient example. I can't say we really know the answer. There's a lot of circular logic in the literature -- people trying to estimate the rate based on assumptions about when humans reached India, for example, when those assumptions are themselves derived from mtDNA. It's a mess.

The problem is worse for the Y chromosome, because the best estimates of the human most recent common ancestor are on the order of 50,000 years ago -- clearly too young, but we don't know how much too young.

Ancient penguin mtDNA and substitution rates

Here's an example of a really incomprehensible press release:

Ancient penguin DNA raises doubts about accuracy of genetic dating techniques

Penguins that died 44,000 years ago in Antarctica have provided extraordinary frozen DNA samples that challenge the accuracy of traditional genetic aging measurements, and suggest those approaches have been routinely underestimating the age of many specimens by 200 to 600 percent.

In other words, a biological specimen determined by traditional DNA testing to be 100,000 years old may actually be 200,000 to 600,000 years old, researchers suggest in a new report in Trends in Genetics, a professional journal.

You can see why I'm interested -- the Neandertal genetic samples are in the neighborhood of 44,000 years old, so if ancient DNA is saying something unusual about penguins, it might say something unusual about them, right? But what are they talking about here? Racemization? I mean, there are no "genetic dating techniques" for specimens! The rest of the release doesn't clarify matters very much, although it does say that the findings

may force a widespread re-examination of determinations about when one species split off from another, if that determination was based largely on genetic evidence

That sounds like an argument that penguin sequences didn't evolve at the rate one might estimate from a molecular clock based on penguin systematics. The quotes from the researchers involved do include the words "molecular clock", which is a good sign.

Well, enough of this, let's go straight to the research.

High mitogenomic evolutionary rates and time dependency

Using entire modern and ancient mitochondrial genomes of Adélie penguins (Pygoscelis adeliae) that are up to 44000 years old, we show that the rates of evolution of the mitochondrial genome are two to six times greater than those estimated from phylogenetic comparisons. Although the rate of evolution at constrained sites, including nonsynonymous positions and RNAs, varies more than twofold with time (between shallow and deep nodes), the rate of evolution at synonymous sites remains the same. The time-independent neutral evolutionary rates reported here would be useful for the study of recent evolutionary events.

Their sample includes 12 modern Adélie penguins and 8 ancient ones, two of which are from the maximum time interval, although some are only around 250 years old. Now, the age distribution of the rest is fairly important to their analysis, but I can't see it because it's hidden in a data supplement, and I'm reading this in a laundromat in Vienna with no internet access.

You see why I don't like these freaking online supplements? I'm in the middle of Europe and inconvenienced. Imagine if some penguin enthusiast in an underdeveloped country, with no subscription to the journal, got this paper in an e-mail attachment. They'd never be able to get a copy of the methods.

There are several problems estimating substitution rates with data like these penguin mitochondria. You really depend very strongly on neutral demographic history -- if there were big population movements or partial replacements among the penguins, the estimation of rate is totally confounded by these. The paper refers to prior work on mammoth ancient mtDNA:

A previous study on the mitochondrial genomes of the extinct mammoth also suggests that the rate based on internal calibrations (within mammoths) is ~1.6 times higher than that obtained using the external (i.e. mammoth–elephant) calibration.

...which raises a similar issue -- since the mammoths apparently did undergo a partial population replacement (or at least, an mtDNA replacement) across part of their range.

Also, you depend very strongly on the few most ancient specimens, because they sample the longest time interval. Which means, you need to know the date of these specimens with great accuracy and you need to place them accurately on the genealogy that connects the more recent specimens.

I think the biggest hangup is the genealogy. You can't assume that a 44,000-year-old penguin is a direct ancestor of any living mtDNA sequences. It's a relative, at some distance, possibly a member of an extant clade, possibly not. When we're talking about fossils that are 10s of thousands of years old, it becomes very likely that most of the branches connecting with living sequences will have coalesced into very few ancient branches, and it becomes progressively less likely that you will discover a representative of one of those actual ancestral branches. In other words there's an error intrinsic to the coalescent process that really can't be corrected by sampling more extant lineages.

In other words, you can't just convert sequence differences into substitution rates without a model involving some pretty strong assumptions.

The paper mentions two very well-known issues concerning the relationship of substitution rate, purifying selection, and saturation. Basically, deleterious mutations can hang around within a population for a while, so that a genetic sample from a living population will tend to over estimate the substitution rate. And long-term comparisons of distinct taxa may include so much time that multiple substitutions may have happened at the same site -- leading to an underestimate of substitution rate. These are the reasons, for example, why the number of mitochondrial mutations between mothers and their daughters is much higher than you would estimate from the number of differences between humans and chimpanzees.

What does this mean for the penguins? Or, more to the point, the Neandertals? Here's a short passage where the paper discusses the comparison:

By contrast, the synonymous substitution rate (0.054–0.073 s/s/My) estimated here is five to seven times higher than previous phylogenetic rate estimates [1–4] and significantly higher than those based on intra-specific comparisons within human (0.048–0.052 s/s/My) [14] and Neanderthal (0.036–0.042 s/s/My) [24] populations. These results clearly argue against the use of the classical 1% rate per lineage (or the ‘2% rule’ as it is commonly known) to study the evolution or genetics of individual species.

Well, the penguin rate may be significantly higher than the within species human rate estimate, but it's not very much higher -- a minimum of 0.054 compared to a maximum of 0.052. So I don't think there is anything to get very exercised about with respect to ancient human DNA or Neandertal DNA.

Unless you really are trying to use DNA like some sort of radiocarbon method. But that would be silly.

References:

Subramanian S, Denver DR, Millar CD, Heupink T, Aschrafi A, Emsile SD, Baroni C, Lambert DM. 2009. High mitogenomic evolutionary rates and time
dependency. Trends Genet 25:482-486. doi:10.1016/j.tig.2009.09.005

So what's the baboon DNA doing in that rare monkey species, anyway?

The researchers are looking into whether the baboon DNA has given the kipunji any survival advantages and could possibly explain why roughly 1,000 of the monkeys live in the Southern Highlands (the population having baboon DNA) compared with just 100 in the Udzungwas.

Once again, for most primates we only know anything at all about the variation of this one genetic locus.

The Finnish line

A new paper by Jukka Palo and colleagues investigates the population history of Finland:

The Finnish population in Northern Europe has been a target of extensive genetic studies during the last decades. The population is considered as a homogeneous isolate, well suited for gene mapping studies because of its reduced diversity and homogeneity. However, several studies have shown substantial differences between the eastern and western parts of the country, especially in the male-mediated Y chromosome. This divergence is evident in non-neutral genetic variation also and it is usually explained to stem from founder effects occurring in the settlement of eastern Finland as late as in the 16th century. Here, we have reassessed this population historical scenario using Y-chromosomal, mitochondrial and autosomal markers and geographical sampling covering entire Finland. The obtained results suggest substantial Scandinavian gene flow into south-western, but not into the eastern, Finland. Male-biased Scandinavian gene flow into the south-western parts of the country would plausibly explain the large inter-regional differences observed in the Y-chromosome, and the relative homogeneity in the mitochondrial and autosomal data. On the basis of these results, we suggest that the expression of 'Finnish Disease Heritage' illnesses, more common in the eastern/north-eastern Finland, stems from long-term drift, rather than from relatively recent founder effects.

So you've got a cline of genetic variation. How do you explain it? This paper reminds us that for a single locus there are always multiple explanations: asymmetric migration, natural selection, founder effect and population growth are the simple unicausal scenarios. Considering a cline by itself, there's no reason to prefer any of these except for assumptions that come from outside that gene -- maybe you know something about the history, maybe the gene's function gives you a clue.

If you're going to test these hypotheses with genes alone, then you need to sample multiple loci, and you need to make an adequate spatial sampling of the population. And when you do, sometimes the evidence points in a different way than you had expected.

References:

Palo JU, Ulmanen I, Lukka M, Ellonen P, Sajantila A. 2009. Genetic markers and population history: Finland revisited. Eur J Hum Genet 17:1336-1346. doi:10.1038/ejhg.2009.53

Razib posts some thoughts on how the study of human migration history has gotten more and more complex during the last fifteen years.

Sometimes I wonder if the period between the publication of The History and Geography of Human Genes and The Journey of Man, roughly from the mid-90s to the early 2000s, will be seen as a golden age for historical population genetics in hindsight. A few weeks ago I pointed to new data based on DNA extraction which really confuses the picture of how Europe was populated over the past 25,000 years. It seems the more data we get, the more interesting things get. In the late 1990s the emergence of powerful technologies to extract and amplify genetic material and sequence it shed light on several questions which had long tantalized researchers ever since Alan Wilson's group began to push the frontiers of molecular evolution in the 1970s. Where in the 1980s there was only the mitchondrial Eve story, by the year 2000 there was enough to go around for several books. The Journey of Man, Mapping Human History and The Seven Daughters of Eve all came out very close together chronologically. These scientists and writers knew that striking fast was imperative.

That's also when Colin Renfrew's "archaeogenetics" really got going, with a number of symposia and a couple of books. So what happened? As Razib points out, things got complicated -- we started adding more autosomal markers, and larger samples of mtDNA and Y chromosomes, and the trees didn't line up so cleanly. In retrospect this was predictable, as one-locus genealogies have so much variance that it's easy to confuse noise for signal.

Heck, it's not just hindsight, the problem was predicted at the time, by me and others!

Still, I've learned to appreciate science's self-correcting nature. Phylogeography grew into a serious science, not flawless, but driven increasingly by testing hypotheses instead of promoting "consistency" with them. This happened partly by extending the same genetic techniques to other species, where the basic modeling questions didn't have the same headline-grabbing emotional appeal as in humans.

Many shark-jumping moments of human genetics are still out there, waiting to be re-evaluated with new evidence.

"The worm in the fruit of the mitochondrial DNA tree"

François Balloux (2009) has a polemic in the online access area of Heredity presenting references about mtDNA selection, and arguing that the use of this single genetic marker is no longer warranted without support from other loci.

Yay! I've been saying that both here, and in peer-reviewed articles, for several years. I think serious workers know that one gene is not enough; two genes (mtDNA and Y chromosome, for example) aren't enough -- we have to integrate information across every possible source, genetic, skeletal, and anthropological, to really test hypotheses about the past.

Still, an industry of mtDNA sequencing has grown up, reviewing each others' grants and papers, and shutting down any discussion of adaptive changes. Balloux's commentary addresses this problem -- I'm going to quote the same paragraph as Dienekes:

Let us assume I gave a seminar. I would tell the audience about my latest results on the population history of the pigmy shrew. My findings would be based on a stretch of DNA comprising several metabolic genes, showing no signs of genetic recombination. Armed with sequences from a large number of individuals sampled over a broad geographical area, I would make some inference on the colonization routes and times. To make life easier, I would restrict my analysis to the mutations I liked best, with nice names having been given to related sequences, rather than relying on dull mathematical quantities. As I reach one of the key conclusions of the lecture, which would go as follows: 'It is obvious from the distribution of haplotypes Amanda, Eugenie* and Hector_2 that the Outer Hebrides were colonised about 50,000 years ago, this was followed by considerable population fluctuations, a bottleneck during the last Ice Age, a swift recovery and a dramatic recent expansion over the last 200 years and...'. Imagine that, at that climactic stage I was interrupted by someone in the audience. The impertinent would say, 'Sir, can I just ask you whether this confidence in your conclusions may not be misplaced; your analysis is based on a single genetic marker, which comprises genes with a central role in metabolism and is thus likely to have been affected by natural selection'. An awkward silence may ensue, as I would find it difficult to dismiss this criticism easily.

Well, let me tell you, I've been in dozens of audiences, and have raised that exact point. Here is a sample of the bogus responses I've gotten to this question:

Bogus answer 1: There are no functional differences between humans and chimpanzees in the mtDNA, so it can't have been selected during human evolution. False, false false!

Bogus answer 2: Metabolic processes are highly conserved, and humans couldn't have changed much. Hello? Have you noticed that your breakfast didn't exist in the Paleolithic?

Bogus answer 3: But the pattern of variation can be equally explained by a bottleneck. Some aspects can, others can't so easily.

Bogus answer 4: We examined only noncoding parts of the mtDNA, so there could be no selection. Yes, believe it or not, this is the most common response. I guess they don't teach people about linkage anymore.

Bogus answer 5: There's little or no evidence of selection on any gene in recent human evolution. Human evolution may have stopped entirely. Oh, lord. Yes, I've gotten this one many times.

There have been others over the years. Yet mtDNA is a big business -- people seem to be worried that the slightest criticism will bring down the whole thing like a house of cards. That's not true, even if mtDNA has sometimes been selected during human prehistory or history, that doesn't mean it isn't a useful marker for many purposes. But many seem more comfortable avoiding the issue entirely.

I think that taking the hypothesis of selection seriously would improve most of the work in this field. The possibility of selection doesn't eliminate demographic interpretation -- for example, the high ancient African mtDNA variation allows us to test hypotheses about African demography before 50,000 years ago, and there the data appear to reject the hypothesis of selection, at least after around 150,000 years ago. Gene genealogies don't allow us to see the whole past, just the time and forces that they experienced. If we ignore one of the major forces, we are reducing our knowledge.

There is an obvious problem testing the hypothesis of selection with mtDNA. When we consider any one single locus, it's always possible to find some demographic scenario that yields exactly the same predictions as selection. It's just a mathematical necessity -- selection is fundamentally a demographic phenomenon, and the increase in frequency of selected alleles looks similar to exponential growth of a small population.

So what can we do? Fortunately we have lots of options. We can test the proposed demographic hypotheses against the historical record. When we make observations that show that people 1000 years ago had very different frequencies of common haplotypes, well, we know it was selection. There hasn't been any genetically significant bottleneck in the last 1000 years! When we see small Neolithic population samples dominated by haplotypes that are very rare today, again, no historically possible bottleneck could have caused that.

Balloux with his colleagues (2009) has shown that one aspect of mtDNA patterning -- the association of haplogroup diversity with geography -- is very unlikely to have arisen by genetic drift. Here's part of their abstract:

We show that populations living in colder environments have lower mitochondrial diversity and that the genetic differentiation between pairs of populations correlates with difference in temperature. These associations were unique to mtDNA; we could not find a similar pattern in any other genetic marker. We were able to identify two correlated non-synonymous point mutations in the ND3 and ATP6 genes characterized by a clear association with temperature, which appear to be plausible targets of natural selection producing the association with climate. The same mutations have been previously shown to be associated with variation in mitochondrial pH and calcium dynamics. Our results indicate that natural selection mediated by climate has contributed to shape the current distribution of mtDNA sequences in humans.

They took a dual approach to testing the hypothesis of selection. First, they modeled the evolution of haplotype diversity under neutrality, and showed that the empirical distribution lies significantly outside that range of results. But even so, we might imagine some bottleneck scenario that would cause low diversity in high-latitude peoples, and this would be difficult to refute historically because many of those populations have poor historical documentation. But demography should have similar effects on other genes, and they were able to show that the rest of the genome doesn't share the mtDNA pattern.

It's really not that hard to test demographic hypotheses, using comparative genomics and anthropological knowledge. That's what anthropological genetics should be doing more and more. There was a time when obtaining a reasonable sample of mtDNA was an accomplishment, and comparing that sample to other genes was not feasible. But that time is past, and hopefully the review process -- journals and grants -- will start demanding some integration of mtDNA phylogeography with results from the rest of the genome.

Back to Balloux's conclusion:

Exploiting these new resources of autosomal variation will present significant challenges, but it will not help overcoming them if a large fraction of the community of human population biologists persists in sticking to mtDNA as the marker of choice.

Mitochondrial DNA isn't the tip of the iceberg -- it's an ice cube on top of the tip of the iceberg.

Related:

"Mitochondrial DNA selection review"

"Mitochondrial DNA and sperm"

"mtDNA selection in Iceland?"

"Complete Neandertal mitochondrial sequence, and selection on human (not Neandertal) mtDNA"

"Did Neandertals need better mitochondria?"

"Has the dam broken on mtDNA selection?"

Mitochondrial DNA adaptations in living human populations"

OK, that's enough related posts. But you can find a whole lot more by searching the topic!

References:

Balloux F. 2009. Mitochondrial phylogeography: The worm in the fruit of the mitochondrial DNA tree. Heredity (advance online): doi:10.1038/hdy.2009.122

Balloux F, Lawson Handley L-J, Jombart T, Liu H, Manica A. 2009. Climate shaped the worldwide distribution of human mitochondrial DNA sequence variation. Proc Roy Soc Lond B 276:3447-3455. doi:10.1098/rspb.2009.0752

Dienekes, on a new study of early Neolithic and earlier mtDNA variation in Europe:

This study is also a powerful argument against the idea of genetic continuity across long time spans. Most ancient DNA studies so far have reached a similar conclusion. Thus, it also destroys the supposed justification for continuity from Paleolithic Europe to modern times that early mtDNA work (of the Daughters of Eve variety) has proposed, hand in hand with the hunter acculturation hypothesis.

I'll be reading the study carefully and commenting this weekend.

Mitochondrial DNA and sperm

Tuesday, I referred to mtDNA and sperm evolution. The topic was covered in some detail in a 2004 review paper by Neil Gemmell and colleagues, entitled, "Mother's curse: the effect of mtDNA on individual fitness and population viability."

The basic idea:

1. Mitochondrial DNA is maternally inherited, so that the mtDNA germline leading to any male is female all the rest of the way back in time.

2. Hence, any male's mtDNA would have been subject to selection only for success in females, never males.

3. But male traits depend on the adequate function of mtDNA genes, possibly differently from females.

Sperm stand out as a male-only cell type that place special requirements on mitochondrial function. The midsection of each sperm cell is composed largely of mitochondria packed together to provide energy for the flagellum. In humans, several mitochondrial disorders are known to induce infertility by means of reducing sperm motility.

Mitochondrial DNA selection review

I was reading through an excellent review of the recent literature about mtDNA and selection, from Damian Dowling and colleagues (2008). The review focuses on the patterning of evidence for selection in ecological and phylogenetic terms, and to some extent upon the function of mtDNA or the mito-nuclear complex of proteins involved in oxidative metabolism. It includes a long passage covering the significant mismatch between mtDNA variation and effective population sizes across animals (but not mammals). A short section discusses the possibility of adaptive polymorphism maintained by mito-nuclear interactions:

Knowing that deleterious mutations in mtDNA can accumulate within populations because of genetic drift [21], there certainly seems to be scope for mito-nuclear co-evolution to proceed via a ‘compensatory’ model. Under this model, deleterious mutations accumulate in the mitochondrial genome, with selection then favouring an adaptive response in the nuclear genome to restore any compromised metabolic function [24]. In effect, mtDNA mutations will act as the drivers of adaptive evolution in nuclear genes. This scenario is not unlikely, given that more than 1000 nuclear-encoded proteins, which are essential for metabolism, are transported into the mitochondrion [25].

Additionally, given that at least some mtDNA polymorphism might have been shaped via positive selection [7] and [8], scope might also exist for mito-nuclear co-evolution to proceed via a model in which adaptive mutations in one genome select for a response in the other.

There has been recent interest in the coinheritance of sex chromosomes and mtDNA. Because the sex-determining chromosome is opposite in birds from mammals, a number of natural experiments may be available to examine the role of coevolution for the mtDNA and co-inherited sex chromosomes. Further, a number of studies have identified a substantial cytoplasmic contribution to fitness and lifespan variance in Drosophila, suggesting that adaptive variation in mtDNA may be segregating within populations.

The review discusses the possible importance of the adaptive perspective for aspects of biology ranging from life history and aging to speciation (where fast-evolving mtDNA genes may induce hybrid incompatibilities). And sperm are a surprising focus of research -- mtDNA mutations affect motility, fertility, and the outcome of sperm competition. On that topic, more later.

References:

Dowling DK, Friberg U, Lindell J. 2008. Evolutionary implications of non-neutral mitochondrial genetic variation. Trends Ecol Evol 23:546-554. doi:10.1016/j.tree.2008.05.011

Filed under

mtDNA selection in Iceland?

Leave it to me to have readers unwilling to ignore selection in recent populations! Here's an e-mail:

Why couldn't the Icelandic genetic changes have been the result of selection that favored some mtDNA lineages rather than others? We know the population of Iceland derived from settlers that were transplanted into a relatively alien climate and ecology, and had to adjust agriculture and subsistence activity to survive there. We know that there were dramatic environmental insults to the population: disease, starvation, eruptions. At least some of these insults would have likely been more severe than the ancestral populations would have encountered, whether they were Scandinavian or "Celtic".

So why isn't there at least a token mention of selection, either by you or by the authors? Is "genetic drift" that much more likely than selection? Is selection a more academically risky proposition than the comforting mathematics of "drift"?

Genetic drift eliminated rare mtDNA haplotypes from Iceland

How powerful has genetic drift been in recent human evolution? That's the question I raised the other day with reference to the claim that a heart disease risk-inducing allele had become common by drift in India during the last 30,000 years.

Another paper released earlier this week in PLoS Genetics claims that mtDNA haplotypes have been recently lost from the Icelandic population by strong genetic drift. The evidence for such changes in haplotypes comes from sequencing the mtDNA of thousand-year-old skeletons unearthed in Iceland during the last 150 years. These ancient remains have haplotypes that are found elsewhere in Europe today, but not in Iceland. The conclusion is that the modern-day descendants of these early Iceland settlers have experienced genetic drift within the last 1000 years, relieving them of of a load of rare mtDNA haplotypes.

Could genetic drift have accomplished this loss of haplotypes? Although the paper does not present any analysis of this question, a quick consideration of some theory will show that genetic drift could easily have caused the observed results. It also shows a contrast between this case and others where genetic drift has been described as "strong". Even in this case, on an island with a limited human population, genetic drift is only "strong" in the sense of eliminating alleles that are already quite rare in the population.

Double the migrations, double the fun

Several news stories have reported on an article by Ugo Perego and colleagues, titled "Distinctive Paleo-Indian Migration Routes from Beringia Marked by Two Rare mtDNA Haplogroups." The Discover blog, 80beats, has a good two-paragraph summary of the results:

In the study, published in Current Biology [subscription required], a team led by geneticist Antonio Torroni analyzed entire genomic sequences of mitochondrial DNA, the genetic material in cells’ energy-generating units that gets passed from mothers to children…. The researchers focused on the disparate geographic distributions of two rare mitochondrial DNA haplogroups — which are characterized by a distinctive DNA sequence derived from a common maternal ancestor — that still appear in Native Americans [Science News]. Both haplogroups appear to have arisen about 16,000 years ago.

The researchers found that all the people with the D4h3 haplogroup presently live in South America, while those with the X2a haplogroup live in Canada and the United States, which suggests that the two genetically distinct bands of early humans struck off in different directions around 16,000 years ago.

I don't have a lot to say about this. Tracking the frequencies and geographic distribution of rare haplotypes poses different issues than doing so for common alleles. Two closely related populations might nevertheless differ in the presence or absence of rare alleles.

I really just wanted to post with reference to a broader point. If the data don't distinguish between a single migration at one time and multiple migrations at different times, then it's pretty much certain that they won't distinguish between a single migration and multiple migrations at one time.

The two-simultaneous-migrations model might solve problems so far unaddressed by other models. But it's not obvious that it solves any -- there's no test here, just a discussion of the plausibility of the scenario. Each of these scenarios for New World habitation involves the dispersal of many populations across thousands of years. That means lots of free parameters, even in the simplest of the models. Given that necessary complexity, it seems pretty likely that there's a way for the simplest model to account for the frequencies of two rare alleles. It will take a whole lot more genetic comparisons to really test hypotheses about the founding population.

References:

Perego UA and 15 others. Distinctive Paleo-Indian Migration Routes from Beringia Marked by Two Rare mtDNA Haplogroups. Curr Biol 19:1-8. doi:10.1016/j.cub.2008.11.058

Glucose metabolism and memory

Roni Caryn Rabin reports on a study linking blood glucose spikes to age-related memory decline:

Researchers said the effects can be seen even when levels of blood sugar, or glucose, are only moderately elevated, a finding that may help explain normal age-related cognitive decline, since glucose regulation worsens with age.

I wonder if this physiological link may also underlie the association between some mtDNA haplogroups and Alzheimer's. In this case, the paper is able to analyze functional consequences of glucose levels because the authors were able to manipulate glucose levels in experimental animals. Remember this:

Previous observational studies have shown that physical activity reduces the risk of cognitive decline, and studies have also found that diabetes increases the risk of dementia. Earlier studies had also found a link between Type 2 diabetes and dysfunction in the dentate gyrus.

Here the causality is not necessarily clear. Maybe people who have healthy metabolic profiles are more likely to be active and less likely to exhibit cognitive declines. In that scenario, you wouldn't necessarily benefit from changing your activity pattern.

Filed under

The unbearable hotness of Neandertals

According to the Telegraph (UK), Neandertals became extinct because their mitochondria leaked excess heat:

Professor Patrick Chinnery, a neurogeneticist at Newcastle University, believes the differences in this mitochondrial DNA could have caused Neanderthals to be inefficient at producing energy, meaning their cells leaked heat.

He said: "The question is why did Neanderthals disappear? There are lots of explanations to do with changes in climate and the food supply.

"Differences in these mitochondrial DNA sequences might explain why modern humans were able to survive while Neanderthals were not.

So, is it true? Did Neandertals go panting into that long good night?

Well, Siberians may have mtDNA alleles that leak extra heat, and they're not extinct. It seems like a good idea if you don't live in the tropics and have enough food. Because it's not like Europeans lack opportunities to take off a layer to deal with the heat.

Plus, as the Neandertal morphology waned in Europe, the climate was getting colder, not warmer.

The real mistake here is assuming that the mtDNA necessarily shared the same fate as the rest of the genome. Sure, there aren't any living members of the Neandertal mtDNA clade, at least that we know of. But that suggests selection favoring human mtDNA, not necessarily Neandertal extinction. The idea of selection is supported by the finding of functional variations between human and Neandertal COX2, which I discussed in August.

The current research seems like it probably adds detail to this comparison, but that's not an argument for Neandertals going extinct in the heat.

Poor Ötzi's doomed mitochondria

I must have seen a dozen stories today that started this way:

Reuters:

"Otzi," Italy's prehistoric iceman, probably does not have any modern day descendants, according to a study published Thursday.

Washington Post:

Sparking a new mystery about early man, Italian scientists have unraveled the DNA of the 5,300-year-old "Iceman" mummy, only to discover that he doesn't appear to have modern descendants anywhere near where he was found in Europe.

I didn't really think about how funny that line is, until I was talking to someone about it this evening -- there's no way that the mtDNA can be informative about Ötzi's descendants, because, well, it's maternally inherited. D'oh!

Meanwhile, we can ask what it means that a randomly picked Neolithic man would have a now-extinct mtDNA lineage. According to ScienceNews, Antonio Torroni thinks that it is a case of marker loss:

It’s possible but unlikely that Ötzi belonged to a fourth branch of K1 that is now extinct or rare, Torroni says. He considers it more probable that a random mutation in the Iceman’s mitochondrial DNA erased the only genetic marker currently used to identify members of the most common K1 branch.

Could be.

Whether the sequence was a unique branch of K or a slight variation on a well-represented subtype, there's a natural hypothesis for why it no longer exists, that somehow is mentioned in none of the reports. The Iceman is hardly singular: Remember that the mtDNA pool of Central Europe in the Neolithic was dominated by lineages that are now rare. And Medieval Danes had several mtDNA sequences that are now rare or absent in Scandinavia. And the Cambridge sequence has been increasing in frequency in Britain since medieval times. And so on.

There's no mystery here. These are large populations, and mtDNA haplogroups have been changing in frequencies between ancient DNA samples and the present. MtDNA has functions that plausibly were subject to changing environments after the Neolithic. This seems like a good candidate for recent selection.

UPDATE (2008-10-31): A reader writes:

Hi John,

You raised the hypothesis that recent selection might explain the apparent dearth of modern examples of his haplotype, with private mutations at 3513T and 8137T. Those mutations are synonymous, which would seem to rule that out, leaving drift as the better alternative.

That's a good point, and one often raised as a criticism of the selection hypothesis. If a sequence differs from some extant (still-existing) variant by only synonymous mutations, then selection can't explain why one is gone and one is still here. Only genetic drift can explain the extinction of one and the survival of the other.

But this is not the entire story. In this instance, we have synonymy with one major haplogroup (K) which has coding differences from other haplogroups. Those haplogroups have been changing in relative frequencies in Europe over the last few thousand years. A decrease in the frequency of K would naturally cause rare K variants to become rarer or extinct, even though they are neutral with respect to each other. In this instance, lineage extinction would be the result of selection, even though the extinct lineages have no disadvantage relative to some that still survive.

Now, is that the case with Ötzi's haplotype? I would say it's a good hypothesis, but not yet testable. We really would like to know the frequency of the haplogroup in the Neolithic. For that matter, further ancient DNA sampling will test many hypotheses of genetic drift and selection, because direct observation of ancient frequencies gives us a source of information that does not depend on sampling models from living populations.

Sample sizes and the "Neandertal haplogroup"

I have an excellent e-mail question about last week’s Neandertal mtDNA paper, which has provoked a lot of commentary.

I just skimmed over your comments on the recent paper and I have a couple questions. First, how many Neanderthals did they receive mitochondrial DNA from? I think I read somewhere that it was fewer than ten.

Second if that is true, what the hell does it mean? I wouldn’t try and predict anything based on even fifty humans from that long ago much less 8 or 9 in genetic terms. I don’t think that anyone else would either unless they are grandstanding. You can’t prove a negative so they really can’t say that no modern humans have any Neanderthal DNA. Did they study Neanderthals from Asia? I just think they don’t have a good enough sample and until we can resequence a Neanderthal nucleus and bring the little tyke to term and wait for him or her to marry then wait for those kids to have kids will we really be sure we’ve got the goods.

Krause et al. (2007) list 15 Neandertal partial mtDNA sequences. Ten of these at that time presented relatively long portions, including the central Asian Okladnikov and Teshik Tash specimens, Mezmaiskaya, Feldhofer 1 and 2, Vindija 75 and 80, Scladina, Monte Lessini, and El Sidrón 1252. The same paper lists five additional specimens for which only a very short sequence had been recovered (just enough to diagnose as part of the Neandertal clade), including Vindija 77, El Sidrón 441, Engis 2, Rochers de Villeneuve, and La Chapelle-aux-Saints.

We do not know that every Neandertal belonged to the same mtDNA clade as those 15 sequences. Some of them may have looked different, possibly including the new clade otherwise present in later Upper Paleolithic and living people. But based on the 15 sequences we have, we can say that a large fraction of Neandertals must have carried the “Neandertal haplogroup.” Exactly how large a fraction depends on what we are willing to believe about contamination, preservation, and the randomness of our sample.

Now, let’s consider the question: Can we predict anything about Neandertal evolution and relationships based on this small, possibly unrepresentative sample of mtDNA?

The answer is that it doesn’t matter very much whether we have 5 sequences or 500. If 15 out of 15 specimens from different sites across Europe preserve a single mtDNA haplogroup, we can’t say it was universal, but we can say it was common. If 40 out of 50, or 400 out of 500 specimens had the same haplogroup, that would increase the precision, but not change the basic fact: Neandertals had at least one common haplogroup that is now so rare it has never been found in a sample of 100,000 or more people. We deserve some explanation.

The possible explanations are:

  1. Random genetic drift
  2. Accelerated genetic drift due to demographic turnover
  3. Population extinction and replacement
  4. Natural selection


Drift

Random genetic drift is fairly easy to refute, although it might not appear so at first. In favor of drift: There were few Neandertals, and the population size of the succeeding Upper Paleolithic, up through the Last Glacial Maximum, was also small—the best estimates are on the order of 2000 people for Western Europe and 5000 for continental Europe to the Urals (Bocquet-Appel et al.2005). There would have been perhaps twice or more that number across the entire Neandertal range. The effective population size represented by this population would have been smaller; perhaps 3000–5000 for Neandertals and Aurignacian-era people, only half, or around 2000, females. Genetic drift in this small mtDNA population would have been much stronger than for autosomal genes, and very much stronger than in most recent human populations.

But when we plug these numbers into a model of random genetic drift, it starts to appear very unlikely that drift alone could explain the observations. Let’s assume (falsely) that our Neandertal genetic samples all dated to 40,000 years ago, and the female effective size was 2000 individuals between then and 15,000 years ago, and that the population of Neandertal country were a random mating pool. Following these assumptions, on averageall the mtDNA genomes at 15,000 years ago would descend from only 4 or 5 ancestral copies in the population 40,000 years ago. If these five ancestral copies were, by chance, a different haplogroup from the 15 copies we’ve already found, then drift could explain the data.

However, this still doesn’t appear very likely. So far, every one of the Neandertals shares a single haplogroup. The frequency of this haplogroup was apparently very high, making it very unlikely that all five ancestral copies would have belonged to some other haplogroups of which we have never found any trace.

Notice that this argument does not depend very much on the number of Neandertal mtDNA sequences that we have found. The fact that there are 15 helps to constrain the frequency of the haplogroup within the population 40,000 years ago, in our model. That frequency is unlikely to be less than around 85%, assuming random sampling. But suppose there were only five. We would still know that the Neandertal haplogroup was very common in its population, even if we thought it was only 50%. It would still be unlikely to draw four or five ancestral copies and have all of them be some other haplogroup that we haven’t found.

This gives us a considerable confidence margin against drift. We need it. After all, the Neandertals were not randomly sampled at a single time, and it is possible that some of them actually carried a human-like mtDNA sequence, which we now falsely interpret as contamination. But even with these shadows hanging over us, it would still be unlikely that none of the ancestors of today’s mtDNA variation were like the Neandertal haplogroup.

Also, the population was not a random-mating pool. When we add geographic structure to the story, which tends to reduce the importance of genetic drift, we find that the possibility that drift alone is almost zero, and it remains very unlikely that a single migration of modern humans interbreeding with Neandertals under random drift could explain the observations, either (Currat and Excoffier2004).

Extinction

It is at this point that most geneticists turn to the hypothesis of complete Neandertal extinction. They have a point. Genetic drift apparently cannot explain what we have observed, In their point of view, if genetic drift alone cannot explain the Neandertal mtDNA disappearance, then the only other random process at hand is extinction.

I think that hypothesis is false. It does not account for morphological similarities between Neandertals and later people, genetic evidence that suggests a strong ancient population structure with introgression, or with the apparent behavioral continuity in the Upper Paleolithic.

Happily, I don’t have a commitment to random processes. Instead, I think that the mtDNA evolution of Europe was driven by nonrandom processes of demographic turnover and selection.

Demographic turnover

Here we come to an important point. No one believes that later Europeans evolved from earlier Neandertals by a random process of genetic drift. Yet that is precisely the hypothesis that most studies have set up to refute. Without question it is valuable to set up boundary conditions under the hypothesis of random genetic drift. But the time has come to investigate more interesting models.

Personally, I am surprised that more complicated metapopulation dynamics have not gotten more attention as an explanation for the Neandertal mtDNA results. Population sources and sinks are a hot topic in biology, and you would think that anthropologists would have picked up on this. To my knowledge, the only time anyone has examined a population sink model was in 2001, when Milford Wolpoff and I worked with mathematician Per Enflo on such an idea for Neandertals (Enflo et al.2001). This idea deserves a fuller treatment (I think I’ll suggest it as a project for one of my classes this year!).

In a nutshell, a population sink is a region where the average rate of reproduction is below replacement levels. This region can remain populated only if individuals migrate in from other places. The places that reproduce above replacement are called population sources. The continual migration from sources to sinks creates a genetic gradient. Individuals sampled at any given time in the population sink are overwhelmingly likely to have ancestors not in the sink but in one or more source populations.

Europe today is a population sink. The population of the continent does not produce enough children to replace itself, and immigration from other parts of the world is high. There are several reasons to suggest that Europe may have been a population sink in prehistory as well. In Neandertal and Upper Paleolithic times, climate fluctuations created unique challenges in Europe, where caloric expenditures were high and food harder to obtain than some other regions.

Continual migration into Europe would provide a simple explanation for why none of today’s mtDNA haplogroups derive from the European Neandertals. The mtDNA population of 15,000 years ago had a few ancestors 40,000 years ago, and none of these ancestors lived in the sink population—all came from the source population in Africa or West Asia. The Neandertal mtDNA variation would have been a short-lived phenomenon, continually being turned over from source populations. Some Neandertal genes would have survived in Europe for hundreds of thousands of years, but some would have come in with more recent migrants from the population source.

There are points that argue against this source-sink hypothesis. The Neandertal-human divergence time for mtDNA is not very different than that estimated for the autosomal genome. If a European population sink had made genetic drift more powerful, that should have affected mtDNA more than the autosomes, so we might expect a more recent mtDNA divergence. Still, there is nor reason why the source-sink dynamic need have been constant over Neandertal evolution, and there may have been multiple sources in the Pleistocene, not only Africa and West Asia. Investigating the boundary conditions of the source-sink model and its correspondence to autosomal genetic results would be helpful.

I should note that mtDNA is not special. Neandertals had lots of traits that are now very rare. The horizontal-oval, or “bridged” mandibular foramen is a prominent example. Out of the relatively small sample of Neandertal mandibles, half have this derived form. Fewer than one percent of recent European mandibles have this form. As for mtDNA, a once-common variant is now very rare. And as for mtDNA, we deserve some explanation. A source-sink model would appear consistent with the continued evolution of such traits during the Upper Paleolithic—a time when the extinction and replacement hypothesis predicts no change in these characters.

Natural selection

The other nonrandom hypothesis is natural selection, which would presumably have favored one or more modern human types while eliminating the original Neandertal haplogroup. I won’t say much about that hypothesis here, since I discussed it in my initial post about the whole-mtDNA-genome sequencing. Selection has a leg up over the other hypotheses now because it seems like there’s good evidence it happened.

Still, selection on mtDNA alone could not explain the total pattern of observations about Neandertals. Physical traits that were once frequent in Neandertals were much less common or absent in later Europeans, and some continued to reduce in frequencies over time. To explain these changes, we must invoke either selection on other traits, or continued demographic turnover in the post-Neandertal population (probably more immigration into Europe) or both.

So selection on mtDNA has never been a sufficient or necessary hypothesis, even if we assume that other genes carried by Neandertals still survive. But given the current evidence that suggests something distinctive about the mtDNA of recent humans, natural selection may receive renewed attention as a factor in the disappearance of the Neandertal mtDNA haplogroup.

References


   Bocquet-Appel JP, Demars PY, Noiret L, Dobrowsky D. 2005. Estimates of Upper Palaeolithic meta-population size in Europe from archaeological data. J Archaeol Sci 32:1656–1668. doi:10.1016/j.jas.2005.05.006.

   Currat M, Excoffier L. 2004. Modern humans did not admix with Neanderthals during their range expansion into Europe. PLoS Biol 2:e421.

   Enflo P, Hawks J, Wolpoff MH. 2001. A simple reason why Neanderthal ancestry can be consistent with current DNA information. Am J Phys Anthropol 114:S62.

   Krause J, et al. 2007. Neanderthals in central Asia and Siberia. Nature 449:902–904. doi:10.1038/nature06193.

The mtDNA sequence of Paglicci 23

Is there anything surprising about finding the Cambridge Reference Sequence in Paglicci 23?

UPDATE follows at the bottom.

As far as cladistics can take mtDNA analysis

In the early access online edition of Genetics, there is a new paper by Toomas Kivisild and (many) colleagues, titled "The role of selection in the evolution of human mitochondrial genomes" (via Dienekes).

The conclusion of the paper is that the appearance of many nonsynonymous mtDNA changes in certain populations may be the consequence of hotspots where mutations happen repeatedly. The rapid mutation rate at these hotspots means that they saturate more quickly than other sites, and their variation in recently-founded populations is therefore higher than expected compared to their variation in more ancient populations. They suggest that the appearance of many non-synonymous variants in "Arctic" populations (found by Ruiz-Pesini 2004) should be explained by the recent colonization of these regions, as opposed to new adaptations to cold in these populations.

The study was a phylogenetic analysis of human mtDNA variation, from a sample of 277 individuals. After deriving a most parsimonious tree, they looked for sites that underwent recurrent mutations in different branches of the phylogeny. These "hotspots" make up a disproportionately large number of the changes within and between human mtDNA lineages. Thus, it is likely that the high proportion of nonsynonymous changes in certain populations might be due to these hotspots.

Within-human coding variation

So does it matter whether or not some human population has a higher number of nonsynonymous variants? If a population did have a higher proportion of nonsynonymous variants, would that be a good sign of local selection?

I would suspect the answer to both questions is no. It certainly makes sense to me, as Kivisild et al. (2005) claim, that the excess of nonsynonymous changes in some populations may be an overrepresentation of nonsynonymous hotspots compared to more limited variation at other sites. So there is a statistical reason besides selection for this observation.

But considering the low global variation of human mtDNA, there shouldn't have been too much opportunity for different regions to become very different in their mtDNA variants. All of them have a recent common mtDNA ancestor, so locally adaptive variants probably don't differ by a large number of substitutions. And if they don't, then we shouldn't expect to see a significant increase in the proportion of nonsynonymous substitutions for those locally adaptive variants. So this is just not a very good test for local selection.

But there is a pretty good test for whether a variant might be a target of selection: Look at its functional consequences. And we now know that many of the variants that are common in different parts of the world actually have functional consequences on life history, degenerative disease, metabolic efficiency, and high-energy tissues like the brain. Some variants are associated with higher cancer rates, some with higher Alzheimer's and Parkinson's rates, some with higher lifespan, and others with greater energy conversion. When these variants differ significantly in their frequencies in different regions, it is reasonable to suggest that they were selected.

Of course testing the hypothesis of selection depends on demonstrating a fitness advantage for the variants, so it remains at least theoretically possible that different individuals have mtDNA with higher or lower cancer risk, lifespan, and energy efficiency without any difference in fitness.

But I don't think that we would make that assumption for any other gene -- it would be silly. And we don't need to know the proportion of nonsynonymous mutations to make that judgement; we just need to know that the gene does something differently in different places.

So I think the paper goes about as far as anybody can in demonstrating the rates of different kinds of mutations from phylogenetic comparisons. But that still doesn't tell us what we want to know: do the genes do anything differently in different populations. And in fact we already know that they do. The phylogenetic comparisons might inform us about how many selected changes there have been since the mtDNA coalescent, but in fact that number must be small because the coalescent is recent.

Comparison of different primate species

This comparison is discussed to some extent in the paper, but it does not become one of the major foci of the conclusion. I think there is more interesting stuff to be found here, and it points to the possibility of significant adaptive evolution in mtDNA sequences across primates.

You might not get this from the conclusion, which suggests that there is little evidence of positive selection in hominoids on the coding regions of the mtDNA as a whole. But read the criteria:

In these tests, maximum likelihood ratios of non-synonymous to synonymous mutations (omega) exceeding 1 are consistent with the hypothesis of positive selection, while values close to 1 indicate selective neutrality, and values converging on 0 suggest strong purifying selection. We conducted both lineage and site specific tests. For the lineage-specific tests, we used a model in which all lineages have the same omega (hereafter referred to as M0) and compared that with a model in which omega is estimated for each lineage (hereafter referred to as M1). To test for the action of selection among amino acid sites within a specific lineage, we compared a model that allows for heterogeneity in omega among sites, but not among lineages, with a model that allows for variation in omega along a predefined lineage (as in (YANG and NIELSEN 2002)) (Kivisild et al. 2005:8).

Negative selection reduces the number of amino-acid coding substitutions (nonsynonymous subtitutions) compared to synonymous substitutions. Positive selection increases it. This test assumes that either negative selection or positive selection has happened, but not both. Of course, there's no easy test to tell whether both might have happened. They alter the ratio of NS/S subsitutions in opposite directions, so the actual NS/S ratio must reflect their force relative to each other. The paper recognizes this problem (p. 18), but doesn't explore it. Is it credible to think that a site that evolves by positive selection in some lineages is not constrained by negative selection in others? If evolution involves the occasional positive selection of variants at sites usually under negative selection, then the test of selection used here will be extraordinarily weak. Indeed, it is significantly stacked against detecting positive selection.

Even so, the phylogenetic comparison of hominoid ( + macaque) mtDNA found that the model that incorporated positive selection at some sites was superior to neutral or purely negatively selected alternatives. Based on this model, the study found that 16 amino-acid codons in hominoids were significantly likely (i.e. p > .95) to have been under positive selection. That seems to me like a bare minimum, as there must probably have been positively selected sites in individual lineages that wouldn't show up in the cross-hominoid comparison.

The total possibility for positive selection on the human lineage seems large. The study found 167 amino acid substitutions separating humans and chimpanzees, compared to 452 between chimpanzees and orangutans (and only 96 between cats and dogs, which seems incredibly low to me). They tabulate the proportion of substitutions from one amino acid to another (e.g. Ala <> Thr, Ile <> Val, etc.), and find that these proportions differ in some cases from the proportion of segregating variants within humans.

Suppose we assume that those 84 of those 167 mutations are human-specific (the paper doesn't include this information). If six of those were positively selected, that's one per million years. If twelve, one per 500,000 years. And there's no reason to think that some of these might not have undergone multiple substitutions; indeed the presence of hotspots suggests that some sites might have been recurrently selected as the genetic background at other sites changed. And it seems likely that the 414 amino acid segregating variants in humans might include some that had been selected previously during human evolution also. How many selected substitutions may have happened during recent evolution cannot yet be estimated, but how surprising should it be that the most recent one happened around 160,000 years ago?

An aside

Here's an interesting suggestion; I wonder if it's true:

One factor that could, theoretically at least, explain the different amino acid replacement patterns observed between populations and between humans and other mammals is diet. Threonine and valine, essential amino acids that must be taken in the diet, are abundant in meats, fish, peanuts, lentils, and cottage cheese, but deficient in most grains (Kivisild et al. 2005:17-18).

It's another possible reason for selection based on diet during the last 10,000 years. If it affects metabolism strongly enough, which remains to be demonstrated.

Do I have to keep writing about mtDNA?

I'm sure some readers are beginning to think this is mtDNA Selection Central. Believe it or not, I've gotten a lot of requests to cover this topic, which of course is one of the central issues in the Neandertal problem as well as the unraveling of human origins.

And it's an exciting developing story: it shows how medical genetics is steamrolling the human genetics of the past thirty years. Finding mutations that actually do things has great medical interest, and the search is accelerating. This work is being undertaken by people who have no investment in the idea that variation among humans should be completely neutral.

After all, what's more important: that a neutral mtDNA lets us trace human migrations, or that understanding mtDNA selection helps us find treatments for Alzheimer's disease? There's no way that obsolete lineage tracing can survive this kind of conflict. Finding out the history of mtDNA variability is telling us something very important, but it isn't about the movements of people around the globe 100,000 years ago. It's about the evolutionary tradeoffs that led to advantages and disadvantages for different variants.

References:

Kivisild T et al. 2005. The role of selection in the evolution of human mitochondrial genomes. Genetics (online before print).

Syndicate content