john hawks weblog

paleoanthropology, genetics and evolution

mtDNA migrations

  • The H preparation

    Tue, 2012-05-08 08:48 -- John Hawks

    Razib Khan comments on the current round of Henry Louis Gates ancestry programming: "Finding fake roots", and "Reification is alright by me! Razib notes that the criteria that tell many subjects that their ancestry is a mixture of different populations are conditioned on assumptions that don't work at all for South Asians. From the latter:

    In my post below some commenters argued that obviously implausible inferences from a thin set of reference populations are acceptable considering Henry Louis Gates Jr’s target audience. But that really wasn’t my main point. Rather, it was that he was eliding the distinction between uniparental markers, and the clusters generated by modeled based ancestry assignment algorithms, and ascribing the phylogenies of the former to the latter. It is important to note that categories like “Europeans” are only approximations. But they’re damn good approximations today! Nevertheless, note the qualification of time: they may have basically no meaning at some point in the recent past. They’re powerful when it comes to precisely partitioning modern variation, but they don’t tell us the history of that variation.

    The uniparental marker "interpretations" given to people doing genealogical work has become increasingly comical in its distance from what we now know about ancient variation. For example, I carry mtDNA haplogroup H, and here's what the Genographic Project tells me about that history in their "Atlas of the Human Journey":

    Around 15,000 to 20,000 years ago, colder temperatures and a drier global climate locked much of the world's fresh water at the polar ice caps, making living conditions near impossible for much of the northern hemisphere. Early Europeans retreated to the warmer climates of the Iberian Peninsula, Italy, and the Balkans, where they waited out the cold spell. Their population sizes were drastically reduced, and much of the genetic diversity that had previously existed in Europe was lost. Beginning about 15,000 years ago -- after the ice sheets had begun their retreat -- humans moved north again and recolonized western Europe. By far the most frequent mitochondrial lineage carried by these expanding groups was haplogroup H. Because of the population growth that quickly followed this expansion, this haplogroup now dominates the European female landscape.

    Here, a very common mtDNA haplogroup today is given its own origin myth, complete with a glacial refugium and massive expansion and dispersal. The text goes on to explain how this European haplogroup spread right out of southern Europe into central Asia, where today -- surprisingly -- it is even more variable and shows less sign of expansion. Notice how precise the story sounds, a fleshed-out history for people looking to connect their roots to European prehistoric events.

    Why do I say comical? We have ancient mtDNA from all over Europe now, from Neolithic and pre-Neolithic people, showing that haplogroup H was barely there before farming.

    I don't mean to single out Genographic for this issue, in fact the whole edifice of genealogical interpretation is built on assumptions about history that are currently known to be false. We can do much better than this, I think. But many of the same characters who failed five or six years ago keep plugging at it, persisting in describing a distorted version of human history.

    UPDATE (2012-05-08): The thing that really bugs me, is that the amount of money spent producing a season of one of these programs would be more than enough to get some of us to straighten some of these problems out. Population genetics is a lot cheaper than media. Or, to put it in a more inspiring way: any media organization that is willing to spring for a couple of postdocs along with their program can show some real science instead of making stuff up. Just saying...

  • Neolithic discontinuity in Hungary

    Thu, 2011-09-22 16:53 -- John Hawks

    Dienekes comments on a new paper finding another strange mixture of haplotypes in Neolithic-era sample of mtDNA from central Europe ("Unexpected ancient mtDNA from Neolithic Hungary").

    I don't think even a science fiction writer could have predicted the kinds of ancient DNA results we are getting from Europe. We have genetic discontinuity between Paleolithic and Neolithic, and between Neolithic and present, and, apparently, discontinuity between Neolithic cultures themselves, and wholly unexpected links to East Asia all the way to Central Europe.

    The paper is by Zsuzsanna Guba and colleagues [1]. The final phrase of the abstract:

    Our investigation is the first to study mutations form Neolithic of Hungary, resulting in an outcome of Far Eastern haplogroups in the Carpathian Basin. It is worth further investigation as a non-descendant theory, instead of a continuous population history, supporting genetic gaps between ancient and recent human populations.

    Past populations had incredible dynamism across Eurasia. Of course, as shown later, we need not maintain that the haplogroups presently common in East Asia have necessarily been there all that long.


    References

  • The sign of four

    Thu, 2011-06-16 18:30 -- John Hawks

    Gene Expression this morning is worth some thought, a post about the mtDNA of Andaman Islanders and their connections to mainland Asian populations. "Present genetic variation is a weak guide to past genetic variation". In a nutshell, some anthropologists and geneticists had hoped that Andaman Island people were a kind of "time capsule" of the original migration of people out of Africa. The mtDNA lineages are inconsistent with that hypothesis.

    On a final note, if the Andaman Islanders arrived ~20 thousand years before the present from the South Asian mainland they don’t tell us very much about the “Out of Africa” people. They’re not “living fossils,” and it was frankly somewhat stupid probably to think they would be.

    I don't have time at the moment to do my own review but definitely there is a deeper issue at play. It is extremely interesting that we're finding the Andaman Island population fits into the genetic landscape of South Asia at the Last Glacial Maximum, and not earlier. Even if the islands were first inhabited at the LGM, we might expect early inhabitants to preserve variation that had later been supplanted within South and Southeast Asia by the spread of agriculturalists. Apparently, they don't. It is likewise extremely interesting that Neolithic European mtDNA is predominated by haplogroups that are rare or absent in earlier Europeans. With a fuller review, I think we could likely come up with several more instances where fairly large pre-agricultural turnover was happening...I have two or three in mind.

    These observations show that the present distribution of genetic variation is in some ways completely unrepresentative of the patterns in the past. The thing that strikes me: It takes a pretty massive demographic turnover to make this happen. And what we're looking at in today's populations is many, many instances of such turnovers during the last 20,000 years.

    I've spent a good part of my career as a voice in the wilderness, saying that things just aren't simple enough to use genetics and a Wright-Fisher population model to reconstruct events before the Neolithic. But in many ways, mainstream geneticists weren't making an unreasonable assumption that one might reconstruct those events in a straightforward way using mtDNA or the Y chromosome. It's just that reality is stranger than they expected.

  • Agriculture, population expansion and mtDNA variation

    Mon, 2011-05-23 11:50 -- John Hawks

    Earlier this spring, I wrote about a paper by Brenna Henn and colleagues that presented new data on SNP variation in recent African hunter-gatherer populations [1] ("Population structure within Africa: has 'modern human origins' become a non sequitur?").

    Another paper that came out this spring from the same research group is also very interesting. Christopher Gignoux, Henn and Joanna Mountain [2] examined the evidence for Holocene population growth in Europe, Africa and Southeast Asia, from within-haplogroup variability of mtDNA haplogroups. The idea is that earlier samples were not finely resolved enough to examine events of the last few thousand years, either because they included only small sequences (e.g., control region) with limited variation, or because they included whole mtDNA genomes with too few individuals to look at within-haplogroup coalescents. So here they add more individuals. It is still a small number (425 total) and so I expect that we will see better ones in the next few years.

    The results are nonetheless useful because they provide some nice matches for the archaeology of early agriculture. For example, in Africa:

    We find two periods of population expansion within our sample of lineages originating during the Holocene in western Africa. Although the majority of coalescent events occur during the Holocene, a number of lineages from this sample also coalesce during the Upper Paleolithic. The earliest growth begins at ≈38,000 ya (CI: 33,500–45,000 ya) (Table 1 and Fig. S1) and the second period begins at ≈4,600 ya (CI: 3,000–10,000 ya) (Table 1 and Fig. 1B). The correspondence between the timing of genetic evidence for a sharp increase in population size at 4,600 ya in our Holocene sample of sub-Saharan Africans and the archaeological evidence for origins of agriculture in western Africa is quite close (Fig. 1B and Table 1). In contrast, our southern African Upper Paleolithic sample representative of hunter-gatherers shows no growth over the past 20,000 y. We suggest Bantu-speaking farmers and other pastoralist groups migrated throughout southern Africa 2,000 ya (27) without impacting southern African mtDNA lineages (Fig. 1B).

    We can't really understand the pattern of genetic variation within Africa without understanding when the population grew. In Africa, Middle Stone Age genetic variation must have been more extensive than that in other regions of the world. But the survival of that MSA variation to the present day depends on the demography of populations over the past 50,000 years. In a growing population, fewer lineages will be lost by random genetic drift. So if Gignoux, Henn and Mountain are right about the growth of West African populations by 35,000 years ago, we might expect that region to preserve some extensive variation from MSA times. That might explain why that population preserves very deep Y chromosome lineages [3]. Regarding only mtDNA, one might conclude that a historical paucity of migration between hunter-gatherer and agricultural groups would be the most important reason why MSA variation remains in the present-day African population. This has been the explanation for survival of deep mtDNA lineages in southern Africa, for example. The Y chromosome result and the current paper remind us that population growth can also preserve variation from earlier time periods.

    I think this proposal of African population history matches very well the model that we assumed in our acceleration paper [4], which we based on the archaeological record. We suggested early population growth in Africa by 35,000 years ago followed by an agricultural expansion after 5000 years ago. The evidence for relatively late agricultural intensification, within the last 4000-5000 years in sub-Saharan Africa, is very clear archaeologically. Less clear: How big was the earlier, pre-agricultural human population? The LSA might correspond to a demographic intensification, generally after 45,000 years ago. Genetics has certainly seemed to support such a view, and we found it consistent with the evidence that positive selection had increased in rate much earlier in Africa than in other regions. Still, the more detailed study by Gignoux and colleagues helps to clarify this picture.

    The results also show agricultural population growth to have been late in Southeast Asia.

    Direct archaeological evidence for rice agriculture in southeastern Asia dates to only ≈4,400 ya in Thailand (28). Agriculture spread throughout Island Southeast Asia, with evidence of rice in Taiwan again dating to ≈4,400 ya. Our Southeastern Asian Holocene population size curve indicates expansion beginning ≈4,700 ya (CI: 3,000–5,700 ya) (Fig. 1C and Table 1).

    Again, useful. I think we need to exert some effort making sure that the initial dispersal of people into South/Southeast Asia can be differentiated from the post-agricultural history. But assuming that Gignoux and colleagues are correct, it makes sense in an overall picture of slowly adapting early crops to tropical climate regimes, or replacing early domesticates with different ones in those areas.

    I am less sanguine about their results for Europe. They show a gradual period of growth associated in time with the Younger Dryas (around 12,000 years ago), which could make sense in the archaeology. But I am not convinced that the "European" haplogroups here are really European to that time depth. We know that the Neolithic and post-Neolithic saw some large-scale shifts in the frequencies of mtDNA haplogroups in Central and Western Europe. Some Upper Paleolithic Europeans probably contributed mtDNA to this later population, but I have no confidence that the proportion was great enough to accurately infer the demography of that pre-Neolithic population. (This is also a problem with the current paper in Current Anthropology by Peter Rowley-Conwy. I'll discuss this sometime soon.)

    The next frontier in reconstructing the population history of Europe will be ancient DNA. A good sample of Neolithic and pre-Neolithic whole mtDNA genomes would settle this question and allow inferences about the kind of demographic recovery Europe underwent after the Last Glacial Maximum.

    An open question is to what extent the other populations have similar problems. The European population of today reflects West Asian population dynamics 10,000 years ago. The East African population today reflects West African population dynamics from before the Bantu expansion, possibly to a similar extent. The population of Southeast Asia reflects the population dynamics of early rice agriculturalists in South China. And so on.

    Adding large-scale migration and partial population replacement to this kind of demographic analysis is not easy, but it will be essential if we want a better picture of how agriculture affected human populations. Considering these problems, I think it's easy to see why I started working on Holocene population dynamics. Evidence about Late Pleistocene populations, like MSA Africans and Neandertals, still lies within our genomes. But we see it through a lens. Holocene population dynamics -- movements and population growth -- distort that lens. If we don't account for those Holocene dynamics, we will conclude wrongly about the earlier dynamics.

    I like this a lot, because this is what anthropology is really good for. We can bring a lot of archaeological and historical knowledge to bear on the question of post-agricultural population dynamics. But it's a deep, deep field with a lot of specialized literature.


    References

    Synopsis: 
    A study of mtDNA variation attempts to find the times and magnitudes of population expansions in early agriculturalists.
  • mtDNA, purifying selection and "distorted" genealogies

    Sat, 2010-10-23 11:13 -- John Hawks

    I'm going to pass along this paper without much comment, it's by Jon Seger and colleagues and it came out earlier this year in Genetics [1]:

    Gene Genealogies Strongly Distorted by Weakly Interfering Mutations in Constant Environments

    Neutral nucleotide diversity does not scale with population size as expected, and this "paradox of variation" is especially severe for animal mitochondria. Adaptive selective sweeps are often proposed as a major cause, but a plausible alternative is selection against large numbers of weakly deleterious mutations subject to Hill–Robertson interference. The mitochondrial genealogies of several species of whale lice (Amphipoda: Cyamus) are consistently too short relative to neutral-theory expectations, and they are also distorted in shape (branch-length proportions) and topology (relative sister-clade sizes). This pattern is not easily explained by adaptive sweeps or demographic history, but it can be reproduced in models of interference among forward and back mutations at large numbers of sites on a nonrecombining chromosome. A coalescent simulation algorithm was used to study this model over a wide range of parameter values. The genealogical distortions are all maximized when the selection coefficients are of critical intermediate sizes, such that Muller's ratchet begins to turn. In this regime, linked neutral nucleotide diversity becomes nearly insensitive to N. Mutations of this size dominate the dynamics even if there are also large numbers of more strongly and more weakly selected sites in the genome. A genealogical perspective on Hill–Robertson interference leads directly to a generalized background-selection model in which the effective population size is progressively reduced going back in time from the present.

    The topic arises for me at the moment because of some inconsistencies between the apparent timing of events from mtDNA estimates compared to nuclear DNA estimates. Across the crucial "out of Africa" time interval between 200,000 and 50,000 years ago, the mtDNA is not really giving the same chronology as might be expected from nuclear DNA comparisons.

    The mutation rate of mtDNA genome-wide is very high, giving rise to the possibility of interaction between weakly deleterious mutations on the same sequence. It is widely known that the apparent rate of mtDNA mutation depends on the timescale of the comparison in humans. Mothers and their offspring differ by much more than would be predicted by longer pedigrees or by comparisons between populations. Recently diverged populations (such as those in island Polynesia) differ much more than would be predicted from the difference between humans and Neandertals or humans and chimpanzees.

    This apparent "speed-up" of rate as we get closer to the present is consistent with the action of strong purifying selection. So establishing the other genealogical effects of this selection should help us understand the patterns of mtDNA sequence differences found in humans.


    References

  • Neolithic milk fog

    Sun, 2010-10-17 14:11 -- John Hawks

    Razib points today to an article in Der Spiegel about the revival of folk migration as an explanation for the Neolithic in Europe. His post ("Völkerwanderung back with a vengeance") is worth reading. The general issues here are very interesting right now because the increase in data has made it possible to propose and test more and more complex scenarios. The simple scenario, gradual demic diffusion, appears wrong in many details. Archaeological cultures appeared and spread in spurts, which we now know were often composed of people genetically very different people.

    The article in Der Speigel is titled, "How Middle Eastern Milk Farmers Conquered Europe".

    The main idea of the article is that our understanding of the spread of Neolithic cultures into Europe has been revolutionized by ancient DNA and more sophisticated chemical analysis of artifacts. That's more or less correct. We really are thinking much more these days about folk migrations bringing new people into Europe. We know that lactase persistence was a recent evolutionary phenomenon in European groups, which was absent before the early Neolithic.

    Problem is: from the standpoint of ancient DNA samples, the lactase persistence mutation was also absent within the early Neolithic! The article is full of details that are wrong or misleading. Most important, it links the appearance and proliferation of the lactase persistence trait with the LBK. This might appear to make sense. The chemical analyses have supported the importance of dairying and presumably milk consumption in the LBK. But the genes of the LBK skeletons don't have the lactase persistence marker.

    The absence of lactase persistence in these early Neolithic people is entirely to be expected. Such an allele couldn't become common until the selection pressure was in place. People had to be drinking milk habitually at key times of vulnerability to establish this selection pressure. Even when the selection pressure is very strong, as it was for lactase persistence, the initial growth of a selected allele is very slow. It did not become common in Europe until thousands of years after it first appeared.

    So lactase persistence did not distinguish early Neolithic people in Europe from agriculturalists in the Near East, because neither of those populations had it at any detectable frequency. All the stuff in the article about how lactase persistence originated in Central Europe? It's irrelevant to whether these ancient populations were connected or not.

    What does distinguish the early Neolithic in central Europe is the mitochondrial DNA. I've discussed this several times in the last few years ("Early European mtDNA: only mysterious if you want it to be", and most recently "French Neolithic discontinuities"). The early Neolithic in Central Europe and France is characterized by several common haplogroups that are absent or rare in both earlier and later Europeans.

    It remains to be seen whether we can document a clear analogue of this mtDNA observation with nuclear genetic data. We know a lot about the variation of present-day Europeans, but most attention to geographic relationships has been run through course filters -- maps of the first two principal components are very striking in their correspondence to geography, but they really don't address the timing of movements that may have contributed to the pattern.

    The differences between early Neolithic and later Europeans suggests that post-Neolithic migrations -- real Völkerwandurung -- actually had a major impact on the European gene pool. What we see today is not a pattern established 6000 years ago, but a palimpsest richly painted with strokes from successive migrations.

    One aspect of this scenario: There's no reason to link the early Neolithic with Indo-European languages. There were many later widespread population movements that might have carried this language family, and we know that these later movements were genetically decisive -- at least, as concerns the maternal genealogy. The relation of Y chromosome haplogroups with mtDNA haplogroups is a critical question, but even more necessary is the development of an effective means of testing these hypotheses with nuclear genotype data.

  • French Neolithic discontinuities

    Sun, 2010-08-22 19:47 -- John Hawks

    Marie-France Deguilloux and colleagues [1] present a short analysis of ancient mtDNA recovered from a Neolithic burial at Prissé-la-Charrière, between the Loire and Garonne valleys of western France.

    The mtDNA sample in the end was only three individuals -- one haplogroup X2, one U5a and one N1a. Each is intriguing, as far as a single sequence can be, because all are rare or absent from France today. I think one shouldn't go far interpreting three samples, but they contribute to the view that Neolithic mitochondrial variation in Europe was very different from recent Europeans. The N1a and U5b sequences fit within the already-known Neolithic (and for U5a, Mesolithic) variation in central and northern Europe.

    It is from the U5a that Deguilloux and colleagues make a point about possible Mesolithic population continuity.

    Subhaplogroup U5b has also been encountered in German Neolithic remains from the Corded Ware Culture (Haak et al., 2008) and in the hunter-gatherers studied by Bramanti et al. (2009), although in both instances, the branches concerned were distinct from the U5b in the Prissé sample. It is, however, worth noting that haplogroup U5 has been encountered in surprising frequency in the hunter-gatherers studied by Bramanti et al. (2009) and could correspond to a Mesolithic heritage.

    The story of N1a is that it was very common in the central European Neolithic, even though it is very rare today. That was first noted by Wolfgang Haak and colleagues [2], and has in subsequent years been joined by the observation that the pre-Neolithic hunter-gatherers had yet other common haplogroups. The population history of Europe was a lot more interesting than we suspected 10 years ago.

    Deguilloux and colleagues attempt a conservative explanation for the frequencies of N1a in Neolithic samples:

    The widespread distribution of the N1a lineage in Early and Middle Neolithic northwestern Europe may indicate genetic continuity from Mesolithic populations. This scenario would support a Mesolithic contribution to the earliest Neolithic of Atlantic Europe. This would imply that the N1a lineage was already common in indigenous north European populations and that the spread of the Neolithic was principally the result of cultural diffusion. Although so far the N1a lineage has not been encountered among late European hunter-gatherers in central and north Europe (Bramanti et al., 2009; Malmström et al., 2009), it is worth noting that less than half of the hunter-gatherers' paleogenetic data come indeed from the pre-Neolithic period (predating LBK expansion). Finally, no paleogenetic data currently exist for the Mesolithic period in Western Europe. This prevents any conclusion being drawn about N1a occurrence during the Mesolithic period in those regions.

    I will note this -- the more that N1a is replicated across the Neolithic of Europe, the less and less likely that its subsequent vast reduction in frequency could result from genetic drift. When there was only one or two samples from Central Europe with high N1a, it was at least possible that this was a local founder population that did not spread its mtDNA diversity very far. If it were localized, even in the central Danube (a fairly big region) it might be possible to maintain that the later decline of N1a to its present low frequency had been due to population replacement.

    Now N1a seems like a real marker of the LBK, spread widely into Western Europe. It may be, as Deguilloux and colleagues suggest, that it will be found at substantial frequencies in earlier samples somewhere in Europe. We do want some explanation for how it got to be common in this culture area.

    Dienekes has written about the study. His point is a good one: If N1a were present somewhere in pre-Neolithic Europe, it would require some kind of "partition" of the pre-Neolithic population, along with its propagation -- presumably southeastward -- into the LBK of central Europe. Seems doubtful.

    The study includes an illuminating paragraph about the sources of contaminating sequence in these Neolithic extractions.

    Strict precautions were followed during all procedures (including precautions during excavation) and proved to be effective, because all researchers who directly participated in this study (from people working in the field to those working in the laboratory) were genotyped and their sequences were never observed during analyses. However, European sequences were randomly found in clones (28% of the sequences obtained). These specific sequences are regularly observed in the laboratory, whatever the project tackled (including samples from Polynesia or South America), in clones from samples or negative controls. They are not reproducible for a specific sample and are different from researchers' sequences. These facts lead us to suspect the contamination of PCR reagents (Leonard et al., 2007). It was relatively easy, however, to discard those contaminating sequences from our analyses because they were largely in the minority when compared with endogenous sequences.

    It would not be very difficult to compare the results from different labs and do a forensic-quality analysis of these reagent contamination events. Surely a good fraction of ancient DNA results prior to the last few years must represent such contamination. Nowadays people have the expectation that Neolithic-era remains may have rare or exotic haplogroups, but it hasn't been so long since people assumed that French equals French. I expressed some concern about this criterion before -- "strange" stands in for "non-contaminated" in too many studies.

    It might be very helpful to have a paper outlining the actual contamination pathways that have been found to affect multiple labs. Then the results could be compared against reports that have come out over the years. If people are reluctant to cull doubtful ancient DNA results, at the very least they can target a set for replication studies.


    References

    Synopsis: 
    Study of mtDNA from a Neolithic-era burial in France contributes to an overall picture of Neolithic population replacement in Europe
  • Mailbag: mtDNA "out of whack"

    Fri, 2010-08-20 10:13 -- John Hawks

    Re: "Time to revise the mtDNA timescale?":

    You said "The timescale of mtDNA divergence is already out of whack with the rest of the genome."

    What's the time scale for the rest of the genome? It seems to me it should be expected to be at least twice as much as that for mtDNA since at least half the instances of mtDNA - those in males - dead end each generation. With perfect mixing and replacement, 50% of the mtDNA instances pass from one generation to the next, while 75% of the autosomal instances do. Imperfect mixing and replacement would make both numbers lower, but the mtDNA number would still remain much lower than the autosomal number, so the coalescence time should still be expected to be much lower.

    Thanks for noticing that, it's leading to something but I haven't yet described the problem. My apologies for being less than clear.

    What you're describing (you probably already know) is commonly described as the "four-times rule" -- the uniparental inheritance and single copy number give mtDNA one fourth the effective size, on expectation, as an autosomal locus.

    That's in a constant-sized population. Which of course we haven't been. For around the past 100,000 years, African populations were big enough that genetic drift didn't decrease their genetic diversity markedly. The mtDNA coalesces around 100,000 years before that, compared to more than 700,000 years for the typical autosomal locus -- it's 7 times instead of four. That discrepancy is probably not significant given the huge intrinsic variance of the coalescent. But I don't think it's been seriously investigated.

    The real problem is that the out-of-Africa timescale for mtDNA is now very short -- less than 65,000 years -- while the nuclear timescale looks long -- maybe up to 140,000 years. Maybe these can also be reconciled; it's not yet clear. But it's a problem.

  • Time to revise the mtDNA timescale?

    Wed, 2010-08-18 23:35 -- John Hawks

    Krzysztof Cyran and Marek Kimmel (2010) have presented a revised set of estimates of the human mtDNA most recent common ancestor (MRCA). It's an interesting theoretical paper, written for the purpose of developing a method that doesn't rely on the same assumptions as the usual coalescent models.

    Their new method gives an estimate of 174,000 years ago for the human MRCA. They report an upper/lower range as 96,000 to 449,000 years ago. That range does not represent a confidence interval on the estimate, it's an upper/lower based on extreme assumptions about human/Neandertal genetic distance and the human/Neandertal MRCA.

    The Neandertal mtDNA has really affected the way we estimate human MRCA, at least for the mitochondrial genome. Chimpanzees are just too distant. When we compare human and chimpanzee mtDNA genomes, there has been a lot of parallelism and reversal on both lineages, because mutations have hit the same place multiple times. Multiple hits and purifying selection make a mess out of rate estimation -- generally, they make the human MRCA seem a lot older than it truly was. The Neandertals are closer, and are therefore less of a problem.

    But the Neandertal-human MRCA itself was poorly known, as long when we had only chimpanzees to calibrate the mutation rate....

    That's what we discovered earlier this year with the mtDNA genome of the Denisova specimen [1] ("The Denisova mtDNA sequence: The X-Woman"). Denisova is an outgroup to the human-Neandertal mtDNA clade, which diverged from our mtDNA ancestors around a million years ago. Sliding in that branch redated the human-Neandertal MRCA down to 460,000 years ago. Unfortunately, that paper came too late for Cyran and Kimmel [2] to use the revised human-Neandertal MRCA in their calculations. They assumed a date of 511,000 years ago for the human-Neandertal MRCA.

    Still, the paper gives enough detail to work out the effect of a lower human-Neandertal MRCA on their estimate. They obtained their lower bound (96,000 years) by assuming a human-Neandertal MRCA of 389,000 years. If we substitute in the Denisova-informed human-Neandertal MRCA, we can figure that the human MRCA will be around 130,000 years ago or so.

    That's awfully recent.

    I don't want to go too far with these numbers. My first objection is that they all assume the total absence of selection, when we have long known that some human mtDNA clades have been selected in some parts of the world. It's entirely possible that the human MRCA is recent because of natural selection on some mitochondrial-linked phenotype ("Complete Neandertal mitochondrial sequence, and selection on human (not Neandertal) mtDNA", "Has the dam broken on mtDNA selection?", "Selection, nuclear genetic variation, and mtDNA").

    And even if we assume no selection at all, there's not a lot to be gained by increased precision of these estimates. Branch lengths of an mtDNA genealogy give only extremely wide estimates of ancient events. Saying that something happened "around 50,000 years ago, plus or minus 35,000", it hardly matters whether we change that to "around 43,200 years ago, plus or minus 35,000." I would even argue that the round estimate is better, because it doesn't communicate a misleading impression of precision.

    Still, it does a lot of good to know whether estimates are systematically biased in one direction. And this work, combined with what we know about the Neandertal and Denisova complete mtDNA genomes, suggests that our mtDNA branch lengths may have been biased too high.

    It remains to be seen how much of the human mtDNA tree will be affected by this logic. The most recent branches can in many cases be calibrated against historical events, and ultimately parent-offspring comparisons. So those aren't likely to change much. What worries me is that critical period around 30,000--80,000 years ago, when human mtDNA lineages were diversifying worldwide. The timescale of mtDNA divergence is already out of whack with the rest of the genome. Pushing these divergences more recent will make the fit between mtDNA and autosomal estimates worse. But given the wide variance on coalescence times, Cyran and Kimmel's estimates are consistent with the hypothesis that these might be substantially higher -- so it's hard to guess whether the apparent mismatch is real or not.

    I might have missed this paper if it weren't for the press release about it from Rice University. But what a misleading release! It's headlined, "Mother of all humans lived 200,000 years ago" -- which the paper doesn't conclude. If that were the conclusion, it wouldn't be news, because it's confirming a widely-used estimate that's more than 20 years old.

    But there are actually several interesting angles to the story that the press release fails to mention. Their estimation method may prove useful for many species for which we have no good demographic model -- a problem that the release alludes to, but doesn't feature. The method they develop came from a similar process, which had formerly led to a much, much higher estimate of human MRCA. Their estimate is a lot lower -- in large part because they can exploit the Neandertal genetic information. And then there's the likely possibility that the actual MRCA may be much lower, which would truly be unexpected compared to most earlier work.

    At the end of their paper, Cyran and Kimmel give a short discussion of the history of the Out of Africa mtDNA story. They mention the idea that some people favoring the multiregional hypothesis had suggested older dates for the human mtDNA MRCA. Aside from O'Connell [3], however, they didn't cite this literature. The conclusion of a short timescale, with a MRCA around 200,000 years ago, was challenged by a number of geneticists [4],[5]. The most common point was that the upper confidence limit on the MRCA estimate must be very high -- potentially 800,000 years ago or more, because of the great uncertainty about rates, coming from the chimpanzee-human branch length. This remains a problem, although the availability of a Neandertal outgroup helps to clarify which changes on the human lineage are actually recent.

    It's sort of interesting that even in the current paper, we still have an upper estimate of the human MRCA that's nearly 450,000 years ago! I don't think that the assumptions going into that value are realistic, but there's no real upper confidence bound on the central estimate. It might well go as high as 450,000 years, given the huge uncertainty in the depth of the deepest branches of that African mtDNA genealogy.

    So I guess I'm not really sure we've advanced very far in 20 years!


    References

    Synopsis: 
    A study of human variation adds precision to the human mtDNA mutation rate; I compare to results from archaic humans.
  • The other story about the mammoth DNA

    Wed, 2010-03-24 00:04 -- John Hawks

    I got to writing about a story a couple of years ago, and then stalled out. That happens every so often -- remember, most of my research-related entries are my own notes. You can only imagine how many half-written posts I have, but the AI on my computer has gotten better and better at archiving them.

    In this case, the half-written post lately has grown in relevance, so I've revisited it. In the summer of 2008, Thomas Gilbert and (many) colleagues reported on a phylogenetic analysis of 18 mtDNA genomes from extinct woolly mammoths.

    That's pretty cool, by the way. We now know a lot more about woolly mammoth mtDNA variation than we knew about human mtDNA variation in 1980.

    The mammoth mtDNA is an example of something slightly different than the usual phylogeography -- it adds the dimension of time. Call it phylotemporogeography, if you like. The best comparison? Neandertals -- a group for which the number of mtDNA sequences is very similar, over a similarly wide Palearctic geographic range. I wrote about Neandertal phylogeography last year ("Neandertal races?"), and the topic will surely return sometime this year.

    Different mammoth mtDNA clades, which originated millions of years ago, apparently became extinct at different times. The paper divided the mammoth mtDNA variation into two clades, which diverged approximately 1.7 million years ago. These two clades have different geographic distributions. One, which the authors termed, "clade I," was broadly distributed across Siberia and Beringia. The other, "clade II," appears to have been restricted to one area of Arctic Siberia, between the Taymyr Peninsula and the Lena River. Each of these clades has highly restricted diversity, and taking all the mammoth mtDNA sequences together, they are roughly as diverse as the within-subspecies diversity in living elephants. So that deep branch dividing the two clades accounts for a lot of the restricted diversity within mammoths.

    The interesting thing is that the two clades also have different temporal distributions, based on the radiocarbon dates associated with the remains. The geographically restricted clade II is systematically earlier. The time distributions overlap somewhat, but there is no clade II mtDNA after 30,000 years ago, while clade I lasts up to the extinction of the mammoths in the early Holocene.

    First question: why the deep branch? The simple answer is probably that it's just one of those things. It's difficult to weigh the importance of different parts of the geographic range of mammoths, so I hesitate to guess whether the relatively smaller region of clade II mammoths is "peripheral". It's not at a geographic extreme, but it's hard to judge the migration potential among these regions.

    The region occupied by a minor clade doesn't have to be peripheral or geographically isolated. The oldest branch point in a mtDNA tree is unlikely to be evenly balanced, and given that one clade is likely to be less numerous than the other, it is also likely to be geographically restricted. For all we know, the spatial distribution found among these mammoth mtDNAs is perfectly consistent with neutrality.

    Moreover, given the disappearance of clade II after 30,000 years ago, there aren't very many contemporary sequences that are clade I. We don't really know that they weren't evenly balanced at that time -- nor do we know what mtDNA clades may have been present in the broader range of mammoths across Europe and Beringia (although subsequent papers may have given some information on this).

    Second question: why the replacement of one clade by another? The authors first considered whether the mammoth mtDNA might have undergone a selective sweep:

    All of the observed substitutions appear to be between closely related amino acids. For those proteins having a close homolog with an experimentally determined structure (namely, COX1, COX2, COX3, and Cytb), we also modeled the structure of the mammoth proteins. All substitutions appear in regions on the surface or in loop regions that neither seem essential for proper folding nor would be expected to alter protein function in any obvious way. Therefore, the evidence from the modeled structures suggest [sic] that it is unlikely that the nonsynonymous differences found in the mitochondrial genomes of the two mammoth clades have resulted in any physiological disparities, and thus a selective advantage for clade I based on mtDNA sequence differences alone is not expected (Gilbert et al. 2008:8331).

    I think the authors have done as much analysis of this question as possible, given the available data, but I still think this is very weak evidence against selection as an explanation for the clade II extinction. After all, positively selected mtDNA variants are unlikely to change function in a major way -- big changes being much more likely to be bad under the usual Fisher model of adaptation.

    At any rate, the alternative hypothesis is local extinction, taking a geographically-localized clade with it.

    A more likely alternative is that the loss of clade II is a consequence of its restricted geographical distribution, because taxa with small ranges are generally more prone to extinction compared with widespread taxa. It is therefore conceivable that clade II was lost because of a demographic bottleneck resulting in genetic drift or a local population extinction.

    This seems contradictory. Given that there are no noticeable phenotypic differences between these clades, and that mtDNA clades I and II coexisted in the Lena-Kolyma region, a purely local demographic bottleneck doesn't make much sense. Now, there are alternatives that retain mtDNA neutrality -- for example, a demographic replacement of the Arctic Siberian mammoths by populations expanding from elsewhere (either east or south). This might have been driven by selection involving other aspects of physiology, enhanced by climate forcing. For instance, a long-lasting locally adapted population might give way to a more generalized form due to climate oscillations.

    Bottom line: mammoths were a dynamic population, capable of high mobility and rapid clade replacements on the scale of tens of thousands of years. And the Late Pleistocene was a time of high population turnover even across what should have been ideal mammoth habitat. That dynamism is not unusual for large, long-lived mammals, and is something we should be looking for in the DNA phylogeography of Late Pleistocene hominins.

    References:

    Gilbert MTP and 32 others. 2008. Intraspecific phylogenetic analysis of Siberian woolly mammoths using complete mitochondrial sequences. Proc Nat Acad Sci USA 105:8327-8332. doi:10.1073/pnas.0802315105

Pages

Subscribe to mtDNA migrations

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.