john hawks weblog

paleoanthropology, genetics and evolution

Error message

Notice: Undefined variable: options in csl_name->render() (line 639 of /var/www/johnhawks.net/public/modules/biblio/modules/CiteProc/CSL.inc).

Asia

  • Mailbag: North and South China

    Mon, 2012-11-19 09:08 -- John Hawks

    I read with interest your post on:
    http://johnhawks.net/weblog/reviews/neandertals/pigmentation/neandertal-...

    in particular:
    "People of Han Chinese ethnicity sampled in Beijing appear to have on
    average a half percent more Neandertal ancestry than people of the
    same ethnicity sampled in southern China."

    Apologies if you know this already but Han Chinese civilization
    started in the Yellow River area and only later expanded south. The
    original people in the south of China are Viet people and have more in
    common with modern Vietnamese. They all became "Han" people after
    their kingdoms were conquered by the north and are really Han in name
    only. Northern and Southern Chinese people look different and their
    spoken dialects (languages) are mutually incomprehensible to each
    other.

    Chinese people from the province of Shantung have the reputation of
    being the biggest in size, always attributed to their diet of wheat,
    but they are probably the last purest reservoir of Neandertal genes in
    the East. Shantung people generally have big noses, fair skin and big
    bones.

    Yes indeed, these are very deep differences, at least as great as between northern and southern Europe genetically, and maybe more. That's why we find the contrast so useful in comparison with the archaic human genomes. The current samples are not ideal because the "South Chinese" were sampled in Beijing based on ancestry, and so are a diverse set. We are hoping soon to have data from many more Southeast and Northeast Asian populations, which will give us some resolution on when things changed.

  • Denisova at high coverage

    Thu, 2012-08-30 15:25 -- John Hawks

    Science today has released the new paper on the Denisova high-coverage genome by Mattias Meyer and colleagues from Svante Pääbo's group [1]. There is a lot of material in the supplements of the new paper, and it will take some time to work through implications.

    The basics are quite simple: The paper confirms the initial interpretation of the genome by David Reich and colleagues [2] in most respects. The mixture with a whole-genome sample from Papua New Guinea is estimated at 6% Denisovan ancestry. Confirming the later paper by Reich and colleagues [3], the new analysis finds no significant evidence of Denisovan ancestry in a mainland south Chinese (Han Dai) individual, and can exclude it down to a very small fraction:

    However, in contrast to a recent study proposing more allele sharing between Denisova and populations from southern China, such as the Dai, than with populations from northern China, such as the Han (17), we find less Denisovan allele sharing with the Dai than with the Han (although non-significantly so, Z = –0.9) (Fig. 4B) (table S25). Further analysis shows that if Denisovans contributed any DNA to the Dai, it represents less than 0.1% of their genomes today (table S26).

    That is a mystery to be explained. How did Asians end up lacking any evidence of Denisovan ancestry, when the peoples of Sahul (Australia and New Guinea) have six percent? It's nutty! The early modern humans who were the ancestors of present Sahulian peoples surely came from Asia, and they surely mixed with Denisovans there somewhere, right? But today there's no sign that present Asian peoples descended from those early Asian peoples.

    We must, I think, conclude that there was at least one, and possibly several episodes of massive population movement across South and Southeast Asia.

    I have recently completed a review of the analogous problem for Neandertals in Europe -- late and early Neandertals themselves appear to have been a dynamic population. I'm now working on a review of the situation in Southeast Asia. We may fundamentally have to look at the archaeological record in a new, and much more dynamic, way than has been the case.

    Neandertal gene flow

    To me at the moment, this is the most interesting paragraph of the new paper:

    Interestingly, we find that Denisovans share more alleles with the three populations from eastern Asia and South America (Dai, Han, and Karitiana) than with the two European populations (French and Sardinian) (Z = 5.3). However, this does not appear to be due to Denisovan gene flow into the ancestors of present-day Asians, since the excess archaic material is more closely related to Neandertals than to Denisovans (table S27). We estimate that the proportion of Neandertal ancestry in Europe is 24% lower than in eastern Asia and South America (95% C.I. 12–36%). One possible explanation is that there were at least two independent Neandertal gene flow events into modern humans (18). An alternative explanation is a single Neandertal gene flow event followed by dilution of the Neandertal proportion in the ancestors of Europeans due to later migration out of Africa. However, this would require about 24% of the present-day European gene pool to be derived from African migrations subsequent to the Neandertal admixture.

    This is a very interesting result, partially because it is the opposite of what we are finding. As I explained earlier this year, we are finding Europeans to share more Neandertal alleles than Asians do. The difference in our results has been much smaller than 24%; really only an increase of less than 0.5% on the whole genome, or maybe 10% relative to the overall amount in Europe (which is on the order of 3%).

    My initial reaction to this difference is that it reflects the sharing of Neandertal genes in Africa. Meyer and colleagues filtered out alleles found in Africa, as a way of decreasing the effect of incomplete lineage sorting compared to introgression in their comparison. But if Africans have some gene flow from Neandertals, eliminating alleles found in Africans will create a bias in the comparison. If (as we think) some African populations have Neandertal gene flow, that probably came from West Asia or southern Europe. So as long as the present European and Asian (and Native American) samples have undergone a history of genetic drift, or if (as mentioned in the quote) they mixed with slightly different Neandertal populations, this bias will tend to make Asians look more Neandertal and Europeans less so.

    Anyway, this demands further investigation. The Denisova genome makes a more compelling outgroup for these kinds of comparisons, because it is much closer to us than chimpanzees are. But it isn't really an outgroup because it shares alleles by descent with Neandertals. So it takes some clever genetics to compare the distributions of derived alleles in these genomes in terms of introgression versus incomplete lineage sorting.

    Denisovan demography

    It has become possible to make some good estimates of demographic history using only a single diploid genome, using a technique developed by Li and Durbin [4]. Meyer and colleagues applied this technique to the Denisova genome, finding that its genetic history contrasts with that of living human populations:

    To estimate how Denisovan and modern human population sizes have changed over time we applied a Markovian coalescent model (22) to all genomes analyzed. This shows that present-day human genomes share similar population size changes, in particular a more than two-fold increase in size before 125,000–250,000 years ago (depending on the mutation rates assumed (23), Fig. 5B). Denisovans, in contrast, show a drastic decline in size at the time when the modern human population began to expand.

    There is not yet enough data from Neandertal genomes to apply the same method, but to the extent that we understand their diversity, they show a similar picture. These archaic humans in Eurasia had much, much smaller effective population sizes than the ancient population of Africa. That's not surprising, given what we understand about ancient hunter-gatherer population dynamics.

    What may be a bit more surprising is the geography. We know that Neandertals of Europe and Central Asia lived in an environment that was relatively marginal for their technology and subsistence pattern. The Denisovan population could well have lived in parts of South or Southeast Asia -- subtropical and tropical areas comparable to Africa in their ecological diversity and resource richness.

    We might have imagined that the Denisovan population would be more diverse than Neandertals -- that it might have been comparable in diversity to part of Africa, if not the entirety of Africa. The genome is inconsistent with that picture.

    How can we explain the apparent contrast?

    1. Maybe Denisovans didn't live in South or Southeast Asia at all. If not, that demands that we explain how Australians got their genes.

    2. Maybe the population was geographically extensive and diverse, but the genome from Denisova Cave doesn't represent it well. If so, we might discover that Sahulians actually have even more ancestry from this group. Alternatively, we might find that the early history of the population was widely shared, but the recent history diverged between Siberian and other branches of the Denisovan-inhabited region.

    3. Maybe African diversity emerged from a much more complex series of interactions than we now appreciate. The demographic model of Li and Durban doesn't encompass admixture, just the probability of gene coalescence across time. We have recently begun to appreciate the reality of ancient African population structure. If those initial African populations were more divergent from each other than Neandertals and Denisovans, their later mixture would give rise to a picture of early population expansion, even if each of them had relatively low (Denisovan-like) diversity.

    This picture is already complicated. It will get more so. We have a long way to go before the archaeology of MSA and Middle Paleolithic peoples will be reconciled with these genetic models.

    The "modern human" catalog

    I think it's tremendously interesting that the authors have compiled a list of gene variants shared by living humans that are absent from this high-coverage archaic human genome. It's a first step to identifying networks of genes that have been subject to recent evolutionary change in human ancestors.

    That being said, the list of genes itself doesn't lend itself to concrete conclusions:

    One way to identify changes that may have functional consequences is to focus on sites that are highly conserved among primates and that have changed on the modern human lineage after separation from Denisovan ancestors. We note that among the 23 most conserved positions affected by amino acid changes (primate conservation score ≥ 0.95), eight affect genes that are associated with brain function or nervous system development (NOVA1, SLITRK1, KATNA1, LUZP1, ARHGAP32, ADSL, HTR2B, CBTNAP2). Four of these are involved in axonal and dendritic growth (SLITRK1, KATNA1) and synaptic transmission (ARHGAP32, HTR2B) and two have been implicated in autism (ADSL, CNTNAP2). CNTNAP2 is also associated with susceptibility to language disorders (27) and is particularly noteworthy as it is one of the few genes known to be regulated by FOXP2, a transcription factor involved in language and speech development as well as synaptic plasticity (28). It is thus tempting to speculate that crucial aspects of synaptic transmission may have changed in modern humans.

    Interesting. I can imagine a Ph.D. dissertation looking into the function of each of those genes. It is surely true that in the last 300,000 years, human brains have been evolving. But why these genes as opposed to others? And how many regulatory changes (as opposed to amino acid changes) may have been further involved?

    Maybe even more interesting: How many times will the human alleles be found in some other Denisovan (or Neandertal) genomes, and how often will the "archaic" allele be found in anyone living now?

    A limited series of comparisons is too small to exclude that the range of variation will overlap, as fossil analysts have known for a long time. So we will need to work on extending our knowledge of the range of variation within living people, by increasing the sample of genomes representing populations around the world, particularly in Africa.

    The technology

    Of course, the most exciting thing about the new paper is the proof of concept for future high-coverage archaic genomes. The lab was able to generate the high-coverage sequence using its existing samples, by sequencing single-strand DNA instead of requiring double-strand DNA. This is a massive advantage when working with ancient DNA, because damage to the sequence often prevents double-stranded DNA from being amplified.

    The paper makes explicit that the Denisova phalanx simply has better endogenous DNA preservation than any other specimen known. That being said, the new sequencing method has greatly increased the sequence yield from the sample:

    We applied this method to aliquots of the two DNA extracts (as well as side fractions) that were previously generated from the 40 mg of bone that comprised the entire inner part of the phalanx (2, 8). Comparisons of these newly generated libraries to the two libraries generated in the previous study (2) show at least a 6-fold and 22-fold increase in the recovery of library molecules (8), which is particularly pronounced for longer molecules (fig. S4).

    It would be too soon to say that a similar increase in yield will happen for other specimens, but obviously, this may bring higher coverage into reach for several specimens that are currently only sequenced at very low coverage, including the Vindija, Mezmaiskaya, and El Sidron Neandertals. We will have to wait and see how the new technique affects ancient DNA recovery going forward.

    I keep telling people that I think it's exciting that research into human evolution is now pushing technology forward. It has often been that paleoanthropology uses technological advances in other fields. But with ancient DNA, we really see an organic growth of technology along with research questions about our evolution. In our work on the ancient genomes, we're making some progress pushing forward knowledge about human biology by understanding human evolution. Evolution really is the fundamental principle of biology, but using evolution to learn about biology sometimes requires traveling through time. Ancient DNA gives us a time machine bringing new insights into reach.


    References

    Synopsis: 
    A technological advance in library preparation gives rise to much better knowledge of the ancient Denisovans
  • Afrasia djijidae: coolest monkey name ever

    Mon, 2012-06-11 20:07 -- John Hawks

    Ann Gibbons explains the importance of the new possible stem anthropoid fossil teeth from Myanmar: "Out of Asia? New Primate Fossils Pose Origin Riddle".

    The four molars were enough to show [paleontologist Christopher] Beard and team leader Jean-Jacques Jaeger of the University of Poitiers in France that Afrasia was closely related to another primitive anthropoid that lived at about the same time, but in Africa—Afrotarsius libycus from Libya. When the researchers examined the teeth from the two primates under a microscope, they were so similar in size, shape, and age that they could have belonged to the same species of primate, says Beard. Such close resemblance between an Asian and African fossil anthropoid has “never been demonstrated previously,” the authors write online today in the Proceedings of the National Academy of Sciences.

    The paper [1] mentions an earlier candidate for earliest-known anthropoid, Algeripithecus from the Early Eocene of North Africa. They cite recent work claiming that Algeripithecus is an adapoid primate rather than an anthropoid.

    Meanwhile, I wish that frame, "Out of Asia", would go away. People are already confused enough about the idea of "Out of Africa". Here we're talking about a time period literally 400 times older than the "Out of Africa" dispersal associated with the origin of modern humans. The evolution of early anthropoids was a process that unfolded over millions of years, not a sudden event.

    Why do we care about the location? Knowing where early anthropoids lived helps us to better describe the conditions that enabled that process, including the forest and faunal community in which they evolved. So why not focus on that community itself, instead of the continent? If we have similar tarsier-like animals living in north Africa and southeast Asia, whether that's interesting or not depends on whether primates are unique or share that geographic distribution with many other orders.


    References

    1. Chaimanee Y, Chavasseau O, Beard CK, Kyaw AA, Soe AN, Sein C, Lazzari V, Marivaux L, Marandat B, Swe M, et al. Late Middle Eocene primate from Myanmar and the initial anthropoid colonization of Africa. Proceedings of the National Academy of Sciences of the United States of America. 2012.
  • Neandertal introgression, 1000 Genomes style

    Sat, 2011-12-10 18:16 -- John Hawks

    For our project to understand pigmentation genetics in archaic humans, we had to find a good comparative sample of sequence data from recent humans. The original publication on the draft Neandertal genomes compared them to five low-coverage genomes from different Old World populations, along with the publicly available genomes from Craig Venter and others [1]. The first publication on the Denisova genome added an additional handful of genomes to these comparisons [2].

    Some of these handful of genomes from living people are more similar to the Neandertal and Denisova genomes than others. That simple fact is the proof that some living people have Neandertal and Denisovan ancestors.

    But until now, the comparison has been limited to a very small number of human genomes. That became a focus for critics of the Neandertal and Denisovan results. How could three or four genome sequences possibly provide an adequate representation of human variability? We could imagine scenarios in which the similarities between Neandertal and humans could be explained by some unsampled population, for example, northeast Africans [3]. Denisova does not present the same problem, because African population structure cannot possibly explain its resemblance to populations in Wallacea, Australia, and Oceania [2] [4]. But to compare either of these genomes, we should seek a broader sampling of genomes from living people.

    As I wrote yesterday, my students and I have been working to understand pigmentation genetics of the archaic human genomes ("Pigmentation of archaic humans: introduction"). I've emphasized the need to break the analysis into small steps. For this question, we need to examine whether the pattern of introgression around pigmentation genes is characteristic of the genome as a whole. If genes involved in pigmentation have systematically higher or lower levels of Neandertal ancestry, that will tell us a lot about the evolutionary history of pigmentation in recent and archaic humans. For this, we need a good comparative sample, and the 1000 Genomes Project provides the best sample available.

    The first step in assessing the pattern of introgression for pigmentation genes is to characterize the pattern of introgression across the whole genome.

    Yes, a whole-genome introgression analysis sounds awfully big for my "small steps" concept. But actually this is simpler than it might sound. Here's a teaser:

    The figures in this post are not from a whole-genome analysis; they include data from eight chromosomes that we prioritized because of our pigmentation analysis. I am licensing all of them under a Creative Commons ShareAlike license so that anyone can use them anywhere.

    UPDATE (2011-12-10): I finished the whole genome analysis and am updating this post and figures accordingly. The results are the same throughout, with the exception of the Europe-East Asia comparison, which now shows these populations to be significantly different across the genome as a whole. I have partially updated the figures and will finish these later today.

    The value of sequences

    The 1000 Genomes Project data have been updated several times in the last year, as both sequencing and analysis of the genomes have progressed (more information on 1000 Genomes Project website). We downloaded a release of SNP genotype calls from 1094 individuals, based on the low-coverage (average 4x) sequencing that has been carried out on the sample.

    A SNP (single nucleotide polymorphism) is a nucleotide site with at least two alleles present in the global human sample. These sites represent only one kind of genetic variation in today's populations. Many of the differences between people's genes are caused by insertions, duplications, deletions, transpositions, or inversions. But those kinds of polymorphisms can be challenging to study in low-coverage genomes, and we already understand quite a lot about SNPs in human populations from the earlier HapMap project [5] [6]. The HapMap provided the data underlying our 2007 paper on the acceleration of recent human evolution ("Why human evolution accelerated") [7].

    The drawback of earlier SNP variation projects is that they examined only a subset of SNP variation in a sample of people. To design a microchip that could provide a million or more SNP genotypes from a saliva sample, somebody first had to discover where in the genome SNPs could be found. So they took small samples of people, sometimes only a single person's two copies of the genome, and sequenced. Adding together SNPs found by several methods, they could get a representation of SNP variation across the whole genome in a population. But this process introduced a bias: the SNPs were ascertained in a sample that inevitably could not represent humans in other samples with the same accuracy. Initially, SNP samples were heavily biased toward people of European ancestry (upon whom most genetic work was originally done), and the HapMap project went to great efforts to increase the representation of other populations. But even with the best possible ascertainment, interpreting SNP variation requires us to jump through some theoretical hoops.

    Sequence data make life much easier for the population geneticist. Seriously, working on this stuff on the whiteboard is fun instead of a constant nightmare of sampling biases and spaces between markers. I have a bias myself, in that I find recombination hard to deal with. I love reticulation among populations, but I'd rather work with genealogies that look like proper trees instead of a liana-strewn mess. So looking at sequence data over short intervals makes me happy. Not as happy as beer aged in bourbon barrels, but happy.

    The 1000 Genomes Project SNP files represent every SNP mutation observed in the sample. In other words, these are sequence data, just with all the fixed (and therefore redundant) sites removed. Even so, these sequence data are not perfect. Low coverage means that some rare mutations in the sampled individuals will go unreported. We aren't typically interested in singleton mutations in the sample, except that missing them will introduce a bias upon our estimates of the time that common ancestors lived. Next-gen sequence reads are usually fairly riddled with errors. High coverage allows these errors to be removed with some confidence, but low-coverage genomes risk throwing out real SNPs along with the spurious ones. The publicly available files represent some analytical steps that we do not here control, so we have to work with the understanding that the data are not perfect.

    The 1000 Genomes SNP files have had a phasing algorithm applied to them, which attempts to assign genotypes to chromosomes. In essence, phasing tries to figure out whether adjacent SNP alleles belong to the same copy or to different copies of the same chromosome. The details of this phasing are not yet apparent, and for many reasons I am cautious about using phased data. The inference is often inaccurate for rare mutations, and the whole process tends to sneak assumptions about population history into the resulting dataset. I hate being forced to live with someone else's assumptions about human population history, and I typically try to avoid needing phased data. In this case, it looks like the data over short intervals are as accurate as they can be, given the limitations on coverage and sampling. We have moved forward by applying methods that make a bare minimum of assumptions.

    Counting derived SNP alleles

    David Reich and colleagues came up with an appealingly simple test of introgression, which they applied to both the Neandertal and Denisovan genomes. Eric Durand, Reich, Nick Patterson and Monty Slatkin described the method formally this year [8], which they call the D-statistic. Informally, this has become known as the ABBA-BABA test, after their labels for the discordant genealogies that the test compares. By and large, across the genome, humans living today share many more new mutations with each other than they do with an archaic human like a Neandertal. But sometimes two genomes are different from each other, and one of them shares a new mutation with the Neandertal.

    A human might share a mutation with a Neandertal because it actually isn't very new, and both inherited the mutation from some much more ancient population of humans. This scenario is called "incomplete lineage sorting", because humans today have multiple gene lineages that existed within some very ancient population, instead of these having been "sorted" cleanly into the different human and Neandertal populations. Incomplete lineage sorting does happen a lot between humans, Neandertals, and Denisovans. ILS is the normal mode of variation among recent human populations, who trace their genealogical histories back much further than the earliest "modern" humans. So if one human has a Neandertal allele, and another human has a different allele, it's probably no big deal. They both just inherited gene variants that already existed in our distant common ancestors.

    You can probably see already that if we had a way to estimate the age of an allele, we could tell whether incomplete lineage sorting is a credible explanation for any particular site. I'll leave that point for another post.

    In the meantime, if we pretend that we know nothing at all about the ages of alleles, we must find some other way to tell whether incomplete lineage sorting can explain Neandertal similarities. Reich and colleagues recognized that incomplete lineage sorting from ancient pre-Neandertal ancestors ought to be distributed equally among living people. If we look at every site in the genome where we have data from Neandertals, we should find that one living human genome should look like the Neandertal just as often as another.

    This insight led to their test. Take a pair of humans, count the number of times sequence A is like the Neandertal and sequence B is like a chimpanzee, and then do the inverse — B then A. ABBA-BABA.

    Why a chimpanzee? In most cases the chimpanzee allele will represent the ancestral state for humans. Living people can inherit ancestral alleles from Neandertals as well as derived ones, but the derived ones tend to be rarer and younger within human populations. If one living genome shares an ancestral allele with the Neandertal genome, we don't need incomplete lineage sorting or introgression to explain the pattern. For all we know, such a mutation originated after Neandertals were already gone. So we need to pay attention to the derived mutations, ones that are present in Neandertals but not in chimpanzees. Do a count of these across the genome, and if you find a living genome with significantly more than another, you've found evidence for introgression.

    Ed Green, David Reich and colleagues [1] [2] did a comparison of every possible pair of genomes in their modern human sample. These sequence data were gappy, so that sequence A might share different coverage with B than with sequence C. So it was necessary to consider each pair separately, counting all the sites where both human sequence and the Neandertal and chimpanzee sequences had data.

    The 1000 Genomes Project sample reports genotypes for every SNP for every sampled individual. So in principle, every pair of sequences should have data for every one of these sites. Again, we have to be cautious about the nature of the sequencing, attending to the possibility of systematic biases due to low coverage. But we really don't have to take the time-consuming step of comparing every possible pair of the 2188 resulting haploid genomes. We can just find the derived SNP alleles that are present in Neandertals and count how many of them are in each of the human sequences. If one sequence has significantly more Neandertal derived alleles than another, it had to get them somehow.

    That magic three percent

    The figure at the top of the post represents that count. Every individual in the 1000 Genomes Project dataset has two copies of the autosomal genome. Separating these two copies of the genome (basically arbitrarily) and counting up the shared derived features between each of those copies and the genome of Vindija 33.16, we obtain the histogram. Here it is again:

    The African genomes in the 1000 Genomes sample include Yoruba from Nigeria and Luhya from Kenya. The Asian populations sampled are Japanese and Chinese, including people of Han Chinese ethnicity in Beijing and southern China. The European ancestry samples include the CEU sample from Utah, as well as British, Tuscan, Spanish and Finn samples.

    The histogram shows that Asian and European genomes have significantly more Neandertal derived SNP alleles than do the African genomes. The averages for the Asian and European samples are around 3% higher than the average for the African samples. Whatever gave Africans some degree of similarity to Neandertals, non-Africans seem to have gotten around 3% more of it.

    Green and colleagues [1] assumed conservatively that Africans share derived SNP alleles with Neandertals only because of incomplete lineage sorting from the human-Neandertal ancestral population. This fraction should be the same in all human populations, under the assumption that Africans were mostly isolated from Neandertals for some period of time. The 3% Neandertal bonus outside Africa should then represent introgression from Neandertals into recent populations outside Africa.

    Both previous studies noted that genomes outside Africa are not significantly different in the fraction of derived SNP alleles shared with Neandertals. A genome from China and a genome from France carried the same fraction of shared derived SNP alleles with Neandertals. Here, we've confirmed that basic identity in the level of introgression in these populations.

    I have told several people now that I find the distributions in China and Europe spookily similar. On parts of the genome, the two distributions have means that are not significantly different. Indeed, I worked for a week with an analysis of eight chromosomes, in which the East Asian and European means were fewer than 100 SNP alleles apart. Even across the whole genome, Europeans average only 700 derived SNP alleles more than the East Asian sample. This small difference a bit more than a tenth of a percent) is strongly significant on these sample sizes. A t-test yields a p-value of 1.1 times 10-26 on the difference in means. Even so, the distributions of these two populations overlap across most of their ranges.

    Seeing these hundreds of genomes arrayed on a histogram provides much more information than we had from a handful of genomes. It is remarkable how much dispersion there is among genomes from a single population. Although the means of these two samples are nearly the same, you can see that each of them has a large range of variation in the shared derived SNP alleles with Neandertals. This variation means that people within a single population have very different proportions of Neandertal ancestry.

    This is not a graph of people, but a separation of the two copies of SNP alleles carried by these people. That separation is phased at short scales but arbitrary on the scale of a whole chromosome, so the histogram likely understates the variance among single genomes while it overestimates to some extent the variation among people with their diploid genomes. Still, it looks likely from these comparisons that some people in Europe carry more than a percent higher Neandertal ancestry than the average, and some carry a percent less. We can use statistical methods to test this hypothesis directly as applied to individuals in the sample.

    Neandertal genes in recently admixed populations

    A sample of hundreds of people allows us to demonstrate significant differences among the genomes of different populations. Some of the 1000 Genomes Project samples are from populations that represent historically recent admixture of people who trace their ancestry to different parts of the world.

    For example, the "ASW" population sample includes African-American people who live in the Southwest United States. We know from many other genetic studies that African-Americans vary in the fraction of ancestry they derive from Europeans and from Africans. The average amount of African and European ancestry varies among African-Americans who live in different parts of the U.S., as low as 3% and as high as 20% or more in some parts of the country. The proportion among individuals varies even more. So when we consider the ASW sample, we should expect to see a lot of variation in the number of shared derived SNP alleles with Neandertals, with a mean higher than African populations.

    Which is exactly what we do see:

    The ASW sample overlaps substantially with the Yoruba sample from West Africa (Nigeria) and slightly with the CEU sample, which includes people of European ancestry in Utah. The total in the ASW genomes is more variable than either the Yoruba or CEU population samples. If the higher mean in the ASW genomes reflects European ancestry from a population like CEU, the proportion of European ancestry would be around 17% for that sample of people. It would be hard to tell from these numbers alone how much of the variation in ASW is attributable to variation in ancestry fraction, and how much is expected within a population of homogeneous ancestry. As we'll see in some other populations, there are some appreciable differences among populations within a given region, and ancestry differences may add to the variation among individuals within populations.

    We see a similar pattern when we look at the Puerto Rican sample. Individuals in this sample have some ancestry from European, Native American and African ancestors. The comparisons by Reich and colleagues [2] and Green and colleagues [1] suggested that Native American populations have the same fraction of Neandertal ancestry as other people outside Africa. In the comparison with YRI and CEU samples, Puerto Rican (PUR) genomes are intermediate, with a mean suggesting around 15% ancestry from the West African population.

    The two outlier points in the Puerto Rican sample are the two genome copies from one individual, who we would hypothesize had much higher African ancestry than the average in the sample.

    Next...

    This post has taken me much longer than I expected to get to the point of talking about variation among samples within continental regions. It turns out that, despite the similarity of European and East Asian samples in their averages, there are substantial differences between samples within each of these regions.

    For example, here's a comparison of north and south Chinese samples:

    People of Han Chinese ethnicity sampled in Beijing appear to have on average a half percent more Neandertal ancestry than people of the same ethnicity sampled in southern China. I found these kinds of differences almost everywhere I looked within regions. More later...


    References

    1. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, et al. A Draft Sequence of the Neandertal Genome. Science [Internet]. 2010;328:710–722. Available from: http://dx.doi.org/10.1126/science.1188021
    2. Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B, Briggs AW, Stenzel U, Johnson PLF, et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature [Internet]. 2010;468:1053–1060. Available from: http://dx.doi.org/10.1038/nature09710
    3. Hodgson JA, Bergey CM, Disotell TR. Neandertal genome: the ins and outs of African genetic diversity. Current biology : CB. 2010;20(12):R517-9.
    4. Reich D, Patterson N, Kircher M, Delfin F, Nandineni MR, Pugach I, Ko AM-S, Ko Y-C, Jinam TA, Phipps ME, et al. Denisova admixture and the first modern human dispersals into southeast Asia and oceania. American journal of human genetics. 2011;89(4):516-28.
    5. The International HapMap Consortium. A Haplotype Map of the Human Genome. Nature [Internet]. 2005;437:1299–1320. Available from: http://dx.doi.org/10.1038/nature04226
    6. McVean G, Spencer CCA, Chaix R. Perspectives on human genetic variation from the HapMap Project. PLoS genetics. 2005;1(4):e54.
    7. Hawks J, Wang ET, Cochran G, Harpending HC, Moyzis RK. Recent acceleration of human adaptive evolution. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2007;104:20753–20758. Available from: http://dx.doi.org/10.1073/pnas.0707650104
    8. Durand EY, Patterson N, Reich D, Slatkin M. Testing for ancient admixture between closely related populations. Molecular biology and evolution [Internet]. 2011. Available from: http://dx.doi.org/10.1093/molbev/msr048
    Synopsis: 
    We're quantifying the amount of Neandertal ancestry in whole genome data from living people.
  • Asian Homo erectus

    Mon, 2011-11-07 23:59 -- John Hawks
    Synopsis: 
    Examining a sample of crania from the Early and Middle Pleistocene of Asia and Indonesia

    Homo erectus entered Asia as early as 1.8 million years ago. One of the earliest specimens of the species is the Modjokerto skull, from Java. The spread of this species across the tropical Old World was a major event in our evolution. After Homo erectus reached East and Southeast Asia, it had a long history — up to 200,000 years ago or even more recently.

    This station has several representatives of this Asian dispersal of early humans.

    • Trinil 2, Java, 1.2 million years old.
    • Sangiran 2, Java, 1.0 million years old.
    • Zhoukoudian L2, China, 700,000 years old.
    • Zhoukoudian L1, China, 700,000 years old.
    • Ngandong 10, Java, 200,000 years old.
    • Ngandong 8, Java, 200,000 years old.
    • Nganding 4, Java, 200,000 years old.

    What to do: Overall, these fossils are very similar. However, they come from a wide range of times. Make an attempt to seriate the fossils by cranial size. List the results of your seriation. Does it correlate with time?

    Try seriating the skulls according to the form of their frontal bone or supraorbital torus. This feature differs between fossil specimens from Java and China. Does your seriation indicate this difference in geography?

  • Denisovan DNA in the islands, and an Australian genome

    Thu, 2011-09-22 18:09 -- John Hawks

    David Reich and colleagues today report on the persistence of Denisova-like ancestry in island Southeast Asia and Australia (citation not yet available). Meanwhile, Morten Rasmussen and colleagues (citation not yet available) report on the whole-genome sequencing of hair from an Aboriginal Australian who lived some 100 years ago.

    The most obvious story: These data utterly destroy the hypothesis of a single out-of-Africa colonization of Southeast Asia by modern humans. Many human geneticists have argued our present pattern of diversity originated in a wave of successive founder effects coming from a single recent African origin. They were wrong.

    Instead, we can turn to a complex model with successive dispersals and episodes of population mixture. This is not a static model of isolation-by-distance; it is a dynamic model in which populations grow and spread across large spans of the Old World, again and again and again. By my count, at least three massive episodes of population dispersal and mixture are necessary in Reich and colleagues' model. A picture of their admixture hypothesis:

    Denisova admixture model from Reich et al. 2011

    This model depicts (a) an early divergence of an African (represented by Yoruba) and Asian/Australasian populations. These mix with first Neandertals and then (for the Australian/New Guinea/Mamanwa populations) with Denisova-like people. Later (b), after the initial habitation of the Philippines by the ancestors of Mamanwa, a population like Andamanese Onge pushes into the islands, mixing with the ancestors of New Guinea and Australian populations. Later still (c), a population ancestral to today's Chinese people mixes with Philippines and other Southeast Asian people.

    As complicated as it looks, even this model must be a vast oversimplification. I don't like or attribute much belief to mixture models like this, as they assume too much about relative population sizes and the timing of mixture. Many recent hunting and gathering populations of Southeast Asia are not included in the current samples, and the Chinese sample is itself the result of very recent demographic events, covering what once may have been a wider diversity of peoples. Depicting Australian and New Guinean populations as monolithic is an artifact of the small sample; these places themselves housed a tremendous diversity of peoples. Nevertheless, the true model won't be simpler than this one; it will involve many more events that the data cannot yet resolve.

    Hints of that complexity emerge from the Aboriginal Australian whole genome. Rasmussen and colleagues show that this individual shares some ancestry with East Asian peoples, but on the whole populations in Europe and East Asia are much more genetically similar to each other than to this genome. The picture from the whole genome is essentially the same as that drawn by the SNP comparisons by Reich and colleagues, but with the potential (in the long run) to actually trace the histories of individual genes. And I think the gene-by-gene account of history will be important, because we already have some evidence that a few Denisovan genes do persist in mainland Asia, even though most are gone.

    To explain why, we can look at the proportion of Denisovan ancestry in different populations as depicted in a map by Reich and colleagues. The pie charts are confusing here, because they report the fraction of ancestry from Denisovans in each population relative to the 5% estimate for New Guinea. So Australians also have 5% in this figure, Timorese have around 2.5%, and Bougainville has more than 4%.

    Notice the apparent lack of Denisovan ancestry in anyone who lives anywhere that was once connected by land with mainland Asia. I say "apparent" deliberately: Abi-Rached and colleagues reported last month on the widespread distribution of Denisovan HLA types among today's Asian populations, and those may well be products of Denisovan genes that were later selected. I've already identified a handful of other loci that seem to reflect Denisovan ancestry in mainland Asian people. According to the comparisons by Reich and colleagues, such loci must be exceptions.

    At the same time, the mixture model presents an important idea: Once there were people in Southeast Asia who had much more Denisovan ancestry than any populations still remaining today. Both Australian/New Guinea populations and Philippine populations like the Mamanwa have subsequently mixed with new immigrants who lacked any sign of Denisovan ancestry. Prior to this later mixture, the ancestors of those populations must have been more Denisovan -- Reich and colleagues estimate 7%. This is the first evidence that ancestry from archaic people of Eurasia was diluted to a lower value by later population movements. If the population mixture originally happened somewhere in mainland Asia, any traces of Denisovan ancestry in those areas has been diluted almost to nonexistence. But the persistence of some genes would be predicted if natural selection were maintaining them in the face of demographic pressure from elsewhere.

    About the Australian genome, there will be much more interesting analyses to come, I expect. As whole-genome data come to represent more of the variation within human populations, we get a larger store of information about how we came to be variable. Variation traces not only to population movements and demography, but also to natural selection. Australia's population history has been very different from many populations of the Old World, and this genome should give us new perspective on the effects of that demographic history.

    Synopsis: 
    The hypothesis of a single out-of-Africa dispersal is rejected by new data about Denisovan mixture and whole-genome sequencing of an Aboriginal Australian.
  • Agriculture, population expansion and mtDNA variation

    Mon, 2011-05-23 11:50 -- John Hawks

    Earlier this spring, I wrote about a paper by Brenna Henn and colleagues that presented new data on SNP variation in recent African hunter-gatherer populations [1] ("Population structure within Africa: has 'modern human origins' become a non sequitur?").

    Another paper that came out this spring from the same research group is also very interesting. Christopher Gignoux, Henn and Joanna Mountain [2] examined the evidence for Holocene population growth in Europe, Africa and Southeast Asia, from within-haplogroup variability of mtDNA haplogroups. The idea is that earlier samples were not finely resolved enough to examine events of the last few thousand years, either because they included only small sequences (e.g., control region) with limited variation, or because they included whole mtDNA genomes with too few individuals to look at within-haplogroup coalescents. So here they add more individuals. It is still a small number (425 total) and so I expect that we will see better ones in the next few years.

    The results are nonetheless useful because they provide some nice matches for the archaeology of early agriculture. For example, in Africa:

    We find two periods of population expansion within our sample of lineages originating during the Holocene in western Africa. Although the majority of coalescent events occur during the Holocene, a number of lineages from this sample also coalesce during the Upper Paleolithic. The earliest growth begins at ≈38,000 ya (CI: 33,500–45,000 ya) (Table 1 and Fig. S1) and the second period begins at ≈4,600 ya (CI: 3,000–10,000 ya) (Table 1 and Fig. 1B). The correspondence between the timing of genetic evidence for a sharp increase in population size at 4,600 ya in our Holocene sample of sub-Saharan Africans and the archaeological evidence for origins of agriculture in western Africa is quite close (Fig. 1B and Table 1). In contrast, our southern African Upper Paleolithic sample representative of hunter-gatherers shows no growth over the past 20,000 y. We suggest Bantu-speaking farmers and other pastoralist groups migrated throughout southern Africa 2,000 ya (27) without impacting southern African mtDNA lineages (Fig. 1B).

    We can't really understand the pattern of genetic variation within Africa without understanding when the population grew. In Africa, Middle Stone Age genetic variation must have been more extensive than that in other regions of the world. But the survival of that MSA variation to the present day depends on the demography of populations over the past 50,000 years. In a growing population, fewer lineages will be lost by random genetic drift. So if Gignoux, Henn and Mountain are right about the growth of West African populations by 35,000 years ago, we might expect that region to preserve some extensive variation from MSA times. That might explain why that population preserves very deep Y chromosome lineages [3]. Regarding only mtDNA, one might conclude that a historical paucity of migration between hunter-gatherer and agricultural groups would be the most important reason why MSA variation remains in the present-day African population. This has been the explanation for survival of deep mtDNA lineages in southern Africa, for example. The Y chromosome result and the current paper remind us that population growth can also preserve variation from earlier time periods.

    I think this proposal of African population history matches very well the model that we assumed in our acceleration paper [4], which we based on the archaeological record. We suggested early population growth in Africa by 35,000 years ago followed by an agricultural expansion after 5000 years ago. The evidence for relatively late agricultural intensification, within the last 4000-5000 years in sub-Saharan Africa, is very clear archaeologically. Less clear: How big was the earlier, pre-agricultural human population? The LSA might correspond to a demographic intensification, generally after 45,000 years ago. Genetics has certainly seemed to support such a view, and we found it consistent with the evidence that positive selection had increased in rate much earlier in Africa than in other regions. Still, the more detailed study by Gignoux and colleagues helps to clarify this picture.

    The results also show agricultural population growth to have been late in Southeast Asia.

    Direct archaeological evidence for rice agriculture in southeastern Asia dates to only ≈4,400 ya in Thailand (28). Agriculture spread throughout Island Southeast Asia, with evidence of rice in Taiwan again dating to ≈4,400 ya. Our Southeastern Asian Holocene population size curve indicates expansion beginning ≈4,700 ya (CI: 3,000–5,700 ya) (Fig. 1C and Table 1).

    Again, useful. I think we need to exert some effort making sure that the initial dispersal of people into South/Southeast Asia can be differentiated from the post-agricultural history. But assuming that Gignoux and colleagues are correct, it makes sense in an overall picture of slowly adapting early crops to tropical climate regimes, or replacing early domesticates with different ones in those areas.

    I am less sanguine about their results for Europe. They show a gradual period of growth associated in time with the Younger Dryas (around 12,000 years ago), which could make sense in the archaeology. But I am not convinced that the "European" haplogroups here are really European to that time depth. We know that the Neolithic and post-Neolithic saw some large-scale shifts in the frequencies of mtDNA haplogroups in Central and Western Europe. Some Upper Paleolithic Europeans probably contributed mtDNA to this later population, but I have no confidence that the proportion was great enough to accurately infer the demography of that pre-Neolithic population. (This is also a problem with the current paper in Current Anthropology by Peter Rowley-Conwy. I'll discuss this sometime soon.)

    The next frontier in reconstructing the population history of Europe will be ancient DNA. A good sample of Neolithic and pre-Neolithic whole mtDNA genomes would settle this question and allow inferences about the kind of demographic recovery Europe underwent after the Last Glacial Maximum.

    An open question is to what extent the other populations have similar problems. The European population of today reflects West Asian population dynamics 10,000 years ago. The East African population today reflects West African population dynamics from before the Bantu expansion, possibly to a similar extent. The population of Southeast Asia reflects the population dynamics of early rice agriculturalists in South China. And so on.

    Adding large-scale migration and partial population replacement to this kind of demographic analysis is not easy, but it will be essential if we want a better picture of how agriculture affected human populations. Considering these problems, I think it's easy to see why I started working on Holocene population dynamics. Evidence about Late Pleistocene populations, like MSA Africans and Neandertals, still lies within our genomes. But we see it through a lens. Holocene population dynamics -- movements and population growth -- distort that lens. If we don't account for those Holocene dynamics, we will conclude wrongly about the earlier dynamics.

    I like this a lot, because this is what anthropology is really good for. We can bring a lot of archaeological and historical knowledge to bear on the question of post-agricultural population dynamics. But it's a deep, deep field with a lot of specialized literature.


    References

    Synopsis: 
    A study of mtDNA variation attempts to find the times and magnitudes of population expansions in early agriculturalists.
  • Older and younger Acheulean in India

    Sun, 2011-03-27 00:37 -- John Hawks

    Shanti Pappu and colleagues [1] report on date estimates resulting from new excavations at the old site of Attarampakkam, India. The news element is that they date an Acheulean occurrence to as old as 1.5-1.6 million years ago. At the oldest, these dates would make the Acheulean in India equal in age to the earliest occurrences in Africa.

    The dates themselves depend on the decay of cosmogenic nuclides in the artifacts themselves. This is a kind of exposure dating -- as the artifacts are exposed to cosmic rays at the Earth's surface, they build up radioactive isotopes of beryllium and aluminum (10Be and 26Al), which have half-lifes of 1.39 million and 717,000 years, respectively. When they are buried deep underground, their exposure to cosmic rays stops, and the radioactive isotopes can only decay. Then the ratio of the two isotopes in the sample reflects the time since deep burial. But like other exposure methods, in practice this depends on a model of exposure time, burial speed, and radioactivity within the soil, which lends substantial uncertainty to the dates. The lower 95% confidence interval of each of the date estimates reported in the paper is still over a million years, leading to the minimal conclusion that the site is that age or older.

    Robin Dennell has written an accompanying short essay that gives a broader view of the Acheulian in South Asia [2]. The essay includes a great paragraph summarizing the now-obsolete idea that Acheulean reached India only a half million years ago:

    How does this new evidence affect our understanding of the South Asian Acheulian? Previously, the general consensus was that the Indian Acheulian was less than 0.6 to 0.5 Ma (5) and was thus much younger than that in the Levant (eastern Mediterranean). There, the earliest dates of 1.4 Ma, from ‘Ubeidiya in Israel, probably indicate a dispersal of hominins from Africa (6). A second influx of African immigrants is indicated by the discovery of African types of cleavers and hand axes at Gesher Benot Ya'aqov (GBY), in Israel, dated to 0.78 Ma (7). This evidence implied that the Acheulian dispersed eastward toward South Asia only several hundred millennia after it first appeared in the Levant. It also implied that the spread of Acheulian bifacial technologies into South Asia was broadly contemporaneous with its first appearance in Europe, where the earliest sites date from ∼0.5 to 0.6 Ma (8). Some have attributed this expansion of the Acheulian into South Asia and Europe to Homo heidelbergensis. This Middle Pleistocene type of hominin is known mostly from Europe, where it was first defined, but is also recognized by some (but not all) researchers at African sites such as Bodo, Ethiopia, and Kabwe, Zambia, and even at some sites in China (9).

    The "Homo heidelbergensis" model is in such utter disarray right now, I'm not sure many paleoanthropologists have realized the full extent of the problems. You should know that I don't believe in Homo heidelbergensis, never have. A couple of months ago, I was discussing some of the issues about mutation rate estimation with a very prominent geneticist, and the conversation turned to Homo heidelbergensis. What a shock the Denisova sequence should have been to those itching to see a H. heidelbergensis incursion into Asia!

    Notice however, the intrinsic nuttiness of archaeological interpretation. Oh, we have the first evidence for Acheulean in India around 600,000 years ago? Well, that's around the same age as the Bodo fossil from Ethiopia! What a coincidence! Maybe this new kind of hominin expanded from Africa and carried the Acheulean to India! And Sima de los Huesos is around 600,000 years old, too -- and there's a handax in the pit! My gosh, we need a name for those hominins!

    Well, the nice thing about a hypothesis built on mere coincidence, is that it only takes one observation to falsify it. Million-year-old handaxes in India ought to do it, and how. That's the message of Dennell's essay, and the subtext of the paper by Pappu and colleagues. What I find interesting is the extent to which the fact was hinted by earlier discoveries in South Asia but hampered by weaknesses in stratigraphic control and dating. From Pappu and colleagues:

    Sparse radiometric ages from sites in India have situated the Acheulian within the Middle Pleistocene, with a few dates suggesting an early Middle to Early Pleistocene age. However, these ages often exceed the limits of confidence of the methods used (2). They include an electron spin resonance (ESR) mean age of 1.27 ± 0.17 Ma, assuming linear U uptake, on two herbivore teeth from Isampur (23); an ESR age of ~0.8 Ma (lacking uncertainty envelopes) on calcrete from the Amarpura formation, Rajasthan (24), which has been correlated with the Acheulian site of Singi Talav (4); dates ranging from ~1.4 to 0.67 Ma for the tephra at Bori (Kukdi river) (25); and paleomagnetic measurements with evidence of reversals at the sites of Bori, Morgaon, Gandhigram, Andora, and Nevasa (26). However, the reliability of these ages has, in each case, been questioned on various grounds (5, 27, 28). Likewise, the age and stratigraphic position of artifacts and faunal remains from the Early Pleistocene Dhansi formation along the river Narmada are yet to be firmly established (29). Based on data from controlled excavations and two independent dating methods, our ages from Attirampakkam show that the Acheulian in India is older than previously thought. Evidence from other sites in South Asia should be reconsidered and redated.

    Much evidence already exists in the South Asian Acheulean that could be more accessible. The Acheulean in the region has been a long block of undifferentiated time, despite some very well-resolved sites. In addition to this much older dating for early Acheulean, India also has some of the youngest Acheulean assemblages anywhere -- for example, Haslam and colleagues [3] earlier this month reported on an Acheulean assemblage from around 130,000 years ago in northeastern India. That's long after the large biface tradition begins to give way to Middle Paleolithic and MSA toolkits in Europe and Africa.

    On the topic of Denisova, Haslam and colleagues were writing before that genome was reported. But they did know about the Neandertal genetic results, including the evidence of Neandertal ancestry within India. Nevertheless, they assert a scenario in which the makers of earlier and later Acheulean in South Asia are the same biological population, without substantial gene flow from regions to the west, including the Neandertals.

    Recent reports of the draft Neanderthal genome suggest that Neanderthals and H. sapiens likely did interbreed successfully soon after the latter had left Africa (Green et al., 2010), with the probable location of such contact to the west of India, in the Middle East. The southern limit of the Neanderthal range is unknown (Dennell and Roebroeks, 2005), but we emphasise that the continuity seen in the Middle Pleistocene South Asian technological record suggests that taxa derived from earlier hominin dispersals, and not Neanderthals, were the creators of the Indian Late Acheulean. Greater biological separation between dispersing humans and resident Indian hominins may have precluded viable genetic mixing (although see Liu et al., 2010 for an alternate view from East Asia), while similarities in certain technological strategies may have rendered cultural exchange a somewhat more likely occurrence.

    Well, the Denisovans didn't have to live in India when the ancestors of Melanesians ran across them and intermarried. But Denisova and the Neandertal genomes now make it very likely that the inhabitants of South Asia were one or the other. And even if South Asians were yet a third group, as yet unattested from genomes, it is no longer credible to suppose that they were isolated from Europe or Africa for a million years previous. The tools just don't have that much to do with the populations.


    References

    Synopsis: 
    Long known from India, new papers are adding detail to the temporal extent of the Acheulean.
  • Orangutan dynamics of Borneo

    Wed, 2010-11-24 01:46 -- John Hawks

    Bornean and Sumatran orangutans are the most highly divergent subspecies within any of the living species of great apes. The two farther apart even than chimpanzees and bonobos, which are good biological species. The time of the Bornean-Sumatran orangutan divergence as estimated from mtDNA is around 3.5 million years ago.

    This is old enough that many primatologists consider the two populations as separate biological species. The species distinction is supported by some aspects of morphology, but as yet we have no good nuclear DNA information about the extent of divergence. In chimpanzees, nuclear genetic comparisons suggest a relatively recent founding of one subspecies and recurrent gene flow between the others, despite high mtDNA divergence between the subspecies. So information from across the genomes of Bornean and Sumatran orangutans may be necessary to substantiate the hypothesis of long isolation suggested by mtDNA.

    Within Borneo, different local populations of orangutans have strong genetic differentiation, with few shared mtDNA haplotypes among them. A new study by Natasha Arora and colleagues [1] has provided further detail about these relationships within Borneo. Based on earlier work, they expected to find high population differentiation within Borneo, and that is what they found:

    [O]ur analyses revealed high and significant mitochondrial differentiation, with populations within currently recognized subspecies generally displaying as much differentiation as those between subspecies. Of notable interest is the great extent of subdivision and lack of reciprocal monophyly for the morphologically recognized subspecies P. p. morio and P. p. wurmbii. MtDNA haplotype sharing is uncommon and for populations separated by rivers occurs only in two instances: (i) for SA and GP and (ii) for the northern and southern populations across the Kinabatangan river. In both cases, very recent common ancestry could explain the incomplete mtDNA lineage sorting. For North Kinabatangan (NK) and SK, Jalil et al. (27) proposed an expansion from a recent common refugium further west in Mount Kinabalu, as posited for other Bornean species (46, 47, 49). DV, with its low haplotype diversity, might also be the result of a recent range expansion. GP is located proximally to the Bangka–Belitung–Karimata–Schwaner divide, from where orangutans are presumed to have dispersed to the rest of Borneo (12) and where we might expect a rich haplotype diversity. However, the presence of only one mtDNA haplotype shared with populations further east suggests that the current population in GP is recent and/or underwent a severe recent bottleneck. This and other local bottlenecks make it impossible to reconstruct a colonization of Borneo through the southwestern “choke point” (52).

    They were able to confirm the relatively strong differentiation of Bornean populations by examining nuclear microsatellites. These do not give a great indication of the time period over which the populations may have developed their differentiation, but the microsatellites do document the relative lack of allele sharing between the populations, attesting a history of low gene flow in the recent past. The populations they identify as strongly differentiated do not correspond entirely with the subspecies recognized along morphological lines, but there are strongly differentiated populations here.

    The "news" aspect of the paper is the one unexpected observation: the mtDNA ancestor of Bornean orangutans lived relatively recently, only around 176,000 years ago (with a range of error stretching from 72,000 to 320,000 years ago. The data in the study do not allow us to distinguish whether this was a time when the Bornean population may have been founded, or whether instead the mtDNA lineage spread through pre-existing populations. The authors pursue the hypothesis that Bornean orangutans were limited to a refugium sometime during the early Late Pleistocene:

    Assuming that orangutans arrived in Borneo around the same time as gibbons and macaques, the recent coalescence of Bornean orangutans could be explained by a bottleneck through a severe rainforest contraction. Such a bottleneck would have had a more dramatic impact on the mtDNA structure of orangutans compared with other species as a result of their low densities and slow life histories (18) as well as habitat requirements.

    The comparison with gibbons and macaques is necessary because both have substantially deeper mtDNA coalescence times within their Bornean populations. If the forest had been substantially reduced to a small area where orangutans could survive, we might expect the other primates to reflect this event -- and they don't. Nevertheless, a grab-bag of climate change scenarios appear next:

    Geomorphological and palynological data indicate the presence of dryer, more open vegetation in southern and western Borneo during the last glaciation (2, 41), and by extrapolation also during other glaciations (but c.f. refs. 42, 43). Climate change was especially severe during an extended cold period within the penultimate glaciation between 130 and 190 ka (44, 45), which occurred approximately at the time of mean coalescence of Bornean mtDNA haplotypes. More recently, the last Toba eruption approximately 74 ka resulted in a short, albeit signi␣cant, decrease in regional temperatures, ensued by a 1,800-y cold stadial (9, 10). Our data do not provide clear signals to make conclusive statements about potential Toba effects. Nonetheless, the coldest period of the penultimate glaciation (44, 45) was more prolonged than the cold period following the last Toba eruption, suggesting more severe effects of the former on the extent of rainforest across Sundaland. In any event, suitable rainforest habitat for orangutans should have existed in certain regions in Borneo where a refugium population survived the dry glacial conditions.

    A coalescence time of 176,000 years ago does not point to a short-duration bottleneck that began 74,000 years ago. If orangutans in the Middle Pleistocene of Borneo had high genetic differentiation, a crash would have to have been very severe -- eliminating all but one small regional population -- to have effected the present distribution. Still, the great uncertainty in the actual coalescence time leaves open many possibilities, and the refugium hypothesis in the general case is worth testing, even if the Toba eruption in particular cannot explain the data.

    Given the uncertainty about the habitat structure of the now-submerged areas of Sunda, we may also want to consider the hypothesis that the present orangutans arrived recently on Borneo from mainland Southeast Asia. Even if orangutans had lived on Borneo during the Middle Pleistocene, they may not have been the current orangutans. Or even better, they may have been Neanderorangs -- an initial population that was genetically swamped by migrants arriving from elsewhere. The deep Sumatra-Borneo divergence means that the Bornean population was probably not recently derived from Sumatra, but that's a very restricted source compared to the Late Pleistocene distribution of orangutans across mainland and island East and Southeast Asia.

    Some other animals walked from Sumatra to Borneo repeatedly during the Pleistocene, including humans. In the human case, we know that a large fraction of the genetic ancestry of Bornean and Javan people was derived from Asia within the last 100,000 years -- in other words, Late Pleistocene gene flow. The movement of genes may have happened in the context of a dispersal of Asian (or ultimately, African-derived) populations into island Southeast Asia. The paper includes some discussion of other primate species:

    For instance, the south Bornean gibbon Hylobates albibarbis and the Sumatran–Malaysian gibbon Hylobates agilis have a TMRCA of 1.56 Ma (36), and Bornean and Sumatran pig-tailed macaques have one of 3 to 4 Ma (37). By contrast, the Bornean–Sumatran common ancestor of both the silvered langur(39) and clouded leopard (40) is much more recent than that of orangutans, gibbons, and pig-tailed macaques, probably because of a higher ␣exibility in habitat use.

    The pig-tailed macaque divergence time is more or less the same as the orangutan divergence; the others are more like the time range for human dispersals into island Southeast Asia. We can add to the primates a few other medium-sized mammals; for example, clouded leopards are highly differentiated between Sumatran and Bornean populations, and their mtDNA divergence occurred sometime after 3 million years ago.

    There may be no contradiction between the recent mtDNA common ancestor and the high degree of population structure in Bornean orangutans; the mtDNA could have been selected. We really would want resequencing of a lot more loci in these orangtuan populations, for which we may not have to wait too long. Mitochondrial DNA is convenient in many ways, including its greater sensitivity to restricted population size and higher mutation rate. But the intrinsic variance of a single gene system under genetic drift is so high that this disadvantage probably outweighs all advantages for reconstructing population sizes.

    At any rate, the orangutans now provide an additional case where the subspecies-level history of hominoids is more complex than depicted five or six years ago. Uncovering these kinds of dynamics highlights the need for better modeling of demography and dispersal within a geographically widespread species. Isolation-by-distance and long-lasting subspecies are well-defined models, but when they are refuted, we have a lack of well-defined alternatives.


    References

  • Mailbag: The Neandertal fraction

    Tue, 2010-09-07 15:22 -- John Hawks

    Re: Neandertal DNA

    I have a question about your "Neandertals Live!" entry written on May 8, 2010.

    When you say that living non-African populations (ancestry) derive
    1-4% of their genomes from Neandertals, does this mean all living
    individuals of non-African descent have some genomic contribution from
    Neandertals? In other words, could one say if you or myself
    specifically have some kind of Neandertal DNA contribution? Or, does
    the 1-4% only refer to certain populations outside of Africa, while
    nothing can be said about individual non-Africans?

    For example, would having Neandertal genes be analogous to certain
    populations, like certain ethnicities, having a particular founder
    mutation on a haplotype, like sickle-cell anemia in people of African
    descent? In other words, some living groups of individuals have them,
    but not all living individuals have them?

    The comparison results from the greater similarity of European (and other non-African) people to the Neandertal sequence, compared to African people. It takes 1-4% genetic contribution to explain this similarity.

    That's an unusual comparison, and it leads to unusual limitations. The number is genome-wide and we don't know (yet) whether some parts of the genome are more consistently Neandertal than others. We also don't know (yet) whether Africans have no Neandertal at all, or just 1-4% less than non-Africans.

    We know nothing at all about individuals (at this moment) although I expect we'll be able to say something about the heterogeneity of Neandertal contribution fairly soon.

    I expect that some genes will have a very common Neandertal-derived haplotype outside of Africa because of selection, and that these will account for a predominant fraction of the admixture. But I can't say we know this yet empirically.

Pages

Subscribe to Asia

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.