john hawks weblog

paleoanthropology, genetics and evolution

metagenomics

  • Population gut metagenomics

    Fri, 2012-05-11 08:37 -- John Hawks

    The new research by Tanya Yatsunenko and colleagues examining gut microbiomes in different human populations is just incredibly cool work [1]. I don't have time to write much about it this morning, but Ed Yong's report is an excellent place to start: "Three nations divided by common gut bacteria".

    The population genomics of these gut microbes is a great topic also, but what I find most interesting is the parallel ontogenetic changes among populations from infants to adults:

    The guts of babies are dominated by Bifidobacterium – the group that’s commonly found in probiotic foods. They’re also loaded with genes for producing folate, an essential B-vitamin that’s involved in creating and repairing DNA. These folate-making genes decline as babies grow up, and get more of the vitamin from their diets. At the same time, the genes for making other vitamins, like B1, B7 and especially B12, become more common. “This similarity across cultures in building up the gut microbiome in childhood has been touched on before but it’s much more convincing here,” says Peer Bork, from the European Molecular Biology Laboratory.

    Adam Van Arsdale also has written up some thoughts about the research: "The human gut microbiome".


    References

  • Scanning the ape fecome

    Mon, 2010-09-27 17:00 -- John Hawks

    Donald McNeil, Jr., has written up some background detail about last week's story that falciparum malaria came from gorillas: "A finding on malaria comes from humble origins". It's one of many research findings coming out of a systematic collection of fecal samples from African ape field projects:

    Dr. Hahn, a virologist at the University of Alabama at Birmingham, is an expert not in malaria but in S.I.V., or simian immunodeficiency virus, the precursor to the virus that causes AIDS in humans. But she has made deals with primate researchers all across Africa who collect fecal samples for their own projects, to have them take extras for her.

    They go into vials with a special solution, called RNAlater, that preserves the nucleic acids of all the cells in the sample — which includes not only what apes eat, but cells sloughed off their gut linings, which contain all the things infecting them. She has systematically sequenced the genes of many of those infective agents: S.I.V., simian foamy virus, hepatitis and now malaria parasites.

    Poop metagenomics. I wonder to what extent pathogens in meat may pass through the gut with DNA intact. Probably not a big issue with African apes, as meat consumption is fairly sporadic even in chimpanzees. But you'd want to be cautious doing certain things with carnivores.

  • NEANDERTALS LIVE!

    Thu, 2010-05-06 12:53 -- John Hawks

    I, for one, welcome my Neandertal ancestry.

    It may not sound like a lot -- between 1 and 4 percent. But that's the equivalent of one great-great-great grandparent's DNA contribution. In the case of the Neandertal contribution, more than 1500 generations ago, it's an enduring legacy of an ancient group of people, spread across many lines of the genealogies of living people. Beyond their genealogical interest, Neandertal genes might have made a big difference to our evolutionary potential.

    In case you wonder what the heck I'm talking about, here's the story: Two new papers in Science describe the full draft sequence of the Neandertal genome, and perform additional analyses to understand the pattern of adaptive evolution in the population ancestral to living people.

    Richard Green and colleagues report on the genome, demonstrating very convincingly that present-day people have Neandertal ancestors. It is not entirely obvious when and where the gene flow between Neandertals and other ancient populations happened -- whether it was associated with the dispersal of most of our ancestry from Africa, or whether it may have been earlier. The gene flow was not limited to Europe, and evidence for Neandertal ancestry occurs in East Asian and Australasian populations.

    The paper is full of other good stuff, including some evidence about which gene regions changed under selection in the ancestral human population.

    Meanwhile, the second paper by Burbano and colleagues applies new microarray techniques to assess how much of the human legacy of amino acid changes has arisen in the latest, post-Neandertal period of our evolution.

    So there's a lot about the pattern of evolution and gene flow leading to living people, and a lot about adaptive and functional evolution. That makes a lot for me to cover -- and while I have the papers a little early, time is short. Let's see how much I can help clarify what's in this new research.

    If you had to sum up in a few words, what does this mean for paleoanthropology?

    These scientists have given an immense gift to humanity.

    I've been comparing it to the pictures of Earth that came back from Apollo 8. The Neandertal genome gives us a picture of ourselves, from the outside looking in. We can see, and now learn about, the essential genetic changes that make us human -- the things that made our emergence as a global species possible.

    And in doing so, they've taken a forgotten group of people -- whom even most anthropologists had given up on -- and they've restored them to their rightful place in our heritage.

    Beyond that, they've taken all of their data and deposited it in a public database, so that the rest of us can inspect them, replicate results, and learn new things from them. High school kids can download this stuff and do science fair projects on Neandertal genomics.

    This is what anthropology ought to be.

    What did they sequence?

    The Max Planck group obtained most of their genomic sequence from three specimens from Vindija -- Vi33.16, Vi33.25, and Vi33.26. These are all postcranial fragments with minimal anatomical information. Green and colleagues were able to establish that the three bones represent different women, and that Vi33.16 and Vi33.26 may represent maternal relatives.

    From these skeletons they got 5.3 billion bases of sequence. All this from an amount of bone powder about equal in mass to an aspirin pill.

    Amazing. I mean, I know the folks at Max Planck are reading this. It's inspiring to see what they've been able to do. These are three pieces of barely diagnostic hominin bone, and they've obtained literally hundreds of times more information than we have ever gotten from the fossil record of Neandertals.

    I'll describe the analyses of genetic similarity with humans in more detail below. As a brief summary, of those positions where the human genome differs from chimpanzees, Neandertals have the chimpanzee version around 12.7 percent of the time -- meaning that across the genome, a Neandertal and a human will share a genetic ancestor an average of around 800,000 years ago. This is a couple hundred thousand years higher than the same number if we compare two humans to each other. The higher age of genetic common ancestors reflects partial isolation between the Neandertal population and the African populations that gave rise to most of our current genetic variation.

    The team were able to identify 111 candidate duplications, almost all of which have some evidence of copy number variation in humans or other primates. They tentatively show that Neandertals have a bit more copy number variation than present-day humans, and identify a few loci with substantially higher copy numbers in one group or the other.

    A substantial part of the paper is dedicated to finding evidence of positive selection on the human lineage after the emergence of Neandertals. The idea is to look for fixed selective sweeps -- regions where humans are likely to have SNPs absent in Neandertals and a relatively shallow gene tree. They identify 212 regions like this -- as I discuss below, a surprisingly low number.

    The second paper, by Hernán Burbano and colleagues, describes the application of a targeted microarray to probe Neandertal genetic samples for protein-coding variants that separate humans from chimpanzees. They identify 88 amino acid substitutions that seem fixed in the known sample of living humans, but not present in the Neandertal sequence. Those 88 are not necessarily all functionally important, although this list will include a number of "structural" genetic changes that make a difference to proteins expressed worldwide today. There is much to come in analyzing the categories and genes represented in both lists, which may tell us very interesting things about our Late Pleistocene evolution.

    What is the evidence for interbreeding?

    From their initial work sequencing the nuclear genome in Neandertals, the Max Planck group has followed a clever strategy: Don't look at the Neandertal sequence to see what humans share, look at human variation to see which version the Neandertal sequence has.

    The strategy is smart because it helps to obviate some major problems with ancient DNA -- you don't have all the parts, and the parts you do have probably contain a lot of sequencing errors of various kinds. By looking first at sites that vary within humans (or, in some comparisons, between humans and chimpanzees), we can focus on a very simple question -- did the Neandertal have one version, or the other?

    Applied to human variation today, there are several ways we might use a Neandertal genome test the hypothesis of no interbreeding. Green and colleagues focus on two complementary approaches.

    1. If Neandertals contributed no genes to living populations, then they should be equally related to all living people, no matter where in the world those people live.

    Green and colleagues show that the Neandertal genome is closer to some humans than others. People whose ancestry lies outside Africa are significantly more like Neandertals than are people who live in Africa today. In this study, the authors include whole genomes from people in France, China and Papua New Guinea outside Africa, and Yoruba and San inside Africa. The Africans are not as close to the Neandertal as any of the non-Africans.

    That doesn't mean that non-Africans derive most of their genes from Neandertals -- in fact, as I describe below, the proportion is quite small. Living people are more like each other -- even non-Africans and Africans -- than any of them are like Neandertals.

    The point is that despite this great similarity of living people, we have genetic variants that we share with the Neandertal genome, and that proportion is a lot higher outside Africa than inside it. The natural conclusion is the Neandertals contributed more genes to non-Africans than to Africans.

    One thing is for sure: You can't explain this observation under the hypothesis that a small, African population expanded out of Africa without interbreeding with Neandertals along the way.

    2. Look at the genes most likely to represent ancient population structure, the ones with deep roots outside Africa.

    This is an idea that we came up with to look for genes in living humans that might have come in from Neandertals or other ancient populations (for example, we described it in our 2008 review). Look for the parts of the genome with the deepest genealogical roots outside of Africa. Those are candidates for Neandertal gene flow -- a high chance that one of the two sides of that deep root was present outside of Africa for hundreds of thousands of years.

    Green and colleagues took this idea to the next level. They found parts of the genome where non-Africans have a deep root and Africans don't. Then they looked at the Neandertal sequence. Out of the 12 regions they identified with deep roots outside Africa, they found that the Neandertals had the deep, non-African specific version in 10 of those.

    I mean, there's really not any other way you can explain this. We got those genes from Neandertals. Every one of those loci is a region where some people have a Neandertal-derived allele, and others don't. Those particular 10 loci are a small fraction of the overall Neandertal-derived element of our heritage -- because they used Perlegen SNPs to find them, they ended up with regions that are fairly long (100 kb or more in length). Those are probably all really interesting, but there will be more of them when we can reliably identify smaller segments with deep genealogies.

    Could the results have been caused by contamination?

    Green and colleagues are utterly convincing about the level of contamination in their sequence. They have employed several independent checks, all of which arrive at the same conclusion: The modern human contamination in almost all their comparisons is limited to significantly less than one percent -- and for autosomal sequence they can give a tight estimate of 0.7 percent contaminating sequence.

    The methods that Green and colleagues used to test for a Neandertal contribution to non-African populations are not likely to be strongly influenced by contamination. The probe for deep roots in particular is extremely unlikely to be influenced by contamination in the Neandertal sequence.

    The very low contamination rate, and methods that should be robust to some contamination, means that we can be very confident in their result.

    How much Neandertal ancestry do we have?

    The Neandertal contribution does not make up a major proportion of any population, even outside of Africa. Green and colleagues apply a population model that involves isolation between ancestral Neandertal and African populations, a dispersal from Africa into Eurasia, and subsequent mixture with the Neandertals. Under this model, the estimated fraction of Neandertal ancestry for non-African populations today is between 1 and 4 percent.

    Now, let's put on our skeptics' hats. Is this the right model?

    If Neandertal and African populations had not been isolated, then the amount of mixture after an out-of-Africa dispersal would be lower. On the other hand, the dispersing African population would already be part Neandertal, because of genetic mixture. The proportion of ancestry from ancestral Neandertals would be around the same amount, it would just be distributed across a longer time.

    They did not examine the question of how much of the genome came in from Neandertals because of selection. The estimate they have, between 1 and 4 percent, is so high that this is not just a few genes introgressing in from Neandertals -- it is a big fraction of the neutral, non-coding part of the genome. So selection doesn't explain the similarity, nor can parallelism -- the similarity is genome-wide, not just coding or functional changes, and not as far as we know clustered into regions that might have hitchhiked with adaptive alleles.

    But there's clearly a lot more to do, characterizing the functional implications of some regions, testing for selection, and finding Neandertal variants that might have reached very high frequencies in later populations. To the extent that selection has influenced the pattern, it will also throw off the simple population model. But it doesn't throw off the fraction of Neandertal ancestry -- if it's three percent, it doesn't matter whether it was selected or neutral, it's still three percent.

    So the bottom line is, the fraction is going to be about right, regardless of the mechanism by which the genetic mixture happened.

    Can we please take off our skeptics' hats? It's getting in the way of my Neandertal victory dance.

    No. All the cool paleoanthropologists wear hats.

    What about population structure within Africa? Could that explain the apparent Neandertal contribution?

    We've known about the occasional deep-rooted genealogies outside Africa for a long time (and Jeff Wall's work, as an example among others, has explained that pattern as archaic human mixture into non-Africans). They've been talking about something like five percent of the human genome coming from admixture with ancient groups outside of Africa. So this shouldn't come as a shock.

    Until now, though, it has been possible for some people to wave these results away. We didn't really know that any of those deep roots were in archaic humans, and after all, who's to say that they aren't variants that originated in Africa and have since been lost there, or that we haven't found them yet? African variation is great, and if you imagine that some variation might have once existed in northeastern Africa and was subsequently lost within African populations, that might look like admixture with archaic humans outside of Africa.

    This line of argument is now special pleading. Why would we posit a cryptic mystery population in Africa, which happens to look genetically identical to Neandertals, but has subsequently disappeared? A big fraction of deep genealogies outside Africa really are in Neandertals. By far the simplest explanation is that today's non-Africans got them from ancient non-Africans. This is no surprise -- that's where the data have been pointing now for five years.

    Yet Africans are a lot more diverse than other populations, and this diversity itself does reflect the dynamics of the ancient African population. The Neandertals aren't so different from that pattern that now still exists within Africa -- they're extending the notion that "modern" is something that's been evolving for a long time. I expect we'll be able to come to a better understanding of ancient population interactions within Africa, by understanding the parts of the genome that have come from Neandertals outside of Africa.

    Could the gene flow be due to ancient interactions between West Asia and Africa?

    Green and colleagues suggest that at most few genes from modern humans ended up in Neandertals.

    That is, although they find lots of evidence of old-looking genes in us that are shared with the Neandertal genome, they find few cases of new-looking genes in us that are shared with that genome.

    That might suggest several things about interactions between Africa and West Asia and Europe during the Middle to Late Pleistocene. For example, if there had been high gene flow from Africa into West Asia after the first appearance of a distinct Neandertal population, maybe 200,000 to 400,000 years ago, we might expect to find some new-looking genes in humans that Neandertals also got.

    On the other hand, the data are from European Neandertals, who are at the end of a fairly long chain of populations from Northeast Africa. If gene flow had been ongoing into the Levant or further into West Asia during the last 200,000 years, it's not obvious how many of these genes would have made it into Europe. The rapid mitochondrial DNA coalescence of Neandertals does suggest substantial mobility in the population across Central Asia to Western Europe. But maybe that apparent dynamism had a boost from mtDNA selection.

    So just on the data, I don't think we know yet whether this is gene flow in the Levant 200,000 or 100,000 years ago, or whether it's genes coming from West Asian Neandertals into dispersing Africans after 100,000 years ago. I expect all are likely. I have some ideas how to test some of these things, and we will get started immediately.

    The lack of apparent mixture of "modern" genes into Neandertals -- what does it mean?

    It means that a model of one-way gene flow from Neandertals into us can explain the pattern of genetic similarity.

    The authors explain this as a function of population expansion. The expanding population (us) picks up some Neandertal genes that expand in numbers, while the contracting population (Neandertals) doesn't have a chance to pick up as many genes because it is declining in numbers. That model seems plausible, particularly in comparison with historical cases of population contact.

    On the other hand, the three Neandertals from which most of the genome sequence was derived all date to before 40,000 years ago. There weren't any modern humans around for them to have interacted with around Vindija at that time. So should we be surprised that they don't have genes of modern humans?

    A more interesting question was posed to me by a very sharp journalist: What would we expect the result to have been if they had sequenced a Near Eastern Neandertal, like Amud, for example?

    The answer seems obvious -- the admixture fraction should have been higher. That population, which is the most likely to have been the source of mixture, must have been somewhat genetically different from the European Neandertals. Any extent of genetic differentiation between them would make the European Neandertals look less like non-Africans today than the Near Eastern ones.

    I'll have more to say about these Near Eastern Neandertals in the next few days.

    But wait a minute. I thought the mitochondrial DNA proved that Neandertals are extinct!

    Selection. Selection. Selection.

    I've been saying it for years. I've published it. Will you learn to listen to me, already?

    The mtDNA of Neandertals is gone because it conferred some disadvantage. There are many reasons to suspect this -- the Neandertal variation is itself apparently recently derived; the human variation is clearly in disequilibrium, especially outside Africa; the mtDNA genes affect functions that differ greatly in Neandertal and recent populations, including energetics, longevity, and brain; there are clear signs of mtDNA selection in many recent human populations.

    Mitochondrial DNA is useful for a lot of reasons, but nobody should ever have relied on it alone as evidence of Neandertal population dynamics.

    Is it really true that there is no variation in Neandertal ancestry outside Africa?

    The comparisons in the paper are highly convincing because of the sheer amount of sequence taken from the sampled individuals. A single gene locus from an individual may be unrepresentative of the person's population, but averaged across the whole genome, the difference between two people from distant populations is very, very close to the difference between the two populations.

    But they sampled very few individuals. So we are left with a question -- do we really know we've sampled variation outside Africa enough to make regional estimates of Neandertal gene flow?

    I think we could do better with more genomes. For example, when it comes to finding deep genealogies, we need to be able to find shorter regions than the ones used by Green and colleagues. That will expand the sample of candidate loci, and will catch some Neandertal-derived genes that we're missing now. Moreover, if gene flow was really around 1-4 percent, many SNPs that came in from Neandertals will be rare enough to be missing from the big SNP genotyping samples. We may find some variants with whole-genome sequencing on larger samples that will be worth examining.

    But most important, we'll be able to develop strategies based on this success to find ancient population structure involving groups where we don't yet have the DNA -- like populations of South and East Asia. Some of those may give us the chance to test those methods soon, as for the Denisova individual.

    Is this multiregional evolution, or just out-of-Africa with some leakage of earlier Eurasian genes?

    Out-of-Africa movement was a major mechanism of recent human evolution. The genetic ancestry of living people is multiregional.

    I see no contradiction between those statements. From now on, we are all multiregionalists trying to explain the out-of-Africa pattern.

    There was clearly a dispersal of African genes into the rest of the world during the Late Pleistocene, sometime between 50,000 and 100,000 years ago. Living people everywhere on Earth derive more than 90 percent of their genes from African populations who lived 100,000 years ago. That much is plain.

    (Why did I not write "more than 96 percent?" See below.)

    These genetic observations require some kind of out-of-Africa event. This event was not limited to a few genes, and selection of a few genes even with substantial hitchhiking of surrounding genome cannot account for the pattern. There must have been some kind of demographic expansion including African-derived populations and preferentially excluding the genes of Eurasian populations like the Neandertals. Selection on a gene network might have mediated the expansion, as suggested by Eswaran (2002). Or the expansion might have been culturally or technologically mediated, as many other people have suggested.

    Those are hypotheses about mechanisms. How did it come to be that living people trace the overwhelming majority of their ancestry to Africa within the last 100,000 years? These explanations may answer that question.

    The present study shows that Neandertals were at a minimum partially isolated from their contemporaries in Africa, and that the genetic divergence between those populations was larger than the genetic differences between European, Asian, and African populations today.

    Yet those Neandertals are among our ancestors. Late Pleistocene humans had multiregional origins, and the evolution of the Neandertals was itself a case of relatively recent population dispersal from Africa or West Asia. Human and Neandertal genes mostly derive from common genetic ancestors between 400,000 and a million years ago -- much, much later than the initial habitation of Eurasia 1.8 million years ago.

    But 1-4 percent is so minor, can it be an important part of our evolution?

    There are three things you have to ask about the fraction of Neandertal ancestry.

    1. How much gene flow would it take to guarantee that anything adaptive in the Neandertal population survived into later people?

    The answer to that question is simple -- it takes a few dozen matings to get most adaptive genes into our population. If there was a lot of interference with the genetic background, it might take more -- just to make sure that the advantageous alleles had a chance to be de-linked from the genetic background.

    If Neandertals are one percent of the ancestry of non-Africans, we can be very sure that any gene in a Neandertal that had adaptive value in the later population is here now. That means they were important in an evolutionary sense.

    2. What fraction of the human population 50,000 years ago were Neandertals?

    This is very important -- when it comes to neutral genetic loci, the essential question is how much the Neandertals may be underrepresented today relative to their numbers in the past. Is three percent too low? It seems very unlikely that the fraction of Neandertals compared to the rest of humans was as high as 10 percent -- we know that Africa already had a large population 50,000 years ago, and everything we know about Neandertals suggests a very low population density, an effective size much smaller than 10,000 individuals. Were five percent of the people on Earth 50,000 years ago Neandertals?

    We don't really know the answers, but now we have a chance to test hypotheses about ancient population size and expansion in Neandertals. My point at the moment is only this: If today Neandertal genes make up only one percent of the gene pool of the 5 billion people outside Africa, that's the genetic equivalent of 50 million Neandertals.

    In relative terms, their contribution to our population may be a reduction from their fraction of the Late Pleistocene population. Not that great a reduction, not a massive crash to zero. A reduction in the wake of the out-of-Africa movement, possibly from five percent to three.

    You might think the answer to this is obviously zero. But in genetic terms, we can ask, how many times has the average Neandertal-derived gene been replicated in our present gene pool? Those aren't Neandertal individuals -- that is, a forensic anthropologist wouldn't classify them as Neandertals. They're the genetic equivalent.

    The answer to this is also simple: In absolute terms, the Neandertals are here around us, yawping from the rooftops.

    There are more than five billion people living outside of Africa today. If they are one percent Neandertal, that's the genetic equivalent of fifty million Neandertals walking the Earth around us.

    Does that sound minor? If I told you that your average gene would be replicated into fifty million copies in the future, would you be satisfied? Maybe your ambition is greater, but I think the Neandertals have done very well for themselves.

    Does this mean that Neandertals belong in our species, Homo sapiens?

    Yes.

    Interbreeding with fertile offspring in nature. That's the biological species concept.

    Now, some paleontologists might still disagree -- maintaining that species are units that can be distinguished morphologically, or by one or more derived features, or any number of other definitions. That's fine with me, as long as they're clear. But understand: It does define all non-Africans today as an interspecific hybrid population.

    So maybe they want to rethink that one?

    If Eurasians got less than 4 percent from Neandertals, doesn't that mean that they got more than 96 percent from Africa?

    I look at the 1-4 percent estimate as a minimum, for several reasons. As I'll note below, this estimate mainly refers to the excess Neandertal ancestry outside Africa, which means there may be some additional amount that both recent African and non-African populations share.

    But more important, Neandertals weren't the only people living in Eurasia 100,000 years ago. China didn't have Neandertals, nor did Southeast Asia and Java. India was full of hominins, which might or might not have shared substantial genetic similarity with Neandertals. They're close enough to the known Neandertal range to speculate that they may have been close, but the only available fossil, the Middle Pleistocene Narmada skull, is not very informative. Any of these populations might have been genetically different from Neandertals, and might have also contributed genes to present-day human populations -- genes that wouldn't show up by scanning the Neandertal genome.

    The recent genetic sequencing of the Denisova pinky (a.k.a. the X-woman) from the Altai Mountains reminds us that these populations outside of Africa may have been quite a bit closer to us, genetically, than we might have expected from the 1.8-million-year record of humans outside Africa. These populations were dynamic in ways that many paleoanthropologists haven't yet appreciated.

    Do living Africans have Neandertal ancestry, too?

    I think that the present study doesn't have the power to answer this question, at least with the design that the authors used. The fact that living Africans are less genetically similar to the Neandertals is extremely important evidence of the Neandertals' genetic contribution to populations outside Africa. But it doesn't bear on how much back-migration into Africa may have happened.

    We know that the answer is nonzero, because Africa has received immigrants from other parts of the world during historic times. The same genetic patterns that reflect population contacts up and down the East African coast, and across the Sahara into West Africa, show the possible conduits for the flow of Neandertal-derived genes into African populations.

    But how much genetic dispersal into Africa happened in LSA or late MSA times? Mitochondrial and Y chromosome distributions in Northeast Africa suggest there was been some. Nevertheless, Africa would have been a very difficult place to return, for humans who had begun adapting to different ecological and disease environment.

    I think that some Neandertal genes might have made it back into Africa, even in ancient times, but I wouldn't be surprised if that number was small.

    The big shoe left to drop is the extent of population differentiation within Africa during MSA times. So far we've seen hints that these populations might have been nearly as differentiated from each other as they were from Neandertals, with substantial gene flow homogenizing them in the last 30,000 years. This paper includes an additional Bushman genome, after the four published earlier this year. Comparing that new genome to the Neandertals, its modal difference from the human reference (Hg18) genome is between the other humans and the Neandertal. Not quite halfway between, but nearly so. There's a lot of genomic variation within Africa, and exploring the population history that explains that variation may turn up some surprises.

    What about recent selection?

    One of the really exciting aspects of this work is that both Green and colleagues and Burbano and colleagues look for things that all humans today share but Neandertals lack.

    You might call these "the genes that make us modern," although functionally we have little idea what any of them do.

    Both papers show one thing that is extremely interesting: There aren't very many such genetic changes.

    Burbano and colleagues put together a microarray including all the amino acid changes inferred to have happened on the human lineage. They used this to genotype the Neandertal DNA, and show that out of more than 10,000 amino acid changes that happened in human evolution, only 88 of them are shared by humans today but not present in the Neandertals.

    That's amazingly few.

    Green and colleagues did a similar exercise, except they went looking for "selective sweeps" in the ancestors of today's' humans. These are regions of the genome that have an unusually low amount of incomplete lineage sorting with Neandertals, and therefore represent shallow genealogies for all living people. They identify 212 regions that seem to be new selected genes present in humans and not in Neandertals. This number is probably fairly close to the real number of selected changes in the ancestry of modern humans, because it includes non-coding changes that might have been selected.

    Again, that's really a small number. We have roughly 200,000-300,000 years for these to have occurred on the human lineage -- after the inferred population divergence with Neandertals, but early enough that one of these selected genes could reach fixation in the expanding and dispersing human population. That makes roughly one selected substitution per 1000 years.

    Which is more or less the rate that we infer by comparing humans and chimpanzees. What this means is simple: The origin of modern humans was nothing special, in adaptive terms. To the extent that we can see adaptive genetic changes, they happened at the basic long-term rate that they happened during the rest of our evolution.

    Now from my perspective, this means something even more interesting. In our earlier work, we inferred a recent acceleration of human evolution from living human populations. That is a measure of the number of new selected mutations that have arisen very recently, within the last 40,000 years. And most of those happened within the past 10,000 years.

    In that short time period, more than a couple thousand selected changes arose in the different human populations we surveyed. We demonstrated that this was a genuine acceleration, because it is much higher than the rate that could have occurred across human evolution, from the human-chimpanzee ancestor.

    What we now know is that this is a genuine acceleration compared to the evolution of modern humans, within the last couple hundred thousand years.

    Our recent evolution, after the dispersal of human populations across the world, was much faster than the evolution of Late Pleistocene populations. In adaptive terms, it is really true -- we're more different from early "modern" humans today, than they were from Neandertals. Possibly many times more different.

    More?

    That's what I have time for now, if I want to get this posted. There is much, much more to say on the topic, and you can bet it will be all Neandertals all the time here for the foreseeable future.

    References:

    Green RE and many others. 2010. A draft sequence of the Neandertal genome. Science (in press) doi:10.1126/science.1188021

    Burbano HA and many others. 2010. Targeted investigation of the Neandertal genome by array-based sequence capture. Science (in press) doi:10.1126/science.1188046

  • Microbial extinction

    Thu, 2009-12-31 00:19 -- John Hawks

    Scientific American asks: "What happens when the microbes that keep us healthy disappear?"

    If we're starting a pool, I want to put my money on, "nothing much."

    The article is interesting -- it describes how people are trying to unravel the "microbial ecology" of the human microbiome, and how new sequencing technologies are rapidly accelerating the research.

    I think it goes a bit into science fiction by combining these with the "clean childhoods cause allergies" hypothesis, generating the idea that modern life and antibiotics may cause extinctions of gut flora. So far, there's little evidence that losing a microbial species has any negative health impact. The article mentions that widespread vaccination against Streptococcus pneumoniae may create an opening for more Staphylococcus aureus infections, but that's hardly an argument against vaccination -- it's an additional reason to find a vaccine for S. aureus.

    UPDATE (2009-12-31): A nice AP story reviews Norway's method of stamping out MRSA: Don't prescribe antibiotics:

    In Norway, MRSA has accounted for less than 1 percent of staph infections for years. That compares to 80 percent in Japan, the world leader in MRSA; 44 percent in Israel; and 38 percent in Greece.

  • The Neandertal genome FAQ, February 2009 edition

    Tue, 2009-02-17 16:38 -- John Hawks

    I was out of town last week when the Max-Planck Institute made its announcement about the completion of 1x coverage of the Neandertal genome. It was an exciting day for me. Already, I had scheduled a number of radio shows and a public lecture to commemorate Darwin Day. Several press interviews regarding the news of the Neandertal sequencing project added to the hectic nature of the day, so I didn't get a chance at the time to sit down and write my reactions.

    So, nearly a week later, I've finally caught up. I've answered many questions about the Neandertal genome before, so I'm focusing these on the current announcement.

    For answers to other kinds of questions, try these posts:

    And now, some new questions arising this week:

    Has the Neandertal genome now been reconstructed?

    No. This announcement is a milestone, not an endpoint.

    Much remains to put together an entire genome sequence. The ongoing work represents a massive technical achievement, and is well worth celebrating. But we are not yet at the point where we can talk about structural variants in the Neandertal genome compared to humans, length polymorphisms, or a number of other things. Plus, as noted below, only 63 percent of the nucleotides have been sequenced once -- leaving a lot of basic sequencing left to get even a single pass over the whole genome.

    Some stories have used the term "decoded" -- that also would be a misstatement. We don't know the import of the variations that might so far have been found. That is, we cannot yet convert the information that Neandertal sequences provide to us about their genome into information about their phenotypes. Keeping that in mind, "decoding" the human genome is an ongoing process. With the Neandertals, we have barely begun.

    I heard that this was a whole Neandertal genome, but then the fine print says that it's only 60 percent completed. What gives?

    They set up an announcement when they knew they would be past sequencing 3 billion bases. And in fact they've reached 3.7 billion.

    That would be more than the whole genome, if they could pick out exactly which parts they are sequencing. But the shotgun sequencing approach they are using means that some parts of the genome are represented several times in their 3.7 billion bases, while others are not represented all.

    It's sort of like painting your house. You could calculate how many gallons it would take for "full coverage" with a paintbrush, but if you shoot that many gallons out of a paint gun there are going to be a lot of gaps that didn't get painted.

    For the Neandertal sequence right now, the gaps add up to around 36 percent of the whole genome. Which is an awful lot of missing data.

    So why make an announcement now? I dunno. Darwin's birthday makes a good occasion? They could easily have published last year or the year before on many different genes, just as they published the whole mtDNA last year. It seems likely to me that they've been holding off announcing or publishing until they were sure they had worked out a solution to the contamination problems they were having.

    I think they deserve to pop some champagne bottles and celebrate. When there is a public data release, we can all celebrate!

    What about those contamination problems?

    If you've been around a while, you'll remember that I thought the initial report of contamination was a bit overblown. Nevertheless, the possibility of substantial contamination, documented by comparisons between sequencing methods, stopped almost all work on the publicly available data. It was a serious problem, and the research groups responded seriously to the presence of contamination in the samples. Few details of this response were made public, but clearly there was a concern that the longer fragments coming out of the 454 machine didn't originate in the Neandertal sample.

    According to the Max-Planck press release, they've taken a number of steps to eliminate contamination. I'll quote the relevant sections:

    One essential element developed by Pääbo’s group was the production of sequencing libraries under “clean-room” conditions to avoid contamination of experiments by human DNA. They also designed DNA sequence tags that carry unique identifiers and are attached to the ancient DNA molecules in the clean room. This makes it possible to avoid contamination from other sources of DNA during the sequencing procedure, which was a problem in the initial proof-of-principle experiments in 2006. They also used minute amounts of radioactively labeled DNA to identify and modify those steps in the sequencing procedure where losses occur. Together with other advances implemented during the project, these innovations drastically reduced the need for precious fossil material so that less than half a gram of bone was used to produce the draft sequence of 3 billion base pairs.

    In order to reliably compare the Neandertal DNA sequences to those of humans and chimpanzees, the Leipzig group has performed detailed studies of where chemical damage tends to occur in the ancient DNA and how it causes errors in the DNA sequences. The researchers found that such errors occur most frequently towards the ends of molecules and that the vast majority of them are due to a particular modification of one of the bases in the DNA that occurs over time in fossil remains. They then applied this knowledge to identify which of the DNA fragments from the fossils come from the Neandertal genome and which from microorganisms that have colonized the bones during the thousands of years they lay buried in the caves. They have also developed novel and more sensitive computer algorithms to put the Neandertal DNA fragments in order and compare them to the human genome.

    I'm satisfied that they've done everything possible to eliminate contaminants. The examination of the chain of events from extraction to the final sequence is especially important. In many ancient bones, the steps taken to sterilize and extract from deep within the bone somehow still don't eliminate contamination in the final sequence data. Most of that contamination must arise during the processing and sequencing steps, despite the oft-quoted "clean room" conditions in ancient DNA labs. So the methodological advances toward understanding the sources of contamination are very scientifically significant.

    There's a hint in some of the earlier press coverage that the pace of sequencing has vastly sped up in the last few months. For example, in December, Ewen Callaway reported that the genome was halfway done:

    Half the Neanderthal genome has been decoded and the rest should be sequenced by year's end, a scientist involved in the project told a human evolution conference last week.

    Researchers will roll out a rough draft of the Neanderthal nuclear genome after their sequencers have read every letter in the genome on average once - "1x coverage" in genomics speak.

    Callaway is a careful reporter, but we should keep in mind that the comments in the story might not quite have conveyed the full situation. Still, if we take that assessment at face value, we can speculate that the process of working out the contamination issue took a long time during which sequencing was relatively slow or paused. If they actually had only sequenced half the 3 billion bases by December, that's pretty fast work since then (a perception that was echoed in some press reports prior to the announcement).

    The switch to the Illumina platform seems like an underreported aspect of the story. The press release claims that a billion reads were done on the Solexa, compared to only 100 million from 454 -- that also suggests a switch later in the process, since we know that they were using 454 initially and through early 2008. The press release doesn't explain why they moved from the 454 machine to Illumina. Maybe it's just efficiency of the current platform, but there must be a story there.

    What was the most boring aspect of the announcement?

    I was talking to a reporter on Tuesday before the press conference, and I said,

    "They're no doubt going to give us a list of some genes, with well-known variations in living people, that they've genotyped in Neandertals. And, aside from FoxP2, which we already know about, and microcephalin, I don't know what those will be. I think it would be the most boring possible outcome if they told us that the lactase persistence allele wasn't there. Because there's no news there.

    Well, I gave a big belly laugh when I saw the press release. Gee, Neandertals didn't have lactase persistence. Big surprise there! What did they think, they were secretly milking goats?

    OK, I admit, that's overly snarky. I mean, what if they'd found the opposite? It would be contamination, of course. So finding the wild-type lactase allele is worth something.

    But it's sort of like if your friend was looking through a telescope on Christmas Eve and caught the first-known glimpse of Santa and his reindeer. And you asked her, "What does he look like?" And she says, "He's wearing a red coat!"

    It's like being trapped in a Laurel and Hardy routine. And I'm Hardy.

    Does the Neandertal genome show that they were "distinct from us"?

    Experts on Neandertal bone morphology can readily distinguish them from later Europeans, assuming that the correct parts of the skeleton have survived. So from that perspective, Neandertals were clearly a "distinct" population. They had a morphological configuration no longer found anywhere in the world, and not found in the Europeans who immediately followed them in Europe.

    On the other hand, the bones of early Upper Paleolithic Europeans share some interesting similarities with the Neandertals. You wouldn't call the Oase 1 cranium a Neandertal. It lacks nearly all of the features that set Neandertals apart. But it has a mandibular foramen shaped like a small horizontal oval -- like a bit over half of Neandertals, and nearly a quarter of early Upper Paleolithic mandibles. This is a very rare morphology today, and it is rare elsewhere in the human fossil record, although it has been found in the very early Homo erectus sample from Dmanisi. There are two hypotheses for why this feature and others should be most common in two populations living in the same place in adjacent time periods: descent or parallel evolution.

    Looking only at the morphology, we have only our personal limit of credulity to argue one way or the other. How many features does it take to be convinced that descent must explain some of the similarities? Sadly, the answer to this question is different for different researchers.

    I think that the most reasonable explanation for the morphology is gene flow between Neandertal and other populations. But I have to say that others disagree.

    Genetic evidence may be most useful because we are much more likely to agree on the score. A unique gene sequence is unlikely to arise twice in parallel, and in any event the probability of such parallelism can be calculated in real numbers, not shopworn guesses. With 3 billion base pairs to compare between our populations, we have a good chance of finding and quantifying even low levels of genetic exchanges.

    However, these conclusions still depend on assumptions and models that not all anthropologists agree about. At the moment, the state of the science is such that the meaningful distinction is not whether Neandertals and humans may have interbred, but instead whether such interbreeding was common enough to be evolutionarily important, or to establish Neandertals as a "distinct" population. Since "important" and "distinct" do not have quantifiable meanings in evolutionary theory, you can see that we have a long way to go before paleoanthropology agrees on testable models of Neandertal population history.

    I think the science will be lively for the next few years, as the focus goes away from details of morphological characters and toward details of evolutionary models. The morphology will still remain important -- particularly as the observable evidence of variation within ancient populations. It will take many years before we have a good picture of genetic variability within these samples. But questions of "distinctness," which depend on shared characters and levels of interbreeding, must be answered at the level of models, not features.

    What about microcephalin?

    According to the press conference, the human-derived allele of MCPH1 was not found in the Vindija sequence. Bruce Lahn and colleagues had suggested that this allele might have come into the recent human population from Neandertals, based on its present pattern of variability. This allele is quite divergent from the rest of human variation at the locus, it is common outside of Africa but rare inside of Africa, and it appears to have been under positive natural selection for around 30,000 years. I have an FAQ on MCPH1 and introgression, and I've published on the topic. If the human-derived allele is not in the Neandertal genome, that obviously weakens the argument for introgression of this gene from Neandertals.

    We have interpreted this gene cautiously from the beginning. Neandertals are one likely source for such introgression, but not the only one. In my FAQ, I wrote this:

    Well, the D haplogroup [of MCPH1] is common in many areas outside of Africa in addition to Europe. So it isn't possible to really specify in what archaic population it may have originated. There is some chance that it may be found in the Neandertal genome sequence, when that becomes available. In fact, that would be the ultimate test for many candidate introgressive alleles.

    But there is a good chance that it won't be found in the Neandertal sequence. After all, Neandertals were probably pretty thin on the ground -- especially in Europe. A sampling of their genes would be sort of unlikely to yield a high proportion of archaic alleles that may have survived to the present day. So there is hope that we will find and document such alleles, but the best evidence for many of them may remain their current pattern of variation in living people.

    I think those points are important. There were not many Neandertals, and it may be much more likely for present-day humans to have genetic variation that originated in South or West Asia, or even multiple regions of Africa (a hypothesis suggested for some other gene loci).

    But I still think it very likely that out of the 20,000 genes in the human genome, some will have derived variants that were also present in the Neandertal genome. Human evolution over the last 50,000 or more years was driven by new variation, and multiple human populations would have been one of the largest potential reservoirs of adaptive variation for selection to work upon.

    What is the most important aspect of this announcement?

    Paleoanthropology is a science that generates huge public interest. But it gives very few chances for public participation. Those of us who are close to paleoanthropology know how much our science is driven by good ideas from many other fields. The pathways by which those insights enter our science tend to be highly constrained -- radiocarbon dating, scanning electron microscopy, isotopic analysis of enamel, and now genetics have all been brought into paleoanthropology by extremely skilled scientists from outside the field. I think that the Neandertal genome has the potential of breaking new ground.

    One year from now, there will be high school students working with sequences from the Neandertal genome. Who knows what they will discover?

    I just think that is tremendously exciting. For the first time, the primary data of paleoanthropology will be available to everyone.

  • Neandertal genome in one week's time?

    Wed, 2009-02-04 15:02 -- John Hawks

    Rex Dalton reports that Svante Pääbo's presentation at the AAAS meetings next week will have a little surprise:

    Project leader Svante Pääbo will announce the results of the preliminary genomic analysis at the American Association for the Advancement of Science annual meeting in Chicago, Illinois, which starts on 12 February.

    "We are working like crazy at the moment," says Pääbo, adding that his Max Planck colleague, computational biologist Richard Green, is coordinating the analysis of the genome's 3 billion base pairs.

    ...

    Pääbo says that his group will publish a first draft of the entire Neanderthal genome later this year, as a single read of all base pairs. However, some published human genomes had all their base pairs read eight to ten times before publication. The team says that its single-read of the Neanderthal genome is sufficient for publication because the technique used does not rely on the same DNA reassembly process used in conventional 'shotgun' sequencing.

    Three billion base pairs. Perhaps 4000 amino acid substitutions between them and us, and an unknown number of regulatory changes. There are likely to be some surprising similarities, as well as many surprising differences.

    We're going to have plenty of work.

    Anyway, the story doesn't specify what will be new about next week's announcement. I'm sure there will be some surprises.

  • Mailbag: Resurrect extinct species?

    Mon, 2008-12-22 10:35 -- John Hawks

    In today's mail, this question:

    Stupid question that I wish you would address: Are the tissue samples left from recently extinct species such as the Auroch, passenger pigeon, moa, dodo etc etc of sufficient quality to use it to resurrect the species? I would much rather see an Auroch than a pet cat cloned. Of course a wooly mammoth or Neanderthal would be even more interesting but also more problematic.

    My reply:

    It seems that those pursuing the idea of such resurrection are more interested in constructing artificial chromosomes. Once the technology is sufficient to do that, all you need is a genome sequence of the extinct organism and a suitable (closely related) host species to carry the pregnancy—of course with the attendant possible problems of immunocompatibility, etc.

    So, the barrier now is not the amount of tissue or the availability of genomic data, both of which seem to be sufficient for any recently extinct organism.

    I also mentioned the topic last month, after the NY Times carried an article about mammoth cloning. The idea raised there by George Church (which he thought would "alarm a minimal number of people" was constructing a Neandertal genome from a chimpanzee prototype. Is he imagining that people aren't ooked out by a Neandertal baby C-sectioned from a female chimpanzee?

    OK, so I'm ooked out. Meanwhile, I think you're going to want to construct a diploid genome, not two identical ones, because there are going to be some recessive lethals in there. So it takes more knowledge of variation than a single genome, and ideally quite a bit more. That's a limit too.

  • Complete Neandertal mitochondrial sequence, and selection on human (not Neandertal) mtDNA

    Sat, 2008-08-09 17:39 -- John Hawks

    In the current Cell, the Max-Planck group, in coordination with 454 Life Sciences, report the sequence of a complete Neandertal mtDNA. I'm out of town right now, so I'm writing fairly quickly, and I haven't seen any of the reporting. Keeping that in mind, I wanted to set out a few of the interesting things about the paper.

    I've been waiting a long time for this sequence to come out. I know they've had the basic data for a long time, since the mtDNA copy number is very high, the 454 process kicks out a lot of mitochondrial sequence. The reward for the wait is that Green and colleagues have done a very careful job of comparative analysis, with some very interesting results.

    If I leave something obvious out, please forgive me, since I'm just dashing this as quickly as I can.

    Where we left off...

    All previously reported sequences of Neandertal mtDNA have been fragments of the control region. The control region of the mtDNA (hypervariable regions I and II) is very helpful for working out phylogenetic relations among recent humans. True to its name, it varies a lot, and its high mutation rate allows a fine discrimination among lineages that have differentiated only within the recent past.

    The high mutation rate of the hypervariable regions also means that closely related populations have accumulated many differences. That's very convenient for identifying Neandertal mtDNA, where only small fragments (up until recently) have been practical to obtain. A small fragment of the mtDNA control region is sufficient to assess whether a specimen is like other known Neandertal sequences or not. Up to now, this has been an important way of authenticating Neandertal DNA sequence results --- although it has the obvious drawback that it might falsely exclude some genuine sequences that really do look like the modern human form.

    So far every Neandertal mtDNA sequence looks like a member of the same mtDNA clade. (More carefully, every specimen with good biological preservation that has produced DNA has yielded at least some mtDNA sequences that form a clade distinct from all recent humans. Others are presumed to be contamination -- which I have no reason to doubt.) No recent human -- out of the many thousands that have been sampled so far -- has produced a mtDNA control region sequence like any known Neandertal. The two populations, so far as we can tell, possessed distinct mtDNA clades.

    Divergence time

    A complete mtDNA sequence provides a lot of sites, which allows a more precise estimate of the divergence time between recent human and Neandertal mtDNA lineages. The paper reports this time as 660,000 years ago, with a confidence interval from 520,000 to 800,000 years ago. That range of dates substantially overlaps with the prior estimates of divergence time, and is a pretty good match to the initial estimate based on a single HVR1 sequence in 1997.

    The availability of a complete sequence has also removed a remaining piece of ambiguity from earlier comparisons. Because the hypervariable regions are so variable, it has always been the case that comparisons of hundreds or thousands of recent humans have included some pairs of individuals who are really divergent in their control region sequences. The result: some people living today are more different from each other than Neandertals are from recent people.

    Now, that particular fact is not meaningful in a cladistic sense. Neandertal sequences share derived mutations, as do recent humans. But the concept of a "range" of genetic divergence has confused comparisons. Comparing the control region alone, it may appear that Neandertals were not so very different from living humans, even if they have a few derived mutations that no longer exist. As long as some humans were also very different from each other, it remained possible that the tree had been wrongly reconstructed. An equally parsimonious tree (or even a more parsimonious one) might link the Neandertal clade with some modern human, even if not a recent European. When comparing humans to chimpanzees and more distantly related primates, the hypervariable regions are somewhat saturated with mutations, meaning that parallel mutations between different species are very common. This makes it even harder to reconstruct the tree of mtDNA relationships based on the hypervariable regions alone.

    Comparing the complete mtDNA genomes of a Neandertal and many recent humans presents a very different picture. Humans are all more similar to each other, when comparing the complete mtDNA genome, than any human is to a Neandertal. And in fact the Neandertal sequence is three or more times as different, on average, from us as we are from each other. This change from the earlier picture is a purely statistical one: more sites, with a more regular mutation rate. But it makes a clearer picture, and one that supports the phylogenetic model more clearly.

    Selection on COX2?

    Even though the control region is so helpful for analysis of recent humans, and easy identification of Neandertals, it's only a small fragment of the complete mtDNA. The mitochondrial genome is inherited as a single unit, so different mutations on a single mtDNA are co-inherited with each other. That means that the diversity of the noncoding control region is shaped by both genetic drift (due to demography) and selection. The selection includes purifying selection on coding sites across the entire mtDNA genome, and the possibility of positive selection on one or more ancient mutations.

    I believe that positive selection on mtDNA in ancient humans has a lot of indirect support (and I wrote as much here). To give a brief list:

    • Mitochondrial haplotypes in living humans correlate with functional variation in disease, longevity, and performance -- all areas that have undergone recent biological shifts in humans.
    • Some mtDNA haplotypes in humans appear to have been under recent positive selection, as indicated by their geographic distributions.
    • Some mtDNA haplotypes have vastly changed in frequencies within the past few thousand years, as evidenced by ancient DNA samples.
    • Nuclear genes involved in mitochondrial function have been under recent positive selection.
    • MtDNA from Neandertals is completely absent today, despite the other evidence for genetic survival of that population. This combination is very unlikely if mtDNA was neutral.

    So I think that positive selection is not only a reasonable hypothesis, it is extremely likely. But that is not to say that it has been demonstrated. Others might say that my final reason, that positive selection can explain the apparent contradiction between mtDNA and other data (such as skeletal comparisons and apparent nuclear introgression), is a case of wishful thinking. They might argue that all this other evidence of Neandertal-modern gene flow is an illusion, and not a problem to be explained.

    I don't think they're right, but in the spirit of honest advertising, that's what they think.

    It would be unreasonable for me to expect that a Neandertal mtDNA genome would provide strong evidence of positive selection on the human lineage. Finding such evidence would require repeated selected substitutions, probably within a single gene. Otherwise there would never be statistical evidence of positive selection. The available tests for positive selection in a two-genome (or in this case, two-clade) comparison are very weak.

    Only a single selected mutation would be sufficient to explain the complete replacement of Neandertal mtDNA by an advantageous modern human type. No test of selection is powerful enough to refute neutrality based on a single selected site in a comparison of two mtDNA genomes. And repeated selection on a single gene just doesn't seem as likely as one or a few instances of selection, potentially on many mtDNA coding regions.

    So imagine my surprise, when reading this paper, when I discovered that they found repeated substitutions on a single mtDNA gene in the human lineage, and statistical evidence of positive selection!

    The gene is cytochrome oxidase subunit 2 (COX2). Using the chimpanzee mtDNA sequence as an outgroup, there were 18 human-specific and 20 Neandertal-specific nonsynonymous coding substitutions. Out of the 18 human-specific substitutions, 4 were in COX2. Only three synonymous substitutions occurred in humans for this gene (the ratio 3:4 differs from the ratio for other mtDNA coding regions, 54:14). In contrast, Neandertals had no coding substitutions -- every difference between Neandertal and human sequences is inferred to have occurred in ancient humans. These data are unlikely unless COX2 was recurrently selected in ancient humans.

    More evidence will be necessary to establish positive selection. The paper includes multiple comparisons of different genes, so a significant result for this one is necessarily weakened by the multiple-comparisons correction.

    But in a very interesting part of the paper, the authors did a functional analysis of the human-specific changes in COX2. Functional analysis of coding sites has come a long way in the last few years. Last fall, we saw it applied to the Neandertal-specific mutation of the MC1R gene. It was the functional analysis that argued that the mutation likely resulted in a red hair phenotype. These functional analyses consider the position of a mutation within the protein sequence, the extent to which that part of the protein interacts with other proteins, and whether the coding changes are otherwise conserved in other species.

    Here is the paper's conclusion about COX2:

    Another interesting observation is that COX2 stands out among proteins encoded in the mitochondrial genome as having experienced four amino acid substitutions on the modern human mtDNA lineage. Further work is warranted to elucidate the functional consequences of these amino acid substitutions. However, all these substitutions are in regions of the protein that, based on the crystal structure, do not have any obvious function, and they are variable among primates. Hence, they may represent either minor adaptive advantages, perhaps of regulatory relevance, or have no significant functional consequences for mitochondrial function. Unless other evidence for their importance becomes available, we see no need to invoke positive selection to account for the evolution of COX2 on the human lineage (Green et al. 2008:423).

    To me, a very persuasive finding is that each of the four human-specific mutations of COX2 is also found in some other primate species. In other words, where humans differ from chimpanzees and Neandertals (and generally, gorillas and orangutans), humans are like baboons or macaques. The authors of the paper read this finding as evidence that the changes have little functional importance. But I see this as a suggestion that these substitutions are functionally salient. Different primates have different energetic and dietary constraints, and it should be no surprise if they exhibit functional convergences in mtDNA. Humans evolved four separate sites, within the last half-million years, to be similar to some cercopithecoids and different from most other hominoids. Neandertals exhibited no evolution in this gene. This makes sense under a hypothesis of mtDNA selection in accordance with functional requirements, which we have good reason to believe were different in humans and Neandertals.

    But as the authors say, we need more evidence about the function of these genes. I think the comparative evidence now supports the hypothesis of selection very strongly, and is consistent with the pattern of evidence from the nuclear genome and from the anatomy of early Upper Paleolithic Europeans.

    Contamination

    This paper advances our understanding of contamination within the Neandertal sequences. The authors acknowledge Wall and Kim's (2007) interpretation of a high contamination rate in the earlier reported nuclear genetic data off the 454 platform, and provide additional information to support a relatively high contamination rate:

    Contamination with extant human DNA is the other dominant source of erroneous Neandertal sequences. Given the high coverage and the fact that the best estimate of the contamination rate here is 0.5% (with an upper 95% confidence limit of 0.87%), we do not expect contamination to affect the mtDNA sequence assembly to any appreciable level. Under the assumption that the Neandertal mtDNA sequence is reliable, it is a useful tool for gauging contamination when sequencing the Neandertal nuclear genome. Previously, assays to determine contamination within Neandertal fossil extracts were limited to the HVRI, which carry few positions where extant humans differ from Neandertals. By contrast, the complete Neandertal mtDNA now offers 133 such positions. This enables a reliable estimation of mtDNA contamination by analyzing sequence reads from 454 libraries, rather than by PCR-based assays of the DNA extracts. For example, when we do this in a small preliminary data set initially published from this fossil (Green et al., 2006), 10 of 10 sequences are classified as Neandertal. However, in further unpublished sequencing runs from that library, 8 out of 75 diagnostic sequences derive from extant human mtDNA, suggesting a contamination rate of ˜ 11% (CI = 4.7%–20%). This is in agreement with the suggestion (Wall and Kim, 2007) that contamination occurred in that experiment. That library was constructed outside our cleanroom facility and before the introduction of the Neandertal-specific key, which is crucial for the detection of contamination by other 454 libraries, and was therefore not used for the subsequent Neandertal genome sequencing project (Briggs et al., 2007). However, with the help of the mtDNA presented here, such levels of contamination are now easily detectable from 454 sequencing runs (Green et al. 2008:424).

    So the mtDNA from the same sequence library as the previously reported 1 Mb of Neandertal nuclear genome shows a high contamination rate. That's really disappointing, since it means we have no data to work with. We'll just have to wait.

    OK, that's all I have time to post; more later...

    References:

    Green RE and 24 others. 2008. A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing. Cell 134:416-426. doi:10.1016/j.cell.2008.06.021

  • Neandertal, other ancient DNA review

    Wed, 2008-07-16 11:25 -- John Hawks

    Last week, a short article in Science by Rachel Mackelprang and Edward Rubin discussed some of the recent advances in ancient DNA extraction. Of most interest is the paragraph that discusses ways to probe for particular genes while avoiding some drawbacks of PCR amplification:

    Microarray-based hybridization, coupled with high-throughput sequencing of recovered DNA, has recently been used to capture thousands of targets in parallel from modern DNA samples. With these strategies, a DNA sample is directly applied to an array of specifically designed oligonucleotide probes immobilized on a chip. Complementary fragments hybridize to the probes while the remaining nonbound DNA is washed away. The hybridized DNA can then be eluted from the chip and sequenced, resulting in enrichment of targeted genomic regions (11). Alternatively, chip-synthesized oligonucleotide probes have been released from the chip and used to capture molecules in solution (12). A purely solution-based method, where sets of probes are designed against a reference genome and used as a bait to "hook" corresponding sequences from a DNA pool (13), has been used to recover specific regions of nuclear DNA from Neandertal and cave bear genomic sequence libraries (1). These various capture approaches hold promise for economically investigating the same sequence in multiple different samples as well as examining multiple independent molecules of an allele isolated from a single sample.

    The short review also mentions problems with contamination and some of the results that indicate contamination of the Neandertal sequences, which I've discussed before (Complete Neandertal DNA files).

Pages

Subscribe to metagenomics

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.