john hawks weblog

paleoanthropology, genetics and evolution

Error message

Notice: Undefined variable: options in csl_name->render() (line 639 of /var/www/johnhawks.net/public/modules/biblio/modules/CiteProc/CSL.inc).

Africa

  • Oldowan hunting behaviors at Kanjera South

    Mon, 2013-04-29 16:28 -- John Hawks

    Joseph Ferraro and colleagues have done some neat analyses of the faunal remains from Kanjera South, Kenya [1]. Kanjera South is an archaeological assemblage of Oldowan artifacts and associated animal bones from around 2 million years ago. The site was once a plain next to a lake, and gradually built up clay and silt sediments over years and years of flooding and soil formation. Stone tools and bones stand out in the sediments, representing recurrent activities of ancient humans over a few hundreds or thousands of years. As a result, the site has a good statistical representation of fauna that were hunted by early humans, relatively early in the evolution of our genus.

    This is not the earliest site with evidence for meat acquisition by stone toolmakers. We know that people were butchering animals with stone tools around 2.6 million years ago. But the first really good evidence for hunting strategies is much more recent -- around 1.8 million years ago at Olduvai Gorge. There are actually very few Oldowan-era faunal assemblages large enough to study hunting behaviors. Kanjera South shows that the activities documented at Olduvai Gorge were happening a bit earlier, and the site helps to clarify the kind of context in which we might expect to find more evidence of hunting behavior.

    Hunting versus scavenging is the tiredest chestnut in anthropologists' Oldowan arsenal. Were early hunters really competent enough to bring down a duiker on their own? Or did they steal away pieces of half-eaten zebra carcases when the lions took a break?

    In reality, there is no contradiction here. Undefended meat doesn't last a day in the open, whether on the plains or near waterholes. So scavenging meat from other carnivores usually means facing them down -- not a job for an incompetent killer. Meanwhile, present-day peoples who hunt and gather rely quite a lot on "power scavenging", or taking advantage of other carnivores' successes. The present value of a dead carcass is higher than that of a live animal, as long as it may still escape you. Whether the hunter has to predict prey behavior, or the scavenger has to predict competitors' behavior, both strategies require a depth of planning. So, when it comes to Oldowan-era sites, we should expect to see a mixture of hunted and scavenged remains.

    In that context, we can make some inferences about hominin hunting practices by assessing which kinds of animals they hunted, and which they scavenged. Looking at tooth mark and cutmark evidence is not a perfect way of sorting hunting and scavenging -- because both kinds of marks are rare on faunal elements in archaeological contexts. But sometimes those comparisons lead to clear results. For example, here is the chart showing the number of tooth-marked midshaft fragments from long bones at Kanjera South, in comparison to experimental bone assemblages:

    Figure 3 from Ferraro et al 2013

    Figure 3 from Ferraro et al. 2013. Original caption: Tooth-marked mid-shaft fragments: results from experimental assemblages and excavations at KJS. Figure follows a published model [26]. Hominin-first assemblages refer to remains initially defleshed and demarrowed by hominins, then subsequently exposed to large-bodied carnivores (primarily hyenas). Carnivore-first assemblages refer to remains initially defleshed and/or demarrowed by large-bodied carnivores (primarily hyenas and/or lions). Data for body sizes 1–4 [21]. Modern data (with single standard deviations where available) derived from the literature [23]–[26], [56]–[58]. KJS frequencies are from Table 2 and Table S1. Multiple symbols for KJS indicate the results of multiple analysts. X’s indicate minimum and maximum estimates of damage (see Table S1). doi:10.1371/journal.pone.0062174.g003

    These are cool data. Carnivores who get to chew on bones for a while tend to leave the middle of them covered in tooth marks. If humans get access to the carcass early, they will strip off the meat from those midshafts, break them into bits, and otherwise prevent the taphonomic pathway to carnivore tooth marking. And in the graph we see that the Kanjera South faunal assemblage looks like cases where humans were the agents of defleshing and butchering.

    If humans had primary access to the carcasses, then the transport decisions of ancient hunters should have shaped the bone assemblage at Kanjera South. It is very common in analyses of the fauna from African Oldowan-era sites to divide the prey animals into three size classes -- small, medium and large. The majority of prey species were bovids, ranging from small antelopes to water buffalo, although most were in the small and medium size categories at Kanjera South. Ferraro and colleagues show that for medium-sized bovids, the hominins were taking two strategies. These bovids were too big to carry wholesale to a central place for sharing. So the hunters disarticulated the animals and carried back the legs, leaving the axial skeleton for the most part behind.

    Except for the heads:

    But why acquire, transport, and process an abundance of medium-sized heads? In living animals, these remains contain a wealth of fatty, calorie-packed, nutrient-rich tissues: a rare and valuable food resource in a grassland setting where alternate high-value foodstuffs (fruits, nuts, etc.) are often unavailable [2], [3], [29], [49], [52], [63], [76]–[78]. Medium-sized heads are also relatively dense and durable elements, and their internal contents are generally inaccessible to all but hyenas and tool-wielding hominins [63], [79], [80]. As a result, they are often seasonally-available as scavengable resources in East African grasslands [63], [76], [79]–[83]. Additionally, bone surface modification studies at KJS clearly demonstrate that hominins accessed internal head contents: several cranial vault and mandibular fragments bear evidence of percussion striae. Considered in sum, the presumed availability of these isolated remains across the landscape, the relative abundance of these remains in the KJS assemblages, and unambiguous material evidence that hominins exploited their contents on-site is most parsimoniously interpreted as reflecting very early archaeological evidence of a distinct hominin scavenging strategy – one that included a strong focus on acquiring and exploiting fatty, nutrient-rich, energy-dense within-head food resources (e.g., brain matter, mandibular nerve and marrow, etc.) [e.g., 24,63,76,82,84–86].

    This is John Speth's scenario for fat acquisition from lean animals. The brain is the last part of the body to become fat-depleted during times of stress. If hunters are energy-limited, further lean meat is not going to be valuable to them because protein takes energy to digest. What they need most is fat, and the most ready source of fat is the brain. Accumulation of head elements, whether from hunted or scavenged sources, is an effective behavioral strategy in those circumstances. It's one that we think Neandertals pursued at the end of winter in some parts of Europe, and a strategy followed by hunters in ethnographic and historic contexts as well.

    The paper's conclusion is well-framed as a summary of the overall value of evidence from Kanjera South.

    With regard to evolutionary ecology, the relative uniformity of hominin activities documented through the KJS sequence indicates an evolved foraging adaptation well-tuned to local ecological contexts. This point implies that hominin involvement with, and their presumed consumption of, animal remains had substantial fitness implications. In turn, sufficiently strong selective pressures are implicated as having favored the evolution of persistent hominin carnivory no later than 2.0 million years ago. This date is approximately 200,000–500,000 years earlier than previously documented [11], [20], [33], [45], and increases the known time depth of this adaptation within the hominin lineage (range of dates reflects varied interpretations of faunal materials from Olduvai [20]–[42]).

    This one was fun to read, because the data being built up at Kanjera South are really capable of testing hypotheses about hunting behavior in a way that some of the Oldovai Gorge assemblages have done up to now. Putting the faunal exploitation together with the stone tool evidence, we see a really interesting picture. As I reported a few years ago ("Plant processing with early Oldowan tools"), Kanjera South is one of the locations where we have good evidence of plant exploitation of some kind by Oldowan peoples. The site has also provided evidence about stone material transport decisions and the planning depth of stone flaking ("Technological sophistication of the earliest toolmakers". It is a good illustration of how deep knowledge of a single site, with teams returning to excavations over multiple seasons, can yield a richness of statistical information about hominin behavior.


    References

    Synopsis: 
    A faunal exploitation study finds clues about brain consumption and prey choices
  • Behavior of the first North African humans

    Thu, 2013-03-07 23:49 -- John Hawks

    Mohamed Sahnouni and colleagues describe the archaeology of El-Kherba, Algeria. [1]. This locality is a paleontological exposure associated with the nearby Ain Hanech site, and Sahnouni and colleagues have excavated an Oldowan archaeological assemblage with large mammals such as hippos, rhinos and horses.

    Dated to 1.78 Ma, the El-Kherba cut marks and usewear traces represent the earliest North African evidence showing a clear causal link between Oldowan stone technology and processing of large animal carcasses for meat, broadening the geographic range of Plio-Pleistocene hominin subsistence activities to include the Mediterranean fringe. As was shown in the East African Plio-Pleistocene archaeofaunas, early hominins were foraging for large mammals in northern Africa by circa 1.8 Ma. The evidence from the modified bones at these sites indicates that early hominins were involved in evisceration, disarticulating and removing meat, and breaking bones of large mammals to extract marrow.

    It's a great site because it is the first to document human activity in North Africa. Australopithecines were present in Chad by 3.4 million years ago, and given their mobility and range it seems likely they would have been present to the north of the Sahara also. But none have ever yet been found. As it stands, humans were at Dmanisi by 1.78 million years ago and also in Java by that time. The extent of human migration outside of Africa makes it clear that the Mediterranean coast of Africa itself should have been well within their range.

    And yet, stone tools are known from Ethiopia from 2.6 million years ago, and nearly as old in Kenya. Did the earliest stone toolmakers range beyond the Rift Valley? So far there's no equivalently early evidence of tool manufacture in South Africa. And in North Africa, the earliest tool assemblage is at El-Kherba.

    It would sure be useful to uncover evidence of A. boisei or related robust australopithecines in the Ain Hanech area. In East and South Africa, early Homo lived alongside late robust australopithecines, sharing the same landscape. No robust australopithecine has ever been found outside East or South Africa, while Homo erectus spread across the Old World tropics and into the temperate zone. What kept robust australopithecines, otherwise seemingly adaptable, out of Eurasia? If they truly never lived near the Mediterranean coast, we would probably conclude that they weren't as tolerant of different habitats as we might have expected.

    The cutmark evidence described in the paper is fairly clear and comparable to that known from East Africa well before this date. The cutmarks on animal bones, including hippopotamus, along with a "meat polish" on some of the stone flakes, indicate that ancient humans had access to animal carcasses very shortly after the animals' death and were using stone flakes to process them. Again, basically like Oldowan evidence that has long been known from Olduvai Gorge and other sites. I would like to see a better comparison of where this assemblage fits compared to both large and small archaeological assemblages from Olduvai.

    The question of whether and to what extent early humans hunted large mammals involves a long debate that wouldn't fit well in this paper. Still, the evidence here adds to that literature. The ancient people who left these remains were relying upon large mammal acquisition within a broader hunted diet including smaller prey species. Together with sites from across Africa and Eurasia, this one shows that early humans maintained this diet pattern across a range of ecologies and geographies.


    References

    Synopsis: 
    Archaeological report from El-Kherba, Algeria, with implications for human occupation range
  • Quote: Craig Stanford on gorilla habitat threats

    Tue, 2013-01-22 11:02 -- John Hawks

    Primatologist Craig Stanford was interviewed about habitat threats to gorilla populations by a public radio station: "The Human Threat to Great Apes":

    Cell phones, like many other electronic devices, are built with capacitors, which require tantalum extracted from coltan. Eighty percent of the world’s coltan supply is mined in the Democratic Republic of Congo, in the heart of the remaining habitat of eastern lowland gorillas. With an increasing demand for electronics driving a worldwide hunger for coltan, miners in the DRC are polluting and consuming gorilla habitat while extracting the ore. Compounding the problem, miners hunt the apes for food. The situation is grim, and these gorilla populations will go extinct soon without a sustained effort to intervene.

    Cell phones aren't the most common devices with capacitors, but they certainly help to personalize the issue.

  • Recent evolution of coding variants

    Wed, 2012-12-05 01:00 -- John Hawks

    How did I get myself quoted in a story as the skeptic about recent human evolution? ("Human Evolution Enters an Exciting New Phase"). After all, I've been a huge advocate of the idea that recent human evolution was a lot faster and more interesting than anthropologists used to think ("Why human evolution accelerated").

    The story, by Brandom Keim, is a good account of a new paper in Nature by Wenqing Fu and colleagues, "Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants" [1]. It's a pretty cool study, which has identified protein-coding alleles in large samples of European-American and African-American individuals.

    Fu and colleagues compared all the coding variants they found in large samples of European-Americans and African-Americans, and discovered that the European-ancestry people have a higher fraction of rare coding variants. They propose that the rate of new coding variants entering and persisting within the population actually accelerated in the ancestral European population. Why would this happen? In their view, demography is the most likely explanation. As European populations expanded during the Neolithic and later time periods, the rate by which new mutations are lost by genetic drift began to decline. These new mutations have pooled up within the European population, giving them a glut of new changes to protein-coding sequences. Many of these mutations may be deleterious, just not bad enough for natural selection to have weeded them out in the growing ancient population.

    I think in large part this explanation is correct. In some ways it is incomplete.

    The effect of population history on our evolution was the theme of our 2007 paper on positive selection in recent humans [2]. We relied on exactly the same mathematical relations used in this new paper: More people means more different mutations entering the population. In our case, the increase in the total number of mutations meant that we could expect more potential adaptive mutations to be selected within a growing population. In this case, the increase in the total number of mutations means more mutations remain to be picked up by resequencing rare neutral or deleterious variations in present samples.

    One of the senior authors of the study, Joshua Akey, commented:

    Most of the mutations that we found arose in the last 200 generations or so. There hasn’t been much time for random change or deterministic change through natural selection. We have a repository of all this new variation for humanity to use as a substrate. In a way, we’re more evolvable now than at any time in our history.

    (this is quoted by Punnett Square, not sure about the original source)

    That's a cool concept. These rare protein-coding variations may be mostly unimportant to fitness today, and many are slightly deleterious. Still they provide a store of variability that increases the potential range of responses to future adaptive challenges. Or, they give us room to examine the effects of small differences, which will help us to understand better how genes work. For the past few thousand years, a small proportion of those have come under positive selection, the part that we have been studying in my lab since 2007.

    The current study has some drawbacks. For one, it isn't evident from the results how these new coding mutations are distributed among individuals. Under population growth alone, we should expect that the number of these new coding variants carried by any one individual should be approximately the same as any other individual, regardless of the population size. Where a big population differs from a small population is in the variety of mutations carried by different individuals, with the average number per individual being equal. That may be true in this study, but it isn't possible to tell from the results presented.

    To the extent that some of these mutations are deleterious, their distribution matters. In Europeans, there may be a greater number of deleterious mutations that are on average more rare; all things being equal, this pattern should make it harder to find statistical evidence for association of these rare variants with complex disorders. By contrast, in Africans, the higher average frequencies of such variants should make them easier to tie to phenotypic variation. All this can be concluded from frequencies alone, without a need to relate frequency to age.

    Probably the biggest shortcoming of the paper is in its estimation of ages for these rare mutational variants. Estimating the ages of mutations in human populations has been a real problem for those of us working with genotyping or sequencing data from small samples. When we depend on the linkage between a rare allele and nearby genetic loci, we run into a sampling problem: Estimating the proportion of recombinants in a population fundamentally has a lot of error when you are working with a sample of 10 copies of the rare allele.

    Estimating dates by LD is bad enough, but this paper doesn't even go that far. Instead, it estimates the ages of alleles from their frequency.

    Frequency estimation of age is OK if the genome sequences have come from a Wright-Fisher population (that is, a random-mating, constant size population). More common alleles tend to be older, new alleles tend to be very rare. This isn't a very accurate means of dating any particular mutation, because the relationship of age and frequency under genetic drift has a tremendous variance. But when pooling large sets of alleles into frequency classes, the age-by-frequency approach gives a rough idea of whether mutations have accelerated or stayed at a constant rate over time.

    But there's one obvious thing missing from the model that may have a large effect on the frequencies of rare coding variants: Introgression from Neandertals! If we want to know why Europeans have a large store of rare coding variants relative to Africans, their ancient mixture of a small fraction of a very divergent human population is one obvious reason. None of the Neandertal alleles in Europeans today are new, they are all old. But a method that estimates their ages by allele frequency alone will always conclude that these rare Neandertal alleles are very young.

    In the current paper, the relation of frequency and age is derived from simulations that are based on a model of human population history. Like all recent papers that apply a model of human population history, this one is both overcomplicated (lots of parameters to which we have no good estimates) and oversimplified (too few events to accommodate known historical phenomena). Here's the population model used to derive allele ages in the paper:

    Population model from Fu et al. 2012

    Population model from Figure S5 in the supplementary information from Fu et al. 2012

    The parameters for population divergence times and ancient population sizes are estimated from genetic data, so any systematic error will propagate through to the estimation of allele ages. The exclusion of Neandertal introgression in the model really does bias the allele age estimates badly, as Neandertal genes today are mostly rare, and mostly very old. This year's shift in our assumptions about mutation rates (to a much slower rate than previously assumed) will also affect the estimates of the demographic parameters in the model. An older coalescence time for most genes means a larger ancestral effective size for these populations, and much older allele ages when frequency is the estimator.

    Our lab is working very hard on allele ages, and I hope to be able to share some of that work soon.

    This study is not alone in demonstrating the real importance of rare coding variation in human populations. This line of research has substantial value, as it helps to show why so much of the additive genetic variation underlying variation in human phenotypes has not yet been assigned to genes. We know that many traits are heritable by comparing genetic relatives with each other. Finding the genetic loci that explain similarity among relatives is relatively easy when the genes involved are common, because the same gene variants will be shared across many families. But pooling many families doesn't help us find very rare mutations, as these are likely carried only by a few pedigrees even in a very large sample. By showing the large store of rare coding variation, these studies help to establish that much of the genetic variation underlying disease may be there for us to discover, if we change our discovery approach.


    References

    Synopsis: 
    Probing the pattern of noncoding rare variation in whole exome data.
  • The North African Neandertal descendants

    Thu, 2012-10-18 16:25 -- John Hawks

    A new paper by Federico Sánchez-Quinto and colleagues reports on comparisons of North African population samples with the Neandertal DNA project data [1]. The paper shows that North African populations also carry a substantial trace of Neandertal ancestry, like living populations outside of Africa, much more than populations of sub-Saharan Africa.

    One of the main findings derived from the analysis of the Neandertal genome was the evidence for admixture between Neandertals and non-African modern humans. An alternative scenario is that the ancestral population of non-Africans was closer to Neandertals than to Africans because of ancient population substructure. Thus, the study of North African populations is crucial for testing both hypotheses. We analyzed a total of 780,000 SNPs in 125 individuals representing seven different North African locations and searched for their ancestral/derived state in comparison to different human populations and Neandertals. We found that North African populations have a significant excess of derived alleles shared with Neandertals, when compared to sub-Saharan Africans. This excess is similar to that found in non-African humans, a fact that can be interpreted as a sign of Neandertal admixture. Furthermore, the Neandertal's genetic signal is higher in populations with a local, pre-Neolithic North African ancestry. Therefore, the detected ancient admixture is not due to recent Near Eastern or European migrations. Sub-Saharan populations are the only ones not affected by the admixture event with Neandertals.

    The interesting aspect of the paper is that the authors attempted to separate the ancestry of North African samples into a pre-Neolithic indigenous African component, and a residual component that represents more recent gene flow into North Africa, from all sources. The historic movement into North Africa has been fairly cosmopolitan, involving sub-Saharan Africans, Arabs, Medieval Europeans, Romans, Carthaginians and many other peoples. Sánchez-Quinto and colleagues used the ADMIXTURE program to try to sort out a pre-Neolithic indigenous component and analyze that specifically for Neandertal similarity.

    Unsurprisingly, the fraction of estimated sub-Saharan African ancestry in each population sample was inversely correlated with the estimated Neandertal ancestry. That is, the more a population looks like sub-Saharan Africans, the less Neandertal it has.

    Here's what's surprising: When they sorted out parts of the genome in Tunisians that ADMIXTURE determines to be most likely from pre-Neolithic North Africans, they found these parts of the genome had more Neandertal ancestry than typical of the CEU sample of northern European ancestry. Is it possible that ancient North Africans had more Neandertal similarity than today's Europeans?

    Sánchez-Quinto and colleagues suggest that the Neandertal ancestry in this population came in Upper Paleolithic times from the Near East. That is possible, or some of the Neandertal similarity may reflect ancient African population structure. Really I think we will have to do a finer analysis of chromosome blocks to examine the subset of shared Neandertal derived alleles that reflect introgression versus incomplete sorting from the ancestral African population. It will be very interesting to examine more closely the mixture of population history within Egypt, through which most Near Eastern pre-Neolithic population movement must have come.

    The authors note that the distribution of Neandertal similarity outside Africa increases with distance from Africa.

    A previous study [26] observed that the similarity to Neandertals increases with distance from Africa and suggested this could be explained by SNP ascertainment bias plus a strong genetic drift in East Asian populations. Nonetheless more complex, population-biased, ascertainment schemes might have additional effects (i.e bottlenecks), but these are not expected to significantly increase the rate of false positives in admixture tests [31]. The Tunisian population has been reported to be a genetic isolate [17] so it is plausible that part of the signal detected is actually due to genetic drift. However, this should not affect the other North African groups in our study. Finally, given that SNP arrays are based on common alleles and probably the relevant admixture information is encoded within the rare and very rare alleles, the potential bias, if anything, will underestimate ancient hominid admixture signals, as shown in previous studies [2],[3].

    This pattern was also observed by Meyer and colleagues earlier this year [2], and I discussed it in my post on that paper ("Denisova at high coverage"). Both papers note that ascertainment bias may contribute to this pattern. I added that Meyer and colleagues had assumed that genes found in sub-Saharan African populations could not have come from Neandertals, which greatly biased their estimates against Europe and West Asia, considering historical and prehistoric gene flow across the Sahara and along the Indian Ocean coast. So I'm not yet accepting the relative numbers of Neandertal ancestry from different populations, as we don't know that they have all come from consistent assumptions. In particular, an elevated amount of Neandertal ancestry in China -- this paper puts it almost as double the amount of Neandertal ancestry in northern Europeans -- is unlikely. There is no pattern of bottlenecks that can give rise to that excess without additional population mixture, and hard to see where such population mixture would have happened without also affecting the ancestors of Europeans. Instead, we have some work to do in reducing the biases on these comparisons.


    References

    Synopsis: 
    A study of North African genetic variation shows that Neandertal genes were widespread in the area before the Neolithic.
  • Quote: Lederberg on Haldane

    Sun, 2012-09-30 00:16 -- John Hawks

    J. B. S. Haldane has typically been assigned credit for the first suggestion that human hemoglobinopathies are adaptations to malaria. In 1999, Joshua Lederberg examined the history of this question [1].

    Haldane's most often remembered attribution, to malaria, oddly enough does not appear at all in the formal article but in the discussion footnotes. Therein, Montalenti acknowledges a verbal communication from Haldane suggesting that thalassemia heterozygotes may be more resistant to malaria. In his rejoinder, Haldane goes on to suggest that “microcythemic heterozygotes may be at an advantage on diets deficient in iron or other substances, thus leading to anemia” (HALDANE 1949, p. 76). This has been widely viewed as an anticipation of much later research on heterozygote advantage of blood dyscrasias in relation to malaria.1

    In this regard, the work of A. C. ALLISON (1954) is well known. However, he remarks (private e-mail communication, April 26, 1999):

    At the time of publication of my finding that sickle-cell heterozygotes have some protection against malaria (1954), I was unaware that J. B. S. Haldane had made a similar suggestion for thalassemia. After my publication I was invited to make a presentation at University College, London, and we had a friendly discussion. Haldane said that he had recognized that heterozygotes for the thalassemia gene are likely to have some advantage to counter-balance selection against homozygotes and suggested several possible candidates, among them malaria and better absorption of iron. He added that to speculate about the problem was one thing and to provide experimental evidence for a solution was altogether another. This was the first evidence that natural selection operates in humans.

    Meanwhile, Allison himself [2] cited the earlier work of Beet, who showed in 1946 and 1947 that the blood of East African peoples with the sickle-cell trait carried a lower incidence of malaria parasites than the blood of normal individuals [3][4]. Beet's articles are of interest because they precede Haldane's oblique suggestion about the adaptive value of thalassemia. Still, the relatively weak observation that sickle-cell individuals have a slightly lower incidence of parasites was not a sufficient proof that the sickle-cell trait actually protected its carriers.

    Allison demonstrated the connection between sickle-cell and malaria resistance in two ways. He undertook an epidemiological survey among children, showing a very strong statistical association between non-sicklers and parasites in the blood. Then, he performed an experiment in which 15 sickle-cell trait and 15 normal individuals were injected with the malaria parasites in a controlled way. These two groups were starkly different in their parasite response, with only two of the sickle-cell trait individuals showing any parasites at all, and then at low blood counts; while 14 out of 15 of the normal individuals had parasite infections. His article is notable not only for this clear demonstration, but because of its direct discussion of the other major arguments in favor of the malaria resistance hypothesis, including the close examination of the geographic distribution of the sickle-cell trait in relation to endemic malaria, and the rejection of alternative hypothesis of high mutation rate. This paragraph is exceptionally clear:

    The main problem can be stated briefly: how can the sickle-cell gene be maintained at such a high frequency among so many peoples in spite of the constant elimination of these genes through deaths from the anaemia? Since most sickle-cell anaemia subjects are homozygotes, the failure of each one to reproduce usually means the loss of two sickle-cell genes in every generation. It can be estimated that for the lost genes to be replaced by recurrent mutation so as to leave a balanced state, assuming that the sickle-cell trait -- that is, the heterozygous condition -- is neutral from the point of view of natural selection, it would be necessary to have a mutation rate on the order of 10-1. This is about 3,000 times greater than naturally occurring mutation rates calculated for man and, with rare exceptions, in many other animals.

    Interesting: the mechanism by which the sickle-cell trait deters the parasites is even today not fully understood.


    References

  • Into Africa

    Fri, 2012-07-27 00:59 -- John Hawks

    I have a lot to say about the new study of African genomes by Joseph Lachance and colleagues [1], which I think is tremendously exciting, along with the new preprint from Joseph Pickrell and colleagues on the arXiv, which includes some similar analyses with SNP data. But I'm on my way to Africa myself today for a week, and don't have time to post all my thoughts about the new papers until I arrive there. So I'll try to post these over the weekend.


    References

  • Neandertal introgression, 1000 Genomes style

    Sat, 2011-12-10 18:16 -- John Hawks

    For our project to understand pigmentation genetics in archaic humans, we had to find a good comparative sample of sequence data from recent humans. The original publication on the draft Neandertal genomes compared them to five low-coverage genomes from different Old World populations, along with the publicly available genomes from Craig Venter and others [1]. The first publication on the Denisova genome added an additional handful of genomes to these comparisons [2].

    Some of these handful of genomes from living people are more similar to the Neandertal and Denisova genomes than others. That simple fact is the proof that some living people have Neandertal and Denisovan ancestors.

    But until now, the comparison has been limited to a very small number of human genomes. That became a focus for critics of the Neandertal and Denisovan results. How could three or four genome sequences possibly provide an adequate representation of human variability? We could imagine scenarios in which the similarities between Neandertal and humans could be explained by some unsampled population, for example, northeast Africans [3]. Denisova does not present the same problem, because African population structure cannot possibly explain its resemblance to populations in Wallacea, Australia, and Oceania [2] [4]. But to compare either of these genomes, we should seek a broader sampling of genomes from living people.

    As I wrote yesterday, my students and I have been working to understand pigmentation genetics of the archaic human genomes ("Pigmentation of archaic humans: introduction"). I've emphasized the need to break the analysis into small steps. For this question, we need to examine whether the pattern of introgression around pigmentation genes is characteristic of the genome as a whole. If genes involved in pigmentation have systematically higher or lower levels of Neandertal ancestry, that will tell us a lot about the evolutionary history of pigmentation in recent and archaic humans. For this, we need a good comparative sample, and the 1000 Genomes Project provides the best sample available.

    The first step in assessing the pattern of introgression for pigmentation genes is to characterize the pattern of introgression across the whole genome.

    Yes, a whole-genome introgression analysis sounds awfully big for my "small steps" concept. But actually this is simpler than it might sound. Here's a teaser:

    The figures in this post are not from a whole-genome analysis; they include data from eight chromosomes that we prioritized because of our pigmentation analysis. I am licensing all of them under a Creative Commons ShareAlike license so that anyone can use them anywhere.

    UPDATE (2011-12-10): I finished the whole genome analysis and am updating this post and figures accordingly. The results are the same throughout, with the exception of the Europe-East Asia comparison, which now shows these populations to be significantly different across the genome as a whole. I have partially updated the figures and will finish these later today.

    The value of sequences

    The 1000 Genomes Project data have been updated several times in the last year, as both sequencing and analysis of the genomes have progressed (more information on 1000 Genomes Project website). We downloaded a release of SNP genotype calls from 1094 individuals, based on the low-coverage (average 4x) sequencing that has been carried out on the sample.

    A SNP (single nucleotide polymorphism) is a nucleotide site with at least two alleles present in the global human sample. These sites represent only one kind of genetic variation in today's populations. Many of the differences between people's genes are caused by insertions, duplications, deletions, transpositions, or inversions. But those kinds of polymorphisms can be challenging to study in low-coverage genomes, and we already understand quite a lot about SNPs in human populations from the earlier HapMap project [5] [6]. The HapMap provided the data underlying our 2007 paper on the acceleration of recent human evolution ("Why human evolution accelerated") [7].

    The drawback of earlier SNP variation projects is that they examined only a subset of SNP variation in a sample of people. To design a microchip that could provide a million or more SNP genotypes from a saliva sample, somebody first had to discover where in the genome SNPs could be found. So they took small samples of people, sometimes only a single person's two copies of the genome, and sequenced. Adding together SNPs found by several methods, they could get a representation of SNP variation across the whole genome in a population. But this process introduced a bias: the SNPs were ascertained in a sample that inevitably could not represent humans in other samples with the same accuracy. Initially, SNP samples were heavily biased toward people of European ancestry (upon whom most genetic work was originally done), and the HapMap project went to great efforts to increase the representation of other populations. But even with the best possible ascertainment, interpreting SNP variation requires us to jump through some theoretical hoops.

    Sequence data make life much easier for the population geneticist. Seriously, working on this stuff on the whiteboard is fun instead of a constant nightmare of sampling biases and spaces between markers. I have a bias myself, in that I find recombination hard to deal with. I love reticulation among populations, but I'd rather work with genealogies that look like proper trees instead of a liana-strewn mess. So looking at sequence data over short intervals makes me happy. Not as happy as beer aged in bourbon barrels, but happy.

    The 1000 Genomes Project SNP files represent every SNP mutation observed in the sample. In other words, these are sequence data, just with all the fixed (and therefore redundant) sites removed. Even so, these sequence data are not perfect. Low coverage means that some rare mutations in the sampled individuals will go unreported. We aren't typically interested in singleton mutations in the sample, except that missing them will introduce a bias upon our estimates of the time that common ancestors lived. Next-gen sequence reads are usually fairly riddled with errors. High coverage allows these errors to be removed with some confidence, but low-coverage genomes risk throwing out real SNPs along with the spurious ones. The publicly available files represent some analytical steps that we do not here control, so we have to work with the understanding that the data are not perfect.

    The 1000 Genomes SNP files have had a phasing algorithm applied to them, which attempts to assign genotypes to chromosomes. In essence, phasing tries to figure out whether adjacent SNP alleles belong to the same copy or to different copies of the same chromosome. The details of this phasing are not yet apparent, and for many reasons I am cautious about using phased data. The inference is often inaccurate for rare mutations, and the whole process tends to sneak assumptions about population history into the resulting dataset. I hate being forced to live with someone else's assumptions about human population history, and I typically try to avoid needing phased data. In this case, it looks like the data over short intervals are as accurate as they can be, given the limitations on coverage and sampling. We have moved forward by applying methods that make a bare minimum of assumptions.

    Counting derived SNP alleles

    David Reich and colleagues came up with an appealingly simple test of introgression, which they applied to both the Neandertal and Denisovan genomes. Eric Durand, Reich, Nick Patterson and Monty Slatkin described the method formally this year [8], which they call the D-statistic. Informally, this has become known as the ABBA-BABA test, after their labels for the discordant genealogies that the test compares. By and large, across the genome, humans living today share many more new mutations with each other than they do with an archaic human like a Neandertal. But sometimes two genomes are different from each other, and one of them shares a new mutation with the Neandertal.

    A human might share a mutation with a Neandertal because it actually isn't very new, and both inherited the mutation from some much more ancient population of humans. This scenario is called "incomplete lineage sorting", because humans today have multiple gene lineages that existed within some very ancient population, instead of these having been "sorted" cleanly into the different human and Neandertal populations. Incomplete lineage sorting does happen a lot between humans, Neandertals, and Denisovans. ILS is the normal mode of variation among recent human populations, who trace their genealogical histories back much further than the earliest "modern" humans. So if one human has a Neandertal allele, and another human has a different allele, it's probably no big deal. They both just inherited gene variants that already existed in our distant common ancestors.

    You can probably see already that if we had a way to estimate the age of an allele, we could tell whether incomplete lineage sorting is a credible explanation for any particular site. I'll leave that point for another post.

    In the meantime, if we pretend that we know nothing at all about the ages of alleles, we must find some other way to tell whether incomplete lineage sorting can explain Neandertal similarities. Reich and colleagues recognized that incomplete lineage sorting from ancient pre-Neandertal ancestors ought to be distributed equally among living people. If we look at every site in the genome where we have data from Neandertals, we should find that one living human genome should look like the Neandertal just as often as another.

    This insight led to their test. Take a pair of humans, count the number of times sequence A is like the Neandertal and sequence B is like a chimpanzee, and then do the inverse — B then A. ABBA-BABA.

    Why a chimpanzee? In most cases the chimpanzee allele will represent the ancestral state for humans. Living people can inherit ancestral alleles from Neandertals as well as derived ones, but the derived ones tend to be rarer and younger within human populations. If one living genome shares an ancestral allele with the Neandertal genome, we don't need incomplete lineage sorting or introgression to explain the pattern. For all we know, such a mutation originated after Neandertals were already gone. So we need to pay attention to the derived mutations, ones that are present in Neandertals but not in chimpanzees. Do a count of these across the genome, and if you find a living genome with significantly more than another, you've found evidence for introgression.

    Ed Green, David Reich and colleagues [1] [2] did a comparison of every possible pair of genomes in their modern human sample. These sequence data were gappy, so that sequence A might share different coverage with B than with sequence C. So it was necessary to consider each pair separately, counting all the sites where both human sequence and the Neandertal and chimpanzee sequences had data.

    The 1000 Genomes Project sample reports genotypes for every SNP for every sampled individual. So in principle, every pair of sequences should have data for every one of these sites. Again, we have to be cautious about the nature of the sequencing, attending to the possibility of systematic biases due to low coverage. But we really don't have to take the time-consuming step of comparing every possible pair of the 2188 resulting haploid genomes. We can just find the derived SNP alleles that are present in Neandertals and count how many of them are in each of the human sequences. If one sequence has significantly more Neandertal derived alleles than another, it had to get them somehow.

    That magic three percent

    The figure at the top of the post represents that count. Every individual in the 1000 Genomes Project dataset has two copies of the autosomal genome. Separating these two copies of the genome (basically arbitrarily) and counting up the shared derived features between each of those copies and the genome of Vindija 33.16, we obtain the histogram. Here it is again:

    The African genomes in the 1000 Genomes sample include Yoruba from Nigeria and Luhya from Kenya. The Asian populations sampled are Japanese and Chinese, including people of Han Chinese ethnicity in Beijing and southern China. The European ancestry samples include the CEU sample from Utah, as well as British, Tuscan, Spanish and Finn samples.

    The histogram shows that Asian and European genomes have significantly more Neandertal derived SNP alleles than do the African genomes. The averages for the Asian and European samples are around 3% higher than the average for the African samples. Whatever gave Africans some degree of similarity to Neandertals, non-Africans seem to have gotten around 3% more of it.

    Green and colleagues [1] assumed conservatively that Africans share derived SNP alleles with Neandertals only because of incomplete lineage sorting from the human-Neandertal ancestral population. This fraction should be the same in all human populations, under the assumption that Africans were mostly isolated from Neandertals for some period of time. The 3% Neandertal bonus outside Africa should then represent introgression from Neandertals into recent populations outside Africa.

    Both previous studies noted that genomes outside Africa are not significantly different in the fraction of derived SNP alleles shared with Neandertals. A genome from China and a genome from France carried the same fraction of shared derived SNP alleles with Neandertals. Here, we've confirmed that basic identity in the level of introgression in these populations.

    I have told several people now that I find the distributions in China and Europe spookily similar. On parts of the genome, the two distributions have means that are not significantly different. Indeed, I worked for a week with an analysis of eight chromosomes, in which the East Asian and European means were fewer than 100 SNP alleles apart. Even across the whole genome, Europeans average only 700 derived SNP alleles more than the East Asian sample. This small difference a bit more than a tenth of a percent) is strongly significant on these sample sizes. A t-test yields a p-value of 1.1 times 10-26 on the difference in means. Even so, the distributions of these two populations overlap across most of their ranges.

    Seeing these hundreds of genomes arrayed on a histogram provides much more information than we had from a handful of genomes. It is remarkable how much dispersion there is among genomes from a single population. Although the means of these two samples are nearly the same, you can see that each of them has a large range of variation in the shared derived SNP alleles with Neandertals. This variation means that people within a single population have very different proportions of Neandertal ancestry.

    This is not a graph of people, but a separation of the two copies of SNP alleles carried by these people. That separation is phased at short scales but arbitrary on the scale of a whole chromosome, so the histogram likely understates the variance among single genomes while it overestimates to some extent the variation among people with their diploid genomes. Still, it looks likely from these comparisons that some people in Europe carry more than a percent higher Neandertal ancestry than the average, and some carry a percent less. We can use statistical methods to test this hypothesis directly as applied to individuals in the sample.

    Neandertal genes in recently admixed populations

    A sample of hundreds of people allows us to demonstrate significant differences among the genomes of different populations. Some of the 1000 Genomes Project samples are from populations that represent historically recent admixture of people who trace their ancestry to different parts of the world.

    For example, the "ASW" population sample includes African-American people who live in the Southwest United States. We know from many other genetic studies that African-Americans vary in the fraction of ancestry they derive from Europeans and from Africans. The average amount of African and European ancestry varies among African-Americans who live in different parts of the U.S., as low as 3% and as high as 20% or more in some parts of the country. The proportion among individuals varies even more. So when we consider the ASW sample, we should expect to see a lot of variation in the number of shared derived SNP alleles with Neandertals, with a mean higher than African populations.

    Which is exactly what we do see:

    The ASW sample overlaps substantially with the Yoruba sample from West Africa (Nigeria) and slightly with the CEU sample, which includes people of European ancestry in Utah. The total in the ASW genomes is more variable than either the Yoruba or CEU population samples. If the higher mean in the ASW genomes reflects European ancestry from a population like CEU, the proportion of European ancestry would be around 17% for that sample of people. It would be hard to tell from these numbers alone how much of the variation in ASW is attributable to variation in ancestry fraction, and how much is expected within a population of homogeneous ancestry. As we'll see in some other populations, there are some appreciable differences among populations within a given region, and ancestry differences may add to the variation among individuals within populations.

    We see a similar pattern when we look at the Puerto Rican sample. Individuals in this sample have some ancestry from European, Native American and African ancestors. The comparisons by Reich and colleagues [2] and Green and colleagues [1] suggested that Native American populations have the same fraction of Neandertal ancestry as other people outside Africa. In the comparison with YRI and CEU samples, Puerto Rican (PUR) genomes are intermediate, with a mean suggesting around 15% ancestry from the West African population.

    The two outlier points in the Puerto Rican sample are the two genome copies from one individual, who we would hypothesize had much higher African ancestry than the average in the sample.

    Next...

    This post has taken me much longer than I expected to get to the point of talking about variation among samples within continental regions. It turns out that, despite the similarity of European and East Asian samples in their averages, there are substantial differences between samples within each of these regions.

    For example, here's a comparison of north and south Chinese samples:

    People of Han Chinese ethnicity sampled in Beijing appear to have on average a half percent more Neandertal ancestry than people of the same ethnicity sampled in southern China. I found these kinds of differences almost everywhere I looked within regions. More later...


    References

    1. Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, et al. A Draft Sequence of the Neandertal Genome. Science [Internet]. 2010;328:710–722. Available from: http://dx.doi.org/10.1126/science.1188021
    2. Reich D, Green RE, Kircher M, Krause J, Patterson N, Durand EY, Viola B, Briggs AW, Stenzel U, Johnson PLF, et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature [Internet]. 2010;468:1053–1060. Available from: http://dx.doi.org/10.1038/nature09710
    3. Hodgson JA, Bergey CM, Disotell TR. Neandertal genome: the ins and outs of African genetic diversity. Current biology : CB. 2010;20(12):R517-9.
    4. Reich D, Patterson N, Kircher M, Delfin F, Nandineni MR, Pugach I, Ko AM-S, Ko Y-C, Jinam TA, Phipps ME, et al. Denisova admixture and the first modern human dispersals into southeast Asia and oceania. American journal of human genetics. 2011;89(4):516-28.
    5. The International HapMap Consortium. A Haplotype Map of the Human Genome. Nature [Internet]. 2005;437:1299–1320. Available from: http://dx.doi.org/10.1038/nature04226
    6. McVean G, Spencer CCA, Chaix R. Perspectives on human genetic variation from the HapMap Project. PLoS genetics. 2005;1(4):e54.
    7. Hawks J, Wang ET, Cochran G, Harpending HC, Moyzis RK. Recent acceleration of human adaptive evolution. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2007;104:20753–20758. Available from: http://dx.doi.org/10.1073/pnas.0707650104
    8. Durand EY, Patterson N, Reich D, Slatkin M. Testing for ancient admixture between closely related populations. Molecular biology and evolution [Internet]. 2011. Available from: http://dx.doi.org/10.1093/molbev/msr048
    Synopsis: 
    We're quantifying the amount of Neandertal ancestry in whole genome data from living people.
  • The risk gradient

    Wed, 2011-11-09 23:58 -- John Hawks

    Ann Gibbons reports [1] from the International Congress of Human Genetics, on papers that examine GWAS risk alleles for type 2 diabetes: "Diabetes Genes Decline Out of Africa" (paywall).

    At the poster session, Stanford graduate student Erik Corona stood in front of a Google Earth map of the world that he finds surprising. On this map he had plotted the frequency of 12 gene variants known to be associated with type 2 diabetes in 51 populations from Australia to Zaire. It shows “a clear gradient of red to green from west to east, from Africa to Asia,” Corona says (see map). “Something strange is going on with type 2 diabetes.”

    This is of course a challenging problem because risk alleles identified in one population may not replicate in other populations. The most well-known example is ApoE4, strongly associated with Alzheimer's Disease in Europeans, but not in Africans. More generally, looking at a set of risk variants that are identified in one population introduces an ascertainment bias that constrains their likely frequencies in other populations. An allele is more likely to yield a statistically significant association with a trait if the allele is not too rare. If we take many alleles associated with a trait, we're likely to see some gradient across populations due to this bias alone.

    Hidden ascertainment bias is a problem we run up against quite a lot. It may not apply in this case, depending on where the risk alleles were identified, in particular since many risk alleles for type 2 diabetes appear to be linked to recent positive selection (explaining why I got interested).


    References

    1. Gibbons A. Diabetes Genes Decline Out of Africa. Science. 2011;334(6056):583 - 583.
  • African Homo erectus

    Tue, 2011-11-08 00:14 -- John Hawks
    Synopsis: 
    African specimens from the Early Pleistocene are compared

    This station includes several casts of early fossil Homo erectus, from the Early Pleistocene of Africa. These include:

    • OH 9, from Olduvai Gorge, Tanzania, around 1.2 million years old.
    • KNM-ER 3733, from Ileret, Kenya, 1.65 million years old.
    • KNM-ER 3833, from Koobi Fora, Kenya, 1.6 million years old.
    • KNM-WT 15000, from Nariokotome, Kenya, 1.5 million years old.

    In addition to these specimens, the station has a few comparative casts from earlier hominid species and from other parts of the world.

    What to do: First, consider the issue of sexual dimorphism in these specimens. Which are male and which are female? What features lead you to that conclusion?

    Second, why are the differences between these specimens and Homo habilis, for example, KNM-ER 1813, reflective of a species distinction, instead of sex?

Pages

Subscribe to Africa

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.