john hawks weblog

paleoanthropology, genetics and evolution

acceleration

  • My review of "Paleofantasy"

    Thu, 2013-03-14 16:22 -- John Hawks

    I have a review of Marlene Zuk's new book, Paleofantasy, in this week's Nature: "Evolutionary biology: Twisting the tale of human evolution" [1].

    I can't replicate my review here, but for people who have access to Nature I thought I'd bring attention to it. And if you don't have access, I wanted to share a couple of my reactions.

    It was a fun book for me to read. Zuk brings a light-hearted skepticism to a broad array of topics in human evolution. She took as her focus a collection of "paleo-advice" ideas: barefoot running, paleo diet, back-to-nature parenting advice. She then added some uncritically-accepted scientific notions about our evolution, such as the idea that agriculture was "the worst invention ever devised". To each of these topics, she brings an array of recent science questioning or disproving the assumptions. The result is not to debunk ideas, but to give a fuller (and more nuanced) perspective on how much we know (and don't know) about our evolution.

    The serious issue underlying all these topics, which Zuk recognizes, is the difficulty of reconstructing Pleistocene environments. Some hypotheses assume a fairly detailed model of ancient environments -- the so-called "environment of evolutionary adaptedness". But ancient humans lived in an array of environments, more different than each other in many ways than different parts of today's globalized world. We are unquestionably living in environments no ancient humans knew, in population size, density, disease, lifespan, and many other ways. But in other ways, our difference from some ancient people is trivial compared to their diversity. Are we well-adapted to live in cities? Perhaps not in some ways, but maybe in others.

    Probably the best part of my review to share is the end:

    As an anthropologist, I observe that Zuk's use of the term 'fantasy' is just an emphatic way of describing the hypothesis-forming that is essential to evolutionary science. We play with hypotheses, explore their predictions and try very hard to falsify them. So it is, in a way, unremarkable that so many hypotheses proposed by anthropologists about ancient environments now seem to be wrong — and, in a few cases, even ridiculous.

    It means that science is working. Genomics, high-resolution climate records, and microscopic and isotopic evidence have changed our understanding of what the past has to offer. With that in mind, let the next round of palaeofantasies begin.

    Zuk's "very brief" overview of human evolution is a lot shorter than in other recent books on the topic. I found this to be a merciful change -- how many times do I really need to read about the Australopithecus-to-humans timeline? Readers who don't already know the basic timeline are unlikely to pick up the book, I would guess. Still, if you're looking for a "latest news" about early humans, this book is not directed that way. Where it excels is its coverage of recent evolutionary changes and the shifts in Holocene environments and genetics.

    The book is not without its weak points. Without quite enough of the "paleo-advice" topics to carry the whole story, there were some real differences in tone across the chapters, with some a bit drier than others.

    People coming to this book for "the right answer" about ancient environments are not going to find it. There is no right answer, at least not a scientific one, for many of the topics covered here. Zuk has done well to talk to a range of scientists, covering these different aspects of our evolutionary history, and discuss the reasons for their disagreement.

    I wish scientists would do that for themselves more often!


    References

    Synopsis: 
    A new book by Marlene Zuk challenges some paleo advice mongers.
  • Recent evolution of coding variants

    Wed, 2012-12-05 01:00 -- John Hawks

    How did I get myself quoted in a story as the skeptic about recent human evolution? ("Human Evolution Enters an Exciting New Phase"). After all, I've been a huge advocate of the idea that recent human evolution was a lot faster and more interesting than anthropologists used to think ("Why human evolution accelerated").

    The story, by Brandom Keim, is a good account of a new paper in Nature by Wenqing Fu and colleagues, "Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants" [1]. It's a pretty cool study, which has identified protein-coding alleles in large samples of European-American and African-American individuals.

    Fu and colleagues compared all the coding variants they found in large samples of European-Americans and African-Americans, and discovered that the European-ancestry people have a higher fraction of rare coding variants. They propose that the rate of new coding variants entering and persisting within the population actually accelerated in the ancestral European population. Why would this happen? In their view, demography is the most likely explanation. As European populations expanded during the Neolithic and later time periods, the rate by which new mutations are lost by genetic drift began to decline. These new mutations have pooled up within the European population, giving them a glut of new changes to protein-coding sequences. Many of these mutations may be deleterious, just not bad enough for natural selection to have weeded them out in the growing ancient population.

    I think in large part this explanation is correct. In some ways it is incomplete.

    The effect of population history on our evolution was the theme of our 2007 paper on positive selection in recent humans [2]. We relied on exactly the same mathematical relations used in this new paper: More people means more different mutations entering the population. In our case, the increase in the total number of mutations meant that we could expect more potential adaptive mutations to be selected within a growing population. In this case, the increase in the total number of mutations means more mutations remain to be picked up by resequencing rare neutral or deleterious variations in present samples.

    One of the senior authors of the study, Joshua Akey, commented:

    Most of the mutations that we found arose in the last 200 generations or so. There hasn’t been much time for random change or deterministic change through natural selection. We have a repository of all this new variation for humanity to use as a substrate. In a way, we’re more evolvable now than at any time in our history.

    (this is quoted by Punnett Square, not sure about the original source)

    That's a cool concept. These rare protein-coding variations may be mostly unimportant to fitness today, and many are slightly deleterious. Still they provide a store of variability that increases the potential range of responses to future adaptive challenges. Or, they give us room to examine the effects of small differences, which will help us to understand better how genes work. For the past few thousand years, a small proportion of those have come under positive selection, the part that we have been studying in my lab since 2007.

    The current study has some drawbacks. For one, it isn't evident from the results how these new coding mutations are distributed among individuals. Under population growth alone, we should expect that the number of these new coding variants carried by any one individual should be approximately the same as any other individual, regardless of the population size. Where a big population differs from a small population is in the variety of mutations carried by different individuals, with the average number per individual being equal. That may be true in this study, but it isn't possible to tell from the results presented.

    To the extent that some of these mutations are deleterious, their distribution matters. In Europeans, there may be a greater number of deleterious mutations that are on average more rare; all things being equal, this pattern should make it harder to find statistical evidence for association of these rare variants with complex disorders. By contrast, in Africans, the higher average frequencies of such variants should make them easier to tie to phenotypic variation. All this can be concluded from frequencies alone, without a need to relate frequency to age.

    Probably the biggest shortcoming of the paper is in its estimation of ages for these rare mutational variants. Estimating the ages of mutations in human populations has been a real problem for those of us working with genotyping or sequencing data from small samples. When we depend on the linkage between a rare allele and nearby genetic loci, we run into a sampling problem: Estimating the proportion of recombinants in a population fundamentally has a lot of error when you are working with a sample of 10 copies of the rare allele.

    Estimating dates by LD is bad enough, but this paper doesn't even go that far. Instead, it estimates the ages of alleles from their frequency.

    Frequency estimation of age is OK if the genome sequences have come from a Wright-Fisher population (that is, a random-mating, constant size population). More common alleles tend to be older, new alleles tend to be very rare. This isn't a very accurate means of dating any particular mutation, because the relationship of age and frequency under genetic drift has a tremendous variance. But when pooling large sets of alleles into frequency classes, the age-by-frequency approach gives a rough idea of whether mutations have accelerated or stayed at a constant rate over time.

    But there's one obvious thing missing from the model that may have a large effect on the frequencies of rare coding variants: Introgression from Neandertals! If we want to know why Europeans have a large store of rare coding variants relative to Africans, their ancient mixture of a small fraction of a very divergent human population is one obvious reason. None of the Neandertal alleles in Europeans today are new, they are all old. But a method that estimates their ages by allele frequency alone will always conclude that these rare Neandertal alleles are very young.

    In the current paper, the relation of frequency and age is derived from simulations that are based on a model of human population history. Like all recent papers that apply a model of human population history, this one is both overcomplicated (lots of parameters to which we have no good estimates) and oversimplified (too few events to accommodate known historical phenomena). Here's the population model used to derive allele ages in the paper:

    Population model from Fu et al. 2012

    Population model from Figure S5 in the supplementary information from Fu et al. 2012

    The parameters for population divergence times and ancient population sizes are estimated from genetic data, so any systematic error will propagate through to the estimation of allele ages. The exclusion of Neandertal introgression in the model really does bias the allele age estimates badly, as Neandertal genes today are mostly rare, and mostly very old. This year's shift in our assumptions about mutation rates (to a much slower rate than previously assumed) will also affect the estimates of the demographic parameters in the model. An older coalescence time for most genes means a larger ancestral effective size for these populations, and much older allele ages when frequency is the estimator.

    Our lab is working very hard on allele ages, and I hope to be able to share some of that work soon.

    This study is not alone in demonstrating the real importance of rare coding variation in human populations. This line of research has substantial value, as it helps to show why so much of the additive genetic variation underlying variation in human phenotypes has not yet been assigned to genes. We know that many traits are heritable by comparing genetic relatives with each other. Finding the genetic loci that explain similarity among relatives is relatively easy when the genes involved are common, because the same gene variants will be shared across many families. But pooling many families doesn't help us find very rare mutations, as these are likely carried only by a few pedigrees even in a very large sample. By showing the large store of rare coding variation, these studies help to establish that much of the genetic variation underlying disease may be there for us to discover, if we change our discovery approach.


    References

    Synopsis: 
    Probing the pattern of noncoding rare variation in whole exome data.
  • Human population history makes a difference

    Thu, 2012-05-10 16:18 -- John Hawks

    Alon Keinan and Andrew Clark have a short report in the current Science examining the effects of recent human population growth on the expected spectrum of human genetic variation [1]. Population growth skews the variation in a population so that there are many more rare alleles than would be expected in a constant-sized population.

    Why is this? In a constant-sized population, individuals have an average of two offspring who survive to have offspring of their own. Many people have no children at all, or only one, while only a small proportion of people have more than four children. In the constant-sized population, a person born with a new mutation would have a 50% chance of passing it on to each child. In such a population, more than a third (36%) of mutations aren't passed on even once. The same fraction are inherited by only one child, and these face the same odds of extinction in the next generation. This isn't natural selection, it is random genetic drift -- and its net result is that most new mutations are lost.

    In a growing population, individuals average more than two offspring. Every additional offspring increases the chance that a new mutation will be passed on to the next generation. In other words, more people means less genetic drift. As a population grows, new mutations begin to stack up at low frequencies in the population.

    This is a very basic point in population genetic theory, and it interacts in a troubling way with the current generation of sequencing technology. Short-read shotgun sequencing yields a high number of false positive mutations, which must be aggressively filtered out of whole genome data. If we don't filter these out, we will arrive at incorrect conclusions about many aspects of human biology. The simplest means of filtering require some understanding of how many rare mutations you expect to find, in particular how many should be found in only one person in a sample of people. That expectation is different in a growing population, resulting in a potentially large bias.

    Despite an improvement in the accuracy of sequencing technologies, some errors remain unavoidable. For example, with a sequencing error rate of 1 in 10,000 bases, in a sample of 10,000 individuals, each base pair will exhibit two errors on average across the sample and the majority of monomorphic sites will appear polymorphic (most often as a singleton or a doubleton; i.e., with the rare allele present in one or two copies in the sample). On the other hand, strict filtering of the data will lead to missing many rare variants because they are not observed as reliably. Hence, any analysis of large sample sizes must account for the uncertainty inherent in sequencing by considering the variant calls probabilistically, and secondary validation of rare variants by an alternate sequencing procedure is essential.

    Keinan and Clark present some models that show how much it matters to consider a growing population compared to the usual null model of constant population size.

    It's so interesting to me to see human geneticists catching up to where anthropologists have been for a long time. Of course, we wrote about the effects of recent population expansions in 2007, noting the apparent acceleration of positive selection in post-agricultural populations ("Why human evolution accelerated") [2].

    Large-scale sequencing projects have moved beyond simply categorizing common genetic variation. They are now at a stage where thousands of individuals need to be examined, to find increasingly rare genetic variations and determine their collective effects on phenotypes. That means that the next version of the 1000 Genomes Project really needs to be involve many of us who are directly concerned with human population history. The growth and dynamics of actual historic human populations are going to matter to how we understand their genetic variation and its effects on phenotypes. Fortunately, archaeology and written history can help -- if anthropologists are involved in this work from the start!


    References

    1. Keinan A, Clark AG. Recent Explosive Human Population Growth Has Resulted in an Excess of Rare Genetic Variants. Science. 2012;336(6082):740 - 743.
    2. Hawks J, Wang ET, Cochran G, Harpending HC, Moyzis RK. Recent acceleration of human adaptive evolution. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2007;104:20753–20758. Available from: http://dx.doi.org/10.1073/pnas.0707650104
    Synopsis: 
    Human genetics has reached the point where population history is essential to further progress
  • Kaku cockup

    Thu, 2011-02-17 00:16 -- John Hawks

    I can't bear to watch it again, and I don't see why I should tolerate anyone else having to watch it. But I can't sit quietly while physicist Michio Kaku tells us how human evolution has stopped.

    I'm telling you, don't go watch it. DON'T DO IT!

    Oh, heck, how did that get there?

    Don't press play, whatever you do. I'm warning you.

    Kaku wants to tell you all about how life in the forest used to make us run fast, but now we don't have to do that anymore. He says that life on isolated island continents, like Australia, would rapidly accelerate our evolution. But today jet planes will spread your genes across the world, so our evolution has stopped.

    Or, no, it's not all our evolution that's stopped -- Kaku says that's still going on because our molecules can change. No, it's gross evolution that has stopped. You know, like making our brains twice as big -- that would be gross.

    What about genetic engineering, you ask? Well, Kaku says that changing genes is very painful. And we can't make pigs with wings, so why would we bother? No, many decades from now, humans will look pretty much the way they do now.

    Well, you can't say I didn't warn you. That's today's "Big Think" for you -- timely news you can use. But no flying pigs.

    DERP!

    (via Pharyngula)

  • Spatial dispersal, parallel adaptation, and the "Stooge effect"

    Thu, 2010-10-14 00:06 -- John Hawks

    Peter Ralph and Graham Coop have an interesting paper in the current Genetics, titled, "Parallel Adaptation: One or Many Waves of Advance of an Advantageous Allele?" [1]

    Fisher [2] famously considered the case in which an advantageous allele is dispersing through a spatially dispersed population, showing that the dispersal forms a "wave of advance". This work was the foundation for a lot of progress in understanding spatial dynamics of organisms.

    As I discussed in 2008 ("Overstating the obvious"), one of the consequences of the Fisher wave model for human evolution is that advantageous alleles will spread very slowly through the population. During the course of the Holocene, a strongly selected mutation might move only across a radius of a thousand or so kilometers. That provides one explanation for why new advantageous alleles haven't spread very far beyond their points of origin -- they just haven't had time yet.

    Another reason why an allele might not have spread widely is interference from other alleles with similar effects. I mentioned this process last year ("Spatial variation and near-fixed selected alleles"):

    Greg Cochran and I have been discussing this idea for some time. We call it the "Stooge effect". Think of the Three Stooges all trying to run through a door at the same time and getting stuck in the middle. That's what these genes are doing -- all of them are competing to respond to selection, but each is slowed by the presence of the others.

    Ralph and Coop have cleverly combined the "Stooge effect" phenomenon with spatial dispersal. They suppose a case in which two separate advantageous mutations arise in different geographic locations, each affecting the same trait. Each begins to spread independently as a Fisher wave of advance. What happens when they meet?

    As they show, the dynamics in this case give rise to a static equilibrium -- once the "waves of advance" meet, they stop moving, forming a stable boundary. A new favorable mutation makes headway only so long as it has no equally favorable mutation to compete against.

    I like the way they used both analytical approaches and simulations to come to this outcome. The appearance of stable boundaries in a reaction-diffusion system has long been known (demonstrated first by Alan Turing, actually!). But to my knowledge, no one has considered this specific case from an analytical perspective.

    The Fisher equation is not all that simple for most students to work with. If you become familiar with the equation, you will notice the key aspect is that it has two separate components -- a logistic (or reaction) component representing the increase in frequency at a single point in space, and a diffusion component representing the dispersal across space.

    The muscle of the dispersal process comes from the logistic component. Without the intrinsic growth of the selected allele, the dispersal of individuals along the boundary would not carry many copies of the selected allele into new geographic areas. If the local selective advantage dies, the wave of advance rapidly stalls. A static equilibrium arises, with the frequency of the selected allele forming a cline that correlates with the local selection pressure.

    Ralph and Coop's model approximates this case, in a dynamical sense. Each new selected mutation forms an increasing zone in which the selective advantage of other mutations is zero. When those other mutations encounter this zone, they form a stable cline. The cline is stable in the short term, but the diffusion component still disperses copies of an allele; they just lack the muscle to continue their deterministic expansion.

    The most interesting simulations by Ralph and Coop show the two-dimensional case, in which the stable boundaries emerge in a "tesselation" pattern.

    Tesselations

    Figure 6 from Ralph and Coop (2010), showing "tesselations" in 2-d simulations of waves of advance.

    The lower three panes in the figure show the stability of the boundaries between the selected alleles. They proceed to fixation locally, but their dispersal stops where they come into contact with other adaptive alleles. Over the very long term, the population will mix -- the diffusion process will slowly carry all these alleles throughout the species' range. Look at the process after a million generations and the entire zone will be gray. But this dispersal occurs at the neutral rate, where the diffusion term is the only factor driving the dispersal.

    What about humans?

    My graduate student Zach Throckmorton and I have been working in this area for a while now. One of the things that impresses us is the way that much more interesting dynamics can emerge when you alter the assumptions. I learned some of this stuff by talking to Frank Livingstone, who gave a lot of thought to these issues of spatial dispersal and selection as applied to malaria resistance alleles.

    In particular, Frank thought about the case where one allele has a slightly larger advantage than another. In some contexts, this allows the "better" allele to overtake and swamp the expansion of the "weaker" (but nonetheless adaptive) one. In others, the two come to a near standstill, one displacing the other only very gradually. Much depends on the timing of the two mutations and the local conditions controlling their initial dispersal.

    Ralph and Coop briefly consider this case in their paper, noting that the difference in fitness advantage of two alleles will allow one to advance into the range of the other, albeit at a slower rate. In humans, we may be seeing a smaller subset of cases, where one or more of the alleles have not yet established a wavefront. In these cases, the arrival of another wave can disrupt the spatial pattern of the rarer allele. The diploid case gives rise to the possibility of more complex epistases. Well-defined boundaries between selected alleles are rare, and where they occur (as may be the case with HbC and HbS in Africa), many have focused on negative epistasis as an explanation.

    Also, alleles are unlikely to substitute perfectly for each other. In many cases, they may work synergistically -- individuals carrying two selected alleles that affect the same function may outperform those carrying only one such allele. At some point, new selected mutations may start to have diminishing returns, even on a trait like skin pigmentation where dozens of alleles may have been selected in widespread human populations. So the current distribution may to some extent be "frozen", but by a more complicated dynamic than the simple intersection of waves of advance.

    As Coop and colleagues showed last year [3], and we discussed in 2007 [4], there are really only few genes that have approached local fixation in recent human evolution. The current spatial pattern of recently selected alleles doesn't look like a tesselation with many alleles near local fixation. Over most of the Old World, it looks like populations have a very large number of very new alleles, far from fixation, and few up over 70 percent in frequency.

    So the specific scenario in this paper by itself probably does not explain the overall empirical pattern in humans. But if we consider the current pattern as a transient, approximating the early stages of dispersal for many selected alleles, we may not be terribly far off the mark.

    Mutation-limited evolution

    This is a long dense paper and there's a lot in it. One further aspect of the paper that I think is essential is the way that Ralph and Coop reiterate the basic point that more people means more mutations. In their case, they focus on population density over space (population number, when you multiply them) as a constraint on the number of possible adaptive mutations. They apply this idea as a hypothesis to account for parallel adaptations that may have emerged in recent human evolution.

    Multiple mutational origins are likely if the characteristic length is shorter than the physical dimensions of the region. Eurasia measures >8000 km across, and so Table 1 suggests that multiple origins at a single base pair are very unlikely at the lower population density. On the other hand, if the mutational target is large, then multiple origins are likely at low densities, while at high densities independent origins are ubiquitous. The complementary cases of (rho = 2, µ = 10–8) and (rho = 0.002, µ = 10–5) give identical characteristic lengths of 3000 km, although the timescale on which the mutations spread differs. Thus for these two parameter combinations we can expect a few mutations to dominate within continents and for multiple mutations to be common in a population spread across an area the size of Eurasia. Obviously these calculations are very crude, as population densities vary through space and time, and dispersal across continents is not simply a function of geographic distance and individual dispersal. Nevertheless, these calculations suggest that it is plausible that for adaptive traits with reasonable mutational targets (e.g., a change anywhere within a gene or pathway) even low population densities can lead to parallel adaptation across an area the size of Eurasia, and higher densities almost certainly will.

    We note that as human population densities have increased dramatically over time, so too has the probability of parallel adaptation. It is interesting therefore to note that a number of recent human adaptations (e.g., sickle cell alleles) involve repeated changes at very small mutational targets in relatively small geographic areas, while older adaptations from single changes (e.g., skin pigmentation) are more broadly spread.

    They are describing a scenario in which small human populations would have been mutation-limited -- that is, the number of new mutations is small, making it unlikely that adaptive mutations will happen in any given generation. In such populations, the rate of adaptation is limited by the availability of new mutations. In an extreme -- in the very small effective sizes of Pleistocene human populations -- the rate of adaptation may be extremely slow and regional populations may come to differ at many weakly selected loci, which spread very slowly.

    As the population grows, strongly adaptive mutations become more and more likely to happen somewhere in the species' range. Yet they are still relatively rare -- meaning that they have an opportunity to spread fairly far before encountering another equally strongly selected mutation affecting the same trait.

    This process can give rise to very large differences on a continental scale, even when the selection pressures in different regions do not differ. In humans, the dispersal of selected alleles across space may have been significantly accelerated by actual dispersals of populations. It is not a mere coincidence that very widespread alleles in Eurasia also tend to be much older than 20,000 years old -- long-distance dispersals prior to that time had a higher chance of leaving a lasting influence on subsequent populations.

    But as the population gets bigger and bigger, parallel mutations are more and more likely to happen. As Ralph and Coop point out, at the extreme of large population size and likely mutations, you shouldn't see any new mutations emerging and spreading over very large areas. Any of these mutations would be very likely to encounter other new mutations that do the same thing.

    Is this likely in humans? Clearly some mutations have happened recurrently. Making a broken gene is easy -- there's a large mutational target, since a large fraction of nonsynonymous substitutions might do the job. So if there's a net selective advantage to breaking a gene, we ought to see that happen recurrently in human populations.

    In contrast, if the mutational target is very small, then mutations will still be rare even in a very large population. If only one base change can have an adaptive effect, that precise change will happen less than once in 109 births (remember that not just any mutation at a site, but some particular mutation is what we may need). If a rare duplication or gene conversion is the necessary change, then it may be much rarer.

    Looking across the last few million years, when human population numbers were much smaller than the Holocene, we can be pretty sure that some aspects of our evolution were mutation-limited. The changes that took hold in our ancestors were the ones that happened, and that survived the winnowing of genetic drift. Many changes that would have been adaptive didn't happen in our ancestors. They just weren't lucky enough.

    But some of those changes would still be adaptive now, if we could get them. And we have had much larger numbers in the last 10,000 years. Homo erectus needed these mutations, but we only now are seeing them selected in the human population.

    Malaria adaptation

    Hemoglobinopathies are among the cases of easy mutations -- where breaking a gene is adaptive. It's not just any broken version of alpha- or beta-globin that does the job, though. The hemoglobin needs to be impaired in certain ways to impede the parasites while maintaining blood function. This provides many of the classic cases of human adaptation, and Ralph and Coop turn to this system for examples of parallel adaptation:

    The sickle cell allele HbS at the β-globin gene in humans provides a particularly interesting case of putative parallel adaptation. The HbS allele (β6 Glu-Val) has been driven to intermediate frequencies by selection within the past 10,000 years due to increased resistance to malaria of heterozygotes for the allele (HALDANE 1949; ALLISON 1954; CURRAT et al. 2002; KWIATKOWSKI 2005). The HbS allele is present on at least four major distinct haplotypes in Africa, each at intermediate frequency within a different geographic region; the haplotypes are named after the population sample where they were first discovered (Central African Republic, Senegal, Benin, and Cameroon). This is consistent with multiple origins of this single-base-pair change. Note that a distinct, malaria resistance allele, HbC (β6 Glu-Lys), has also arisen in Africa at the same codon as the HbS allele (TRABUCHET et al. 1991; AGARWAL et al. 2000; WOOD et al. 2005a), increasing our confidence that the mutational input was high enough to allow multiple types to arise. However, FLINT et al. (1998) thought the hypothesis of multiple new mutations arising at a single base pair was extremely unlikely and proposed that it was more likely that gene conversion had spread a single mutation across multiple haplotypes.

    The theory we have developed can be used to assess the plausibility of the multiple mutational origins of the sickle cell allele, by exhibiting parameter combinations that yield characteristic lengths consistent with the separation of the sample locations. [Recall that the wave of advance, and thus also our model, works in the case of heterozygote advantage (ARONSON and WEINBERGER 1975).] The different HbS haplotypes co-occur within a few thousand kilometers of each other (see Table 5 of FLINT et al. 1998) (noting that these locations are unlikely to reflect the geographic mutational origins, and mutations will have been spread by large population movements). As the HbS changes occur at a single base pair, the mutation rate would have been 10–8, and we take an s = 0.05 (as in CURRAT et al. 2002). If human dispersal at that time was well approximated by a Gaussian kernel with sigma = 100 km, then a characteristic length of 1000 km would require an effective density of individuals of rho = 25 km–2, while if sigma = 10 km, then we would require only rho = 2.5 km–2. This latter set of parameters does not seem unrealistic, considering our knowledge of population density and dispersal parameters, so our model suggests that the hypothesis of multiple origins is not unreasonable.

    I think they've got the basic idea correct here, but there are some additional details to consider. The distribution of HbE is not quite so easy to understand if parallel mutations are really so likely, and of course there is the negative epistasis of different alleles (and the thalassemias) which impacts their dispersal ability when they become moderately common. The dynamic may be of similar form to the one described here, but boundaries between alleles may be reinforced by the fitness costs of carrying multiple ones.

    This situation raises the issue of path dependence. Some mutations have "first mover" advantages. Once they are common, other adaptive mutations may still occur -- even mutations that are better from the standpoint of fitness -- but be lost or grow very slowly because their net fitness advantage over the common mutant is slight. Where HbE is common, new HbS alleles are unlikely to invade quickly. Where HbS is common, new HbE mutants are similarly unlikely to invade -- even though HbE has a higher fitness.

    Network effects among genes may also dominate the spatial dynamics. HbS spread most widely in the context of populations that were already Duffy null, and in which G6PD deficiency was rapidly increasing. The first conditioned the parasite environment -- P. vivax had a strong disadvantage in Duffy null populations, P. falciparum made up most of the parasite load. G6PD deficiency should have impacted the relative advantage of HbS, more and more as it became more common. Those are two loci among many that alter malaria dynamics in Africa compared to South and Southeast Asia.

    Conclusions

    There is much more to say about this paper -- it's 22 journal pages. But I think I've given an impression of what's there and how the ideas may impact our interpretation of recent human evolution. Many of the central concepts were presaged by earlier work in 2007 and 2008, as reviewed here on the blog. The new analytical and simulation work, I really like.

    Hopefully we can get out some shorter papers that will focus on aspects of these problems as applied to humans. A message that comes across very clearly in our work and this new paper is that different time periods in our evolutionary history must have had very different selection dynamics. Pleistocene humans were not only in a different ecology than us, they experienced a radically lower potential for adaptation.


    References

  • Recent selection, the new paradigm

    Mon, 2010-07-19 23:15 -- John Hawks

    Nicholas Wade gives some recent highlights of research into ongoing selection in humans.

    We are at the center of this research [1], as we connected the widespread pattern of positive selection to human demographic history -- a growing population, with major ecological changes, has both the pressure and opportunity to respond by new adaptive mutations. The result was an acceleration of the rate of positively selected mutations, so that a large proportion of the genome shows evidence of ongoing selective sweeps in one or more human populations. So I'm excited to see the continuing interest in this topic.

    According to Wade's account, the initial skepticism of many geneticists to this idea seems to have mostly evaporated. I think that much of the caution was reasonable conservatism -- few people expected to see such widespread effects of selection. Only those of us who were thinking of the changes in the Neolithic and later were really prepared to interpret the evidence. But now, the sheer accumulation of studies has shown that our initial estimates may have been too conservative.

    About 21 genome-wide scans for natural selection had been completed by last year, providing evidence that 4,243 genes — 23 percent of the human total — were under natural selection. This is a surprisingly high proportion, since the scans often miss various genes that are known for other reasons to be under selection. Also, the scans can see only recent episodes of selection — probably just those that occurred within the last 5,000 to 25,000 years or so. The reason is that after a favored version of a gene has swept through the population, mutations start building up in its DNA, eroding the uniformity that is evidence of a sweep.

    Unfortunately, as Joshua M. Akey of the University of Washington in Seattle, pointed out last year in the journal Genome Research, most of the regions identified as under selection were found in only one scan and ignored by the 20 others. The lack of agreement is “sobering,” as Dr. Akey put it, not least because most of the scans are based on the same Hap Map data.

    From this drunken riot of claims, however, Dr. Akey believes that it is reasonable to assume that any region identified in two or more scans is probably under natural selection. By this criterion, 2,465 genes, or 13 percent, have been actively shaped by recent evolution. The genes are involved in many different biological processes, like diet, skin color and the sense of smell.

    That's 13 percent with statistical evidence in two or more studies. Keep in mind that our present sample size is small enough that we can't reject the hypothesis of genetic drift on things that have frequencies lower than ten percent in a given population. So probably the variants we know about are the tip of a larger iceberg of rare selected variants, which originated within the last few thousand years and haven't had time to increase to higher frequencies. Some may have stalled out at lower frequencies, because of epistases or changes in the environment.

    The proportion of affected genes should approach some asymptote, as lower-frequency variants will be likely to hit the same gene categories again and again. Diet, skin color, smell, disease, brain, all systems that have been under strong selection pressure in recent human evolution. That may provide a promising way to uncover functional relationships among genes. Wade's description of Anna Di Rienzo's work seems to be along those lines.

    Many workers seem to realize now that humans don't live in hunter-gatherer environments. But a disappointment for me is that the article doesn't discuss the role of demography in generating this unique evolutionary pattern. Demography provides an important filter on the results of genome-wide analyses, also. The power of statistical methods is not uniform across different ages of adaptive alleles. Some methods miss older events while all methods miss very recent ones.

    Statistical power is an important reason why some studies find more evidence of selection in Europe and East Asia compared to Africa. The demography of those regions means that Africa has a broader distribution of ages of positively selected mutations: more older events, fewer events corresponding to the peak population growth of early agriculturalists.

    There is some stuff in the article about "soft sweeps" -- the hypothesis that much recent phenotypic change may result from selection on standing genetic variation in ancient populations. An allele that already existed neutrally in the population can come under new selection, and that kind of selection won't trigger the criteria for genome-wide selection scans.

    I have some thoughts about this phenomenon that I'll write up and share. We know that there were some big phenotypic changes in the Late Pleistocene and early Holocene, and initially these changes should mostly have involved standing genetic variation. New adaptive mutations were coming into these populations at a relatively slow rate. When a new mutation is still rare, it doesn't have much impact on the average phenotype in the population. So if we see a fast change to the average phenotype, we know that new mutations aren't responsible, at least not initially.

    But it doesn't take very many genes to cause phenotypic changes. And if small populations have few new adaptive mutations, they also have relatively little standing variation. So the importance of soft sweeps to our evolution may be great, even if their numbers are ultimately small.


    References

    1. Hawks J, Wang ET, Cochran G, Harpending HC, Moyzis RK. Recent acceleration of human adaptive evolution. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2007;104:20753–20758. Available from: http://dx.doi.org/10.1073/pnas.0707650104
  • Using the Neandertal genome to uncover human evolutionary history

    Sun, 2010-05-16 11:20 -- John Hawks

    Before the Neandertal genome release last week, I was reading (thanks to a correspondent) an essay that James Noonan wrote for the current Genome Research. The piece, titled, "Neanderthal genomics and the evolution of modern humans" is well worth reading. It's a snapshot of what we might reasonably have anticipated would come out of the efforts to sequence Neandertal genomes, without the punchline -- no recognition that we would ultimately turn out to have Neandertal genes.

    It will take a while for paleoanthropologists to come to any kind of informed opinion about the importance of the current genome results. The quotes I've gathered from various newspaper sources include a pretty wide range of silly ideas. Maybe some of mine fall in that category. But generally I try to be informed by both archaeology and genetics, and I find that tends to avoid some of the silliest statements.

    Note however, there is really no excuse at all for archaeologists saying silly things about the archaeological record.

    Noonan's point of view is that of a mainstream geneticists, and is clearly stated. It represents a widespread school of thought about Neandertal genetics, but (understandably) is mostly uninformed by the archaeological record. For example,

    The primary motivation behind generating a Neanderthal reference genome is to determine how distinct modern humans really are from all earlier versions of humanity. We are the only remaining human species, and thus we do not know if Neanderthals or our other extinct relatives shared our capacity for invention, abstract reasoning, or language. We have had to speculate on these matters based on the bones, the settlements, and the artifacts Neanderthals left behind. The question of modern human and Neanderthal biological similarity is particularly compelling given the recent common ancestry of both species: Based on both genomic and mitochondrial sequence comparisons, the lineages leading to modern humans and Neanderthals likely diverged in Africa ∼300,000–700,000 yr ago (Krings et al. 1997; Serre et al. 2004; Green et al. 2006, 2008; Noonan et al. 2006). This genetic evidence has become folded into a narrative of modern human and Neanderthal evolutionary history that continues to frame comparative studies of both species. In its simplest form, the modern human and Neanderthal lineages continued on parallel evolutionary tracks subsequent to their divergence, with the descendants of one branch migrating to Europe and giving rise to Neanderthals, and the other branch remaining in Africa and eventually producing us (White et al. 2003; Mellars 2004; Hublin 2009; Tattersall 2009). The modern human colonization of Europe ∼40,000 yr ago potentially brought both lineages back into widespread contact (Mellars 2004).

    Given their very recent common ancestry, how much did the species have in common at this point? Were modern humans and Neanderthals capable of interbreeding, and, if so, did it happen to any appreciable extent? Or were the species so different that no meaningful exchange of information could occur?

    Well, you know my answer to those questions.

    I quoted this part because I think the earlier part of the passage deserves comment. Will the genetics tell us more about the cognitive relations of Neandertals and their contemporaries? Maybe eventually, but for the time being there is a tremendous void in our understanding of functional genetics. We really know nothing about the relationship of genetic variants to the "capacity for invention, abstract reasoning or language."

    Compare the situation to "personalized genomics." If we sequence somebody's genome and find new variants, for the most part we have no way of predicting what they do. And even the genes have functionally apparent properties -- for example, a stop codon -- there still may be no practical way to test the hypothesis that it influences a given phenotype.

    The archaeological record is actually pertinent to cognition in a way that the genetic evidence isn't yet. That doesn't mean we have many answers -- we're still groping the dark. But if I want to know about the evolution of human cognition, the archaeology is a much better place to start.

    What we know about the archaeology seems very clear: Most of the things that later MSA Africans did, Neandertals also did. There were differences, which may have been important -- but those differences don't exceed the variation of material culture in later human populations.

    That doesn't rule out that Neandertals may have been cognitively different from us in some important ways. But when we look at the complexity of the material record within Africa, I think it is fair to say that Neandertal behavior fits comforably within the continuum represented by MSA people. "Behavioral modernity" is broadly shared, and doesn't clearly track lines of biological differences. Rachel Caspari and Sang-Hee Lee's work on mortality differences are another concrete illustration of the ways that material culture and behavior do not track with anatomy in these populations.

    In the short term, the most important influence of understanding the Neandertal genome will be what it tells us about phylogenetics and demographic history. That is what got all the attention last week, and will continue to occupy many of us in the next few months.

    Even though the news of interbreeding is fascinating, working out the phylogenetic relationships of Pleistocene humans is only a first step towards understanding their evolutionary history. Noonan focuses on strategies for uncovering which genetic changes were important to recent human and Neandertal phenotypic evolution. In this respect, the essay could serve as an introduction to the two papers released in Science last week. It explains a bit about why the Neandertal genome is useful for uncovering functional changes in the human genome, and what may prove useful to drive this inquiry further. For example, from near the end of the essay:

    These studies illustrate a general strategy toward an understanding of biological differences between modern humans and Neanderthals, in which the first step is the reverse genetic analysis of genes and gene regulatory elements showing human-specific or Neanderthal-specific sequence changes. In this approach, changes in basic molecular functions, such as enhancer activity, protein-DNA interactions, or receptor-ligand binding affinity are identified in synthetic assays. The phenotypic consequences of these molecular changes can then be assessed in mouse models: A recent study describing the introduction of a "humanized" version of FOXP2 into the mouse genome by gene targeting is one early example (Enard et al. 2009). The data from such studies, combined with a growing body of information on human gene function, the effects of genetic variation on human phenotypes, and comprehensive efforts to functionally annotate the human genome, would provide the foundation for more sophisticated hypotheses concerning the biological similarity of modern humans and Neanderthals than can be generated from the paleoanthropological record alone.

    Now, in light of last week's data release, we know some things about these general topics. The evolution of human-specific changes in conserved regions, for example, apparently mostly preceded the human-Neandertal common ancestor. There are few amino acid changes in recent (post-Neandertal) evolution that have become fixed worldwide -- the new studies counted only 88. There are only 212 estimated selective sweeps not present in the Neandertal genome.

    Those are manageable numbers.

    Of course, we shouldn't underestimate how hard it will be to untangle the interactions among these human-specific changes. It may require testing not each change one by one, but many possible combinations of the changes, since we don't necessarily know their order. And it is not only the fixed changes that are important to morphological and behavioral evolution, polymorphisms will also be important. Among those polymorphisms will be later, strongly selected changes that may substantially modify the "fixed" substitutions -- in a few cases, may even reverse them.

    But this isn't a hopeless prospect anymore, it's a practical research program. The genetic changes that are nearly fixed in living people but absent in Neandertals represent one of the earliest -- possibly the first -- instances of geographic isolation and selection in Homo sapiens. They are one aspect of a pattern that has become increasingly important in later human populations, as the pace of adaptation has accelerated beyond the ability of gene flow to disperse adaptive alleles. Reconstructing this history will tell us about the shared evolutionary dynamics of humans and Neandertals, and the ecological particularities that may have made both populations phenotypically different.

    References:

    Noonan JP. 2010. Neanderthal genomics and the evolution of modern humans. Genome Res 20:547-553. doi:10.1101/gr.076000.108

  • Selection's genome-wide effect on population differentiation

    Sun, 2010-03-28 08:30 -- John Hawks

    Alon Keinan and David Reich [1] have tested an obvious prediction of the hypothesis that recent selection has had a major effect on variation across the genome, and in doing so have provided some strong support for our hypothesis of a recent acceleration.

    A new mutation that increases rapidly under positive selection will carry with it a lot of nearby variants that are physically linked to it. The region of this "genetic hitchhiking" will depend on the local rate of recombination -- the lower the recombination rate, the longer the extent of the hitchhiking region.

    Meanwhile, a new mutation takes a while, sometimes many thousands of years, to spread widely beyond its population of origin. We can measure population differences for a single locus as FST. The FST attained by a new selected variant depends on what frequency it has reached in different populations. For many selected alleles, they have not yet attained high frequencies anywhere, and so FST is low. But for a few, the selected variant has reached a high frequency in a few populations, but remains rare elsewhere. These are recognizable as high FST loci.

    What is true of the selected allele itself will also be true, to a lesser extent, of the linked haplotype that is hitchhiking along with it. And so, if selection has been sufficiently common in recent human history, there should be a relationship between the local rate of recombination and measures of population differentiation like FST.

    Which is exactly what Keinan and Reich found.

    Further, they found that this relationship is true of regions of the genome that contain a lot of coding loci, and much less true of gene-poor regions:

    We cannot envision any demographic or mechanistic explanation that would produce a correlation between recombination rate and allele frequency differentiation as observed and we hypothesize that our observations reflect a history of natural selection. Natural selection is usually expected to increase population differentiation at linked neutral sites, an effect that is expected to extend over longer physical distances in regions of lower recombination rate. A prediction of an explanation based on natural selection is that the effect would be more marked in regions that are more likely to be influenced by selection, such as genes.

    The observed FST in these categories is not super-high -- we're not looking predominantly at genes for which more than 20 percent of the variation is the between-population component. Therefore, the comparison can encompass quite a bit more of the selected variation in the genome, instead of the extremely stringent cutoffs required to identify an individual candidate gene. It's a bit like getting a measure of wind speed as opposed to looking at the few highest-flying kites.

    Perhaps the most interesting aspect of the study is that they compared the Phase 3 HapMap samples, which include some pairs of nearby populations. They found that the apparent effect of selection on population differentiation was much higher for those nearby pairs of populations:

    In addition to qualitatively replicating our findings, analysis of HapMap 3 data allows us to generalize them to additional populations. A striking result is that the relationship between FST and recombination rate is stronger for FST between pairs of closely-related populations, whether within or outside Africa: FST between a West African sample and Maasai (of mixed West African and East African ancestry [57]) decreases by an average of 6% for every 1 cM/Mb (Figure 4D), FST between Italians and individuals of North-Western European ancestry decreases by 10% for every cM/Mb (Figure 4E), and FST between Japanese and individuals of Chinese ancestry decreases by 4% (Figure 4E). In view of the large effective population size in recent human history since each of these pairs of populations have split, these observations support the possibility that the different patterns observed between different pairs of populations are due to natural selection operating more efficiently in the context of larger population sizes.

    That's a direct sign, in other words, of the recent acceleration of positive selection in human populations. There are a lot more genes that are geographically circumscribed and low in frequency affecting FST at a more localized level, and fewer affecting major allele frequencies between continental regions. It's a neat comparison, and it helps to answer the comment that selection is somehow "weak", or insignificantly different from drift, because the new selected alleles haven't spread very far. The point is, most of them are so new that they haven't had time to disperse widely and reach appreciable frequencies very far from their origins.

    UPDATE (2010-03-28): A reader pointed out an error in the post; I had written "lower" recombination rate at one point that should have been "higher". I have corrected the text.


    References

  • Genes and archaeology

    Tue, 2010-02-23 21:33 -- John Hawks

    Current Biology has released a special issue titled "Global genetic history of Homo sapiens". There is much of interest in this issue, with seven papers, mostly regionally focused in different parts of the world, but one paper by Jonathan Pritchard and colleagues discussing recent adaptive evolution.

    The geneticists to varying extents in this volume depend on archaeological observations, but in many cases read the archaeology very selectively. Speaking as someone who takes archaeology seriously, I find this very frustrating. With more genetic data, we need to demand

    An editorial by archaeologist Colin Renfrew leads off the special issue ("Archaeogenetics -- towards a 'new synthesis'?").

    Today, we have an abundance of data about the genetic variation of living people that we did not have ten years ago. In addition to our samples from living populations, we are beginning to find a trove of information about ancient people, from DNA extracted directly from skeletal material. But despite the attempts of geneticists and (rather pitifully few) archaeologists, I don't see a "new synthesis" emerging.

    Reading the first paragraph of his editorial, it seems to me that Colin Renfrew agrees:

    It seems a timely moment to review human population history of the five continents as it emerges from recent archaeogenetic studies, as summarised in the reviews of this special issue of Current Biology. Has the ‘new synthesis’ — between genetics, archaeology and linguistics — arrived which I, perhaps incautiously, heralded a few years ago [1]? These highly informative reviews document, it seems to me, both achievement and uncertainty: the achievement relates to the remarkably consistent picture which has now emerged about the out-of-Africa emergence of our own species Homo sapiens and the initial peopling of the Earth. The uncertainty involves the application of archaeogenetics to the more recent, Holocene period, when most of the planet was already peopled — except much of Oceania — and sedentary, farming-based communities emerged. Here, it appears that much of our current understanding still depends on archaeological or, sometimes, linguistic evidence. And, with a few exceptions, the archaeogenetic evidence has not yet been assimilated into a genuine synthesis; but, let us begin with the good news.

    I find it a markedly bad sign that Renfrew thinks the best of "archaeogenetics" is the part with the least archaeological evidence. If the genetics doesn't seem to work where there is abundant archaeology, why should we believe the genetics in cases where the archaeology is poor?

    I write that quite seriously, as someone engaged directly with the genetics. It's too easy to make stuff up. How can you test a hypothesis that seems consistent with genetic data? The obvious approach is to try to falsify the hypothesis with archaeological observations -- but sadly, archaeology is often pitifully silent on the subject of demography and gene flow, or there are many scenarios equally consistent with the same archaeological record.

    In the Holocene, archaeology has a lot of power to rule out hypotheses about demography and population movement. So this is where I want to see serious attempts to falsify archaeological models using genetics. And that's what we're starting to get! The finding from ancient DNA that early European farmers were neither closely related to earlier hunter-gatherers nor to later agriculturalists has been very surprising. It seems to reject the hypothesis that today's gene distributions come from an initial dispersal of farmers with their Indo-European languages -- the European component of the so-called "language-farming hypothesis".

    Why? Well, because a later massive genetic change suggests that the language transition may well have happened a lot later (as suggested by much of the linguistic evidence itself), and the mtDNA haplotypes carried by the early European farmers have no clear relationship to Near Eastern or central Asian populations.

    It's no surprise that Colin Renfrew would find disagreements with this genetic work; he's the biggest supporter of the "language-farming hypothesis".

    But I think that the current situation is very healthy. Geneticists are testing hypotheses and showing them to be false. At the same time, they're proposing models that archaeology can easily show to be false. For example, many recent evaluations of adaptive evolution have looked for genetic outliers against a "neutral" population model that involves very small Holocene population size. From the genetic perspective, this small population size assumption is conservative -- it means that some genuine cases of adaptive evolution will look less statistically significant. But archaeology can actually inform us about these cases. Any scenario in which the Holocene population was smaller than millions of individuals must be false. In many cases, a less conservative model is in order.

    I think there are tremendous opportunities for integrating adaptive evolution remains to be integrated with our understanding of demography. I don't put a lot of faith in the current storyline about genetics and the earlier part of prehistory. That story will continue to develop as we deepen our understanding of the demographic and adaptive factors that have shaped human genetic variation within the last 50,000 years.

    References:

    Renfrew C. 2010. Archaeogenetics -- towards a 'New Synthesis'? Curr Biol 20:R162-R165. doi:10.1016/j.cub.2009.11.056

Pages

Subscribe to acceleration

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.