john hawks weblog

paleoanthropology, genetics and evolution

Error message

Notice: Undefined variable: options in csl_name->render() (line 639 of /var/www/johnhawks.net/public/modules/biblio/modules/CiteProc/CSL.inc).

theory

  • Anthropological Theory and Ethnography

    Mon, 2012-05-28 22:24 -- John Hawks
    Synopsis: 
    The homepage for Anthropological Theory at the University of Wisconsin-Madison

    Welcome to the homepage for Anthropology 300, Anthropological Theory and Ethnography

    Here you'll find all the readings, links and essential materials for the course. This homepage is a relatively simple outline of the course requirements and information from the syllabus, with a schedule of classes linked to readings as they become available.

    This course is based on readings and discussion. There are only ten sessions total, so attendance and participation are crucial to learning the material.

    If you like to follow RSS feeds, the site has a feed with all the material listed below, updated automatically as new items are posted. The feed is not in outline format, but you can follow it just as if it were a course newsletter with all the readings.

    Part 1: Culture

    Part 2: Evolution

    Part 3: Structure and function

    Part 4: Transfer of information

    Part 5: Status and power

  • Quote: E. E. Evans-Pritchard on social anthropology and humanities

    Fri, 2012-04-06 17:10 -- John Hawks

    From "Social anthropology: Past and present" [1]:

    The thesis I have put before you, that social anthropology is a kind of historiography, and therefore ultimately of philosophy or art, implies that it studies societies as moral systems and not as natural systems, that it is interested in design rather than in process, and that it therefore seeks patterns and not scientific laws, and interprets rather than explains. These are conceptual, and not merely verbal, differences. The concepts of natural system and natural law, modelled on the constructs of the natural sciences, have dominated anthropology from its beginnings, and as we look back over the course of its growth I think we can see that they have been responsible for a false scholasticism which has led to one rigid and ambitious formulation after another. Regarded as a special kind of historiography, that is as one of the humanities, social anthropology is released from these essentially philosophical dogmas and given the opportunity, though it may seem paradoxical to say so, to
    be really empirical and, in the true sense of the word, scientific.

    This passage is often cited in anthropological theory courses as an early statement of how cultural anthropology came to be seen by its practitioners as an interpretive and fundamentally humanistic discipline. The end of the passage, in which Evans-Pritchard presages the social anthropologists of the future will mainly be humanists, is indeed a polemic for an interpretive approach. But his argument for humanism is not actually anti-science in today's terms; instead it is anti-normative.

    As he described the agenda of a humanistic anthropology, Evans-Pritchard effectively described what later would be known as "historical science". Evolutionary biology, for example, is fundamentally historical rather than experimental. "Laws" are a part of evolutionary biology only in the sense that they may provide useful generalizations about the outcomes of historical (and contingent) natural processes. After the passage above, Evans-Pritchard described a research agenda for social anthropology basically akin to evolutionary biology:

    What more do we do, can we do or should we want to do in social anthropology than this? We study witchcraft or a kinship system in a particular primitive society. If we want to know more about these social phenomena we can study them in a second society, and then in a third society, and so on, each study reaching, as our knowledge increases and new problems emerge, a deeper level of investigation and teaching us the essential characteristics of the thing we are inquiring into, so that particular studies are given a new meaning and perspective. This will always happen if one necessary condition is observed: that the conclusions of each study are clearly formulated in such a way that they not only test the conclusions reached by earlier studies but advance new hypotheses which can be broken down into fieldwork problems.

    You can see that Evans-Pritchard equated a scientific approach with a positivist approach. In those days, the equation was not unreasonable. Although philosophers of science had long been probing alternatives to positivism, most working scientists -- and particularly anthropologists and archaeologists -- used a kind of naive positivist epistemology. In Evans-Pritchard's view, this kind of inquiry had tainted anthropological inquiry throughout its history by encouraging anthropological hubris. If anthropologists could find and understand natural laws of culture, they could improve the effectiveness of social policy.

    This normative element in anthropology is, as we have seen, like the concepts of natural law and progress from which it derives, part of its philosophical heritage. In recent times the natural-science approach has constantly stressed the application of its findings to affairs,the emphasis in England being on colonial problems and in America on political and industrial problems. Its more cautious advocates have held that there can only be applied anthropology when the science is much more advanced than it is today, but the less cautious have made far-reaching claims for the immediate application of anthropological knowledge in social planning; though, whether more or less cautious, both have justified anthropology by appeal to utility. Needless to say, I do not share their enthusiasm and regard the attitude that gives rise to it as naive. A full discussion of it would take too long, but I cannot resistthe observation that, as the history of anthropology shows, positivism leads very easily to a misguided ethics, anaemic scientific humanism or - Saint Simon and Comte are cases in point - ersatz religion.

    If the lecture had stopped here, it might have been remembered as an early statement in favor of anthropology as a humanistic science, rather than as humanities opposed to science. The lecture was nine years before the famous "Two cultures" lecture by C. P. Snow, but obviously takes a similar theme. But Evans-Pritchard did not take the daring route of redefining anthropological science. Instead, he observes that most future anthropologists would no longer be drawn from the sciences at all (emphasis added):

    There is, however, an older tradition than that of the Enlightenment with a different approach to the study of human societies, in which they are seen as systems only because social life musthave a pattern of some kind, inasmuch as man, being a reasonable creature, has to live in a world in which his relations with those around him are ordered and intelligible. Naturally I think that those who see things in this way have a clearer understanding of social reality than the others, but whether this is so or not they are increasing in number, and this is likely to continue because the vast majority of students of anthropology today have been trained in one or other of the humanities and not, as was the case thirty years ago, in one or other of the natural sciences. This being so, I expect that in the future there will be a turning towards humanistic disciplines, especially towards history, and particularly towards social history or the history of institutions, of cultures and of ideas. In this change of orientation social anthropology will retain its individuality because it has its own special problems, techniques and traditions. Though it is likely to continue for some time to devote its attention chiefly to primitive societies, I believe that during this second half of the century it will give far more attention than in the past to more complex cultures and especially to the civilizations of the Far and Near East and become, in a very general sense, the counterpart to Oriental Studies, in so far as these are conceived of as primarily linguistic and literary -- that is to say, it will take as its province the cultures and societies, past as well as present, of the non-European peoples of the world.

    Not a bad prediction. Evans-Pritchard did not anticipate that Orientalism would give rise to a backlash, and that anthropology would become much more reflexive and inward-looking, focused on subcultures within Western societies nearly as much as non-European peoples. But the field's actual history followed from Evans-Pritchard's basic prediction about the students of the future. Anthropology began to draw students who did not speak the language of science, and thus became more humanistic. The human sciences always have had use for cultural information, drawing in anthropologists concerned with psychological and sociological interests, but leaving students in anthropology often as a residue of those with more humanistic than scientific interests.

    A science of culture could be, and was partially, constructed along the lines of a historical science as Evans-Pritchard nearly described, but that science has been attempted more often in psychology or biology than in anthropology.


    References

  • Measuring differences between populations

    Mon, 2011-11-28 00:28 -- John Hawks
    Synopsis: 
    Fst and its relationship to the number of migrants among populations

    When individuals mate locally, different populations tend to diverge from each other in the frequencies of their alleles. Genetic differences between populations are therefore differences in allele frequencies — and these differences in allele frequencies may have consequences in terms of phenotypic or adaptive differences. But every difference in allele frequencies is not equal. When populations encompass great genetic variation, large differences in allele frequencies still leave much overlap — the individuals in the different populations may not be very different from each other. In contrast, slight differences in allele frequencies might be very important between populations that are not variable, because individuals in these populations might vary extensively as a result.

    Geneticists measure the differences between populations by comparing the difference in allele frequencies to the amount of variation within the populations. When people mate with their neighbors, they tend to become more inbred — that is, they are more likely to mate with distant relatives. This means that people will tend to have greater genetic similarity than they would have if they mated equally with people who were born across the world.

    Increase in the level of inbreeding due to low gene flow is often used as a statistic, called FST, relating the increase in inbreeding in the subpopulation to that in the total population. When gene flow is high, FST is low, and vice versa. FST represents the proportion of differences between two individuals taken randomly from two subpopulations that are due to the differences in allele frequency between subpopulations alone. Other differences between the individuals are those that could be found between individuals taken randomly from the same subpopulation. FST therefore provides a comparison between the between-subpopulation and within-subpopulation components of genetic variation.

    The relationship of FST and migration between populations. When the forces causing genetic divergence between subpopulations are balanced by gene flow, the reduction of heterozygosity within subpopulations is a function of the number of people who move between subpopulations each generation, expressed by FST = 1 / (1 + 4Nm).

    Comparing human populations taken from different continents, FST is between 0.1 and 0.15, meaning that only between 10 and 15 percent of genetic differences between individuals are attributable to their geographic origins. This difference is relatively small compared to many other large mammal species spread among different continents, such as wolves or bears [1]. This level of similarity among human populations means that they have shared high levels of gene flow in the past. However, the meaning of these numbers depends on the relationship of gene flow and the other evolutionary forces.

    Because they are opposite in direction, gene flow and genetic drift will reach an equilibrium over time. At equilibrium, FST = 1 / (1 + 4Nm), where Nm is the number of migrants moving into each subpopulation. Neglecting the forces of selection and mutation, then, an FST of 0.1 for human continental populations means an average of 2 migrants have been entering each continent per generation for a long period of time. Many more people are moving from place to place today than two, so one prediction of this relationship is that the level of genetic differences among continents will in the future decrease. In the face of this gene flow, it is likely that most of the differences in allele frequencies that persist in humans are in fact affected by selection. Indeed many of the most obvious differences, related to physical appearances in different places, appear to bear this out.


    References

    1. Templeton AR. Human races: a genetic and evolutionary perspective. American Anthropologist. 1998;100:632–650.
    Study questions: 
    1. If the present FST among human continental groups is consistent with two migrants among populations each generation, what do you predict will happen to human FST in the future?
    2. It is remarkable that genetic drift and migration balance each other at a given number of actual individuals migrating, so that large and small populations are held in equilibrium by the same number of migrants. Are there any differences between large and small populations?
  • From genes to numbers: effective population sizes in human evolution

    Mon, 2011-08-22 16:02 -- John Hawks
    Research authors: 
    Publication information: 

    This is a pre-review manuscript version of the book chapter published in Recent Advances in Paleodemography, J-P Bocquet-Appel, ed., Springer, doi:10.1007/978-1-4020-6424-1_1 (citation information)

    Work status: 

    This manuscript represents the completed work before peer review. It is posted here in accordance with the Springer copyright agreement. All citations and references to this work should direct readers to the final published version in the edited volume by Jean-Pierre Bocquet-Appel.

    Abstract: 

    The effective population size has become a central aspect of our understanding of the ancient structure of human populations. It is through this concept that the genetic variation of present-day humans may inform us about the number and relationships of humans in the past. However, effective population size itself is not a demographic parameter. If the theoretical model does not apply accurately to human evolution, then inferences based on the estimates of effective population size may be in error. Here, I present the theoretical basis of effective population size, including many of the demographic and evolutionary conditions that can confound the relationship of genetic variation and population size.

    Demography is the engine of evolution. Changes in allele frequencies require differential births and deaths of the individuals who carry the alleles. Under natural selection, these births and deaths approximate a deterministic process favoring the survival and reproduction of carriers of a particular allele. The histories of alleles themselves are demographic phenomena: the fitness advantage of a selected allele may be expressed as a relative intrinsic growth rate; its frequency over time follows a logistic growth curve.

    In the absence of selection, allele frequencies vary as a stochastic process. The parameters influencing this process are themselves demographic: population size and mating pattern. Ultimately, the rate of evolution of a population must be constrained by these parameters. This means that the observable genetic characteristics of populations are to some extent natural estimators of demographic characteristics. The relationship between the demographic parameters of a population and its genetic characteristics may in some cases be approximated by a single parameter: the ``effective population size.'' Effective population size refers the demographic complexity of some real population to the simplicity of some ideal population --- in other words, it is a measure of the extent to which a natural population corresponds to some theoretical population model.

    The effective population size has become a central aspect of our understanding of the ancient structure of human populations. It is through this concept that the genetic variation of present-day humans may inform us about the number and relationships of humans in the past. However, effective population size itself is not a demographic parameter. If the theoretical model does not apply accurately to human evolution, then inferences based on the estimates of effective population size may be in error. Here, I present the theoretical basis of effective population size, including many of the demographic and evolutionary conditions that can confound the relationship of genetic variation and population size.

    The Wright-Fisher model

    The mathematical theory of population genetics was developed early in the twentieth century, principally by Ronald A. Fisher, Sewall Wright, and J. B. S. Haldane [1]. The initial success of population genetics was the development of mathematical account of inheritance that reconciled Mendelian inheritance with continuous traits [2]. This development made possible a deterministic model of Darwin's natural selection in terms of change in gene frequencies [3][4][5]. However, the deterministic model depends on differential equations that are strictly true only in an infinite population. In a finite population, stochastic factors also change gene frequencies. The evolution of natural populations is caused by a hierarchy of factors, some of which are deterministic in their effect on the gene frequency, others predictable only in their variance, and yet others unique or idiosyncratic [6]. The importance of the stochastic factor was considered by both Fisher (1930) [4] and Wright (1931) [5]; their disagreement about its importance became a major focus of theoretical population genetics.

    Many phenomena in finite populations may amplify or dampen stochastic change in gene frequencies. In an infinite population, the variance in the time or number of events such as births, deaths, and matings does not matter to the gene frequency. Absent selection or mutation, an infinite population does not evolve. In a finite population, variance in the times or numbers of births, deaths, and matings causes evolution even in the absence of selection and mutation, as gene frequencies fluctuate slightly from generation to generation. Other factors may increase or decrease the variance in births, deaths or matings, such as assortative instead of random mating, high variance in mating success, or inbreeding instead of outbreeding.

    In the course of several publications, Wright and Fisher explored the stochastic factor by application of a simple population model (e.g. [5][4], which became known as the Wright-Fisher model. In this model, the population consists of N diploid individuals. These individuals mate randomly, die immediately upon reproduction, and are monoecious (i.e., no sex-specific effects of alleles, selfing possible). The population therefore contains 2N genes in each generation, which are assumed to be sampled randomly from the 2N genes in the preceding generation, with replacement.

    The main feature of this model is that it is mathematically tractable. The gene frequency in any given generation is a binomial random variable based on the frequency in the previous generation [7]. The expectation of a gene frequency pt is simply its frequency in the preceding generation pt-1 --- that is, no change in frequency on expectation. The variance in the gene frequency is equal to pt-1(1-pt-1)/(2N) --- this variance is larger for smaller N and for gene frequencies near 0.5. The probability of fixation of a given allele is equal to the initial frequency of the allele, so that the fixation probability of a new introduced mutation is 1/2N. Likewise the probability that two genes taken at random in the population are descendants of a single parent gene is 1/2N. The model is a Markov process in which the transition matrix (probabilities of pt given pt-1 has a maximum nonunit eigenvalue equal to 1-(1/2N). As can be seen from these relations (summarized in Ewens 2004 [7]), stochastic evolution in the Wright-Fisher model is determined by the single parameter of population size --- indeed, the model assumes all other possible factors constant.

    Mutation may be added to the model, at a rate u per gene, in which case the expected number of new mutations in any given generation is 2Nu [4]. When mutations are included in the model, it is possible to derive expectations for sample characteristics such as the frequency spectrum of alleles and the probability of gene identity [8]. Such values involve the parameter θ=4Nu, which indicates that mutation and finite population size are inversely related stochastic factors: A small population with a high mutation rate may have similar sample characteristics to a large population with a low mutation rate.

    No natural population reproduces according to this simple model. However, the model gives rise to calculations of the expectation and variance of many genetic characteristics that might be empirically observed in natural populations. Wright (1931) considered that deviations from the simple model might be treated in terms of their effects on sample characteristics. In this respect, a nonideal population with N individuals might behave in a similar way to the ideal population of some different size, Ne, which he termed the ``effective population size.'' The effective population size of a study population is therefore the number of individuals in an ideal Wright-Fisher model with the same sample characteristics as the nonideal population under study.

    But from the considerations above, it is evident that different sample characteristics depend differently on population size in the Wright-Fisher model. In particular, the probability of identity of two randomly chosen genes depends on the probability of inbreeding (1/2N in the Wright-Fisher model), while the change in gene frequency over time depends on the variance in gene frequency (pt-1(1-pt-1)/(2N) in the Wright-Fisher model). Departures from the Wright-Fisher model may affect these two values in different directions. For example, assortative mating may greatly increase the probability of gene identity without greatly affecting the allele frequency. This insight can be important to conservation, since inducing assortative mating may allow more effective selection against deleterious recessives without materially reducing the frequencies of other genes [9].

    Evidently, a single ``effective'' population size cannot summarize all departures from the Wright-Fisher model: natural populations are not described by a single stochastic parameter. For this reason, three distinct concepts of effective population size are often considered. The inbreeding effective population size is the size of the Wright-Fisher population with the same probability of inbreeding as the study population. The variance effective population size is the size of the Wright-Fisher population with the same variance in gene frequencies as the study population. The eigenvalue effective population size is the size of the Wright-Fisher population in which the maximum nonunit eigenvalue is the same as the study population. It is important to note that ``study population'' here may refer to an empirically observed natural population, or it may apply to a population model. It is also worth noting that population models other than the Wright-Fisher model are sometimes considered, such as the Cannings model [10] or the Moran model [11]. These models sometimes give rise to different effective population sizes, because the parameterization of population size may differ from the Wright-Fisher version.

    These effective population sizes have different uses. Molecular data empirically provides estimates of sample characteristics such as the probability of gene identity and the frequency spectrum of alleles, both of which depend on the probability of inbreeding. For this reason, the inbreeding effective size is most relevant for most studies of genetic data. Sometimes inbreeding is relevant to ecological comparisons; in other cases the variance in gene frequencies may be more relevant. In particular, the variance effective size is relevant to conservation because conservation efforts often attempt to assess the rate of gene frequency change [12]. The eigenvalue effective population size is based on the transition probabilities among gene frequencies, with a leading nonunit eigenvalue of 1-(1/2N) in the Wright-Fisher model. Like variances in gene frequencies, these transition probabilities are not easily estimable from empirical molecular samples, and the eigenvalue effective size has rarely been applied in human population genetics. However, it is important in modeling and has emerged recently in considerations of metapopulation dynamics (e.g. [13][14].

    The model-dependence of effective population size is rarely considered in analyses of molecular data. Ewens (2004) [7] gives a good account of the problem:

    Except in simple cases, the concept [of effective population size] is not directly related to the actual size of the population. For example, a population might have an actual size of 200 but, because of a distorted sex ratio, have an effective population size of only 25. This implies that some characteristic of the model describing this population, for example a leading eigenvalue, has the same numerical value as that of a Wright-Fisher model with a population size of 25. It would be more indicative of the concept if the adjective ``effective'' were replaced by ``in some given respect Wright-Fisher model equivalent.'' Misinterpretations of effective population size calculations frequently follow from a misunderstanding of this fact (Ewens 2004: 37-38) [7].

    Changing population size

    The utility of effective population size comes from the fact that it concatenates many separate stochastic phenomena into a single parameter. As an example, a gene frequency is a single value, with a single degree of freedom. It is therefore sufficient to estimate only a single parameter. This approach obviously runs into trouble when more than one stochastic factor varies in the population.

    One of the most troublesome cases is a change in population size. A population that changes in size violates a basic element of the Wright-Fisher population model. Sjodin et al. (2005) [15] assert that ``effective population size'' in meaningless in the context of most changes in population size, because the allele frequency spectrum, variance in gene identity, and other sample characteristics will be altered in ways that have no equivalent in the Wright-Fisher model. In their view, only changes in size that occur on a different time scale (either much shorter or much longer) than genealogical events can be reconciled with the concept of effective size. Indeed, a survey of the literature on human prehistoric population dynamics shows that changes in size create much confusion, with divergent definitions and concepts of ``long-term effective population size.''

    Nevertheless, the treatment of changing population size in terms of effective size originated with Wright himself and is well-entrenched. Wright (1938) [16] considered the effect of fluctuating population size on inbreeding, finding that the effective size of a population that fluctuates in size is approximated by the harmonic mean of population size taken across all generations. The harmonic mean is much closer to the smallest of a set of values than the largest; effective population size is generally closer to the minimum population size than the maximum. This is the inbreeding effective population size, which predicts gene identity and other sample characteristics that derive from it, such as allele frequency spectra.

    The harmonic mean approximation breaks down as changes in population size become more and more rare or exceptional. For example, we might estimate an ``effective size'' for a population that has undergone a bottleneck, a period of small population size flanked by which would be useful for predicting the expected heterozygosity. But the coalescence times of different genetic loci would be much more variable than expected for the corresponding Wright-Fisher population. For many bottlenecks, these times might have a bimodal distribution --- some genes having been fixed by drift during the bottleneck, others having escaped fixation. This bimodal distribution may particularly characterize different gene loci that themselves have different effective numbers, for instance autosomal versus mitochondrial genes [17].

    Simple population growth induces a disequilibrium compared to the Wright-Fisher model, in which the number of new alleles arising by mutation increases more rapidly than the mean difference between individuals [18]. For growing populations, different characteristics of single molecular samples may lead to very divergent estimates of effective population size. For instance, allele number may lead to a large effective population size estimate at the same time that gene identity generates as small estimate. The discrepancy emerges from the temporal scope of inbreeding underlying the two observed values --- some are influenced by population growth more rapidly than others. The disequilibrium itself serves as a test of population growth [18][19].

    Natural selection

    Generally, analyses of effective population size assume neutrality --- that is, they attempt to quantify the stochastic factor in the absence of selection. Natural selection is a deterministic force, which itself is influenced by the stochastic factors in finite populations. Still, genes under selection are influenced by demography. For example, the long-term selective balance affecting many HLA loci has preserved their allelic diversity over millions of years, but the major functional alleles themselves occur on different haplotypes that are neutral relative to each other, and respond to the population effective size [20]. Balancing selection may mask the effects of population growth, or vice versa [21]. And the long-term survival of polymorphisms under selection assumes some demographic prerequisites \citep{Ayala:1995}, which may be used to test demographic hypotheses.

    Linkage to selected sites may impact the variation of neutral sites, distorting estimates of effective size. The relationship of recombination rate and genetic diversity may reflect these selective processes [22][23]. ``Genetic hitchhiking'' is a phenomenon in which neutral sites linked to a positively selected allele show vast reductions in variability [24][25]. Hitchhiking induces disequibria that resemble those resulting from population growth, naturally because positive selection is the logistic growth of one adaptive allele. Constant purifying selection across the genome can reduce the variation of linked neutral alleles, a phenomenon called ``background selection'' [26][25]. Gillespie (2000) [27] showed that recurrent positive selection could restrict the variation of weakly linked neutral sites even in a population of infinite size. This gives rise to a stochastic effect called ``pseudohitchhiking,'' which generates an estimate of effective population size even for evolutionary models where it is undefined. If the force is powerful in natural populations, it would greatly restrict genetic variation below the amount expected for the Wright-Fisher population model. Pseudohitchhiking may even generate an ``effective population size'' for a population of infinite numbers [28].

    As evolutionary factors, both genetic drift (influenced by population size and mating structure) and natural selection influence the genetic variability of natural populations. For any particular locus, these factors may confound each other, so that the reasons for a particular level of genetic variability may not easily be attributed to either. For any bias in the genetic parameters that might result from selection, an equivalent bias may be found as a product of some demographic history. Indeed, this equivalence marks a deep symmetry between the stochastic effects of drift and selection: ultimately, selection is a demographic phenomenon as concerns a particular allele, as opposed to a full population. It has often been assumed that the effects of drift and selection may be clearly differentiated by among-locus analyses --- while selection should affect different functional loci differently, genetic drift should affect all loci in the same way. However, pseudohitchhiking exerts stochastic effects across many loci [27]. This may explain some cross-species comparisons, which show that genetic diversity does not correlate strongly with population size [29], including mtDNA where there is no correlation between population size and diversity across large groups of animal species [30]. The importance of selection in shaping genome-wide variation remains an unresolved question.

    Genetic versus ecological estimates

    From its definition and application to theoretical populations, it should be clear that the utility of ``effective population size'' is that it provides a way of relating the genetic characteristics of a population to those expected of an ideal population under the Wright-Fisher model. Yet, the genetic characteristics of a population always trail to some extent the demographic and ecological factors that influence them. Because genetic variation ``looks to the past'' in this way, a discrepancy arises between estimates of effective size based on genes and so-called ``ecological'' estimates based on observations of demography and behavior.

    Nunney and Elam (1994) [31] reviewed genetic approaches to estimating effective population size, compared to approaches based on field observations of ecology. Genetic approaches are very straightforward: mathematical expressions derived from the Wright-Fisher model generally include population size. Genetic data from a natural population may be entered into these expressions, yielding a solution for population size. This solution is the effective population size --- it is the value of population size in the Wright-Fisher model that corresponds to the observed genetic data. Nunney and Elam (1994) divided genetic approaches into ``long-term'' and ``short-term'' methods. Long-term methods track the changes in gene frequencies over time, and require recurrent sampling of populations over timescales long relative to their generation lengths. Such surveys may be plausible for genes that are phenotypically apparent (e.g., coat color polymorphisms), although estimates must ensure that such traits are neutral. Sampling of molecular characteristics is more costly, and tracking gene frequency change in long-lived populations may be impractical --- for example, no such study has been performed on a human population. Nevertheless, such long-term studies have great relevance to conservation because they assess the variance effective size. Most important, they estimate the \emph{current} variance effective size, without being confounded by the cumulative effects of genetic drift in the past.

    The vast majority of studies that estimate effective population size from genetic data are short-term studies. These use the characteristics of a single genetic sample, taken at one time, and the result is generally an estimate of the inbreeding effective size. This estimate entails all of the potential confounding factors that have influenced gene frequencies over a long, long time in the study population; generally over a period spanning four times as many generations as the estimate of effective size. Thus, an estimated effective size of 10,000 individuals is an assertion that the gene frequencies have been changing by drift in a population of this size for a time period on the order of 40,000 generations. Such estimates obviously have weaknesses as applied to conservation: although they may assess the current level of variation, they do not inform about the current rate of change in gene frequencies. Most important, because the potential confounding effects include both ancient demographic changes and ancient selection over a very long time period, these estimates have a necessarily uncertain connection to current or historic demography.

    For this reason, ecological estimates of effective size may be more satisfactory. Such estimates require observations concerning natural population densities, migration rates, life history, sex ratio and other aspects of mating pattern. The practical interest in conserving natural populations has engendered a substantial body of theoretical work on the relationship between census and effective population sizes, considering variation in these factors. The following list discusses several classes of factors that influence the ratio of effective to census population size. The list is not intended to be comprehensive, but gives a sampling of important phenomena in natural populations and their effects on neutral genetic variation. These factors are considered in terms of their effects on the inbreeding effective population size, although for the most part they influence variance and eigenvalue effective sizes in similar ways.

    Age structure

    Age-structured populations are all those in which death is not coincident with reproduction. For mammals, the reproductive lifespan is relatively long and features intermittent births of single or multiple offspring. This life history pattern leads to an overlap of two or more generations within the population at any given time. Because a large proportion of individuals are either pre- or post-reproductive, the effective population size of an age-structured population is generally half or less the census size [32].

    1. Maturation age: A higher maturation age leads to a higher proportion of nonreproductive juveniles in the population, reducing effective size relative to census size [32][33].
    2. Variance in breeding age: Earlier breeding has a greater effect than later breeding on changes in gene frequencies [4], so that a population with a high variance in reproductive ages will have a reduced effective size.
    3. Postreproductive lifespan: A long postreproductive lifespan increases the number of individuals without increasing the birth rate, reducing effective size relative to census size. Postreproductive helpers may enable a higher birth rate than otherwise possible, but only among those females for which mothers or other postreproductive helpers have survived. In this way, helpers may also tend to decrease effective population size relative to census size.
    Population structure

    Splitting a population into partially isolated subpopulations or groups tends to impede the fixation of alleles in the population as a whole. But if these subpopulations themselves undergo evolutionary stochasticity, then the fate of alleles will be tied to the fate of the subpopulations. When the population behaves as a metapopulation [34], different subpopulations may have greatly different net reproduction, some areas of suitable habitat may be unoccupied, and the fission and subsequent growth of successful subpopulations may dominate the population history [35].

    1. Subpopulations: A population divided into partially inbred subpopulations retains more genetic variation than a panmictic population of the same size. This is a major factor increasing effective population size in geographically dispersed populations.
    2. Isolation by distance: Wright (1943) [36] defined the concept of effective population size in his isolation by distance model to encompass a finite ``neighborhood'' of spatially proximate individuals. The neighborhood size is used to estimate the inbreeding coefficient for this model, and is much smaller than the total population size.
    3. Source/sink dynamics: A species with static population size may nevertheless occupy geographic areas that differ in productivity. Areas where reproduction is lower than the replacement rate will contribute relatively little to the ancestry of the total population over the long term. The effective population number is reduced by such variation [37][38].
    4. Extinction and recolonization: At an extreme, local groups frequently become extinct and are replaced by colonists from other groups. The population will be derived from a small number of groups at earlier times, which may drastically reduce genetic variation and effective population size [39].
    Family size

    Family size is simply the number of offspring per individual. Under the Wright-Fisher population model, a substantial proportion of individuals have no offspring at all — which makes genetic drift possible. But when the variation in family size exceeds the binomial number predicted under the Wright-Fisher model, genetic drift may be substantially stronger.

    1. Variation in family size: Low variance in family size tends to increase effective size relative to census size; high variance tends to decrease effective size.
    2. Heritability of family size: If large families generate offspring that themselves tend to have large families, this inheritance can vastly decrease effective population size [40].
    3. Polygyny/polyandry: These mating systems tend to alter effective sex ratio away from 1.0, which increases the variance in family size in the population, and decreases effective population size.
    4. Distribution of family size: The Wright-Fisher model predicts that family size will follow a Poisson distribution [41]; different distributions (e.g., binomial) may increase or decrease effective population size.

    The majority of these phenomena tend to reduce genetic variability below that expected for a Wright-Fisher model of the same population size, although there are several exceptions to this trend. This bias toward factors that reduce variation may emerge as a natural consequence of fitness-seeking by organisms: if given a chance, individuals should tend to increase the representation of their own genes at the expense of other individuals. Equal representation of all individuals in the gene pool — as in the Wright-Fisher model — is an unlikely outcome. Natural factors that deviate from the Wright-Fisher model should often bias the gene pool toward a subset of individuals, which increases both inbreeding and the rate of change of gene frequency.

    Human societies

    No study of a human population has considered more than a handful of the factors that might influence the relation of effective population size and census size. Some of the factors, such as the effect of age structure or migration, are relatively visible in the ethnographic present. In a village census, the demographer can note the ages of respondents and their place of birth. She may be able to determine inbreeding patterns (e.g., cousin marriages) and factors influencing reproductive variance (e.g., polygyny). But longer-term factors such as population extinction and recolonization, imbalanced migration, or fluctuations in population size are generally beyond measuring with ecological or demographic means in humans. But although no study of ecological factors influencing effective population size in humans is comprehensive, each provides important evidence about the constraints that affect gene frequencies and gene identity over the short run. They may be evaluated in the context of longer-term genetic data to examine the way that human demography itself may have evolved over time.

    Wood (1987) [42] applied the ecological approach to a human society, using the methods of [32] and [43]. He estimated the ratio of effective to census population size for the Gainj tribe of highland New Guinea, a group of slash-and-burn horticulturalists numbering around 1500 individuals at the time of the study. There were two important departures in this study population compared to the Wright-Fisher model: overlapping generations and a high male reproductive variance. Both features tend to decrease effective size compared to census size; with a census count of 1318 individuals in the study, Wood estimated an effective population size of 650.5, for a ratio of Ne/N of approximately 1/2. In the Gainj, reproductive heterogeneity in males was mainly a result of polygyny. However, although the male reproductive variance was approximately three times that of females, this mating structure was estimated to decrease effective population size by a relatively modest 7 percent. However, Wood noted that the estimate of approximately 1/2 for Ne/N is substantially higher than the value of 1/3 that had often been taken for humans. He interpreted this discrepancy in terms of reproductive lifespan — in his sample, individuals of reproductive age made up a larger proportion than 1/3 of the population. High infant mortality and higher adult mortality rates tend to increase the ratio of effective to census population size.

    Austerlitz and Heyer (1998) [44] (see also [45] examined pedigrees from French Canadian families, finding an autocorrelation in family size from one generation to the next. In this population, large families themselves tended to beget large families, leading to a strong reduction in the effective population size. They estimated that the harmonic mean of this growing population to have been ca. 17000; but the inheritance of family size reduces the effective size to only ca. 1000 individuals. This leads to an estimate of the ratio of effective to census size well under 1/10. Sibert et al. (2002) [46] found that such intergenerational correlations in family size could affect gene genealogies in a similar pattern as population size bottlenecks. It is not known to what extent family size may be inherited in most human population. Quebeçois may be an extreme example where rapid growth is concentrated in large families, or perhaps stationary populations may also have such strong intergenerational correlations.

    Migration is an important influence on genetic diversity in most human populations. It is very difficult to examine the effect of migration apart from other factors, because migration patterns have depended strongly on local population growth. Cavalli-Sforza (1959) [47] considered the effect of migration on effective population size for village isolates in Parma, Italy. With a unique knowledge of the historical context of migration among these villages, Cavalli-Sforza was able to demonstrate that their present genetic differentiation was a product of their history. This genetic differentiation does not characterize all human populations, but provides an important reason why genetic diversity may exceed estimates based on other demographic observations.

    Social stratification by cultural mechanisms may affect genetic differentiation within and among human groups. A single society with little gene flow from outside will tend to have a reduction in heterozygosity if stratification affects mating, just as for assortative mating and other deviations from panmixia. Estimates of effective population size will be more strongly influenced by differential gene flow into different social strata. For example, Bamshad et al. (2001) [48] found that genetic samples from higher-ranking castes in India tended to share more alleles with Europeans than samples form lower-ranking castes, which share more alleles with other Asians. Since gene flow from different source populations appears to have been correlated with caste, the overall effect of stratification has been to inflate the overall genetic diversity of the population while limiting within-caste variation. Likewise, differences in admixture rates between Africans and other populations within the New World has influenced the genetic diversity of local geographic regions. For example, Parra et al. (2001) [49] assessed the frequencies of genetic markers in African Americans in different parts of South Carolina, finding that European gene flow increased with distance from the Atlantic Coast, and exhibited a historic sex bias. The net effect was an increase in genetic diversity and differentiation with geographic location. Boundaries between living hunter-gatherers and agricultural populations may exhibit differential gene flow that generates similar patterns of differentiation. This may be an important reason for the apparent high genetic diversity of living hunter-gatherer populations within Africa, despite their current small census sizes [50][51].

    Pleistocene human populations

    Ancient human material and skeletal remains have been found across large parts of Africa, Asia, and Europe. By the beginning of the Middle Pleistocene, some 780,000 years ago, ancient humans occupied at least 35 million square kilometers [52][53][54]. This estimate includes large parts of the tropical and subtropical Old World, but excludes constant and periodic desert, rain forest, inundated continental shelf, and the northern tier of steppe and boreal forest. Although there were likely substantial fluctuations in geographic range over time, the estimate of 35 million km2 is conservatively low for the past 500,000–800,000 years.

    To arrive at an estimate of population numbers, the geographic range must be multiplied by some population density. The range includes areas with varying resource densities, some of which may have been marginal for ancient hunter-gatherers without projectile weapons or sophisticated organizational strategies [55][56]. Therefore, the population density applied across this entire range would be substantially lower than might have obtained within long-lasting local breeding populations. Observations of population densities in ethnographic hunter-gatherers vary substantially. Weiss (1984) [54] applied estimates of population density based on ethnographic observations in recent Native Australian groups [57][58]. The overall estimate of Australian population density before European contact was approximately 0.28 persons per square kilometer [54]. However, this overall continental estimate includes groups with widely varying ecologies, from those living in subtropical rainforests, to temperate open woodlands or desert. Birdsell (1993) [59] estimated that the range of population densities among Australian groups may have varied from 1 person per square kilometer in areas of dense resource availability to 1 person per 100 square kilometers in marginal desert regions. Applying the minimum estimate of 1 person per 100 km2 yields a global census size estimate of 350,000 individuals. This is likely to have been near the minimum of a long-term fluctuating population of Pleistocene humans.

    This estimate of 350,000 individuals would be of the census population size of humans globally during the Middle Pleistocene. In strong contrast, the effective population size of humans globally during this time period has been estimated from many sources at only 10,000 individuals.

    The earliest studies of variation used protein polymorphisms to arrive at this figure [60][61][62]. Haigh and Maynard Smith (1972) proposed that the slight amount of human polymorphism might be explained by an ancient bottleneck of population size — a period of time during which human populations were very small compared to their present numbers. This hypothesis was later applied to a broader range of protein polymorphism data [29], and then RFLP data from the mitochondrial DNA [63]. Later studies discovered consistent levels of variation for Y chromosome [64] and autosomal genes [65][66]. The Wright-Fisher equivalent of the ancestral human population would have contained 10,000 persons.

    Considering the number of ways that natural populations may differ from the Wright-Fisher model, there might have been many reasons that human populations had such low genetic variation compared to their census numbers. It is important to note that this discrepancy between census and effective sizes characterizes most mammal species to some extent, with carnivores and primates in particular showing low genetic variation compared to their census sizes [29]. A number of phenomena may explain this discrepancy, at the same time providing valuable information about the dynamics of Pleistocene human groups.

    One explanation for low human genetic variation is that ancient population structures resulted in higher inbreeding than typical today. Takahata (1994) [67] applied a model of extinction and recolonization of subpopulations to human evolution. In this model, the human population is assumed to have consisted of small groups that frequently became extinct and were replaced by other groups. Eller et al. (2004) [68] extended the model to demographic parameters drawn from the ranges observed in recent hunter-gatherers. This kind of model can account for a severe reduction in genetic variation compared to the expectations for the census size of a population, because most of the population will be descended from a few ancestors at any earlier time. Considering the fluidity of hunter-gatherer groups, it may be unclear whether a model of recurrent extinctions and low migration is appropriate [69].

    In many other respects, it seems likely that the ratio of effective to census population size actually decreased over time. For example, overlapping generations present more of a limit on genetic variability today than at any time during the Pleistocene, because the human lifespan is much longer [70], generating a much larger number of postreproductive individuals. Likewise, migration distances greatly reduced after the advent of agricultural economies, increasing the genetic differentiation of local populations from each other.

    A second explanation for low genetic variation relative to census population size is that the census population size used to be much smaller. A bottleneck with a short duration can explain some aspects of human genetic variation, such as the much lower variation of mtDNA and Y chromosome compared to autosomes and the X chromosome [71]. However, a short bottleneck can have only a slight effect on the overall level of genetic variation. A number of researchers adopted the hypothesis that current human genetic variation is the product of a very long history of small population size in equilibrium [72][73][74]. In this view, the reason why human genetic systems
    have an inbreeding effective size on the order of 10,000 is that the number of breeding individuals in the human species was in fact near 10,000 during most of the Pleistocene. A corollary of this hypothesis is that many ancient human fossils must represent different species not ancestral to any living people — otherwise, their genes should remain with us today and inflate the current level of genetic variation.

    Since the population size is clearly much larger than 10,000 today, the bottleneck hypothesis also requires a massive expansion of population size during the late Pleistocene. It is clear from archaeological data that human populations did expand massively during the Late Pleistocene [75]. But there is little genetic evidence for such an expansion, aside from the mtDNA and Y chromosome [76][21]. Instead, autosomal variation suggests at best a very slight bottleneck during the past 70,000 years [77][78]. And a long-term bottleneck down to as few as 10,000 individuals is inconsistent with anatomical and genetic evidence for gene flow among Pleistocene human populations [79][80][81]. This evidence supports the hypothesis that a substantial proportion of Pleistocene human remains represent ancestors of living people instead of extinct species.

    A third hypothesis is that selection has limited the genetic variation of humans and other species. In order to affect both functional and apparently nonfunctional sites, this selection would involve widespread hitchhiking or pseudohitchhiking. Theoretical models suggest that pseudohitchhiking may explain some empirical results, such as the lack of relationship of mtDNA variation and census size across animal species [30], or the association of genetic diversity and local recombination rate in Drosophila [82]. It is now known that recent selection was very widespread in human prehistory [83][84]. {However, there is no strong association of local recombination rate and genetic diversity in humans [85], even though hitchhiking would predict such an association [23].

    None of these three hypotheses yet provides a compelling account of human effective population size. It is clear today that an effective size of 10,000 individuals refers only to a theoretical model that is inaccurate in many possible ways. But we do not know whether a more correct population model would have 30,000 individuals or 300,000 — or even more. Therefore, it is not yet obvious whether human genetic variation can inform us about the geographic location or mating systems of ancient people. The few estimators available are very course in their resolution. Deciding which factors actually operated on Pleistocene humans remains an active area of theoretical interest.

    Summary

    Effective population size is one of the central concepts of population genetics, but its complexity is seldom fully understood. The concept pertains to an ideal population model, the Wright-Fisher model. The primary purpose of the model is mathematical simplicity, and no natural population conforms to its predictions. However, the model forms a kind of baseline against which the variation in natural populations of the same size can be measured. The genetic evolution of a population is predicted to be constrained by demography in accordance with the effective size. However, at least three different effective population sizes (inbreeding, variance, and eigenvector) predict different aspects of the genetic evolution of a population.

    Several demographic and evolutionary factors may deviate from the Wright-Fisher model. Most of these tend to reduce effective population size compared to the census size. Of these, the largest effects relevant to human evolution come from fluctuations in population size, hitchhiking due to selection on linked sites, overlapping generations, and between-generation autocorrelation of family sizes.

    Human populations during the Middle Pleistocene and later appear to have had census numbers of 350,000 persons or more. In contrast, human genetic variation is consistent with a Wright-Fisher population of only 10,000 persons. The apparent discrepancy between these values has led to much theoretical and empirical investigation of human genetic variation. At present, the relative importance of demography, selection, and changing environments to human genetic variation during the past million years remain unclear.


    References

    1. Provine WB. The Origins of Theoretical Population Genetics. Chicago: University of Chicago Press; 1971.
    2. Fisher RA. The Correlation Between Relatives on the Supposition of {Mendelian} Inheritance. Transactions of the Royal Society of Edinburgh. 1918;52:399–433.
    3. Haldane JBS. A Mathematical Theory of Natural and Artificial Selection. Transactions of the Cambridge Philosophical Society. 1927;23:19–41.
    4. Fisher RA. The Genetical Theory of Natural Selection. Oxford: Clarendon Press; 1930.
    5. Wright S. Evolution in Mendelian Populations. Genetics. 1931;16:97–159.
    6. Wright S. Classification of the Factors of Evolution. Cold Spring Harbor Symposia in Quantitative Biology. 1955;20:16–24D.
    7. Ewens WJ. Mathematical Population Genetics. Cambridge, UK: Cambridge University Press; 2004.
    8. Ewens WJ. The Sampling Theory of Selectively Neutral Alleles. Theoretical Population Biology. 1972;3:87–112.
    9. Templeton AR, Read B. Inbreeding: one word, several meanings, much confusion. In: Loeschcke V, Tomiuk J, Jain SK Conservation Genetics. Conservation Genetics. Birkhaduser Verlag: V Basel; 1994. pp. 91–106.
    10. Cannings C. The Latent Roots of Certain {Markov} Chains Arising in Genetics: A New Approach. 1. Haploid Models. Advances in Applied Probability. 1974;6:260–290.
    11. Moran PAP. Random Processes in Genetics. Proceedings of the Cambridge Philosophical Society. 1958;54:60–71.
    12. Crow JF, Denniston C. Inbreeding and Variance Effective Numbers. Evolution. 1988;42:482–495.
    13. Whitlock MC, Barton NH. The effective size of a subdivided population. Genetics. 1997;146:427–441.
    14. Lehmann L, Perrin N. On Metapopulation Resistance to Drift and Extinction. Ecology. 2006;87:1844–1855.
    15. Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M. On the Meaning and Existence of an Effective Population Size. Genetics [Internet]. 2005;169:1061–1070. Available from: http://dx.doi.org/10.1534/genetics.104.026799
    16. Wright S. Size of a population and breeding structure in relation to evolution. Science. 1938;87:430–431.
    17. Fay JC, Wu CI. A human population bottleneck can account for the discordance between patterns of mitochondrial versus nuclear {DNA} variation. Molecular Biology and Evolution. 1999;16:1003–1005.
    18. Tajima F. Statistical method for testing the neutral mutation hypothesis of {DNA} polymorphism. Genetics. 1989;123:585–595.
    19. Fu Y, Li W. Estimating the age of the common ancestor of a sample of {DNA} sequences. Molecular Biology and Evolution. 1997;14:195–199.
    20. Takahata N, Satta Y. Footprints of intragenic recombination at \\emph{HLA} locus. Immunogenetics. 1998;47:430–441.
    21. Harpending HC, Rogers AR. Genetic perspectives on human origins and differentiation. Annual Review of Genomics and Human Genetics. 2000;1:361–385.
    22. Nachman MW, Bauer VL, Crowell SL, Aquadro CF. {DNA} variability and recombination rates at X-linked loci in humans. Genetics. 1998;150:1133–1141.
    23. Nachman MW. Single Nucleotide Polymorphisms and Recombination Rate in Humans. Trends in Genetics. 2001;17:481–485.
    24. Braverman JM, Hudson RR, Kaplan NL, Langley CH, {}, Stephan W. The hitchhiking effect on the site frequency spectrum of {DNA} polymorphisms. Genetics. 1995;140:783–796.
    25. Kim Y, Stephan W. Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics. 2000;155:1415–1427.
    26. Charlesworth B, Morgan MT, Charlesworth D. The effect of deleterious mutations on neutral molecular variation. Genetics. 1993;134:1289–1303.
    27. Gillespie JH. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics. 2000;155:909–919.
    28. Gillespie JH. Is the population size of a species relevant to its evolution?. Evolution. 2001;55:2161–2169.
    29. Nei M, Graur D. Extent of protein polymorphism and the neutral mutation theory. Evolutionary Biology. 1984;17:73–118.
    30. Bazin E, Glémin S, Galtier N. Population Size Does Not Influence Mitochondrial Genetic Diversity in Animals. Science [Internet]. 2006;312:570–572. Available from: http://dx.doi.org/10.1126/science.1122033
    31. Nunney L, Elam DR. Estimating the Effective Population Size of Conserved Populations. Conservation Biology. 1994;8:175–184.
    32. Hill WG. Effective Size of Populations with Overlapping Generations. Theoretical Population Biology. 1972;3:278–289.
    33. Nunney L. The influence of mating system and overlapping generations on effective population size. Evolution. 1993;47:1329–1341.
    34. Levins R. Some Demographic and Genetic Consequences of Environmental Heterogeneity for Biological Control. Bulletin of the Entomological Society of America. 1969;71:237–240.
    35. Gilpin M. The Genetic Effective Size of a Metapopulation. Biological Journal of the Linnaean Society. 1991;42:165–175.
    36. Wright S. Isolation by Distance. Genetics. 1943;28:114–38.
    37. Beerli P, Felsenstein J. Maximum likelihood estimation of a migration matrix and effective population sizes in \\emph{n} subpopulations by using a coalescent approach. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2001;98:4563–4568. Available from: http://dx.doi.org/10.1073/pnas.081068098
    38. Wakeley J. The Coalescent in an Island Model of Population Subdivision with Variation among Demes. Theoretical Population Biology [Internet]. 2001;59:133–144. Available from: http://dx.doi.org/10.1006/tpbi.2000.1495
    39. Maruyama T, Kimura M. Genetic variability and effective population size when local extinction and recolonization of subpopulations are frequent. Proceedings of the National Academy of Sciences, U. S. A. 1980;77:6710–6714.
    40. Nei M, Murata M. Effective population size when fertility is inherited. Genetical Research. 1966;8:257–260.
    41. Hudson RR. Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology. 1990;7:1–44.
    42. Wood JW. The genetic demography of the {Gainj} of {Papua} {New} {Guinea}. 2. {Determinates} of effective population size. American Naturalist. 1987;129:165–187.
    43. Emigh TH, Pollak E. Fixation Probabilities and Effective Population Numbers in Diploid Populations with Overlapping Generations. Theoretical Population Biology. 1979;15:86–107.
    44. Austerlitz F, Heyer E. Social transmission of reproductive behavior increases frequency of inherited disorders in a young-expanding population. Proceedings of the National Academy of Sciences, U. S. A. 1998;95:15140–15144.
    45. Gagnon A, Heyer E. Intergenerational Correlation of Effective Family Size in Early {Québec} ({Canada}). American Journal of Human Biology [Internet]. 2001;13:645–659. Available from: http://dx.doi.org/10.1002/ajhb.1103
    46. Sibert A, Austerlitz A, Heyer E. {Wright-Fisher} Revisited: The Case of Fertility Correlation. Theoretical Population Biology. 2002;62:181–197.
    47. Cavalli-Sforza LL. Some Data on the Genetic Structure of Human Populations. In: Proceedings of the 10th International Congress on Genetics. Vol. 1. Proceedings of the 10th International Congress on Genetics. Toronto: University of Toronto Press; 1959. pp. 389–407.
    48. Bamshad M, Kivisild T, Watkins SW, Dixon ME, Ricker CE, Rao BB, Naidu MJ, Prasad RBV, Reddy GP, Rasanayagam A, et al. Genetic Evidence on the Origins of {Indian} Caste Populations. Genome Research [Internet]. 2001;11:994–1004. Available from: http://dx.doi.org/10.1101/gr.GR-1733RR
    49. Parra EJ, Kittles RA, Argyropoulos G, Pfaff CL, Hiester K, Bonilla C, Sylvester N, Parrish-Gause D, Garvey WT, Jin L, et al. Ancestral Proportions and Admixture Dynamics in Geographically Defined {African Americans} Living in {South Carolina}. American Journal of Physical Anthropology [Internet]. 2001;114:18–29. Available from: http://dx.doi.org/10.1002/1096-8644(200101)114:1%3C18::AID-AJPA1002%3E3.0.CO;2-2
    50. Chen Y-S, Olckers A, Schurr TG, Kogelnik AM, Huoponen K, Wallace DC. {mtDNA} Variation in the {South African} {Kung} and {Khwe} –- and Their Genetic Relationships to Other {African} Populations. American Journal of Human Genetics [Internet]. 2000;66:1362–1383. Available from: http://dx.doi.org/10.1086/302848
    51. Tishkoff SA, Williams SM. Genetic Analysis of {African} Populations: Human Evolution and Complex Disease. Nature Reviews Genetics [Internet]. 2002;3:611–621. Available from: http://dx.doi.org/10.1038/nrg865
    52. Hawks JD. The Evolution of Human Population Size: A Synthesis of Fossil, Archaeological, and Genetic Data. 1999.
    53. Biraben JN. Essai sur l'evolution du nombre des hommes. Population. 1979;1:13–25.
    54. Weiss KM. On the number of members of the genus \\emph{Homo} who have ever lived, and some evolutionary implications. Human Biology. 1984;56:637–649.
    55. Whallon R. Elements of cultural change in the later {Paleolithic}. In: Mellars P, Stringer CB The Human Revolution: Behavioural and Biological Perspectives on the Origins of Modern Humans. The Human Revolution: Behavioural and Biological Perspectives on the Origins of Modern Humans. Edinburgh: Edinburgh University Press; 1989. pp. 433–454.
    56. Gamble CS. Timewalkers. The Prehistory of Global Colonization. Cambridge, MA: Harvard University Press; 1994.
    57. Birdsell JB. The human numbers game. In: Supplement No. 9 in Human Evolution. Supplement No. 9 in Human Evolution. Chicago: Rand McNally; 1972. pp. 291–294.
    58. Tindale NB. Distribution of {Australian} Aboriginal tribes: a field survey. Transactions of the Royal Society of South {Australia}. 1940;64:140–231.
    59. Birdsell JB. Microevolutionary Patterns in Aboriginal {Australia}: A Gradient Analysis of Clines. Oxford, UK: Oxford University Press; 1993.
    60. Nei M. Effective size of human populations. American Journal of Human Genetics. 1970;22:694–696.
    61. Haigh J, Maynard Smith J. Population size and protein variation in man. Genetical Research [Internet]. 1972;19:73–89. Available from: http://dx.doi.org/10.1017/S0016672300014282
    62. Nei M, Roychoudhury AK. Genetic relationship and evolution of human races. In: Hecht MK, Wallace B, Prace GT Evolutionary Biology. Vol. 14. Evolutionary Biology. New York: Plenum; 1982. pp. 1–59.
    63. Cann RL, Stoneking M, Wilson AC. Mitochondrial {DNA} and human evolution. Nature. 1987;325:31–36.
    64. Underhill PA, Li J, Lin AA, Mehdi QS, Jenkins T, Vollrath D, Davis RW, Cavalli-Sforza L, Oefner PJ. Detection of numerous {Y}–chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography. Genome Research. 1997;7:996–1005.
    65. Wang DG, Fan J, Siao C, Berno A, Young P, Sapolsky R, Ghandour G, Perkins N, Winchester E, Spencer J, et al. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998;280:1077–1081.
    66. The International HapMap Consortium. A Haplotype Map of the Human Genome. Nature [Internet]. 2005;437:1299–1320. Available from: http://dx.doi.org/10.1038/nature04226
    67. Takahata N. Repeated failures that led to the eventual success in human evolution. Molecular Biology and Evolution. 1994;11:803–805.
    68. Eller E, Hawks J, Relethford JH. Local Extinction and Recolonization, Species Effective Population Size, and Modern Human Origins. Human Biology. 2004;76:689–709.
    69. Yellen J, Harpending H. Hunter-Gatherer Populations and Archaeological Inference. World Archaeology. 1972;4:244–253.
    70. Caspari R, Lee S-H. Older Age Becomes Common Late in Human Evolution. Proceedings of the National Academy of Sciences, U. S. A. 2004;101:10895–10900.
    71. Fay JC, Wu C-I. Hitchhiking under positive Darwinian selection. Genetics. 2000;155:1405–1413.
    72. Harpending HC, Sherry ST, Rogers AR, Stoneking M. The genetic structure of ancient human populations. Current Anthropology. 1993;34:483–496.
    73. Harpending HC, Batzer MA, Gurven M, Jorde LB, Rogers AR, Sherry ST. Genetic traces of ancient demography. Proceedings of the National Academy of Sciences, U. S. A. 1998;95:1961–1967.
    74. Sherry ST, Harpending HC, Batzer MA, Stoneking M. \\emph{Alu} evolution in human populations: using the coalescent to estimate effective population size. Genetics. 1997;147:1977–1982.
    75. Stiner MC, Munro ND, Surovell TA. The Tortoise and the Hare: Small-Game Use, the Broad-Spectrum Revolution, and {Paleolithic} Demography. Current Anthropology. 2000;41:39–73.
    76. Hawks J, Hunley K, Lee SH, Wolpoff MH. Bottlenecks and {Pleistocene} human evolution. Molecular Biology and Evolution. 2000;17:2–22.
    77. Marth G, Schuler G, Yeh R, Davenport R, Agarwala R, Church D, Wheelan S, Baker J, Ward M, Kholodov M, et al. Sequence Variations in the Public Human Genome Data Reflect a Bottlenecked Population History. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2003;100:376–381. Available from: http://dx.doi.org/10.1073/pnas.222673099
    78. Marth GT, Czabarka E, Murvai J, Sherry ST. The Allele Frequency Spectrum in Genome-Wide Human Variation Data Reveals Signals of Differential Demographic History in Three Large World Populations. Genetics. 2004;166:351–372.
    79. Evans PD, Mekel-Bobrov N, Vallender EJ, Hudson RR, Lahn BT. Evidence that the Adaptive Allele of the Brain Size Gene \\emph{microcephalin} Introgressed Into \\emph{Homo sapiens} from an Archaic \\emph{Homo} Lineage. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2006;103:18178–18183. Available from: http://dx.doi.org/10.1073/pnas.0606966103
    80. Frayer DW, Wolpoff MH, Thorne AG, Smith FH, Pope GG. Getting it straight. American Anthropologist. 1994;96:424–438.
    81. Wolpoff MH, Hawks J, Frayer DW, Hunley K. Modern Human Ancestry at the Peripheries: A Test of the Replacement Theory. Science. 2001;291:293–297.
    82. Betancourt AJ, Kim Y, Orr AH. A Pseudohitchhiking Model of {X} vs. Autosomal Diversity. Genetics. 2004;168:2261–2269.
    83. Wang ET, Kodama G, Baldi P, Moyzis RK. Global Landscape of Recent Inferred Darwinian Selection for Homo sapiens. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2006;103:135–140. Available from: http://dx.doi.org/10.1073/pnas.0509691102
    84. Voight BF, Kudaravalli S, Wen X, Pritchard JK. A Map of Recent Positive Selection in the Human Genome. PLoS Biology [Internet]. 2006;4. Available from: http://dx.doi.org/10.1371/journal.pbio.0040072
    85. Hellmann I, Ebersberger I, Ptak SE, Pääbo S, Przeworski M. A Neutral Explanation for the Correlation of Diversity with Recombination Rates in Humans. American Journal of Human Genetics [Internet]. 2003;72:1527–1535. Available from: http://dx.doi.org/10.1086/375657
  • Overdominance and rapid adaptation

    Mon, 2011-08-22 13:19 -- John Hawks
    Publication information: 

    This is a pre-publication manuscript. Please contact the authors for information about how to cite this article.

    Work status: 

    This paper is substantially complete but is being transitioned from a LaTeX manuscript by pieces, and so is not yet completely on the website.

    Abstract: 

    We suggest that in diploid organisms, most adaptive mutations of large fitness effect are overdominant. R. A. Fisher's geometric model of adaptation, which has been fruitfully applied to investigate the size and dynamics of adaptive mutations, has previously been limited to the haploid case. By extending it to the diploid case for mutations of additive effect on phenotype, we show that only a small fraction of mutations that bring the heterozygote phenotype closer to the optimum will also bring the homozygote phenotype closer to the optimum. More generally the probability that a mutation is adaptive in heterozygotes is greater than the probability that it will be equally or more adaptive in homozygotes. Together, these considerations imply that most mutations that are adaptive in heterozygotes will have lower fitness in homozygotes and will therefore reach an equilibrium frequency, not fixation. We show that this theoretical expectation is consistent with the evidence of recent adaptive mutations in human populations, where a significant deficit of fixed or near-fixed selective sweeps have been identified compared to the number of apparent new adaptive gene variants. Partial selective sweeps in human evolution should be much more common than complete selective sweeps.

    Introduction

    The last decade has seen an explosion of interest in ongoing and recent natural selection in human populations. The HapMap project and other surveys of SNPs in human populations revealed many regions in the genome where one SNP allele is surrounded by large regions of linkage disequilibrium while the other is on a more heterogeneous background [1][2][3][4]. The inference has been that the former were undergoing or had undergone a selective sweep, and had experienced an increase in frequency too great to be explained by genetic drift. There have been attempts to construct bottlenecked population histories that would explain these patterns, but the elevated concentration of this pattern in genic regions apparently makes such explanations unlikely [1][2].

    A puzzling finding of this line of research is that the sweeping SNPs are disproportionately at intermediate frequencies rather than near fixation. Part of the explanation is that the technology for discovering these regions usually relies on contrasting the linkage disequilibrium around the putative sweeping SNP with that around its allele, so both kinds of chromosomes must be present. There is also a ``shortage'' of fixed genomic regions that show high disequilibrium as would be produced by complete sweeps.

    Why are there so many incomplete sweeps? Some have suggested that most phenotypic adaptation occurs in the form of ``soft sweeps'', in which evolution of a phenotype occurs because of slight changes in frequency of many standing genetic variants [5]. But soft sweeps of standing genetic variants do not explain the pattern of long-range LD haplotypes of intermediate frequency. These appear to be young haplotypes that have risen to intermediate frequencies quite rapidly, not old haplotypes that have changed in frequency marginally. One good possibility is what we have called the ``stooge effect'' - after the Three Stooges all trying to get through a doorway at the same time. Selection following an environmental change will favor any allele (either standing variation or new mutation)that provides increased fitness in the new environment, perhaps leading to change at many loci that decreases the fitness advantage of any one advantageous mutation. For example, the fitness advantage of sickling hemoglobin must have been much greater long ago when there were no high frequency genetic responses to malaria. The competition among adaptive alleles may slow down and perhaps stop the sweep of any one new mutant.

    We know that some sweeping alleles, for example some of the hemoglobinopathies, are loss-of-function mutations, broken versions of the ancestral version. They can have strong negative effects in homozygotes, who have no working copy of the gene. Such overdominant mutations are easy to recognize and understand.

    Our analysis (based on Fisher's geometric model) suggests that overdominance is common in newly originated beneficial alleles of large effect, even when the mutation changes gene function, rather than reducing or eliminating it. This is a natural consequence of a high degree of pleiotropy. Such overdominant alleles will never fix. While homozygote fitness will be lower than heterozygote fitness in these cases, usually it will not be extremely low - will not cause death or obvious disease, and so will not be easily observed.

    Fisher's geometric model

    Fisher considered adaptation as an aspect of an organism's phenotype.

    An organism is regarded as adapted to a particular situation, or to the totality of situations which constitute its environment, only in so far as we can imagine an assemblage of slightly different situations, or environments, to which the animal would on the whole be less well adapted; an equally only in so far as we can imagine an assemblage of slightly different organic forms, which would be less well adapted to that environment [6].

    Fisher defined the organism's fitness as its capacity for intrinsic population growth, the 'Malthusian parameter'. He imagined an optimal phenotype from which no change in the phenotype, however slight, could increase the organism's fitness—in some particular environment, obviously. A well-adapted organism would be one whose phenotype was very close to that optimum. In his discussion, he did not distinguish between the case where the optimal form is the phenotype of an individual, or the average phenotype of a population. Fisher used this model as part of his argument that most evolutionary changes are small, suggesting that most individuals would be very near the average phenotype of the population most of the time.

    Fitness can thus be considered as a function of position in a multidimensional phenotype space. Fisher assumed that the distribution of fitness in that phenotype space has a single optimum (point O). Coordinates are normalized so that fitness is a function of the distance from O and declines monotonically as that distance increases.

    Consider a non-optimal phenotype, which we model as a point A in this phenotype space, at a distance d/2 from O. Fisher considered this distance as the radius of a multidimensional sphere centered on O. All the points on that sphere have the same suboptimal fitness; you might say that they are all equally poorly adapted.

    Now imagine a mutation that shifts the phenotype a distance r in some random direction, moving it to a new position B. If B is inside the hypersphere, it will be closer to O than is the initial point A, and so the mutation will be favored by selection. If B is on the hypersphere, the mutation will have equal fitness to the wild type, if B is outside the hypersphere, the mutation is deleterious relative to the wild type. Hence, the probability that the mutation is adaptive is the same as the probability that B is inside the hypersphere.

    We can derive this probability by considering the boundary condition in which the new position B lies exactly on the hypersphere: that is, when the distance |AO| = |BO|. The angle θ' between AO and AB in that case determines the maximum angle of a change that improves fitness: θ' = arccos(r/d).

    [Figure 1 here]

    If the angle between AB and AO is less than θ, B is closer to the optimum than A and fitness increases. If the angle is greater, B is farther from the optimum than A and fitness decreases.

    In order to determine the probability that a random change of size r will increase fitness, we need to find what fraction of the surface of a hypersphere of dimension n lies within the cap with half-angle θ'. Hartl and Taubes [7] gave an exact expression for this probability:

    \begin{equation}
    \frac{\int_0^{\theta'} \sin ^{n-2} \theta d\theta}{\int_0^{\pi}
    \sin^{n-2} \theta d\theta}
    \label{eq:surface-fraction-adaptive}
    \end{equation}

    Fisher argued that n, the effective dimensionality of the phenotype space, was likely to be large in real cases because many different traits influence biological success. He proceeded to develop a large-n approximation for this probability integral that gives considerable insight. Back in 1930, it must also have been considerably easier to calculate than the exact integral.

    \begin{equation}
    \frac{1}{\sqrt{2 \pi}} \int_{x}^{\infty} e^{-t^2/2}dt, x=r (n/d)^{\frac{1}{2}}
    \end{equation}

    One can see from this expression that the probability of a favorable change is close to 1/2 when r is small, while decreasing rapidly as the size of the change becomes larger than d/√n - which one might call the "standard magnitude" of change. Fisher concluded that mutations with small favorable effects are the main players in adaptive evolution.

    But there are other factors that Fisher did not consider in his analysis. Kimura [8] considered an additional aspect of selective dynamics that influences the effect size of adaptive phenotypic changes: the fact that new mutations that confer small increases in fitness are likely to be lost by chance, almost as likely as a neutral mutation. The probability of success of a beneficial mutation increases linearly with its fitness benefit [9]. Thus, although larger phenotypic changes are less likely to be adaptive, changes of large effect that are adaptive are more likely to persist in the population. Kimura showed that a population is most likely to undergo adaptive changes that are intermediate in effect-large enough to survive genetic drift, but small enough that they remain relatively likely to move the phenotype closer to the optimum.

    Orr [10] considered not only the first adaptive change, but the entire series of adaptive changes as a population approaches a phenotypic optimum. He showed that Kimura's relation held for the first adaptive change, bringing the population a considerable distance toward the optimum. Subsequent changes are likely to be smaller. As the phenotype nears the optimum, large phenotypic changes are less and less likely to approach it more closely, so the entire sequence is dominated by small changes. Considering the process as a whole, Orr showed that the effect sizes of adaptive changes will be exponentially distributed. The exponential distribution is also the expectation drawn from extreme value theory of the effect sizes of beneficial mutations.

    Diploid genotypes and Fisher's model

    Fisher's geometric analogy holds up well for a haploid, because the phenotypic change induced by a mutation may be thought of as a vector, just as in Fisher's model. The difference between individuals and populations need not be strictly defined in the model, because the effect of a mutation on the fitness of an individual will be the same as on the fitness of a population. Given the amount of effort put into extending Fisher's phenotype model to mutational effects - even in diploid organisms like Drosophila — it seems remarkable that nobody has observed that the analogy does not hold for diploid genotypes. Diploids are difficult to treat with this geometric model, because each genotype may have a distinct phenotypic effect. Unlike the haploid case, we cannot assume that the effect of a mutation in an individual is the same as the effect of a substitution in the population. For an autosomal mutation to proceed to fixation in the population — thereby becoming a substitution — a mutant homozygote must have fitness equal to or greater than that of a heterozygote. Even if a heterozygote has a phenotype that is intermediate between original-allele and mutant homozygotes, its fitness may not be.

    For simplicity, we assume that phenotypic change is a linear function of gene dosage. In that case, two copies of the mutant allele result in exactly twice the change in phenotype caused by one copy. We designate the original phenotype as point A and the phenotype resulting from one copy of the mutation as point B, as before. We will designate the phenotype in mutant homozygotes as point C. This means that the distance AC is exactly double AB (that is, 2r in Fisher's model). Given this constraint, we can find the angle φ at which the heterozygote and homozygotes have equal fitness, that is, where |BO| = |CO|. Since the displacement is larger, φ, the critical angle for homozygotes, is smaller than θ, the critical angle for heterozygotes.

    [Figure 2 here]

    We can use this expression to calculate the probability that individuals who are homozygotes for a beneficial mutation of effect size r are fitter than heterozygotes. If r is small, 50% of mutations increase heterozygote fitness, almost all of which confer even higher fitness in homozygotes. If r is large, most mutations will not increase heterozygote fitness, and even those that do are unlikely to increase fitness further in homozygotes.

    \begin{table}[h]\centering
    \begin{tabular}{|c|ccccc|}
    %\multicolumn{5} {c} {\bf Overall Heading} \\
    \hline
    & $|r|$ & $0.01$ & $0.1$ & $0.25$ & $0.5$ \\
    \hline
    N & & & & & \\
    \hline
    16 & & 0.4924 & 0.4244 & 0.3163 & 0.1666 \\
    32 & & 0.4890 & 0.3911 & 0.2441 & 0.0803 \\
    64 & & 0.4842 & 0.3462 & 0.1606 & 0.0223 \\
    128 & & 0.4776 & 0.2868 & 0.7918 & 0.0021 \\
    \hline
    \end{tabular}
    \caption{Probability that fitness tncreases in heterozygotes}
    \end{table}

    \begin{table}[h]
    \centering
    \begin{tabular}{|c|ccccc|r|}
    \hline
    & $|r|$ & $0.01$ & $0.1$ & $0.25$ & $0.5$ \\
    \hline
    N & & & & & \\
    % \hline
    16 & & 0.9846 & 0.8277 & 0.5266 & 0.1230 \\
    32 & & 0.9775 & 0.7411 & 0.3289 & 0.0190 \\
    64 & & 0.9675 & 0.6181 & 0.1388 & 0.0005 \\
    128& & 0.9532 & 0.4524 & 0.0270 & 0.0000004 \\
    \hline
    \end{tabular}
    \caption{Probability that homozygotes are fitter than
    heterozygotes} \label{tab:second}
    \end{table}

    In other words, there are three tests that an adaptive mutation must pass in order to reach fixation. First, it must increase fitness in heterozygotes, which, as Fisher showed [6], is unlikely if its phenotypic effect is large. Second, it must avoid stochastic loss when rare. Haldane showed that a favorable mutation's probability of avoiding stochastic loss is about 2s, when s is the selective advantage. This means that a new allele will probably be lost if its effect size is small, because a small phenotypic effect leads to a small selective advantage. Third, it must confer higher fitness in homozygotes than in heterozygotes, which is unlikely if it has a large phenotypic effect. The first two tests are considered in most treatments of the genetics of natural selection, but the third has seldom been discussed.

    Our assumption of linearity is optimistic. If the phenotypic change induced in homozygotes is nonlinear and in a significantly different direction in phenotype space than the change in heterozygotes, fitness will almost certanly be lower than in heterozygotes, since favorable changes are possible only in a narrow range of angles in a high-dimensional space.

    True loss-of-function mutations, in which the gene's function is eliminated, rather than merely reduced, make up an important class of alleles with nonlinear effects. These changes, which eliminate rather than merely reducing gene function, are usually nonsense mutations, frameshifts, radical amino acid changes, etc. Many such alleles have moderate phenotypic effects in single dose — effects that can increase fitness — while causing drastic fitness decreases in homozygotes, which have no working copy of the gene. Many are lethal. Such alleles cause some of the most common human genetic diseases, such as cystic fibrosis (the ΔF508 mutation) and the β0 thalassemias.

    Generally speaking, this kind of direction-changing nonlinearity should become more and more likely as effect size increases. If it is significant, the phenotypic change in homozygotes will be essentially unrelated to the beneficial change in heterozygotes, it will not even be in the same general direction in phenotype space, and so such mutations are almost always deleterious in homozygotes.

    Lower fitness in homozygotes than in heterozygotes doesn't necessarily imply that homozygote fitness is extremely low or obviously depressed. For example, consider a case in which one copy of a new allele increased fitness in past environments by 5%, while two copies increased fitness by 2%. The new allele would never go to fixation: it would eventually approach an equilibrium frequency of about 71%. But, quite possibly, none of these genotypes (0,1, or 2 copies) would be obviously ill or seek medical attention. This is particularly the case in modern environments, which are less harsh in many ways than those our ancestors experienced. For example, alleles that conferred protection against famine or smallpox might have reached high frequencies in modern populations, but their advantages would be unnoticeable and effectively unmeasurable in modern populations. We would have next to no chance of determining small differences in the fitness effects of that allele in heterozygotes and homozygotes.

    The X chromosome

    The fate of an adaptive variant that appears on the X chromosome is quite different in eutherian mammals — humans, for example. Males have only a single copy of the X, so their gene dosage does not vary. Females have two copies, but only one copy of the X chromosome is active in each cell, while the other copy is condensed and inactive [11]. Upwards of 85% of all genes on the condensed chromosome are inactive, except in the pseudoautosomal regions. Since one of the X chromosomes is randomly inactivated in each cell, female heterozygotes will have the wild-type allele in some cells and the adaptive variant in others, while female homozygotes will have the same effective gene dosage as males with their one copy.

    So, if a variant on the X chromosome increases fitness in males, it is likely to have the same effect in females with two copies. The effective dose of the new allele will be half as large in heterozygous females, but if the phenotypic effects are linear with gene dosage, heterozygote fitness should still be higher than wild-type In Fisher's model, a fitness increase for a given displacement in a particular direction implies a fitness increase for any smaller displacement in that same direction.

    Under these assumptions, most X-chromosome gene variants that increase fitness in males should go to fixation, as long as they escape stochastic loss. Interestingly, the tight regulation of gene dosage on the X chromosome implies that changes in dosage do indeed influence fitness — why else would X-inactivation have evolved?

    Such X-linked beneficial recessive alleles sweep more slowly than an autosomal allele with the same selective advantage, since only one third (those in males) manifest the advantage when the allele is rare. However, the elimination of the requirement that the new variant confer higher fitness in homozygotes than heterozygotes is more important, and should result in a disproportionate number of completed sweeps on the X chromosome. As it happens, that is exactly what we see in humans.

    SNPs with high allele frequency differences are relatively rare in humans, but they are particularly common on the X chromosome [12][13] Out of the 3.2 million SNPs in the HapMap data set, only 479 have FST greater than or equal to 0.90. Of those 479 high-Fst SNPs, 379 are on the X chromosome . The majority of those highly differentiated SNPs cluster into five distinct regions. Those five regions apparently correspond to six selective sweeps. Two of these sweeps happened near the centromere (in different populations). Of the six sweeps, five are in populations outside sub-Saharan Africa, with the sweep reaching fixation in East Asia, existing at lower frequencies in Europe and West Asia, while being rare in Sub-Saharan Africa. One of the two sweeps near the centromere has the opposite pattern — essentially fixed in sub-Saharan Africa and rare outside. That second pattern is unusual: there are few high-Fst SNPs for which the derived alleles are near fixation in sub-Saharan Africa. The best-known example is the mutation responsible for the Duffy Fy*O blood type.

    Conclusion

    Our conclusion is that most mutations with strong effects that increase fitness in heterozygotes confer lower fitness in homozygotes — that is, are overdominant.

    This effect may not matter much in a well-adapted, stable species. Over many generations, overdominant alleles that partially solve some adaptive problem should eventually be replaced by alleles that confer high fitness in both heterozygotes and homozygotes and so go to fixation. This could occur through the evolution of modifier loci and by rare favorable mutations that are essentially additive. In steady-state, there should be relatively few common overdominant alleles, except for cases of frequency-dependent selection.

    It may, however, play an important role in species that have experienced strong selection, ones whose environment has changed drastically. As it happens, this is the case for a number of species of interest: we would put humans and most domesticated species in this category.


    References

    1. Hawks J, Wang ET, Cochran G, Harpending HC, Moyzis RK. Recent acceleration of human adaptive evolution. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2007;104:20753–20758. Available from: http://dx.doi.org/10.1073/pnas.0707650104
    2. Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, et al. Signals of recent positive selection in a worldwide sample of human populations. Genome Research [Internet]. 2009;19:826–837. Available from: http://dx.doi.org/10.1101/gr.087577.108
    3. Voight BF, Kudaravalli S, Wen X, Pritchard JK. A Map of Recent Positive Selection in the Human Genome. PLoS Biology [Internet]. 2006;4. Available from: http://dx.doi.org/10.1371/journal.pbio.0040072
    4. Wang ET, Kodama G, Baldi P, Moyzis RK. Global Landscape of Recent Inferred Darwinian Selection for Homo sapiens. Proceedings of the National Academy of Sciences, U. S. A. [Internet]. 2006;103:135–140. Available from: http://dx.doi.org/10.1073/pnas.0509691102
    5. Pritchard JK, Pickrell JK, Coop G. The Genetics of Human Adaptation: Hard Sweeps, Soft Sweeps, and Polygenic Adaptation. Current Biology [Internet]. 2010;20:R208–R215. Available from: http://dx.doi.org/10.1016/j.cub.2009.11.055
    6. Fisher RA. The Genetical Theory of Natural Selection. Oxford: Clarendon Press; 1930.
    7. Hartl D, Taubes C. Towards a theory of evolutionary adaptation. Genetica [Internet]. 1998;102-103:525–533. Available from: http://dx.doi.org/10.1023/A:1017071901530
    8. Kimura M. Some Problems of Stochastic Processes in Genetics. Annals of Mathematical Statistics. 1957;28:882–901.
    9. Haldane JBS. A Mathematical Theory of Natural and Artificial Selection, Part V: Selection and Mutation. Proceedings of the Cambridge Philosophical Society. 1927;23:838–844.
    10. Orr AH. The Distribution of Fitness Effects Among Beneficial Mutations. Genetics. 2003;163:1519–1526.
    11. Carrel L, Willard HF. X-inactivation profile reveals extensive variability in X-linked gene expression in females. Nature. 2005;434(7031):400 - 404.
    12. Lambert CA, Connelly CF, Madeoy J, Qiu R, Olson MV, Akey JM. Highly Punctuated Patterns of Population Structure on the X Chromosome and Implications for African Evolutionary History. The American Journal of Human Genetics [Internet]. 2010;86:34–44. Available from: http://dx.doi.org/10.1016/j.ajhg.2009.12.002
    13. Casto AM, Li JZ, Absher D, Myers R, Ramachandran S, Feldman MW. Characterization of X-linked SNP genotypic variation in globally distributed human populations. Genome biology [Internet]. 2010;11:R10+. Available from: http://dx.doi.org/10.1186/gb-2010-11-1-r10
  • DNA relatives

    Mon, 2011-03-07 15:07 -- John Hawks

    Steve Mount works through the math of "relative finder" predictions from 23andMe (and by extension, other personal genome tests): "Genetic genealogy and the single segment".

    He does a nice short explanation of a point that is counter-intuitive to many people. You don't actually share much DNA with your relatives by descent, and because chromosomes are inherited in chunks, you quickly (within 6 generations) get to a point where you're not likely to have any DNA in common at all. Yet...you do have to have DNA from somebody, which means that if you do share DNA, you'll probably share a big chunk of it.

  • mtDNA, purifying selection and "distorted" genealogies

    Sat, 2010-10-23 11:13 -- John Hawks

    I'm going to pass along this paper without much comment, it's by Jon Seger and colleagues and it came out earlier this year in Genetics [1]:

    Gene Genealogies Strongly Distorted by Weakly Interfering Mutations in Constant Environments

    Neutral nucleotide diversity does not scale with population size as expected, and this "paradox of variation" is especially severe for animal mitochondria. Adaptive selective sweeps are often proposed as a major cause, but a plausible alternative is selection against large numbers of weakly deleterious mutations subject to Hill–Robertson interference. The mitochondrial genealogies of several species of whale lice (Amphipoda: Cyamus) are consistently too short relative to neutral-theory expectations, and they are also distorted in shape (branch-length proportions) and topology (relative sister-clade sizes). This pattern is not easily explained by adaptive sweeps or demographic history, but it can be reproduced in models of interference among forward and back mutations at large numbers of sites on a nonrecombining chromosome. A coalescent simulation algorithm was used to study this model over a wide range of parameter values. The genealogical distortions are all maximized when the selection coefficients are of critical intermediate sizes, such that Muller's ratchet begins to turn. In this regime, linked neutral nucleotide diversity becomes nearly insensitive to N. Mutations of this size dominate the dynamics even if there are also large numbers of more strongly and more weakly selected sites in the genome. A genealogical perspective on Hill–Robertson interference leads directly to a generalized background-selection model in which the effective population size is progressively reduced going back in time from the present.

    The topic arises for me at the moment because of some inconsistencies between the apparent timing of events from mtDNA estimates compared to nuclear DNA estimates. Across the crucial "out of Africa" time interval between 200,000 and 50,000 years ago, the mtDNA is not really giving the same chronology as might be expected from nuclear DNA comparisons.

    The mutation rate of mtDNA genome-wide is very high, giving rise to the possibility of interaction between weakly deleterious mutations on the same sequence. It is widely known that the apparent rate of mtDNA mutation depends on the timescale of the comparison in humans. Mothers and their offspring differ by much more than would be predicted by longer pedigrees or by comparisons between populations. Recently diverged populations (such as those in island Polynesia) differ much more than would be predicted from the difference between humans and Neandertals or humans and chimpanzees.

    This apparent "speed-up" of rate as we get closer to the present is consistent with the action of strong purifying selection. So establishing the other genealogical effects of this selection should help us understand the patterns of mtDNA sequence differences found in humans.


    References

Subscribe to theory

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.