john hawks weblog

paleoanthropology, genetics and evolution

heterozygosity

  • Measuring differences between populations

    Mon, 2011-11-28 00:28 -- John Hawks
    Synopsis: 
    Fst and its relationship to the number of migrants among populations

    When individuals mate locally, different populations tend to diverge from each other in the frequencies of their alleles. Genetic differences between populations are therefore differences in allele frequencies — and these differences in allele frequencies may have consequences in terms of phenotypic or adaptive differences. But every difference in allele frequencies is not equal. When populations encompass great genetic variation, large differences in allele frequencies still leave much overlap — the individuals in the different populations may not be very different from each other. In contrast, slight differences in allele frequencies might be very important between populations that are not variable, because individuals in these populations might vary extensively as a result.

    Geneticists measure the differences between populations by comparing the difference in allele frequencies to the amount of variation within the populations. When people mate with their neighbors, they tend to become more inbred — that is, they are more likely to mate with distant relatives. This means that people will tend to have greater genetic similarity than they would have if they mated equally with people who were born across the world.

    Increase in the level of inbreeding due to low gene flow is often used as a statistic, called FST, relating the increase in inbreeding in the subpopulation to that in the total population. When gene flow is high, FST is low, and vice versa. FST represents the proportion of differences between two individuals taken randomly from two subpopulations that are due to the differences in allele frequency between subpopulations alone. Other differences between the individuals are those that could be found between individuals taken randomly from the same subpopulation. FST therefore provides a comparison between the between-subpopulation and within-subpopulation components of genetic variation.

    The relationship of FST and migration between populations. When the forces causing genetic divergence between subpopulations are balanced by gene flow, the reduction of heterozygosity within subpopulations is a function of the number of people who move between subpopulations each generation, expressed by FST = 1 / (1 + 4Nm).

    Comparing human populations taken from different continents, FST is between 0.1 and 0.15, meaning that only between 10 and 15 percent of genetic differences between individuals are attributable to their geographic origins. This difference is relatively small compared to many other large mammal species spread among different continents, such as wolves or bears [1]. This level of similarity among human populations means that they have shared high levels of gene flow in the past. However, the meaning of these numbers depends on the relationship of gene flow and the other evolutionary forces.

    Because they are opposite in direction, gene flow and genetic drift will reach an equilibrium over time. At equilibrium, FST = 1 / (1 + 4Nm), where Nm is the number of migrants moving into each subpopulation. Neglecting the forces of selection and mutation, then, an FST of 0.1 for human continental populations means an average of 2 migrants have been entering each continent per generation for a long period of time. Many more people are moving from place to place today than two, so one prediction of this relationship is that the level of genetic differences among continents will in the future decrease. In the face of this gene flow, it is likely that most of the differences in allele frequencies that persist in humans are in fact affected by selection. Indeed many of the most obvious differences, related to physical appearances in different places, appear to bear this out.


    References

    1. Templeton AR. Human races: a genetic and evolutionary perspective. American Anthropologist. 1998;100:632–650.
    Study questions: 
    1. If the present FST among human continental groups is consistent with two migrants among populations each generation, what do you predict will happen to human FST in the future?
    2. It is remarkable that genetic drift and migration balance each other at a given number of actual individuals migrating, so that large and small populations are held in equilibrium by the same number of migrants. Are there any differences between large and small populations?
  • Measuring population subdivision

    Sun, 2011-11-27 22:58 -- John Hawks
    Synopsis: 
    The statistical measurement of differentiation among populations is Fst

    The basic measure of genetic difference between two populations is the statistic, FST. In genetics, the term F generally stands for ``inbreeding'', which tends to reduce genetic variation in the population. Genetic variation can be measured by heterozygosity, and so F generally expresses a reduction in the heterozygosity in the population. FST is the reduction in heterozygosity in subpopulations compared to the total population of which they are part.

    To estimate FST, take the following steps:

    1. Find the allele frequencies for each subpopulation.
    2. Find the average allele frequencies for the total population.
    3. Calculate the heterozygosity (2pq) for each subpopulation.
    4. Calculate the average of these subpopulation heterozygosities. This is HS.
    5. Calculate the heterozygosity based on the total population allele frequencies. This is HT.
    6. Finally, calculate FST=(HT-HS)/HT.

    Don't forget that the HS term is the average across all subpopulations.

    Example: The gene SLC24A5 is a key part of the melanin expression pathway, which contributes to skin and hair pigmentation. A SNP that is strongly associated with lighter skin pigment in Europe is rs1426654. The SNP has two alleles, A and G, with G being associated with light skin, at a frequency of 100% in Utah European-Americans. The SNP varies in frequency in populations in the Americas with mixed African and American Indian ancestry. A sample in Mexico had 38% A and 62% G; in Puerto Rico the frequencies were 59% A and 41% G, and a sample of African-Americans from Charleston had 19% A with 81% G. What is the FST in this example?

  • Genetic variation and the Hardy-Weinberg proportions

    Mon, 2011-10-03 09:54 -- John Hawks
    Synopsis: 
    Allele frequencies and genotype frequencies are connected by math

    The fundamental information about genetics for any individual is her genotype — the alleles that she has. But genes in populations can be considered in other ways as well.

    For instance, a population consists of individuals, so a geneticist may count the number of individuals with every possible genotype. Comparing these numbers with the total number of individuals, the geneticist may calculate the genotype frequencies, the proportion of individuals who have each possible genotype.

    Cystic fibrosis (CF) is a very rare disorder. Among Americans of European ancestry, only around 1 in 2500 people will develop CF during her lifetime. The disorder is even rarer among people of non-European origin. Geneticists have surveyed many people to discover how many of them carry the disorder, and today a number of states screen newborns for cystic fibrosis as a way of directing affected children to early medical treatment (AAP Newborn Screening Task Force 2000). From this information, geneticists have determined that while only around 0.04 percent of the population is affected by cystic fibrosis, with a genotype of ff, approximately 2.5 percent of people are carriers of the allele, with a genotype of Ff. This leaves some 97.5 percent of the population with the genotype FF. These proportions are the frequencies of the three possible genotypes for this gene: 0.04% ff, 2.5% Ff and 97.5% FF.

    The frequency of an allele is the proportion of copies of that allele compared to the total number of copies of all alleles in a population.

    When geneticists know the frequencies of genotypes in a population, they can estimate how many copies of each allele are in the population as a whole. One recent study used genetic techniques to assess the genotypes of a group of colorectal cancer patients for the two possible alleles (A and B) of the p73 gene on chromosome 1 (Pfeifer et al. 2004). In this sample, 113 people (63 percent) were AA, 54 people (30 percent) were AB, and 12 people (7 percent) were BB. Each AA person has two copies of the A allele and each AB person has one copy, so the total sample included 280 copies of the A allele and 78 copies of the B allele. The allele frequency of the A allele is then 280/358, or 78 percent. The frequency of the B allele in this sample is 22 percent.

    Hardy-Weinberg proportions

    While geneticists can determine the frequency of an allele from the proportions of genotypes in a population, they may also do the same calculation in reverse — figuring out the expected proportions of genotypes from the frequencies of alleles. For example, if the cystic fibrosis f allele is found at a frequency of 5 percent in a population, then the chance that an individual will have two copies of this allele is simply the 5 percent chance for each of the two alleles, multiplied by each other. Five percent times five percent is 0.0025, which is 0.25% twenty-five chances out of 10,000.

    The Hardy-Weinberg genotype proportions are p2 + 2pq + q2 for a two-allele gene.

    The proportion of both homozygotes in the population may be determined in the same way. The probability that an individual will be a homozygote for either allele equals the frequency of the allele times itself, or squared. Thus, in the example above, the chance of f homozygotes is equal to the frequency of f squared. The proportion of heterozygotes is equal to the chance that one allele is F times the chance that the other allele is f, plus the chance that the first allele is f times the chance that the other allele is F. Since these likelihoods are the same, the total chance of an individual being a heterozygote is 2 times the frequency of one allele times the frequency of the other allele. Thus, the proportions of genotypes are p2 and q2 homozygotes, and 2pq heterozygotes.

    Geometric presentation of the Hardy-Weinberg proportions

    These proportions are called the Hardy-Weinberg proportions, after the British mathematician G. H. Hardy and German physician Wilhelm Weinberg, who independently formulated the relation in 1908. The proportions come from simple probability of sampling copies from a population with given allele frequencies. These proportions are expected to form an equilibrium. That is, they should stay the same over time, as long as the allele frequencies stay the same. When individuals mate without regard to the alleles they carry, every generation of a population should have genotype frequencies in approximately the Hardy-Weinberg proportions.

    The Hardy-Weinberg proportion is important for two reasons. First, the proportions of genotypes in a population may diverge from the expected ones for many reasons, including natural selection, division of populations into different subgroups, or mating that is not entirely random. Comparing the expected and observed proportions of genotypes allows biologists to determine whether these evolutionary forces may be contributing to a population.

    The heterozygosity of a population is the expected proportions of heterozygotes, given the allele frequencies in the population.

    Second, the proportions lead to a natural definition of genetic variation in a population: the heterozygosity. A population's heterozygosity is the expected proportion of heterozygotes from the Hardy-Weinberg formula, 2pq. Two populations may be compared by their heterozygosities: the one with the higher heterozygosity has a higher chance that any single individual will have two different alleles, which means the population is genetically more variable. Variation is a consequence of evolutionary history, including the patterns of selection and genetic drift, and the amount that individuals have moved from one population to another in the past. Thus, the Hardy-Weinberg proportions give an important way to study the evolution of populations over time.

    Study questions: 
    1. Suppose a population has two alleles, with frequencies of 70% and 30%. What are the Hardy-Weinberg proportions expected for the three genotypes of these alleles?
    2. Mendelian recessive disorders are rare in most populations, but their allele frequencies may be surprisingly high. Why?
    3. For a gene with two alleles, what is the highest possible value of heterozygosity? What is the lowest?
Subscribe to heterozygosity

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.