The basic measure of genetic difference between two populations is the statistic, FST. In genetics, the term F generally stands for ``inbreeding'', which tends to reduce genetic variation in the population. Genetic variation can be measured by heterozygosity, and so F generally expresses a reduction in the heterozygosity in the population. FST is the reduction in heterozygosity in subpopulations compared to the total population of which they are part.
To estimate FST, take the following steps:
- Find the allele frequencies for each subpopulation.
- Find the average allele frequencies for the total population.
- Calculate the heterozygosity (2pq) for each subpopulation.
- Calculate the average of these subpopulation heterozygosities. This is HS.
- Calculate the heterozygosity based on the total population allele frequencies. This is HT.
- Finally, calculate FST=(HT-HS)/HT.
Don't forget that the HS term is the average across all subpopulations.
Example: The gene SLC24A5 is a key part of the melanin expression pathway, which contributes to skin and hair pigmentation. A SNP that is strongly associated with lighter skin pigment in Europe is rs1426654. The SNP has two alleles, A and G, with G being associated with light skin, at a frequency of 100% in Utah European-Americans. The SNP varies in frequency in populations in the Americas with mixed African and American Indian ancestry. A sample in Mexico had 38% A and 62% G; in Puerto Rico the frequencies were 59% A and 41% G, and a sample of African-Americans from Charleston had 19% A with 81% G. What is the FST in this example?