# Measuring population subdivision

27 Nov 2011The basic measure of genetic difference between two populations is the statistic, *F*_{ST}. In genetics, the term *F* generally stands for ``inbreeding’’, which tends to reduce genetic variation in the population. Genetic variation can be measured by heterozygosity, and so *F* generally expresses a reduction in the heterozygosity in the population. *F*_{ST} is the reduction in heterozygosity in subpopulations compared to the total population of which they are part.

To estimate *F*_{ST}, take the following steps:

- Find the allele frequencies for each subpopulation.
- Find the
*average*allele frequencies for the total population. - Calculate the heterozygosity (2
*pq*) for each subpopulation. - Calculate the
*average*of these subpopulation heterozygosities. This is*H*_{S}. - Calculate the heterozygosity based on the total population allele frequencies. This is
*H*_{T}. - Finally, calculate
*F*_{ST}=(*H*_{T}-*H*_{S})/*H*_{T}.

**Don’t forget that the H_{S} term is the average across all subpopulations.**

**Example:** The gene *SLC24A5* is a key part of the melanin expression pathway, which contributes to skin and hair pigmentation. A SNP that is strongly associated with lighter skin pigment in Europe is rs1426654. The SNP has two alleles, A and G, with G being associated with light skin, at a frequency of 100% in Utah European-Americans. The SNP varies in frequency in populations in the Americas with mixed African and American Indian ancestry. A sample in Mexico had 38% A and 62% G; in Puerto Rico the frequencies were 59% A and 41% G, and a sample of African-Americans from Charleston had 19% A with 81% G. What is the *F*_{ST} in this example?