Variation in NAT1 and NAT2

There's a new paper in AJHG by Patin and colleagues, which is just chock full of interesting stuff. The genes studied are NAT1 and NAT2, called "N-acetyltransferase genes" (OMIM entry), and are involved in the metabolism of certain drugs and carcinogens.

For example, they detoxify some of the carcinogenic amines that result from grilling meat. Different alleles of the genes are involved in some harmful drug interactions, since they affect the rate of drug metabolism. In other words, these are the kinds of genes that people interested in "personalized medicine" are most interested in -- they help to determine the response to harmful environmental agents and outcomes to treatment.

Patin and colleagues (2006) studied the evolution of polymorphisms of the two genes. Here's a quick review of what they knew starting out:

Both genes carry functional polymorphisms whose effects on enzymatic activity have been well studied (Hein et al. 2000). Whereas the variants associated with reduced activity attain only low frequencies in NAT1, they constitute common polymorphisms in NAT2 (Upton et al. 2001). Two main classes of NAT2 phenotypes are therefore observed: the "fast-acetylation" phenotype, which refers to the wild-type acetylation activity, and the "slow-acetylation" phenotype, which results in reduced protein activity. In addition, NAT1 and NAT2 metabolize numerous common carcinogens, and variation in these genes can result in varying susceptibility to cancer (for a review, see the work of Hein [2002]). For example, the slow-acetylator NAT2 phenotype has been associated with side effects to the commonly used antitubercular isoniazid (Huang et al. 2002) and with higher risk for bladder cancer (Cartwright et al. 1982; Garcia-Closas et al. 2005). Nevertheless, most NAT2 mutations leading to the slow phenotype are found at high frequencies worldwide, calling into question the role of altered acetylation in human adaptation.

So, the polymorphism of NAT2 is a bit mysterious -- what advantage might the slow-acetylators have to keep them around?

They did the usual sampling on "geographically diverse samples" and a chimpanzee sequence to determine site polarities. A twist makes the study a bit more complicated than usual: the genes are physically close together, so an allele for NAT2 may be significantly correlated with an allele for NAT1, for example.

The paper finds good evidence for selection on NAT2 alleles. Different alleles in different populations appear to cause the slow-acetylator phenotype. One of these, mainly in Europeans (NAT2*5B) has a stronger phenotypic effect (i.e., slower-acetylation), and has the strongest signature of recent selection. They infer that this allele came under selection between 5800 and 7000 years ago.

The footprints of natural selection identified in western/central Eurasians raise the question of which event(s) may have provoked fluctuations in the spectrum of xenobiotics inactivated/activated by NAT2 (e.g., NAT2 activates heterocyclic carcinogens found in well-cooked meat [Hein et al. 2000; Hein 2002]) in these populations. In this context, given the geographic distribution of the slow-acetylator phenotype and the estimated expansion time of the slowest-encoding 341TC mutation (5,7977,005 years ago in western/central Eurasians), it is tempting to hypothesize that the emergence of agriculture in western Eurasia could be at the basis of such environmental changes. Indeed, there is accumulating evidence that this major transition resulted in a profound modification of human diets and lifestyles (Cordain et al. 2005) and, consequently, in the exposure of humans to chemical environments (Ferguson 2002). Moreover, the highest frequencies of slow acetylators are observed in the Middle East (fig. 5), one of the first regions where agriculture originated 10,000 years ago, and these frequencies decrease toward western Europe, North Africa, and India, three regions where agriculture was subsequently diffused from the Fertile Crescent (Harris 1996). However, the hypothesis that the transition to agriculture influenced both the human exposure to xenobiotic environments and, consequently, the selective pressures at NAT2 remains tentative and requires a better characterization of the naturally occurring substrates of the NAT2 enzyme.

The story for NAT1 is even more interesting. The coding region as a whole is much less variable than NAT2. But there is a divergent haplotype, separated by 17 SNPs from the rest of the alleles, and found in only three individuals (in France, India, and Thailand). Patin and colleagues propose that the haplotype may represent ancient population structure, similar to the earlier study from last year by Garrigan et al. (2005). Here's the relevant section of the paper:

Purifying selection may not be the only evolutionary force that has influenced NAT1 diversity. Indeed, one of the most salient observations of this study is the highly divergent tree topology and high TMRCA (2.01 0.29 MYA) of this locus (fig. 2). This binary pattern is translated into significant departures from neutrality in populations presenting the divergent haplotype NAT1*11A (see table 7 and results of the HKA test). The probability of finding such a high TMRCA under a Wright-Fisher model was found to be low (P = .029). Different hypotheses can be proposed to explain such long basal branches in the NAT1 gene tree. First, long-term balancing selection can result in divergent haplotype clusters, by maintaining two or more alleles over time, provided that they result in functional differences. Nevertheless, our data do not support this hypothesis, since the two nonsynonymous mutations separating the two clusters (fig. 2) have been shown to have no significant effects on the in vivo protein activity in human cells (Hein 2002) or on the stability and activity of the recombinant protein in yeast (Hughes et al. 1998). Any kind of selection due to a hitchhiking effect with neighbor genes is equally unlikely, because the two closest genes (ASAH1 located 5 and NAT2 located 3) behave as independent haplotype blocks (this study and the HapMap database). Furthermore, our sequence data from the NAT1 coding region are consistent with the action of purifying selection rather than balancing selection, with the first selective regime having a minor influence on tree topologies (Williamson and Orive 2002). Second, gene conversion could also lead to such divergent haplotype patterns by the replacement of a segment of NAT1 with a tract from its nearby paralogs (NAT2 and/or NATP). This alternative is unlikely, however, since the 17 SNPs separating the two divergent NAT1 lineages are not physically clustered (fig. 2) as one would expect after gene conversion between duplicated loci (Innan 2003). Thus, if gene conversion formed the basis of such a haplotype pattern, multiple conversion events must be invoked, with some tracts of lengths <5 bp. Yet, the conversion-tract lengths have been estimated to range from 55 bp to 290 bp, through sperm-typing analyses (Jeffreys and May 2004).
In this view, an alternative and most likely scenario to explain our data is a demographic event such as ancient population structure. A number of studies have recently reported gene genealogies that present not only unexpectedly old coalescent times (2 MYA) but also long basal branches (Harris and Hey 1999; Webster et al. 2003; Barreiro et al. 2005; Garrigan et al. 2005; Hayakawa et al. 2005). Our observations at NAT1, together with these studies, further support the view that some diversity in the genome of modern humans may have persisted from a structured ancestral population (Harding and McVean 2004). In addition, NAT111A appears to be absent in sub-Saharan Africa, since it was not detected in either our genotyping panel of 144 sub-Saharan Africans from distinct geographic locations or 600 African American individuals reported elsewhere (Upton et al. 2001). Therefore, the observation that the NAT1 gene tree is rooted in Eurasia questions the geographic location of such a structured ancestral population (Takahata et al. 2001). The origins of NAT111A could thus be placed either in sub-Saharan Africa, from where it must have subsequently disappeared, or in Eurasia. Should the latter be the case, the NAT1 gene tree is at odds with the commonly accepted replacement hypothesis (Lewin 1987) and is more parsimoniously explained by the occurrence of partial hybridization between modern humans expanding from Africa and preexisting hominids in Eurasia, as recently sustained by the RRM2P4 locus (Garrigan et al. 2005). However, such inferences require further support from the analyses of multiple independent loci in increased numbers of samples and human populations.

And all this from some cooked meat genes.

References:

Patin E et al. 2006. Deciphering the Ancient and Complex Evolutionary History of Human Arylamine {N}-Acetyltransferase Genes. Am J Hum Genet (online early) Full text