More on the X variation conundrum
More on the X variation conundrum
Last winter I noted the contradiction between two papers that each attempted to explain variation on the X chromosome compared to the autosomes. They had come to opposite conclusions, based on discrepancies in their data. I noticed that they had used different methods of determining mutation rates for X chromosome loci:
So, for their current paper, Keinan and colleagues (2008) try to correct for the recent divergence of human and chimpanzee X chromosomes. Simple enough -- rescale all X chromosome mutation events by the some ratio proportional to the human-chimp divergence discrepancies. In this case, they attempt to rescale to the human-macaque divergence. Since that divergence happened in the Oligocene, the discrepancies among chromosomes should slight compared to the overall divergence. I'd feel better if they actually tested this idea.
Meanwhile, Mike Hammer and colleagues scaled X chromosome diversity to the human-orangutan divergence. They claimed that this gave the same results as the human-chimpanzee divergence. Which, if true, would obviously give a different outcome than the procedure followed by Keinan and colleagues, which was predicated on the idea that the human-chimpanzee X divergence is the wrong number to use.
I had sort of forgotten about this (which drove me crazy at the time), but another question led me to revisit it late this week. In the intervening time, I see that Carlos Bustamante and Sohini Ramachandran (2009) happened across the same explanation that I had offered:
It appears that the rest of the discrepancy is explained by different normalizations for background mutation rate differences between the X chromosome and autosomes (Hammer et al.10 used human-orangutan divergence and Keinan et al.9 used human-macaque divergence).
So you read it here first. Which I suppose means that I should submit letters to journals more often. I don't because it seems to me that all I'm doing is reading and trying to understand papers, which sometimes takes more work than it should. On the other hand, I wonder how many people are really putting much effort into their reading...
Meanwhile, Bustamante and Ramachandran add an additional explanation -- the different means of ascertainment, since Mike Hammer's group used resequencing to find variation, while Keinan and colleagues (2008) had used HapMap SNPs under a specific ascertainment model. They end their short piece by pointing out the value of further resequencing data:
In order to address continuing questions on the nature of sex-biased processes, full genome sequencing of large numbers of individuals sampled from diverse populations will be needed. The upcoming 1,000 Genomes Project (http://www.1000genomes.org/), for example, will provide orders of magnitude more data for these types of analyses. We share the enthusiasm of the population genetics community that this will bring the potential for resolving continuing questions regarding how human history and cultural practices have shaped global patterns of genomic diversity.
Ascertainment is a serious issue with the existing SNP data, because different SNPs were ascertained in different, non-commensurable ways. That's how I was led into reconsidering this issue this week, another set of data seem to have features that are partially explained by ascertainment, but partially not. It's hard to use existing data for some kinds of population genetics analysis, although others are less affected by ascertainment biases.
So the 1000 Genomes effort will make some kinds of analyses simpler to accomplish. I suppose if ascertainment becomes less of a problem, we may see people focus more effort into understanding non-genetic sources of information, too!
References:
Bustamante CD, Ramachandran S. 2009. Evaluating signatures of sex-specific processes in the human genome. Nat Genet 41:8-10. doi:10.1038/ng0109-8







