Genetic structure of the chominids

3 minute read

In view of the previous post on the genetics of the human-chimpanzee divergence, it seems worth going into a bit more detail about the paper by Innan and Watanabe (2006).

I'm going to start using "chominid" for this population, because I'm tired of typing out the whole "pre-human-chimpanzee ancestral divergence population" phrase! I would keep using "chuman", but it has such a connotation of chimerism (no, not "chimperism") -- and "humanzee" is even moreso!

The paper considers a model in which the isolation of the proto-chimpanzee and proto-hominid lineages increases steadily over time at a rate a. They then derived a maximum likelihood method for estimating a from human and chimpanzee sequence data. The time of initiation of divergence is a function of the rate a, so the parameters of the model are a, the speciation time (time of complete isolation) T, the relative size of the proto-chimpanzee and proto-hominid lineages h, and theta which equals four times the mutation rate times the effective population size of the ancestral population. The two values a and h themselves are interdependent, so they may amount to a single parameter for the purposes of the model.

The three-parameter model is very simple compared to the kind of complexities that might actually have affected the ancestral population of the human and chimpanzee lineages. So, it's appropriate to be cautious about the results. The paper discusses some of the possible violations of assumptions.

Recombination is an interesting complicating factor, because it should tend to homogenize the coalescence times for different genomic regions within the chominid population. The authors set up their model in such a way that recombination has a conservative effect -- that is, it tends to inflate the estimate of population structure.

The paper is unique in that it doesn't even try to estimate the pre-divergence effective population size or divergence time. The point is to test the hypothesis of population structure, and these issues are separate from that question. A structure involving progressive isolation from an initially panmictic population will tend to dilate the distribution of recent coalescence times (because coalescence is less likely in a structured population) and leave more ancient coalescence times alone (because the population wasn't structured then).

This effect would be similar to an expansion of the effective population size of the ancestral population, and it might be very difficult to tell population expansion from a progressive increase of isolation.

It is notable that the paper also provides an estimate of the relative contribution of males and females to the mutation rate. Males are generally thought to have a much higher effective mutation rate, because males tend to reproduce a bit later and because there are many more cell generations leading to spermatogenesis in males than to oogenesis in females. Comparing the X chromosome with the autosomes, the paper finds that this is likely true:

Our ML estimate of alpha (2.7-4.4) from the human-chimpanzee comparison is close to previous estimates (circa 5) from comparisons of distantly related primates (Li 1997; Makova and Li 2002), rather than low estimates from human-chimpanzee sequence data (Bohossian, Skaletsky, and Page 2000; International Human Genome Sequencing Consortium 2001), which were obtained by ignoring the effect of the coalescent in the ancestral population. It is indicated that taking the effect of the coalescent in the ancestral population into account is important in estimating alpha.

Another reason for the X chromosome to be a bit different. Probably doesn't explain the low coalescence time compared to the autosomes, though, since it is possible to use local mutation rates in comparison to other primates to estimate those times.


Innan H, Watanabe H. 2006. The effect of gene flow on the coalescent time in the human-chimpanzee ancestral population. Mol Biol Evol 23:1040-1047. DOI link