population growth

SNPtastic India

The cover story in Nature this week is a paper about the population history of India, from David Reich's lab. It's an important contribution to our knowledge of human genetic variation, and provides a very interesting set of data for further investigation of modern human origins, the dispersal of agriculture into the subcontinent, and the history of more recent Indian populations.

Here's the abstract:

India has been underrepresented in genome-wide surveys of human variation. We analyse 25 diverse groups in India to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most Indians today. One, the 'Ancestral North Indians' (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, whereas the other, the 'Ancestral South Indians' (ASI), is as distinct from ANI and East Asians as they are from each other. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39–71% in most Indian groups, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India. However, the indigenous Andaman Islanders are unique in being ASI-related groups without ANI ancestry. Allele frequency differences between groups in India are larger than in Europe, reflecting strong founder effects whose signatures have been maintained for thousands of years owing to endogamy. We therefore predict that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically.

The number of individuals is not huge for the purposes of population genetic analysis -- only 132 people from 25 groups -- but it is very significant in terms of recent samples. By comparison, it is around double the number of effective individuals in any of the HapMap v.1 populations, genotyped at more than 560,000 SNPs.

The results of the study are basic population genetic issues, including the degree of endogamy, the pattern of regional differentiation, the likelihood of discovering new recessive genetic disorders by additional sampling. Some notes:

Population mixture. The authors propose that today's groups descend in varying proportions from two ancient (and no longer existing) populations, which they call "ancestral North Indian" and "ancestral South Indian".

I'm always skeptical of mixture models, especially when the putative source populations no longer exist. There are just too many ways that structured migration or dispersal can lead to the appearance of mixture. People once thought of "Alpines" as a mixture of pure Nordic and Mediterranean elements, after all-- and that was just because their heads were mesocephalic.

Still, with a half-million SNPs, it's possible to do a better job testing the hypothesis of mixture versus structured migration. The authors in this paper didn't -- they applied a simplified "3 Population Test" that compares the empirical allele frequencies to proportions expected under only two scenarios: simple mixture or complete isolation. It seems to me that the null should be simple isolation by distance, which would give the same result as "mixture" according to their test. If you really want to look for population mixture, you need to involve the dimension of time, for example, by demonstrating the antiquity of haplotypes that have mixed together.

So I don't accept this ancestral division, certainly not at face value. It does seem plausible that West Asian (and thereby European-related) genes have introgressed into India over time, perhaps in association with the growth of high-density agricultural populations. Maybe some of this gene flow occurred under the influence of positive selection, but processes of elite dominance and differential growth may have been sufficient.

Regional differences. The results show a greater degree of regional genetic differentiation in India than has been found for continental Europe. Still, with an FST of only 0.01, we're not talking about major population splits here. With that number, the subcontinent is closer to panmixia than one might expect for a region its size. The authors suggest that founder effects explain the regional differentiation:

We propose that the high FST among Indian groups could be explained if many groups were founded by a few individuals, followed by limited gene flow. This hypothesis predicts that within groups, pairs of individuals will tend to have substantial stretches of the genome in which they share at least one allele at each SNP. We find signals of excess allele sharing in many groups (Supplementary Fig. 2), which as expected tend to occur in the groups that have the highest FST values from all others (P = 0.002 for a correlation). To estimate the age of founder events, we measured the genetic distance scale over which allele-sharing decays, and verified the robustness of our procedure by simulation (Supplementary Fig. 3). Six Indo-European- and Dravidian-speaking groups have evidence of founder events dating to more than 30 generations ago (Supplementary Fig. 2), including the Vysya at more than 100 generations ago (Fig. 2). Strong endogamy must have applied since then (average gene flow less than 1 in 30 per generation) to prevent the genetic signatures of founder events from being erased by gene flow.

I don't think that explanation works. With those times in generations, we're talking about events within the last 600-2000 years. Since all these calculations are done on the whole dataset assuming complete neutrality, I think we should look more closely at the distribution of LD across loci. It seems likely that some of the high-LD loci that appear to point to founder effects will actually be found to be selected.

Relationships of Indian to non-Indian populations. One of the real problems of assuming a tree with no migration is that it leads to statements like this:

[T]he ANI [ancestral North Indian] and CEU [HapMap European sample] form a clade, and further analysis shows that the Adygei, a Caucasian group, are an outgroup (Supplementary Note 4). Many Indian and European groups speak Indo-European languages, whereas the Adygei speak a Northwest Caucasian language. It is tempting to assume that the population ancestral to ANI and CEU spoke 'Proto-Indo-European', which has been reconstructed as ancestral to both Sanskrit and European languages, although we cannot be certain without a date for ANI–ASI mixture.

Some of the common ancestors of some living Europeans and some Indians were probably speakers of proto-Indo-European speakers. But we can easily refute the hypothesis that all of the common ancestors did so -- some of those common ancestors lived more than 40,000 years ago, as is well-known from the mtDNA chronology. The tree model with complete isolation does not explain the data. So as simple as it is -- and as well-used by Cavalli-Sforza and others -- it would be better to use a more accurate model.

UPDATE (2009-09-24): Gene Expression has a full review of the paper.

UPDATE (2009-09-27): Very interesting angle by Suvrat Kher at Reporting on a Revolution:

The Indian Press has made a hash of the finding....

But I can't blame the press entirely. The scientists who gave interviews to the press didn't mention this. They wimped out on reporting this potential inflammatory and politically incorrect finding. This is just poor and irresponsible science outreach on part of the scientists. How can you ignore a finding that is staring out at you from the very paper you are talking about? The press may be guilty of not digging in but it was just reporting what the scientists told them.

References:

Reich D, Thangaraj K, Patterson N, Price AL, Singh L. 2009. Reconstructing Indian population history. Nature 461:489-494. doi:10.1038/nature08365

In 2005 I wrote this:

"Unusual compared to the rest of the genome" is a phrase you should expect to hear a lot of in the next few years.

I was looking back at that old post today, as I'm writing new stuff about bottlenecks. It's about the ability to detect selection using the HapMap data -- written just as I was starting to think about recent selection:

Suppose we wanted to use a detailed topographic survey of a road to find the potholes. But for everyday roads, there is a problem -- there are lots of bumps and grooves that aren't potholes. And different parts of the road are more or less bumpy. It would help a lot if we could use the empirical distribution of bumps to simulate a section of road -- then we could figure out whether anomalies in the real road were likely to be potholes or not.

Now suppose that the road isn't just pocked with the occasional pothole -- it has a pothole every three or four feet. Remember why we're using simulations -- not only do we not know where the potholes are, we don't know how common they are. So our simulations based on the pothole-rich road will find that pothole-sized bumps are normal. If pothole-sized bumps are not unusual, then our simulation can have only one result: a pothole is not a pothole.

So I've been writing about the same problem for over three years -- the problem of ignoring history and archaeology when applying models of population history, and how they skew simulations of genetic drift. Time to do something about it, I guess.

The ancient struggle for existence between humans and giant clams

Giant clams are in the news today, helping to drive the expansion of modern humans out of Africa. Can we believe it?

  • The paper (Richter et al.2008) describes a new species of giant clam, distinct from others in reproductive cycle, habitat preference and size.
  • This new species is mainly found in shallow water reefs.
  • Today, the species makes up a very small proportion of the total Red Sea giant clam count.
  • Before the last interglacial, this species made up as much as 80 percent of the giant clam count, as assessed by shells from reef terraces. This proportion decreased around the last interglacial, and again in historic times.

This sounds like the classic megafaunal exploitation story, as it is being reported. Shells become an important debris of humans in Northeastern Africa by 125,000 years ago (Walter et al.2000), and were important elements of the MSA along the coasts of North and South Africa (McBrearty and Brooks2000). So it would not be surprising if these people recovered giant clams, particularly if those clams were readily available in shallow water. Giant clams are similar to large tortoises in terms of their recovery and exploitation, and there is already good evidence that tortoise size decreased with overhunting as Late Pleistocene human populations grew. By the Upper Paleolithic, people in some parts of the Mediterranean began to harvest small shellfish to an extent that put pressure on their populations. The giant clams would be an early example of the same phenomenon, made more precarious by the shallow-water habits of this particular clam species.

Since refuting the Neandertal inferiority complex is a theme this week, I should point out that Neandertals who lived on the coast also exploited shellfish, an observation that I discussed here. The exploitation of coastal resources is not specifically“modern”. Coastal populations of terrestrial predators typically eat marine species, for example, coastal brown bears in Alaska systematically harvest soft-shelled and razor clams (Smith and Partridge2004).

So the clams shouldn’t be surprising. Are they interesting? I think it is another piece of evidence that human populations in Africa during the last interglacial were already large and growing. Archaeological sites from the African Late Pleistocene have been proliferating during the last few decades, but are still underrepresented compared to the density of sites in other regions, especially Europe and the Near East. So you might not get the idea from archaeological sites that the African population was especially large. Yet, across the MSA, we see increasing breadth of faunal exploitation and some systematic recovery of small resources such as shellfish and tortoises. We also see a greater intensity of raw material exploitation and movement, and

Most important, we now have clear genetic evidence for a large and diverse African population during the Late Pleistocene. That includes the mtDNA genealogy, which now supports the interpretation of an effective population size that had perhaps doubled or more by the last interglacial (I discussed that research here). Put that together with the evidence for structure within this ancient population — either regional differentiation or ecological adaptation — and we have some very interesting demographic knowledge about Africa 100,000 years ago.

References


   McBrearty S, Brooks AS. 2000. The revolution that wasn’t: a new interpretation of the origin of modern human behavior. J Hum Evol 39:453–563.

   Richter C, Roa-Quiaoit H, Jantzen C, Al-Zibdah M, Kochzius M. 2008. Collapse of a new living species of giant clam in the Red Sea. Curr Biol 18:1–6. doi:10.1016/j.cub.2008.07.060.

   Smith TS, Partridge ST. 2004. Dynamics of intertidal foraging by coastal brown bears in southwestern Alaska. Journal of Wildlife Management 68:233–240. 0.CO;2]doi:10.2193/0022-541X(2004)068[0233:DOIFBC]2.0.CO;2.

   Walter RC, et al. 2000. Early human occupation of the Red Sea coast of Eritrea during the last interglacial. Nature 405:65–69.

Syndicate content