A problem of fuzzy mammoths

Paleogenomics is changing the way we study evolution. In a number of cases, it now allows us to study extinct organisms with the same methods as we study living ones. A study last year in PLoS BiologyRohland:2010 used genetic evidence from living elephants, extinct mammoths and mastodons, to reconstruct the times that these species diverged.

Woolly and Columbian mammoths

Mammoths are back in the news this week because of a paper by Jacob Enk and colleagues Enk:2011. I think this paper represents a very nice collaboration of paleontologists (Dan Fisher, Ross MacPhee) and paleogeneticists (led by Hendrik Poinar's lab). It's refreshing to read a paper that describes not only the way that the DNA was sampled but also the age and morphological attributes of the sampled mammoths. For example:

This 60+ year old bull is exceptionally well preserved, and exhibits the classic character suite of his species, including low molar lamellar frequency (Figure S1 in Additional file 3), broadly divergent tusk alveoli, a markedly downturned mandibular symphysis, and tremendous body size. We used tusk fragments for the shotgun sequencing, and both tusk and bone samples for PCR and Sanger sequencing.

Every genetics paper should have descriptions like that. Very nicely done.

As an anthropologist, I pay a lot of attention to studies of elephants, because they are another long-lived social mammal, in some ways closer to us in population structure and dynamics than most primates. As in the case of hominins, some taxonomists have argued that we should recognize lots of fossil elephants, others question that distinctiveness. And just as we are discovering for hominins, the elephants are showing evidence for population mixture among groups once considered to be different species.

Enk and colleagues sampled the mtDNA from two Columbian mammoths and one woolly mammoth from North America. The Columbian mammoth is seen by pretty much everybody as a separate species (Mammuthus columbi) from woolly mammoths (Mammuthus primigenius), and paleontologists have thought that they diverged 1-2 million years ago. Woolly mammoths were Holarctic animals, with a range that extended from Europe to North America, while Columbian mammoths were limited to the Americas south of the U.S.-Canada border, roughly. Already other researchers have recovered dozens of woolly mammoth sequences, and their phylogenetic relations are well characterized (as shown in the paper). What Enk and colleagues show is that the two Columbian mammoths both have mtDNA sequences that belong to a single, relatively young clade that is present in woolly mammoths in Alaska and Yukon.

The simplest explanation is that the Columbian and woolly mammoths of North America were exchanging genes.

The authors also suggest the possibility of incomplete lineage sorting (ILS) -- the retention of a single ancestral clade in two isolated species. This seems unlikely given the topology of the clade within woolly mammoths, but the authors omitted the crucial test: the date of the most recent common ancestor of the mtDNA within the clade. If it's truly younger than a million years, we might easily rule out ILS.

Forest and savanna elephants

A lot more information about the variation within living elephantids has appeared within the past year. Looking at them compared to the fossil species, it's pretty clear that taxonomists haven't done well matching taxonomic levels in these groups. Here is a quote from the paper by Rohland and colleagues, who considered the genetic relationships of forest and savanna elephants in Africa.

We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants.

Forest and savanna elephants may deserve a species rank, but we might equally say that the mammoth-Asian elephant divergence doesn't merit the genus rank it has historically been given. As reconstructed in the paper, the forest-savanna elephant and Asian elephant-mammoth divergences both fall within ranges from 2.5 to 5.5 million years. Some widely-recognized mammalian genera (e.g., Homo) are younger, but most mammalian divergences in this range of times are recognized below the genus rank. Should mammoths be put into Elephas? That would probably be a better recognition of the adaptive radiation of Eurasian elephants.

One way to consider the question is by examining the pattern of speciation. With a large number of sampled loci, a far more detailed consideration of speciation can be achieved. This brings us back to a more careful examination of ILS.

We find a higher rate of inferred [Incomplete Lineage Sorting (ILS)] in forest and savanna elephants than in Asian elephants and mammoths: (FE+SE)/(AL+ML) = 3.1 (P = 410?8 for exceeding unity; Table 2), indicating that there are more lineages where savanna and forest elephants are unrelated back to the African-Eurasian speciation than is the case for Asian elephants and mammoths (Table 2). This could reflect a history in which the savanna-forest population divergence time TFS is older than the Asian-mammoth divergence time TAM, a larger population size ancestral to the African than to the Eurasian elephants, or a long period of gene flow between two incipient taxa. (We use upper case T to indicate population divergence time and lower case t to indicate average genetic divergence time (t?T)).

"A long period of gene flow" would reflect a very gradual speciation event, which might argue that the two resultant species should be classified in the same genus. Or...it might suggest that the ecological differentiation actually commenced much earlier in time than the modal estimate, with later hybridization. Mammoths and Asian elephants, by contrast, seem to have a cleaner separation even though the genetic relationships are almost equally close.

We're not quite able to test these alternatives, yet, because only a single individual has been sampled from most of these species. Testing for gene flow really will require larger samples of individuals. In particular, the longer geographic distance between Asian and mammoth samples compared to forest-savanna samples may mean that population structure is hiding within this comparison. I just find it remarkable that genetics has arrived at a point where the pattern of speciation of extinct species is within reach.

The paper uses the extinct mammoth and mastodon comparisons as a frame for discussing the diversity and distinctiveness of African forest elephants. This is in a way unfortunate, because the mammoth-centric questions are probably more interesting to most readers. There's still a lot of productive biology to do there. But the status of forest elephants is a useful hook to hang a paper upon. Whether forest elephants should be given the status of a species has been a hot topic in proboscidean evolutionary biology during the past 10 years. Debruyne Debruyne:2005 gave a good historical review of the issues:

Indeed, when discovered by Matschie in 1900, [forest elephants] were described as either a potential species, or a regional race of Cameroon (Matschie, 1900). Matschie advocated the usefulness of hydrographical basins in order to subdivide African elephants into distinct units. He thus contributed to the profusion of new taxa to be defined by the turn of the 20th century, so that the taxonomy of the African elephant quickly became extravagant, the most meagre morphological evidence being used to acknowledge a new form (Lyddeker, 1907). Up to 22 forms of Loxodonta were described that were finally assigned either to the savannah or the forest elephantsee Laursen and Bekoff (1978) for a review. Morphologists have addressed this question for decades according to their personal taxonomic perspectives. Some have considered that, although displaying a smaller size, smaller round earsresponsible for their designation as cyclotismore toenail structures on both feet, thin down-pointing tusks and a flatter back and forehead, forest elephants belong to the same speciesi.e., Loxodonta africanaas savannah elephants with whom they assumed were reproductively compatible (Backhaus, 1958; Carroll, 1988; Cousins, 1996). Many cases of intermediate morphology have supported this view, which had become prevalent (Laursen and Bekoff, 1978). Conversely, the splitter attitude led other authors to put forest elephants apart on the basis of the same anatomical distinctiveness (Frade, 1931; Frade, 1933; Allen, 1936; Petter, 1958). More doubtful morphological charactersextent of hair-covering, color of the skin, carriage of headhave been put forward to support this division.

The problem became complicated upon recovery of genetic information. Most early phylogeography has been done using mtDNA. The deepest mtDNA clade in the African elephants defines two haplogroups, both of which are shared by the forest and savanna populations. Based on large samples of mtDNA alone, the two populations have been recently exchanging genes.

Early analyses of nuclear microsatellites indicated the opposite pattern, with relatively little allele sharing between the two elephant varieties. I became interested in the question after a paper by RĂ©gis Debruyne (a coauthor on the current paper by Enk and colleagues as well). Debruyne emphasized the great gaps in our sampling of geographic variation in African savanna elephants. Providing some additional data, he showed a very deep mtDNA clade in many forest elephants that was also in many savanna elephants. He argued that the widespread evidence of gene flow refutes the hypothesis of different biological species of elephants.

Rohland and colleagues also addressed the discordance between mtDNA and nuclear genetic variation.

Our study also infers a strikingly deep population divergence time between forest and savanna elephant, supporting morphological and genetic studies that have classified forest and savanna elephants as distinct species [13],[16]. The finding of deep nuclear divergence is important in light of findings from mtDNA, which indicate that the F-haplogroup is shared between some forest and savanna elephants, implying a common maternal ancestor within the last half million years [21]. The incongruent patterns between the nuclear genome and mtDNA (cytonuclear dissociation) have been hypothesized to be related to the matrilocal behavior of elephantids, whereby males disperse from core social groups (herds) but females do not [13],[38]. If forest elephant female herds experienced repeated waves of migration from dominant savanna bulls, displacing more and more of the nuclear gene pool in each wave, this could explain why today there are some savanna herds that have mtDNA that is characteristic of forest elephants but little or no trace of forest DNA in the nuclear genome [13],[14],[39],[40].

The scenario may fit with the facts. It was proposed first by Roca and colleagues Roca:2004, who proposed it as a "genomic record of ancient habitat changes", which had brought the forest and savanna populations into contact across shifting hybrid zones. They reiterated the hypothesis in a later paper Roca:2007 supported with larger samples.

Further progress will require larger samples and better models. I was interested in Debruyn's account of the geographic holes in genetic sampling across the African range of forest elephants. A highly-resolved test of recent gene flow demands finding and sampling potential contact zones between two populations. Some hypotheses can be tested surprisingly strongly using only a single individual from each population. But the power of such tests depends on the pattern of inbreeding in the past. We can imagine that the ancestry of a single individual stretches through the genealogical network of a species like a cone, widening into the past. Recent events are poorly tested by single individuals.

If geographic structure is strong enough, distant populations will approximate different species in their recent genealogical connections. So the single individuals in the more recent study by Rohland and colleagues Rohland:2010 carry a lot of weight.

There are many parallels here between hominin population dynamics and the elephants. Also, as I pointed out in 2006, the elephant situation helps to clarify how we should consider genetic samples from living great apes.

The past year has seen a real reversal in the race between data and analysis. For a long time, sequencing has been a bottleneck in serious analysis of population history. The genealogical connections among individuals ramify by double in every generation, so that the inheritance of a single gene reflects one possibility among countless trillions. If we can only afford to sequence a single gene, we are limited to a single sample of the genealogical links among individuals. Whole genomes give enormous samples of the genealogical history among samples. But they create their own challenges of analysis.