HLA class-I loci in Neandertals and Denisova

With draft sequences of genomes from several Neandertals and from Denisova, we can begin to investigate known human variations that affect phenotypes. In practice, this is a very simple approach -- take alleles that we know exist in recent human populations, and see if they are in the DNA sequences of these ancient people. My lab has been following this line of research, trying to get information about aspects of biology that are not evident from the skeleton. The immune system is one of the most fascinating, both because of its extensive variation in living people, and because we might be able to test hypotheses about the diseases and parasites that ancient humans faced.

Today Science has released an early manuscript edition of a paper by Laurent Abi-Rached and colleagues (bibliographic information not yet available), which identifies the HLA class-I alleles present in the three highest-coverage Neandertal genomes from Vindija (Vi 33.16, 33.25, and 33.26) and the Denisova pinky genome. The paper is very brief and fairly straightforward, providing provisional HLA class 1 allele types for these individuals, discussing possible haplotype associations among these alleles that may have been in the ancient genomes, and providing the frequency of those alleles in present-day human populations.

These archaic individuals carried HLA types that are presently rare in Africa and more common outside of Africa, supporting the hypothesis that these alleles in living people originated in those archaic populations. The linkage between alleles at different HLA class-I genes also supports that hypothesis. The present immune system biology of humans was strongly shaped by the interaction of different regional populations of archaic humans.

The title of the paper calls this "multiregional admixture", and the word "introgression" appears 8 times. Good for us!

(This is the point where I grumble about the lack of citations in this paper....OK, done grumbling.)

Selected genes may have a very different pattern from neutral genes

This paper is the first demonstration that gene variants of functional importance were not only inherited from Neandertals and Denisovans but were valuable and selected in later populations.

We already knew that humans today have gene variants from these archaic humans. Neandertal genes presently account for around 3 percent of the genomes of people outside Subsaharan Africa. My lab has been studying the pattern of frequency of these genes ("Europe and China have different Neandertal genes"). Most of the genes shared between the Neandertal genome and living people outside Africa are presently very rare -- most occur only in a single individual in our sample of Europeans and Chinese people, for example.

These HLA class-I alleles are different. Some of them are quite common today. If they came from the Neandertals and Denisovans -- that is, if they were not present in the African people who make up most of our ancestry genome-wide -- then these alleles must have increased quite a lot during the recent evolution of people outside Africa.

The best explanation for the large increase in frequency of these genes in modern human populations is selection. If readers want to get an introduction to the scientific literature on the topic of functional genes, I can suggest a detailed review paper I wrote with Greg Cochran on the dynamics of introgression and selection as applied to Neandertals two-s, and a review paper we wrote in Trends in Genetics about identifying genes in living humans that that may have come from archaic populations Hawks:legacy:2008. In both papers, we discuss the dynamics of functional genes that may be affected by selection in modern human populations and how they differ from the predictions for neutral loci affected only by genetic drift. The new paper by Abi-Rached and colleagues follows on that line of inquiry.

I think the hypothesis of adaptive introgression is very likely, and that we shouldn't be at all surprised that the immune system might house many good examples of it.

A look at the most extreme examples, involving the Denisova genome, shows the extent that these functional genes might reflect introgression well beyond that indicated by most of the genome. The HLA class-I alleles present in the Denisova genome are most common today in South Asia (HLA-C12:02, HLA-C15 which is also common in Australia) and Southeast Asia (HLA-A11). These regions of the Old World have no substantial evidence of Denisova inheritance across their genomes. Yet they may very well have substantial frequencies (up to 48 percent for HLA-A11) of HLA class-I alleles from the archaic Denisovan population.

Reasons to be cautious

This is the point where I have to make a note of caution. Even though I personally think it is likely that these HLA alleles really did introgress into the modern population from Neandertals and Denisovans, their geographic pattern really isn't enough to demonstrate this without question.

Reports earlier this summer described some of the work this group was doing on HLA class-I loci, including a public lecture by PI Peter Parham. I noted at the time that the geographic distribution of the alleles mentioned in that lecture seemed a mismatch for the hypothesis of a Denisovan origin for the alleles ("The immune systems of archaic humans"). For example:

HLA-A11 is very common in Papua New Guinea, but it is also very common in north India and in China. These two areas otherwise show no significant evidence of Denisova ancestry. We might conclude that the HLA-A gene just has an unusually high level of introgression into Asian populations, not typical of the genome as a whole. That's certainly possible. But without finding any substantial number of derived mutations in the HLA-A11 variant in the Denisova genome and in living Asians, it is hard to rule out that the sharing of HLA-A*11 in all these populations is just coincidence.
Of course, if the allele were absent in Africa, that would weigh in favor of the idea it is shared by Late Pleistocene interbreeding outside Africa. But HLA-A*11 is in Africa, just very rare. And it's in Europe. This is the kind of locus that is difficult to interpret: if it has any tiny disadvantage against malaria, for instance, its rarity in Africa is easily explained as a function of recent evolution, while its presence almost everywhere outside Africa would be no surprise even if there were never any interbreeding.

The story of HLA-C*12:02 is similar. It's common in PNG, but also broadly across South Asia and into Iran, areas where no substantial evidence of Denisovan ancestry has been demonstrated.

Introgression under selection is a good hypothesis for why these alleles should be so much more broadly distributed than the evidence from the rest of the genome. But introgression isn't the only explanation, because the alleles might have been retained by balancing selection, with recombinant haplotypes suppressed by purifying selection. We might use haplotype age to test the hypothesis. If the alleles were retained by ILS, they would look much older than if they came in from an archaic population by introgression. But as I'll describe below, in this case we actually have the opposite problem: these haplotypes look too young to have come in by introgression, likely a consequence of selection long after the Neandertals and Denisovans had contributed their genes to us.

The curious case of HLA-B*73

If I agree that the results of this paper are pretty likely, why am I still cautious? Well, the most confusing thing in this paper is an allele described in great detail that they didn't find in the archaic genomes. And I know from experience that not finding things is a pretty common occurrence when we go looking for odd things that might have come from Neandertals.

There's a detective story here, that probably explains the initial interest of this group in the Neandertal genome, but that just didn't pan out in their search through the archaic genomes. The allele is HLA-B*73.

Parham and colleagues Parham:HLA:1994 first characterized this allele, which is remarkably different from other HLA-B alleles. Homologs of HLA-B73 are present in living apes, suggesting that the different human alleles originated before we diverged from gorillas. The retention of such an ancient allele in humans isn't a surprise in the HLA system, because many very divergent alleles have been kept in the population across evolutionary time by balancing selection. What's a bit surprising about HLA-B73 is its limited diversity in living people. It appears to have persisted in humans throughout our evolution, but people today who carry the allele have very similar sequences, and it is nearly always linked to one single allele at the nearby gene, HLA-C (HLA-C*15). Also, the allele is very rare inside Africa and reaches its highest frequency in West Asia., where it occurs in only 4.5 percent of people. Because of this strange pattern, Parham and colleagues suggested that the allele may have been inherited from Neandertals.

When I was in graduate school working on modern human origins, I took a special interest in genes that had this pattern of variation. HLA-B*73 was not the only one, there are others.

The variation of the HLA-B73 allele and its association with HLA-C15 correspond very well to the predictions we presented in our paper on identifying introgression from archaic humans Hawks:legacy:2008. It's a highly divergent allele in humans compared to others, and it appears not to have recombined much with nearby genes, suggesting it was sequestered in another population through much of the diversification of present-day HLA alleles. But the HLA system is actually a rotten place to look for this kind of evidence, because there are many, many instances where ancient alleles have been retained in human populations by balancing selection. As we pointed out in 2008, a deep root to the gene tree and a rarity of recombination can be good evidence of introgression, but balancing selection and inhibitions to recombination are alternatives to introgression for explaining this pattern of variation.

There's no necessary contradiction between the two processes, and ancient DNA in this case could establish that the allele was both under selection and came from archaic humans. The problem: they didn't find the allele in the archaic genomes.

So why did they spend so much time in this paper discussing this allele? My guess is that they were surprised not to find it. But they did find HLA-C15 in the Denisova genome, which is often linked to HLA-B73 in living people who carry it. That makes for an indirect argument:

C12:02 and C15 were formed before the Out-of-Africa migration (Fig. 2H and fig. S15) and exhibit much higher haplotype diversity in Asia than in Africa (fig. S16), contrasting with the usually higher African genetic diversity (20). These properties fit with C12:02 and C15 having been introduced to modern humans through admixture with Denisovans in west Asia, with later spreading to Africa (21, 22) (Fig. 1F and fig. S11 for C15). Given our minimal sampling of the Denisovan population it is remarkable that C15:05 and C12:02 are the two modern HLA-C alleles in strongest LD with B73 (Fig. 1E). Although B73 was not carried by the Denisovan individual studied, the presence of these two associated HLA-C alleles provide strong circumstantial evidence that B73 was passed from Denisovans to modern humans.

I would go one simpler: Given that HLA-B*73 is most common today in West Asia, I suspect it came from West Asian Neandertals. There's no reason why the HLA genes of European Neandertals should have been identical to West Asian Neandertals. Today's Europeans are different from today's West Asians in the frequencies of these alleles, so why not in the past as well? For that matter, we really only have two alleles from European Neandertals for HLA-B (since the paper finds that

Why do the Vindija Neandertals all have the same HLA types?

It's a pretty good question. The paper cannot distinguish the genotypes from these three individuals. That's not the same as saying they're exactly the same type, since the sequences are very low coverage, but probably they were. Here's what the paper says:

Genome-wide analysis showing three Vindija Neandertals exhibited limited genetic diversity (3) is reflected in our HLA analysis: each individual has the same HLA class I alleles (fig. S17). Because these HLA identities could not be the consequence of modern human DNA contamination of Neandertal samples, which is <1% (3), they indicate these individuals likely belonged to a small and isolated population (fig. S18).

Still, I think this indicates a pretty high degree of inbreeding among these individuals. I wonder what the organ registry for Neandertals would have looked like.

(Not so) final words

I have more to write on the topic of linkage disequilibrium among these genes. The rate of recombination between HLA-B and HLA-C is high enough that a haplotype between these genes should have mostly decayed in the time since our mixture with archaic humans. HLA-C and HLA-A are an order of magnitude further apart, so linkage between alleles of these genes should have been totally erased in the time since any archaic admixture.

That means that the extended haplotypes reported in this study must reflect selection in the period since the population mixture and introgression. The story isn't a simple case of inheritance from archaic humans, it is rather more complex. But more on that later.

I think this paper confirms that it will be really productive to look at archaic genomes for variants present in living humans. Identifying modern human alleles in a Neandertal isn't really very exciting science, though. I've been doing this on my blog for a year now. It's a tricky job to type these HLA alleles, compared to genotyping many other genes, as we discovered. Still, I never really expected that reporting on genotypes in the public domain would be sufficient to get printed in Science.

Still, this set of three genes is particularly interesting. And the paper does add evidence from one additional locus, KIR3DS1, which also has the pattern where an allele rare in Africa but common in Asia is present in the Denisova genome.

If it turns out that we have widespread adaptive introgression in Asia today from Denisovans, that will change the game of studying the origins of these populations. Based on the genome-wide comparison, it looks like the genetic interaction that led to the habitation of Asia did not involve Denisovans, who contributed only to populations at the most eastern extreme of habitation in island Southeast Asia. But the only Denisovans we know about lived near the geographic center of the Asian landmass, not at the extreme southeastern extreme.

The HLA pattern may suggest a more widespread pattern of mixture across Asia, which was later overwritten by population movements of people who didn't have Denisovan ancestry. That means that the habitation of Asia was a process of successive migrations and replacements, which imperfectly covered up the evidence of archaic intermixture. The genes that remain as signs of this intermixture are those that had selective advantages in later populations.