Inbreeding is for slow news days

12 minute read

At least, AP writer Matt Crenson has taken it upon himself to explain the concept in not one, but two different articles on the subject this weekend (you can find them here, on Yahoo, and here, on MSNBC).

The Slashdot discussion on the first article is even tagged "slownewsday"! Which I have to say, inspires me to want to use tags!

Anyway, the point of both stories is to explain the conclusion of a 2004 Nature paper by Rohde, Olson and Chang ( DOI link). Here are two of the concluding paragraphs:

Given the remaining uncertainties about migration rates and real-world mating patterns, the date of the MRCA for everyone living today cannot be identified with great precision. Nevertheless, our results suggest that the most recent common ancestor for the world's current population lived in the relatively recent past -- perhaps within the last few thousand years. And a few thousand years before that, although we have received genetic material in markedly different proportions from the people alive at the time, the ancestors of everyone on the Earth today were exactly the same.
Further work is needed to determine the effect of this common ancestry on patterns of genetic variation in structured populations. But to the extent that ancestry is considered in genealogical rather than genetic terms, our findings suggest a remarkable proposition: no matter the languages we speak or the colour of our skin, we share ancestors who planted rice on the banks of the Yangtze, who first domesticated horses on the steppes of the Ukraine, who hunted giant sloths in the forests of North and South America, and who laboured to build the Great Pyramid of Khufu (Rohde et al. 2004:566, emphasis added).

The study is straightforward mathematically, although it depends on certain assumptions about ancient migration patterns. The observation that humans share many recent genealogical ancestors follows as a simple consequence of the fact that every person has two biological parents. With mathematical certainty, 40 generations into the past yields a thousand times more ancestral lines (240 is more than a trillion) than there were people on Earth at the time (less than a billion). Hence, most ancestral lineages must be concentrated into the same small set of people. A few of these ancestral people were fairly cosmopolitan -- they ended up being ancestors of people in lots of different parts of the world today. Go back another 10 or 20 generations, and one of these cosmopolitan people will probably connect everyone in the world. Go back one more generation from this ancestor, and at least two such people exist (her parents!). In fact, all of this person's genealogical ancestry is shared by everyone living today, which over another 30 generations adds up to over a billion possible lines of descent. And so on.

Of course, whether we all share such connections 30 generations ago or 300 generations ago depends on how people moved around. If everybody could move far enough to mate with anybody else in the world (the AP article calls this a "global meet market") then everybody was effectively cosmopolitan.

On the other hand, if a few Marco Polos were all that connected different parts of the world to each other, then it might take a lot longer to find one of these cosmopolitan ancestors to connect everyone.

This model has a lot more in common with disease modeling than genetics -- it's sort of an epidemiology of genealogy. Like a global epidemic, these close genealogical ties depend very strongly on a few cosmopolitan people. For example, HIV spread very rapidly across the globe because of a relatively small number of people who had many sexual partners (or intravenous drug injections), and even fewer that had either on more than one continent. Most people who have HIV today are not in either of these classes, but HIV has traveled along these biological pathways to all of them.

Even so, the number of biological pathways along which HIV did travel pales beside the number along which it might have traveled but didn't. A biological network with certain properties (such as nodes with large number of sexual partners) enabled its spread, but the spread of HIV actually employed only a small subset of the total network. From one perspective this means that there are a lot of lucky people who could have contracted HIV but didn't. And of course from another perspective it means that some future epidemic might exploit other parts of the same biological network, for instance arising in some other part of the world but dispersing in a similar pattern.

With the recent genealogical shared ancestors of today's humans, the question involves not the total number of sex partners, but a smaller network -- parent-offspring relationships. But against the smaller network size in each generation we can weigh a much longer total time: HIV is less than 50 years old; here we are dealing with at a minimum thousands of years of genealogy. The genealogical relations within the global population limit the network size; and the statistics are meant to illuminate certain boundary conditions on the network, such as how recently the global population could have shared single common genealogical ancestors.

Geneticists are generally concerned with a subset of this genealogical network. We care not about the number of genealogical ties among people (which must grow indefinitely large very quickly in any population), but about the actual ancestry of genes.

Now, it is quite clear that a trillion possible lines of ancestry must be apportioned among a much smaller number of people -- certainly less than a billion (and naturally, another 10 generations gives us a quadrillion possible lines of ancestry, and so on). It is equally clear that a trillion lines of ancestry cannot possibly be followed by six billion nucleotides of DNA. This means that it is quite unlikely that any particular line of your ancestry 40 generations ago has transmitted any DNA at all down to you.

In practice, even though we have billions of nucleotides, our DNA cannot follow billions of genealogical lines. Recombination over 30 -- 40 generations does not divide chromosomes down to individual nucleotides. In the medium term, most human DNA is separated by recombination hotspots into lengths of around 50 kilobases. Across very short spans of 30 generations, DNA is for the most part inherited in chunks of hundreds of kilobases or longer. So dividing six billion nucleotides by 50 kilobases yields a number of around 120,000 ancestral lines at most from which any individual inherits his or her DNA. Recombination will increase this number somewhat further and further back in time, but not nearly so fast as the doubling of possible ancestral lines in every generation. This means that the vast majority of your ancestral lines more than around 17 generations ago have left no DNA to you whatsoever.

For instance, let's suppose that all humans include a single ancestor-descendant link in their genealogies that happened around 40 generations ago. This number will, of course, vary among people -- for some it might be 35 generations, for others 43, etc. But on expectation, the probability is rather less than one in a million that any single person inherited any DNA from this single link.

So from the point of view of genes, this kind of shared genealogical relationship is fairly trivial. What is important for genetics is the extent to which these genealogical lines are replicated by inbreeding. If a single ancestor-descendant link occurs only a single time in anyone's genealogy, then it is likely to be genetically unimportant. But a single link may occur hundreds or thousands -- or even millions -- of times, because many lines of genealogy may trace back to the same few individuals.

This in a way is the point of the second AP article, which talks about everyone's genealogical links to medieval royalty. The point isn't only that there are so many lines of genealogy that many of them must point to royalty. After all, tracing the genealogy of Brooke Shields to the Italian aristocracy isn't exactly just a Joe Schmo example. Nor is the point only that medieval aristrocrats were likely to keep a record of their genealogies (although for most genealogists that is a tremendously important fact). The interesting point is that medieval aristrocrats had lots of successful offspring. So they end up in lots of genealogies because there were lots of them. Sure, people like Camilla Parker-Bowles and Brooke Shields have lots of their genealogical connections because of extra inbreeding. But many of the rest of us have them just because there were so many of them.

But this is precisely where the "global meet market" model breaks down -- different people have different proportions of genealogical links to different ancient populations. As the original article by Rohde et al. (2004) pointed out:

And a few thousand years before that, although we have received genetic material in markedly different proportions from the people alive at the time, the ancestors of everyone on the Earth today were exactly the same (ibid., emphasis mine).

This is why what matters is a matter of interpretation. In case we miss it, the AP article bludgeons us with it:

It also means that all of us have ancestors of every color and creed. Every Palestinian suicide bomber has Jews in his past. Every Sunni Muslim in Iraq is descended from at least one Shiite. And every Klansman's family has African roots.

As much as I would enjoy giving Klansmen the shivers, this line of logic doesn't appeal that much to me. You see, these one-quadrillionth portion genealogical links really can be important only under some kind of extreme one-drop-rule. Ha ha, Klansman! We can prove with mathematical certainty that 10-15 of your ancestry is African! Never mind that you have fewer than 1014 cells in your of those cells is one tenth African! You have the curse of Ham on you!

I don't like it. It doesn't offer much in the way of explanation. Does it matter that one of your ancestors helped build the pyramids? Does it matter that you share a long common ancestry with people halfway around the world? Does it even matter that all of us share all of our ancestry before some recent time?

From a genetic point of view, the answer is that it depends. Sure, all people share some recent ancestors, but how many? What proportion of ancestry is shared between different populations? And since their common origins, how have those populations changed?

We don't know these answers, but they are clearly testable in terms of allele frequencies. More on that later, but in a statistical sense, our ability to assess inbreeding using genes is a lot better than our ability to examine genealogies. And there is a way in which a few cosmopolitan people can have an important genetic effect -- when they carry alleles that are favored by selection. Selection boosts genealogical ties by increasing the chance that some gene lineages do not become extinct. It also makes it somewhat more likely that there will be cosmopolitan people, because higher local reproduction will often leads to dispersals.

From a cultural point of view, there are two interesting points. The first is that a lot of people clearly do find a cultural utility to a one-drop-rule. That is, after all, why the AP articles are written the way they are. People do care whether they have one genealogical link to medieval royalty, regardless of the likely genetic insignificance of any single medieval ancestor. There is some logical sense to this -- since it is easier to make decisions based on simple yes-or-no facts than on fractional quanta. If the full facts aren't easily processed, people find it easier to assume that a royal link confers some status.

The other interesting point is that maybe culture is designed to provide a certain kind of solution to this genealogical problem. After all, the mathematical realities of genealogy and inbreeding didn't just arise in the Middle Ages; they have been with us forever.

Consider, for instance, that some archaeologists have assumed that hunter-gatherer bands persist with a half-life on the order of a few hundred years -- maybe 10 or 20 generations of time. This value might be an important element in the evolution of modern human behavior -- for instance, if it marks some kind of limit on the effective time period of oral traditions. Now, the potential number of ancestors of people after 10 or 20 generations is anywhere from a thousand to a million. It may make a great deal of difference how those million genealogical ties are organized -- if most of them come back by inbreeding to the same couple of hundred individuals, then the dynamics of groups will be quite a bit different than if they outbreed to several thousand or even hundreds of thousands of people. And if a large proportion of those links ultimately come back to a few group founders, then what those founders do in cultural terms might have a large material effect on their distant descendants.

Is it possible that cultures are mechanisms to facilitate an adaptive level of inbreeding? That they once functioned as cooperative blocs to promote the survival and proliferation of the genes of a few founders? That they may still do so in many circumstances (like the medieval aristocracy, for instance)? People do adopt cultures in ways that tend to reinforce this function -- integrating in ways that promote the chance that they or their offspring will have (in their view) good marriages, material comforts, and social success.

People are agents in this process, acting in what they perceive to be their own interests, but coordinating actions culturally with others. Cultural preferences are not only retrospective (the people who look like this receive special treatment) but also prospective (if you do these things you -- or your children -- will receive special treatment). And a high proportion of doing these things involves placing your own progeny into certain privileged classes (marry the right kind of girl, pay the right brideprice, etc.

It is worth thinking about. Particularly to the extent that culture may be related to the emergence of a human pattern of aging. Old people are by their very existence multigenerational, possibly well into the area where these genealogical calculations become more involved. To the extent that he or she can influence the inbreeding pattern, an older person may directly affect the proportion of his or her own descendants that remain associated with his or her culture group. Is it too rarified to make a difference? Who knows?


Myers S, Bottolo L, Freeman C, McVean G, Donnelly P. 2005. A fine-scale map of recombination rates and hotspots across the human genome. Science 310:321-324. DOI link

Przeworski M. 2005. Genetics: Motivating hotspots. Science 310:247-248. DOI link

Rohde DLT, Olson S, Chang JT. 2004. Modelling the recent common ancestry of all living humans. Nature 431:562-566. DOI link