PhyloCode and human evolution

The April issue of Discover has a feature article on PhyloCode, focusing on the roles of Jacques Gauthier and Kevin de Queiroz in trying to revise the code of biological nomenclature. It is an interesting introduction to the issues, but is a little short on specifics, so I went to some additional resources to examine the impact of the whole PhyloCode debate on human phylogenetics.

Proliferating ranks

PhyloCode is an attempt to address two simple problems with the Linnaean system. The first is the problem of ranks. The Linnaean system provides seven ranked positions for species and higher-order taxa. These are the levels familiar to anyone who can remember King Phillip's soup, or his Peter's German origin, or any of the other mnemonics. These seven levels (kingdom, phylum, class, order, family, genus, species) have been supplemented over the years with in-between levels at almost every rank, such as suborders and infraclasses. For example, the most basic division among living primates is into superfamilies, which is the rank occupied by hominoids (great apes and humans), cercopithecoids (Old World monkeys) and ceboids (New World monkeys). The grouping of all three of these superfamilies, Anthropoidea, is a suborder, while the grouping of Old World monkeys and hominoids is the infraorder Catarrhini.

But when it gets to the level of infraorders and superfamilies, the phylogenetic pattern of relationships is already stretching the Linnaean classification to its limits. This degree of differentiation is more or less well suited to primates, but many other groups of organisms have even more complicated phylogenies with many more branches. This leads to some big confusion:

As part of their work, [Gauthier and de Queiroz] created a lizard family tree, but when they began to assign names to the important branching points on the tree, they realized there were more groups to name than there were ranks in the traditional system. "I started using these exotic ranks like parvorder, cohort, and microorder, and all that kind of crap," Gauthier says. "Then we'd learn more about the tree, and all the names would have to change. I thought, 'That sucks. All these ranks, they're a problem.'" (Foer 2005:48-49)

This is a problem I've thought about for a while also, ever since I was learning Mesozoic mammals and encountered exotic taxonomic ranks like "tribe" and "domain." Unlike suborder and infraorder, many of these give no indication at all about where they belong in the phylogenetic hierarchy. If this complication actually helped organize species, that would be forgivable. But even the extension to thirty or more ranks is not enough to encompass all the possible groupings in some phylogenies, especially where extinct species must be placed in a hierarchy including living species and their ancestors.

And of course the probability of disagreement among authorities on names increases combinatorially with more taxonomic ranks. Even within the hominoids there is at present substantial disagreement on the names of groups at almost every taxonomic level, despite the fact that almost everyone agrees about the phylogeny of the living species of apes and humans. Some of this disagreement is purely nomenclatural, while the rest comes from genuine disagreements about the phylogeny of extinct apes. It seems especially problematic that disputes about the relationships of extinct and fragmentary fossils could substantially alter our judgment about the nomenclature to apply to living species, but that is exactly where we stand.

Hominids and hominins

This leads to the second major problem of the Linnaean system, the problem that the names of groups themselves are formulated in a way that cannot be divorced from their taxonomic level. What this means is that if our hypothesis of phylogeny changes, the names of taxa must also change. The problem with this is that it subverts the goal of communication:

In the zoological code, family names must end with the four letters idae, for example, and subfamily names must end in inae. If taxonomists decide that a group once considered a family should instead be ranked as a subfamily, the group must, under the rules of the current system, get a new name. This frustrates the PhyloCoders to no end. "It's still the same tree," Gauthier says. "Nothing has changed, except how we spell the names. In a day when all this information is going onto the Internet, this is a bad idea. It's a constant change of PIN numbers." Some taxa have gone through a number of different names over the course of just a decade. Several years ago, for instance, it was decided that the great-ape family Pongidae couldn't exist at the same rank as the human family Hominidae because humans are a subset of the great apes. To fix the problem, researchers proposed that humans and their great-ape relatives be combined into a single family, Hominidae, and members fo the family Pongidae became the subfamily Ponginae. This can make literature searches a real pain, Gauthier says: "To a computer, there is a world of difference between iguanidae and iguaninae" (Foer 2005:50).

In my mind, computers are the least of the problem. Replace "to a computer" with "to an undergraduate" and you are closer. Really, even this understates the problem. If we could ensure that a new taxonomy established by universal consensus today would not change in the future, then it would be well worth changing all the names. But we can be pretty sure that things will change in the future, repeatedly. It just isn't worth having a system where the names have to be changed all the time, because such changes render all past research at best confusing, or at worst nonsensical.

The Hominidae-Homininae problem is not the only one in paleoanthropology, but it is a convenient example. Foer's description of the problem is one possible reformulation, but not the most popular one. We all recognize that African apes and humans are more closely related than either is to orangutans, and chimpanzees and humans closer than either is to gorillas. Many people would apply Hominidae to all the great apes, ponginae to orangutans, and homininae to the African apes and humans. This leaves the human lineage (including australopithecines) in the "tribe" Hominini (The tribe Panini would therefore include tasty Italian bushmeat sandwiches). Thus, orangutans would be hominids, gorillas would be hominines, and australopithecines would be hominins.

Consider the problems with this arrangement. First, it isn't comprehensive. There is no name for the human-chimpanzee clade, for example. The taxonomic level for that clade would properly depend on the details of the evolutionary divergence among gorillas, chimpanzees, and humans. If, for instance, there was a substantial adaptive radiation between the gorilla divergence and the human-chimpanzee divergence, then these fossil lineages might be placed with chimpanzees and humans within an infrafamily, with the chimpanzee-human clade placed as a supertribe. Likewise, the branch points leading to the dryopithecines depend on their relationships with the later African apes, or even to the Asian apes. In other words, the taxonomy still hangs on currently unknown phylogenetic branchings, and the choice of taxonomic level is entirely arbitrary.

The arbitrariness of the naming system is highlighted by some other alternatives for the hominoids. For many years, molecular researchers like Morris Goodman have suggested that the genetic similarities between chimpanzees and humans are consistent with those within genera of most mammals, and the time of origin of these lineages is also consistent with the antiquity of mammalian genera. So Goodman et al. (1998) took the logical step of including both humans and chimpanzees in Homo. The great apes in this scheme are all hominins (tribe Hominini) and the living hominoids are all hominines (subfamily Homininae).

By discarding past consensus, arbitrary changes impose a cost on any researcher or student, in discarding past consensus. The past fifty years or more of paleoanthropological research have shared a clear meaning for the term "hominid." Of course, one may read that literature today while remembering the past meaning of "hominid," just as we remember what "pithecanthropine" used to mean. But it is a cost that should come at some benefit. For "pithecanthropine," the loss of the genus Pithecanthropus combined with the discarding of the idea of a "pithecanthropine stage" of human evolution means that we no longer have any call to use the term. The benefit of the change is simplification and the recognition that an incorrect hypothesis of evolution has been refuted.

Many would argue that the replacement of hominid with hominin has similar benefits. After all, the use of "hominid" in the past was partly conditional on the acceptance of the family Pongidae to hold the great apes. Now that we know that humans and African apes are sister taxa, we should construe Hominidae differently. It is clear that the human lineage did not have a long independent evolution during the Miocene, that its origin is comparatively recent compared to other mammalian families, and that the gross genetic distinctiveness of humans is relatively low. Doesn't it therefore clarify our understanding of hominoid evolution to demote the human lineage from a family-level taxon to a lower taxonomic level?

The clade formerly known as Hominidae

The problem with this line of logic is that it is a purely aesthetic choice. There is no reason to suppose that a family-level taxon should have a particular date of origin or duration. One may argue that extant mammalian families have a distribution of ages, or even of genetic variation, and that this should inform our taxonomic choices. But the logical endpoint of this argument is not that the human lineage is a tribe-level or infrafamily-level taxon, but instead the endpoint is the conclusion of Goodman et al. (1998), that the human lineage is a subgenus-level entity and chimpanzees should be placed in Homo. The fact that this solution is viewed as "too extreme" is good evidence that this is at its core an aesthetic concern rather than a scientific one.

In fact, there is no scientific reason why a particular phylogeny should correspond to a particular range of phylogenetic ranks. Many extant families of organisms include hundreds of species, others include only one. Some extant vertebrate families originated in the Paleozoic, others in the Pliocene. And viewing only the variation of extant species is especially misleading on this issue. When we consider the relationships of extinct organisms, we find family-level groups originating across the history of the earth. The family rank has been applied to short-lived groups with uncertain affinities, to extinct collaterals of living orders or classes, and to single fossils. When it has been applied, it has usually been according to considerations of morphological adaptive pattern. On this basis, there is a good argument for the idea that the human lineage should be at the family rank, regardless of its antiquity. The adaptation to an obligate pattern of bipedalism along with the dental specializations of the australopithecines (shared with humans) set them apart from other apes to a greater extent than any great ape. These features probably mark the human lineage as substantially different from great apes in adaptive terms as the great apes are from hylobatids.

So what aesthetic considerations prevent us from simply continuing to calll the human lineage Hominidae? That usage requires that something be done to avoid a paraphyletic taxon including orangutans, chimpanzees, and gorillas. We seemingly have a choice: accept Gorillaidae, Panidae, and Pongidae alongside Hominidae, or demote all these taxa. The demotion also helps with (although does not solve, see above) the problem of assigning taxonomic ranks to the African-European ape clades. A lower-level human clade leaves more ranks below superfamily to apply to the great ape clade, its possible progenitors among the Afropithecinae or Proconsulidae, the possible ancestors of the African ape clade among the Dryopithecinae, and the possible ancestors of the human-chimpanzee clade. Each of these clades may need a rank, and there aren't enough ranks to go around.

I have no problem with aesthetic changes in nomenclature per se. After all, I wholeheartedly support replacing "Neanderthal" with "Neandertal." And in fact, I don't find "hominin" that objectionable. It may take me a while to get used to the sound of it, but it is very clear in its now-current application. Since it merely replaces the old use of "hominid," it is a simple replacement of one unambiguous term for another. It seems to me much better than relegating the human lineage to a subgenus, which would leave no taxonomic names at all to talk about the origins of the human lineage (notice how much more awkward this becomes when we can't say "hominid origins").

What I don't like is the confusion that comes from changing the meaning of "hominid." "Hominin" means nothing special to anyone now, so it has a low conceptual cost. In contrast, "hominid" until recently meant something entirely different from its proposed meaning, inclusive of all great apes. "Hominid" is how countless interested followers of paleoanthropology recognize our ancestors, and it is how many of us have presented our science publicly throughout our careers. It is bad enough that we have to get our students to understand that "hominoids" are not "humanoids," and "hominids" do not include all "hominoids." Now we have to get them to differentiate "hominins" from the rest.

An argument is that "hominin" is qualitatively more valuable than "hominid," because it conveys a more correct view of the human phylogenetic rank in comparison to other groups of mammals. This would be the "Copernican" analogy -- noting that the sun is the center of the universe "puts humans in their place," and noting that our taxonomic level is at the tribe rather than the family likewise shows how our place is less special among the species of the natural world. Or at least, it does not distort our view of ourselves by giving us a higher taxonomic rank than we deserve.

But of course, if it is our goal to have every name indicate its exact rank relative to other organisms, then we must also make mammalian groups consistent with insect groups, mollusc groups, and plants, for that matter. For this purpose, it might be as well to include a number after every taxonomic name, to represent the genetic variation encompassed by the group, or age of the group in millions of years, for example.

And more to the point, the next time someone decides that the hominoids subsume too small a segment of the mammalian phylogeny, it will seem necessary to some revolutionaries to change the taxonomy yet again. When we revise terms to give a "correct" understanding of their status, there is no end to "corrections" in pursuit of this goal.

So there are good reasons to resist the shift to "hominin." It renders "hominid" inconsistent with its historical usage in the literature. It unnecessarily confuses the public, especially those who follow our science at a distance. And most important, there is no guarantee that this change will be the last.

How does PhyloCode help?

This is not a full summary of the rules of the PhyloCode. These are available online.

PhyloCode is a system for naming clades. Under this system, each clade in the phylogenetic tree of life is eligible for a unique name. These names are not ranked, so that although clades are necessarily hierarchical, their names are not systematized in a hierarchical way. There are two basic reasons for the use of rankless names:

  1. The number of clades on some phylogenies is so extensive that a rank-based classificaiton devolves into confusion.
  2. Under a rank-based classification, any change in the rank of a single clade name requires concomitant changes to many other clade names, although neither their content nor their hierarchical placement has changed.

Thus, the PhyloCode "holds clades innocent" of changes in other clades, by retaining a single, unique, unchanging name for them.

Clades are may be defined in a number of ways, including by apomorphies, by descendants of a single ancestor, or by the inclusion of all species joined by a single node. This last, node-based clade definition is probably the most common. For example, the living African apes and humans belong to a clade that we might call "Clade Homo sapiens and Gorilla gorilla", while humans and australopithecines may be joined in "Clade Homo sapiens not Pan troglodytes."

Part of the appeal of this kind of scheme is that it approximates what we do much of the time anyway. The human-chimpanzee clade has no taxonomic name, at least not that most people would know, and when we talk about it, we use the term "human-chimpanzee clade." It is understood that this clade also includes Pan paniscus, and that bonobos are nevertheless not part of the name, although "human-bonobo clade" would be no less correct. For larger taxonomic groupings, this trends toward a kind of shorthand. "Human-gorilla" clade necessarily includes chimpanzees and bonobos, and it shorter than "the clade containing extant African apes and humans." PhyloCode effectively codifies this shorthand.

But at the same time it provides a procedure for giving each of these clades a name. Remembering that these clade names carry no rank information, it is possible to give every one of these clades a name that is at once unique and resistant to change with changes in our understanding of phylogeny within and outside of the hominoids.

Phylocode and hominoids

Considering all this, one may wonder what the PhyloCode proposal would say about our current taxonomic problems in paleoanthropology. In the central instance, does PhyloCode provide a way out of the hominid-hominin problem?

According to the current draft (June 2004) of the PhyloCode, "phylogenetic definitions for many widely used clade names" (Cantino and de Queiroz 2004:4) will be presented in a volume resulting from the first meeting of the International Society for Phylogenetic Nomenclature, in Paris, July 2004. That volume is not yet available, but the abstracts of the meeting have been compiled and are available in a PDF online.

Representing primate systematics at the meeting was a contribution from Kaila Folinsbee and David Begun. The part pertaining to Hominidae reads as follows:

We propose to redefine Hominidae Gray 1821 (converted clade name) as the most inclusive clade containing Homo sapiens and Pongo pygmaeus. We redefine Homininae Gray 1825 (converted clade name) as the most inclusive clade containing Homo sapiens and Gorilla gorilla not Pongo pygmaeus. Hominini Gray 1825 (converted clade name) includes Homo sapiens but not Pan troglodytes. The Ponginae has traditionally been paraphyletic, separating Pongo pygmaeus, Gorilla gorilla and Pan troglodytes to the exclusion of Homo sapiens. Ponginae Elliot 1913 (converted clade name) is defined as Pongo pygmaeus but not Homo sapiens. These converted clade names preserve the established endings of the older system in order of most to least inclusive. (Folinsbee and Begun 2004:39)

In other words, this enshrines the use of "hominin" for the human lineage and "hominid" for the great apes and humans.

I think this is unfortunate, since the opportunity was there to establish a classification that would be at the same time unambiguous and maximally consistent with historic use of the term "hominid." To do so, a different term for the great ape and human clade would have to be invented or drawn from the literature. But the strength of the PhyloCode is that this name would not have to be at a higher rank than Hominidae. So for example, all the great apes and humans could be classified in Pongidae, with the human lineage assigned to Hominidae. Retaining "hominid" for the human clade would have followed the PhyloCode recommendation for converting clade names under the old system to the new one:

Recommendation 10A. Clade names should be selected in such a way as to minimize disruption of current and/or historical usage (with regard to composition, diagnostic characters, or both) and to maximize access to the literature. Therefore, when establishing the name of a clade, a preexisting name that has been applied to that clade, or to a paraphyletic group stemming from the same ancestor, should generally be selected if such a name exists. If more than one preexisting name has been applied to the clade (including those applied to paraphyletic groups stemming from the same ancestor), the name that is most widely and consistently used for it should generally be chosen (Cantino and de Queiroz 2004:26).

Under this recommendation, the wholesale switch from "hominid" to "hominin" would not be the preferred outcome. Nevertheless, the case for resisting the classification as proposed is weak, and likely futile.

The most important consequence of the PhyloCode may be in strengthening the hand of conservatives in the future. The classification of the hominoids has for the past few decades been characterized by a pressure to place the human lineage at a lower and lower taxonomic rank. This revision began with Ernst Mayr, has continued through the elevation of "Hominidae" to include all the great apes, and is expressed today by geneticists who would like to include chimpanzees in Homo. This trend has had the primary motivation of making the hominoid taxonomy "equivalent" to that of other vertebrate taxa, with a secondary, often unstated, goal of demoting the status of humans in the natural order. There is every reason to suppose that both these motivations will continue in the future.

But the PhyloCode classification helps make it possible to retain the same names even in the As proposed, the PhyloCode recognizes the names "Hominidae," "Hominini," and others as rankless clade names. Thie means that even if the classification changes substantially in other ways (for example, placing chimpanzees in Homo), we still can use these rankless names for the clades in the hominoid phylogeny. The human lineage can be "Hominini" whether it is technically equivalent to an old-style subfamily, tribe, or subgenus, in other words. But more importantly, if rankless names are recognized widely among the mammals, then there is less of a reason to require clade names to be made consistent across the mammals. Instead, we can move to a direct reference to the age of clades, or the level of genetic differentiation they represent, or other quantitative considerations. This would be a step forward in phylogenetic classification.

Names of fossil hominid genera

Although the phylogeny of the extant hominoids is well understood, the phylogeny of fossil hominids (or hominins) is not. There are several outstanding problems, including whether the robust australopithecines are monophyletic, the relationships of the habilines, and more minor problems such as the placement of Sahelanthropus, Kenyanthropus, and Ardipithecus relative to other fossils and (arguably) extant hominoids. For these problems, PhyloCode provides some assistance.

Most important is the option to define clade names conditionally upon the acceptance of a particular phylogeny:

11.9. In order to restrict the application of a name with respect to clade composition (i.e., under alternative hypotheses of relationship), phylogenetic definitions may include qualifying clauses specifying conditions under which the name cannot be applied to any clade (see Example 1). It is also possible to restrict clade composition under alternative hypotheses of relationship through careful wording of definitions (see Examples 2 and 3) (Cantino and de Queiroz 2004:29).

This is clearly useful for the hominid phylogeny. For example, a careful definition might classify the robust australopithecines as a clade including both A. boisei and A. robustus. The node connecting these two species might well also include the species A. aethiopicus, or it might not. A definition conditioned on the inclusion of that species would encompass those phylogenetic hypotheses in which these three species are monophyletic. Such a clade might simply be named Paranthropus, or it might be desirable to give another taxonomic designation, such as "Paranthropina." The process explored by this example could easily be extended to other cases.

A question is whether this all goes too far toward the cladistic extreme of classification. There are a number of nontaxonomic names now applied to the hominids, including "australopithecine," "habiline," "human," "Neandertal," and others. Under Simpson's classification, these would be called N2 names, and their strength is precisely that they are not taxonomic. The extension of any one of them can change according to convenience, and is not necessarily constrained by considerations such as monophyly.

There is certainly a utility to continuing to use nontaxonomic names like these, as long as adaptation is part of our consideration of evolutionary history. It is almost certainly true that humans derive ultimately from some species of australopithecine. But that does not mean that we should not talk about australopithecines, just as a definition of Dinosauria that includes birds does not mean that we should stop talking about dinosaurs.

Conclusions:

I started writing this essay while deeply considering a problem: is it time to switch to using "hominin?" This is more or less urgent to me because I have a textbook for which a decision must be made. It is not too late to search-and-replace "hominid" throughout. I have no special reason to use "hominin" myself; indeed I find it distasteful to do so. I like "hominid" -- it's the way I learned the field. And I happen to think that our adaptive differences from other primates deserve a high-rank designation, regardless of our genetic similarities.

Yet, "hominin" has a formidable position. It has swept beyond a small clique of scientists to encompass most of the new announcements of species in the field. Those most conversant in taxonomy are not the most prolific in terms of publications, but everyone who names anything must have a full understanding of these issues, and in this realm, the assault of "hominin" has been unrelenting. And within the last year popular publications have begun to regularly use "hominin." For example, National Geographic uses the term in two articles in their April 2005 issue, postfacing it as "a term for humans and their relatives."

The use of the term is no longer just an option, it is approaching the default. The PhyloCode is far from acceptance among taxonomists, but by providing a rank-free naming system for clades, it created the potential to avoid the issue. Except that the founding conference of the system introduced as an integral element the nomenclature applying "hominin" to the human clade and "hominid" to the great ape clade. So all escape routes appear to be blocked. There is only the unrelenting attrition imposed by the taxonomic cognoscenti.

All this means that if I continue to use the term "hominid," I should have a principled reason I am willing to stand by. And I don't. Nostalgia is not a principle. I myself am not confused by older literature that uses "hominid," and I am not convinced that my students will be confused, either. For undergraduates, it's just another name to learn. And if popular magazines are blithely using the term, the public is just going to have to follow. In the end, I think there will be a cost, borne by all of us, but hopefully the change will be more or less permanent and any hard feelings soon forgotten.

So sometime fairly soon, I will probably resign myself to saying "hominin," and using only my right hand on the keyboard instead of both. And maybe I'll take the edge off by writing some taxonomy myself. Any suggestions for clade names are welcome.

Afterword: Where did hominin come from?

I have never seen a review of where the usage of "hominin" came from, and how it became common in paleoanthropology. A search of journals indexed by ISI finds the first keyword reference to "hominin evolution" in a 1993 paper on Makapansgat paleoenvironment in JHE by R. J. Rayner, B. P. Moon, and J. C. Masters. The most widespread early use of the term appears to have been by Bernard Wood and his collaborators. I have not done a systematic review, if anyone has any insight on this I would be most pleased to hear of it.

References:

Cantino PD and de Queiroz K. 2004. PhyloCode: A Phylogenetic Code of Biological Nomenclature. PDF available online

Foer J. 2005. Pushing PhyloCode. Discover 26(4):47-51.

Folinsbee KE and Begun DR. 2004. Phylogenetic nomenclature of living and fossil catarrhines. In First International Phylogenetic Nomenclature Meeting Abstracts, M Laurin, ed. p. 39. PDF available online

Goodman M, Porter CA, Czelusniak J, Page SL, Schneider H, Shoshani J, Gunnell G, Groves CP. 1998. Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidence. Mol Phylogenet Evol. 9(3):585-598. PubMed

Rayner RJ, Moon BP, and Masters JC. 1993. The Makapansgat australopithecine environment. J Hum Evol 24(3):219-231.