Header image

john hawks weblog

paleoanthropology, genetics and evolution

Photo Credit: Contemporary human skull compared to the Kabwe cranium. John Hawks CC-BY-NC 2.0

Self-citation quantified

I always feel a little bit bad when I have to cite my own prior work for a new research paper. As scientists develop career trajectories, self-citation becomes inevitable. Early work often becomes a foundation for later work. The point of citation is to credit the work that allows you to advance; we track citations so that we can see more clearly which research efforts have led to further advances.

And so, most successful scientists will often cite their own earlier work. The more entrenched an individual’s research trajectory becomes, the higher ratio of self-citation to citation of other research will be.

A new study in PLoS Biology by John Ioannidis and coworkers develops a citation metrics author database that facilitates seeing which authors cite themselves the most often: “A standardized citation metrics author database annotated for scientific field”.

Nature comments on the study, focusing on the issue of self-citation: “Hundreds of extreme self-citing scientists revealed in new database”

The data set, which lists around 100,000 researchers, shows that at least 250 scientists have amassed more than 50% of their citations from themselves or their co-authors, while the median self-citation rate is 12.7%.
The data are by far the largest collection of self-citation metrics ever published. And they arrive at a time when funding agencies, journals and others are focusing more on the potential problems caused by excessive self-citation. In July, the Committee on Publication Ethics (COPE), a publisher-advisory body in London, highlighted extreme self-citation as one of the main forms of citation manipulation. This issue fits into broader concerns about an over-reliance on citation metrics for making decisions about hiring, promotions and research funding.
“When we link professional advancement and pay attention too strongly to citation-based metrics, we incentivize self-citation,” says psychologist Sanjay Srivastava at the University of Oregon in Eugene.

When it comes to judging the impact of individual scientists, people are very sensitive to possible ways of “cheating the system”, and self-citation is widely perceived to be one of those ways. But actually Ioannidis and coworkers find that separating self-citation from total citation counts doesn’t change much when it comes to highly-cited researchers. There are some outliers, but they are focused in particular countries, fields, and institutions.

Another aspect of Ioannidis and colleagues’ study is quantifying the citation practices in different fields of study. Anthropology as a field is not separated out in their table, but social sciences generally have low citation numbers—less than half the numbers of biology or clinical sciences. The 90th percentile for total citations in the social sciences is 423, meaning that only 10% of researchers have career citation totals of more than that.

As a result, methods of judging scientific output that rely upon citation counts tend to underestimate the impact of social scientists. Most kinds of scientific assessment look within fields rather than across them, but in some of the cross-disciplinary organizations (like national academies) social scientists are under-represented compared to their impact in ways other than citations.

Laser surface scan accuracy tested

This is encouraging for those of us who rely upon 3D surface data collected with laser scanners: “Dimensional accuracy and repeatability of the NextEngine laser scanner for use in osteology and forensic anthropology”.

Second, as an application to osteology we examined intra-observer scanning protocol variability, using the coefficient of variation (CV) to quantify surface area and volume variance between repeated scans of eight porcine capital femoral epiphyses with undulating mammillary processes on one surface with amplitudes covering the range of the test block bas-relief offset values. The CoR showed each test-retest measurement from each instrument differed by no more than their CoR: 0.010 mm, 0.137 mm, 0.068 mm, 0.193 mm for the VHX, NE, HP and CMM, respectively. There was agreement between the instruments, but each instrument (NE, HP and CMM) overestimated bas-relief features as reported by the VHX, on average (bias) by 0.046 ± 0.038 mm, 0.025 ± 0.033 mm, 0.026 ± 0.033 mm for the NE, HP and CMM, respectively. Both scanners captured surface features as small as 0.1 m

Porcine capital femoral epiphyses – that’s a pretty great test of some of the small bone fragments that we study in the fossil record. Errors of less than 0.1 mm are less than I would have expected.

The research is by Ronald Perrone, Jr. and John L. Williams in Journal of Archaeological Science: Reports.

Quote: Sherwood Washburn on intellectual traditions in human origins

Sherwood Washburn was a prominent biological anthropologist of the mid-twentieth century, best known as the architect of the “New Physical Anthropology” movement to bring population thinking and primate behavioral ecology into the study of human origins. In the early 1950s, he became involved in an exchange of views about the import of the Piltdown hoax on thinking about modern human origins. The following paragraphs are from a 1954 commentary in American Anthropologist: “An old theory is supported by new evidence and new methods”.

Just before this passage, Washburn details that “There have been two major theories concerning the origin of men anatomically like ourselves” – one theory positing a very recent appearance of modern people during the last glaciation, and a second theory positing a very ancient appearance of modern people and their long coexistence with Neanderthals and other extinct forms.

The Piltdown specimen was a primary buttress for the idea of a long existence of modern humans. When the hoax was revealed, other supposed evidence for the existence of modern humans at a very early time was subject to doubt, and the antiquity of skeletal remains like the Galley Hill material was shown to be erroneous.

Washburn comments on the intellectual traditions that guided various scientists’ reactions to the Piltdown evidence:

Since 1946 events have moved so rapidly that everyone is readjusting and what one said even a few years ago may be quite different from present belief. But it is necessary to look at the past a little more to understand subsequent events. I have mentioned Weidenreich and Hooton, not in the spirit of praise or blame, but as representatives of different traditions. In general, the Germans never accepted early Homo sapiens or Piltdown. The English accepted early sapiens, and the Americans have followed the English tradition. One might put the matter this way: apparently it was as hard for a German to believe in early Homo sapiens as it was for an Englishman to be a skeptic. Hrdlička followed the German tradition. I am no intellectual historian and make no pretense of having read the vast literature on fossil man, but the influence of the intellectual tradition on the interpretation of human fossils is so great that the record makes little sense without considering it. As a part of these traditions, we all have built-in preconceived notions. Was it dogmatic for Weidenreich to accept the result of Friederich’s study, showing that the Piltdown jaw was that of an ape? Or was it dogmatic for Hooton to reject this conclusion? Each acted in accord with previous belief and in accord with the tradition to which he belonged. Both were right. The jaw was that of an ape, but it was impossible that such a jaw should be associated with a sapiens skull by chance. Both were wrong in that neither saw the possibility of a fake as the explanation.
It is easy to refer to the other person’s guesses as preconceived and dogmatic, but from the point of view of the developing science of human evolution the essential point is that progress comes when the area open for personal debate is narrowed. The development of chemical dating methods makes it possible to settle some of the problems which up to now have been matters of personal opinion. Frequently human bones have been found under circumstances in which there is real doubt about their associations and the more such problems can be settled by methods which are independent of intellectual traditions the more rapidly our understanding of human evolution will progress.

In textbooks and popular accounts of the history of paleoanthropology, Piltdown takes a very prominent place.

These popular accounts usually elide other supposed evidence of the great antiquity of modern humans in Europe, like the Galley Hill and Fontéchevade fossil remains, or the contribution of Louis Leakey’s Kanam and Kanjera discoveries in Africa.

A complicated story with various forms of incomplete evidence and competing intellectual traditions is harder to tell than a simple story of a hoax. But the complicated story provides a better model for understanding the ways that fossil and archaeological evidence can contribute to long-lasting scientific debates.

UPDATE (2019-08-22): It strikes me upon re-reading that Washburn is not very fair here to Gerrit Smith Miller, the American who concluded that the jaw and skull of the Piltdown “specimen” could not represent a single species. Miller also didn’t get it right, but he certainly didn’t “follow the English tradition”.

I wrote previously about the doubters: “Lessons of Piltdown doubters”.

Skeletons from the Himalayan lake

Yesterday a fascinating story came out in Nature Communications about the skeletons of Roopkund Lake, in northern India: “Ancient DNA from the skeletons of Roopkund Lake reveals Mediterranean migrants in India”. The research was authored by Éadaoin Harney and coworkers. Some background about the site is provided by the introduction of the study:

Roopkund Lake is a small body of water (~40 m in diameter) that is colloquially referred to as Skeleton Lake due to the remains of several hundred ancient humans scattered around its shores (Fig. 1). Little is known about the origin of these skeletons, as they have never been subjected to systematic anthropological or archaeological scrutiny, in part due to the disturbed nature of the site, which is frequently affected by rockslides, and which is often visited by local pilgrims and hikers who have manipulated the skeletons and removed many of the artifacts.

The Atlantic has a nice article by Rachel Gutman that reviews the research: “The Mystery of ‘Skeleton Lake’ Gets Deeper”.

Since a forest ranger stumbled across the ghostly scene during World War II, explanations for why hundreds of people died there have abounded. These unfortunates were invading Japanese soldiers; they were an Indian army returning from war; they were a king and his party of dancers, struck down by a righteous deity. A few years ago, a group of archaeologists suggested, after inspecting the bones and dating the carbon within, that the dead were travelers caught in a lethal hailstorm around the ninth century.

It turns out that the skeletons accumulated over at least three different episodes, and they reflect entirely different groups of people, one from as far away as the Eastern Mediterranean. This deepens the historical mystery: how did groups of people from three different places and times end up dead in this one place?

What I really like about the study is the multidisciplinary approach relying upon ancient DNA, stable isotopes, and radiocarbon to develop lines of evidence about the origin of the individuals. I wish that these had been combined with a deeper analysis of the skeletal biology of the individuals, including what can be said about their anatomical features, age, sex, and health status.

The study does include some information from a report on the physical anthropology of the skeletons, placed into a supplementary note (Supplementary Note 2). The note explains the connection between the work and a National Geographic documentary that was produced in 2004.

The following section is an edited version of an unpublished report generated before genetic data were available by co-author Prof. Subhash Walimbe. The goal of our edits is to synthesize the anthropological discussions included in that report with the genetic findings. Newly added statements dealing directly with the genetic results are shown in italics. Some of the content of the original reports was used as part of the script of a National Geographic television documentary that was made describing the Roopkund Lake Site, so there are similarities between parts of the text that follows and that script.

One reason why I’m interested in the skeletal information is that this case raises important questions about archaeological sites more broadly. Here’s a case where the context of the skeletal remains led scientists naturally to assume that they represented a single event or group. Now we know that the remains come from different groups who lived more than a thousand years apart, and some of whom came from very far away.

As scientists, we’re conditioned to assume that such coincidences don’t happen. Skeletal remains in one place with an ostensibly similar depositional situation should represent similar individuals. Yet more and more sites are revealing two situations that challenge the usual assumptions:

  1. Cemetery distributions that actually represent diverse people who have migrated to a single location, integrating (to varied extents) into a single society.

  2. Multiple depositional situations that occur within the same landscape or space, sometimes guided by geographic or geological conditions, but representing entirely different groups of people.

The first situation has become a widespread subject in stable isotope biology, as strontium and other stable isotopes have revealed individuals who originated far from the cemeteries where they ended up. The second situation is also being revealed more and more by technology as skeletal remains and genetic data show big differences between individuals from single archaeological sites. The recent analysis of the skulls from Apidima—one very Neanderthal-like in morphology, the other much more modern in what is preserved—is a great example of this.

Coincidences of body deposition from different times and cultures are cropping up. With a large enough record, of course, coincidences are inevitable. But my feeling is that they are turning out to be more common than we have assumed, and that means we need to apply extra critical analysis to many archaeological sites.

So a close study of the skeletal biology is very much in order. The supplementary note mentions a difference in “robusticity” among the skulls. Quantifying whether this actually reflects the population differences in the genetics, whether there are health differences that might have been noticed between groups, and whether there are other clues of different populations would be very worthwhile.

Because other cases are out there, and we’ve likely missed them by assuming that bodies found together must sample variation within a single group.

One more note on this study: This analysis included DNA sampling of 71 individuals, 35 based on whole-genome approaches. This is a huge study in terms of the history of ancient DNA analysis, but today it is a moderate sample size for an analysis of new data. We’ve come to the point where dozens of samples of genome-wide data are within the bounds of a single paper in a speciality journal, to answer a fairly specialized historical question. That’s a great thing from the standpoint of replicability of studies and building a more powerful understanding of the past. But it’s a concern if this is a standard that other studies of the past will be forced to meet. It is not always appropriate to require such large sample sizes, even if a skeletal sample numbers that many individuals.

In this case, the sample size of skeletal material was large, the skeletons themselves are in an unsecured situation where they are subject to looting, and the connection to possible descendant communities was previously unknown. It would be valuable to use this paper for a case study of how confident we can be of conclusions at varied sample sizes—-possibly in combination with deeper skeletal analyses.

Quote: Loren Eiseley on Neanderthal hybridization

Loren Eiseley was an anthropologist well known in the mid-twentieth century for his popular writing about human evolution and science more generally.

In the Winter 1946 issue of Prairie Schooner, Eiseley had an essay recounting the discoveries of the Skhūl and Tabun fossil material from Mount Carmel, Israel. In 1939, Theodore McCown and Arthur Keith had published their analysis of the fossil remains, suggesting a population in the “throes of evolution” with characters of both modern humans and Neanderthals. They rejected an alternative hypothesis, that the skeletons might reflect hybridization between two anatomically divergent populations, but this hypothesis was later taken up by other authors including the geneticist Theodosius Dobzhansky.

Eiseley’s discussion of the situation is worth reading in full as a picture of its time. I wanted to share some paragraphs as examples of his use of language and his standing as a contemporary observer of a moment in science when hybridization was seen as an important topic in human evolution.

Another explanation inevitably comes to mind as we survey this assemblage of beings who dwelt on the slopes of Mount Carmel---an explanation which, though intriguing in its own right, would illuminate but little the origins of that creature in which we are intensely interested, namely, ourselves. Can it be---so runs the little disturbing thought which will not be quieted---can it be that we are dealing with a group of mixed bloods, of hybrids between Neanderthal and a type already essentially modern?

Lest anyone think that it was silly for Eiseley to describe the idea of hybridization as “a disturbing thought”: Even well into the 1990s some serious professional anthropologists were still claiming that hybridization was unlikely or impossible because these ancient populations would have been disgusted at the thought of mating with each other. Scientists carried with them an intense bias from their own social experiences and preferences, and they were unapologetic about playing upon the biases of others in their popular writings.

Eiseley ends his essay with a paragraph that could be a nice example of the “jury is out” statements at the end of many popular science articles.

The fault does not lie in this unique and invaluable discovery which, among other things, has demonstrated that an essentially modern brain and facial structure already existed in the Riss-Wurm Interglacial. It lies in our inadequate knowledge of human genetics and the processes which influence or determine the rapidity of human evolution. Only as our knowledge of the Ice Age population of Palestine increases and the science of genetics grows more sure will the vistas of human prehistory opened by the sleepers in the cave of Mugharet es Skhul be capable of interpretation by modern eyes.

Epstein's science posse

I have been following the story of the late Jeffrey Epstein very closely. The combination of politics and money for this billionaire alleged child sex trafficker continues to command huge press and public attention. I have been appalled by the sheer number of prominent scientists and intellectuals who have been revealed to be on Epstein’s private plane flight logs, or guests of his salons privées, or grantees of his various charitable foundations.

The Observer has an article by Luke Darby that looks at this aspect of the Epstein story: “Private jets, parties and eugenics: Jeffrey Epstein’s bizarre world of scientists”.

Lawrence Krauss, a physicist who retired from Arizona State University, even continued defending Epstein after his 2008 conviction, telling the Daily Beast in 2011: “As a scientist I always judge things on empirical evidence and he always has women ages 19 to 23 around him, but I’ve never seen anything else, so as a scientist, my presumption is that whatever the problems were I would believe him over other people.” He added, “I don’t feel tarnished in any way by my relationship with Jeffrey; I feel raised by it.”
Other scientists seem to have been drawn to the attention and spotlight that Epstein gave them. Evolutionary biologist George Church, one of the few researchers who has apologized for having contact with Epstein, which he attributes to “nerd tunnel vision”, told STAT News that “he is used to financiers, technologists, and celebrities seeking him out, and has become a quasi-celebrity himself”.

A year ago, I was in a discussion with a number of prominent science journalists about how universities monitor conflicts of interest. One thing that they emphasized was the vulnerability of the enterprise of science to the appearance of being bought by moneyed interests. There is no shortage of people and companies looking to buy credibility.

In the case of Epstein, it seems clear that one reason he paid the bills for various scientists is to buy himself social respectability in a certain circle. The list of scientists willing to sell themselves for this purpose is depressing.

Supplementary data loss

My inbox this morning has an article by Diana Kwon in The Scientist, looking into the data decay from the supplementary materials of published scientific articles: “The Push to Replace Journal Supplements with Repositories”.

The story leads with Vaughn Cooper, an evolutionary biologist who published a recent paper on a secondary school biology curriculum in the journal Evolution: Education and Outreach. Readers quickly discovered that the supplementary files were inaccessible.

Supplementary information for journal articles is a bad idea. It has always been a bad idea. Journals at the dawn of the World Wide Web, faced with the opportunity to publish infinite pages at low cost, chose instead to create proprietary non-edited slush piles for methods and analyses totally separate from the standard distribution format of their articles. It’s a near-miracle twenty years later that any supplementary information can still be read by today’s software.

Instead of becoming standardized rich media for data distribution, supplements became a bloated morgue where Excel spreadsheets go to die.

But it’s not just broken hyperlinks that frustrate scientists. As papers get more data-intensive and complex, supplementary files often become many times longer than the manuscript itself—in some extreme cases, ballooning to more than 100 pages. Because these files are typically published as PDFs, they can be a pain to navigate, so even if they are available, the information within them can get overlooked. “Most supplementary materials are just one big block and not very useful,” Cooper says.
Another issue is that these files are home to most of a study’s published data, and “you can’t extract data from PDFs except using complex software—and it’s a slow process that has errors,” Murray-Rust tells The Scientist. “This data is often deposited as a token of depositing data, rather than people actually wanting to reuse it.”

That is a super phrase: “Token of depositing data”. It’s exactly the concern I raised earlier this week: “Biological Anthropology association speaks out on data access”.

Going through the motions of providing data instead of actually creating a useful foundation for further work is a universal problem. After all, work is work. As the software industry has long known, there’s no substitute for full and adequate documentation of code, but still everyone is under pressure to produce outputs right now, with less incentive for doing the work for everyone else.

To their credit, the biological anthropologists reference several online data repositories, which grant agencies and publishers are increasingly encouraging. The Scientist article introduces the subject of repositories as a substitute for supplementary information:

Mark Hahnel, the CEO and founder of figshare, says that he started the company during his doctoral studies out of frustration with the limitations of supplemental files. “We expected to play this role for people who were producing outputs of research that didn’t fit into the model of publishing PDFs,” he tells The Scientist. But increasingly, academics also are using figshare for other reasons, he adds, such as being able to freely reuse material associated with a published paper without worrying about infringing upon copyrights. (While research outputs such as figures in a traditional journal may be subject to a publisher’s copyright policies, those deposited to repositories like figshare are usually published with a creative commons license that allows others to use the material without restrictions.)

Data repositories are a partial solution for only one of the problems of data access — providing a way for readers to get the data and code that underlie a published analysis. Building a durable foundation for further work is another task that should be recognized and valued much more.

Part of that task is publishing texts that actually provide the details of analyses. This comes back to supplements. Today, too many published scientific papers are little more than glorified abstracts. Details are hidden in hundred-page supplements, where they are poorly reviewed (if they are reviewed at all) and rarely replicable.

Some papers should be broken up into independent units. Multidisciplinary work that actually consumes hundreds of pages of detail should be formatted and published in a way that recognizes the detail, not hides it. Scientists working on multidisciplinary problems need to model good writing to enable readers to follow how the details from many analyses fit together. That would be vastly more valuable as a foundation for later work than a citation in a weekly science journal.

Making photogrammetry better with spectral imaging

Three-dimensional scanning of teeth and bones has become a bigger and bigger part of morphological analysis. When it comes to teeth, microfocus CT is the most prominent approach. This is in part because the CT can reveal internal structures such as the enamel-dentin junction, and in part because the translucency of enamel makes it very difficult to get accurate scans from other techniques, including laser surface scanning, photogrammetry, and structured light scanning.

Researchers have long dealt with the translucency of enamel in photography by spraying the teeth with substances that temporarily render them opaque, like chalk dust. Nowadays we realize that many kinds of trace evidence can persist on fossill teeth, and subjecting them to repeated surface treatments for the purposes of scanning and photography is a bad idea.

A new paper by Aurore Mathys and coworkers tries something new: dividing the spectrum for scanning the surfaces of teeth: “Improving 3D photogrammetry models through spectral imaging: Tooth enamel as a case study”.

Spy 2B taken at many wavelengths, from Mathys et al. 2019

Enamel is relatively less translucent to ultraviolet wavelengths, and so Mathys and coworkers were able to obtain better scans by breaking down to ultraviolet scanning. I’m not sure this is the wave of the future, but it is a clever idea.

Biological Anthropology association speaks out on data access

For many years, biological anthropologists have been talking about data access.

This month the American Journal of Physical Anthropology is running a commentary: “Data sharing in biological anthropology: Guiding principles and best practices”.

An ad hoc committee on data access and data sharing produced the commentary, which they describe as a consensus of forty participants across the field.

I think that it is very positive that biological anthropologists are having these conversations. There is broad agreement that the data that underlie published studies should be available for replication and meta-analyses.

On the other hand, I’ve noticed over the years that many scientists who agree in principle that data should be available nevertheless find many ways to obfuscate or prevent access. I see some language in the published statement that makes me nervous. For example:

Project design should include a clear data management and sharing plan that is in place prior to the start of the project. Data sharing should be viewed over a time horizon related to the length of the research project, such that different parts of a data set may be shared at different times. For example, timelines in a grant proposal might include specific target dates for making particular data available (e.g., metadata, raw data, etc.).

I get very worried when I see this. In my experience, timelines and target dates in grant proposals do not translate into data access upon publication. In some areas of biological anthropology, projects that have been funded by our major grant agencies are less likely to archive data in ways that other researchers can access, even though they have filled in the mandatory “data access plans”.

It’s also curious that the NSF-funded data repositories for biological anthropology data, such as MorphoSource and PaleoCore, are not included on the list of recommended data repositories. I know that many projects have satisfied NSF data access plan requirements by referring to these repositories. Yet some people have worried that such data repositories are not sustainable in the long term because they rely upon continued funding.

Anyway, I recommend reading the statement and thinking about how the best practices can be improved.

The complexity of paleomagnetic pole flipping

Scott Johnson writes at Ars Technica about the Brunhes-Matuyama boundary: “The last magnetic pole flip saw 22,000 years of weirdness”.

The researchers interpret this additional data as showing a major weakening of the magnetic field starting 795,000 years ago before the pole flipped and strengthened slightly. But around 784,000 years ago, it became unstable again—a weak field with a variable pole favoring the southern end of the planet. That phase lasted until about 773,000 years ago, when it regained strength fairly quickly and moved to the northern geographic pole for good.

The Brunhes-Matuyama paleomagnetic reversal is conventionally recognized as the boundary between the Early and Middle Pleistocene. When we talk about recognizing geological time periods, it is important to realize that our understanding of the boundaries is limited by the precision of our geochronological methods, and the physical processes that give rise to geological changes themselves.

This is an example where a boundary has 22,000 years of wiggle room that we might not have expected. In the span of 780,000 years, that’s not a long time, but if we want to examine whether two events are simultaneous or one caused the other, it’s a long time.

Profile: David Reich on ancient DNA

Harvard geneticist David Reich recently was awarded a prize in Molecular Biology from the National Academy of Sciences. On the occasion, PNAS has done an interview with him, by journalist Beth Azar: Q and As with David Reich.

The interview may not offer much new for people following human evolution closely, but I thought it was worth sharing Reich’s comments on how the field of ancient DNA might move forward:

PNAS:What are you most excited about moving forward?
Reich: I’d like to help midwife this explosive new field into something that is mature and fully integrated into archeology. One goal is to help generate a lot more data from understudied places in the world, especially outside of Europe, and to build an ancient DNA-based atlas of human migrations all around the world. I would also like to help realize the potential of ancient DNA to provide insights into biology. To understand biological change over time, it is critical to understand how the frequencies of genetic variations change. To do that, large sample sizes of ancient people are needed. In the last two years, due to efforts by our lab and others to scale-up data production, the needed sample sizes are finally becoming available.

Dinosaur property war

Phillip Pantuso of the Guardian reports on the legal battle over the ownership of significant dinosaur fossils: “Perhaps the best dinosaur fossil ever discovered. So why has hardly anyone seen it?”

In a test, last November that court ruled that fossils on Montana state and private land could be considered minerals. “Once upon a time, in a place now known as Montana, dinosaurs roamed the land,” begins Judge Eduardo Robreno’s opinion. “On a fateful day, some 66m years ago, two such creatures, a 22ft-long theropod and a 28ft-long ceratopsian, engaged in mortal combat. While history has not recorded the circumstances surrounding this encounter, the remnants of these Cretaceous species, interlocked in combat, became entombed under a pile of sandstone. That was then … this is now.”

This is a well-reported case recently, and the Guardian account provides more detail and background than other stories I’ve seen. It is very different for me as a paleoanthropologist to think about the world of commercial fossil hunters. I can’t disagree with Horner’s opinion expressed in the article:

They contacted natural history museums around the world, including the Smithsonian – where the bones were offered for a reported $15m – and the Museum of the Rockies, in Bozeman, Montana, whose then head paleontologist, Jack Horner (the inspiration for the character played by Sam Neill in Jurassic Park) told them they were scientifically useless.
“In order for a specimen to be of scientific use and publishable, we have to know its exact geographic position, its exact stratigraphic position, and the specimen must also be in the public trust, accessible for study, which this specimen is not,” Horner says.

Fossils are often beautiful objects, and museums are often great showcases for these objects for public engagement and understanding. But today the science requires a lot more detailed examination of the sedimentary context of fossils than the nineteenth century. Not every fossil is of great interest to present scientists. For scientific research today, separating fossils from their context should be a scientific judgement, in which we must weigh the destruction of context against the possibility of collecting and analyzing information.

For the interests of science, the best place for many fossils is to keep them in the ground. When we excavate anything, there is a loss of information and context, a destruction. As technology has developed, it has given us ways to study fossils and their context with less destruction, and to collect information that was once invisible or simply discarded. The future will bring better methods. In every case, we must consider whether today is the right historic time to separate a fossil from its context, balancing the gain to science against the loss of future opportunities—and any risks to the fossil in its present location.

For hominin fossils, the decisions are just as complex. I’m very glad that private ownership and market value of the fossils is not an issue for our work in South Africa.

Are parsimony analyses better than Bayesian methods for phylogenetics?

During the past few years, Bayesian approaches to phylogeny reconstruction have become more and more widespread, including analyses of fossil hominins. Among hominins, Bayesian approaches sometimes lead to very different results from parsimony, even when applied to exactly the same datasets.

That’s a problem. As a field, we can put much more effort into building morphological datasets of fossils. The best existing datasets still have enormous holes and gaps—they are highly biased toward cranial and dental traits, and even these traits are underrepresented in published datasets compared to fossils that preserve the traits. Researchers who have tried to include some specimens in their analyses have actually been denied permission to study them, meaning that they must rely on published studies only, which exclude many traits. So there’s much room to better document the fossil record that already exists.

Last year Robert Sansom and coworkers carried out a study to examine what difference it makes to use Bayesian methods versus parsimony upon the same datasets. Their conclusion is stated in their title: “Parsimony, not Bayesian analysis, recovers more stratigraphically congruent phylogenetic trees”.

Other scientists have looked at how these methods perform in generating phylogenetic trees for simulated data. In that artificial context, Bayesian methods do better than parsimony. But what about real data? Sansom and colleagues considered it possible that some aspects of real datasets make them different from the usual simulated datasets.

That’s tough to test, because we don’t know the real phylogeny in real datasets. But Sansom and coworkers looked at how the different methods perform with relation to the stratigraphic position of fossils — testing for stratigraphic consistency. They found that Bayesian algorithms don’t do so well:

Bayesian analyses yielded trees that were significantly less congruent with stratigraphic data. Given that the 167 empirical datasets were from a wide range of authors, clades, time periods and taxonomic levels, we can place confidence in the small but significant differences observed. Taking stratigraphic range data as a benchmark independent of morphology, therefore, indicates that parsimony should be preferred over Bayesian analyses, but these empirical results differ from simulation studies. We explore a few possible explanations for this discrepancy.

To be honest, the difference in performance between the two methods in this study is pretty slight. Both methods do badly in generating trees that are stratigraphically consistent. Parsimony was better in a statistical sense but the difference was not large. To me, the bottom line is that some real datasets have features that make Bayesian methods work badly, and others probably have features that make parsimony work badly (and many probably are unsuitable for both).

One thing that the study discusses is whether published trees and methods have already been biased by researchers who were aiming for a particular solution:

Cycles of revision and re-analysis of morphological data matrices during construction could lead practitioners to prioritize phylogenetic solutions that fit some preconceived ideas for final publication (either consciously or subconsciously), including stratigraphic fit. Under such circumstances, parsimony trees might exhibit artificially elevated stratigraphic congruence because parsimony is the historic default method used to evaluate morphological data.

From my experience with hominins, this kind of bias is a real possibility. Until the last few years, paleoanthropologists seemed to aim at a particular kind of phylogenetic hypothesis: one in which successive species in stratigraphic order are progressively more closely related to living humans. That’s an intrinsically unlikely order for relationships to occur. Even though scientists profess to accept that human evolution was a tree, they still tend to arrange them as if they were a straight line.

In conclusion, our analyses demonstrate a clear result: Bayesian searches yield trees that have significantly lower stratigraphic congruence compared with trees from parsimony searches. We find little difference between parsimony using equal and implied character weighting—they are roughly comparable with respect to stratigraphic congruence. If stratigraphic congruence is taken as a benchmark for phylogenetic accuracy, then, maximum parsimony is the preferred method of choice for analysis of morphological data.

I’m not sure that stratigraphic congruence is anything that we should be aiming at. With hominins, it has become clear that relationships between lineages may have little to do with the age of the fossils. I’m also dubious that any change in algorithm is going to bring us closer to the “real” phylogeny. As long as we have substantial missing data from specimens that have already been found, the algorithms are garbage-in-garbage-out.

McKenna and Bell on ranked categories

I learned mammalian systematics and cladistics around the same time that Malcolm McKenna and Susan Bell published their 1997 book, Classification of Mammals: Above the Species Level. The movie-title placement of the colon in their book title suggests the epic nature of the task they took on.

McKenna began during the 1960s to undertake the task of updating Simpson’s 1945 mammal classification to accord with the rules of cladistics, and published an interim part of the work in 1975. When I learned mammal paleontology, it was with class notes drawn from mimeograph copies of old notes. McKenna’s 1975 classification had a prominent place in this—sometimes as the only available classification for some groups, sometimes as one among other conflicting alternatives. From it I learned the placement of many extinct branches of early mammals, and saw systematics as an important part of understanding the fossil record.

How can systematists generate a classification that makes sense given the phylogenetic arrangement of mammals? Much of the evolutionary diversity of mammals has historically been recognized at the level of Linnaean orders – Primates, Carnivora, Perissodactyla, and so on. It happens that many of these orders have a very similar time depth, because they originated at or shortly after the Cretaceous-Paleogene impact event 66 million years ago. But relationships below the level of these orders are diverse – some lineages diversified enormously with sudden adaptive radiations at different times, others were more conservative. And when extinct mammals come into the picture, the diversity and time depth expected of “orders” and other higher level groups become less clear.

Today, genomic evidence indicates that the order Primates is a sister to the order Dermoptera (colugos). The group including both is known as Primatomorpha. Rodentia and Lagomorpha (rabbits) are likewise sisters, grouped as Glires, and this group appears to be a sister to Scandentia (tree shrews), although possibly Scandentia is closer to Primatomorpha. All these together form a group with the name Euarchontoglires. There remain several levels above Euarchontoglires but below the class Mammalia. Primates themselves have an extinct group of relatives known as Plesiadapiformes—sometimes included as a stem group within Primates, but sometimes included within Primatomorpha as a sister to Primates. Each of these higher-level branches of the mammal tree belongs to a distinct level of the hierarchy.

To deal with this complexity, systematists must multiply levels. But how many levels? Linnaeus could stack as many orders into a class as he liked, because he was not working with a bifurcating tree. A modern cladistic classification involves many, many bifurcations, each successive bifurcation in the tree representing a distinct level in the hierarchy. To get from one class to 40 orders requires at least six bifurcations: five hierarchical levels of classification between the class and order. Five is not enough for mammals, because of all the extinct stem branches represented by the known fossils. Each new fossil discovery of early mammals potentially introduces another level.

Simpson (1945) had included fifteen levels from class to species; McKenna and Bell (1997) increased this to 25 levels—recognizing categories such as “magnorder” and “supercohort” above the order, and “parvorder” and “subtribe” for lower levels.

They addressed the interesting difference between a classification and a tree, by discussing how prefixes relate to the hierarchy. As I re-read this passage from page 18, I thought it worth sharing:

Certain taxonomic categories came to bear prefixes suggestive of special hierarchical linkage (e.g., in the family-group, subfamilies are always subsumed in families). Others did not (e.g., tribes might alternatively have been dubbed "microfamilies" or some such term connoting subordination). use of a prefixed category implies that the category to which the prefix applies is also used. In the Linnaean system we do not cconstruct superfamilies directly from subfamilies without also employing families. Logically, however, above the species level there is nothing special about sub- or supercategories. They all could have received unprefixed cardinal names or simply be referred to as taxa. Such names are, after all, just labels (recognition symbols) (Mayr 1953:391). That prefixed categories did not receive unprefixed cardinal names, free of reference to another rank, seems to us to be partly a matter of practicality and memorability, and partly a function of their authors' essentialistic belief in the objective reality (beyond a construct of human language) and commensurability of various examples of such taxonomic levels as classes, orders, families, and genera (see Slaughter 1982). We employ prefixed names for the sake of stability, because they have been long in use, but we do not hesitate to allocate to an incertae sedis position some taxa whose names happen to be prefixed. For reasons of stability we might not wish to change their rank or to list the lower-ranked contents but not the valid but prefixed monophyletic taxon containing them.

It is a thoughtful observation. A tree is a logical structure that does not care whether humans can recognize and remember its parts. One advantage of a system of classification is that it is built with human memory in mind. The use of categories that bear a hierarchical relationship not only in definition but also in the form of the category names themselves has utility. Once a student learns that a parvorder is below the level of an infraorder, and a mirorder is above the order but below the grandorder, they’re not likely to confuse them.


Any set of taxonomic levels faces a problem as soon as any new stem branch emerges between two adjacent levels of the hierarchy. Taxonomists who recognize lots and lots of levels have a buffer against taxonomic changes, because there will be empty levels. But with new discoveries of stem groups, the empty levels may eventually be filled.

We are in that situation with hominins. Hominini is a “tribe”. The group used to be called “Hominidae”, at the family level, but the discovery of the branching order of the apes argued for recognizing the family at a higher level of the tree, so that Hominidae includes great apes and humans, the subfamily Homininae includes African apes and humans, and the tribe Hominini includes only humans and fossil species closer to humans than to chimpanzees and bonobos. But clearly that still does not leave enough levels. The McKenna and Bell classification only provides family, subfamily, and tribe. The branch including chimpanzees, bonobos, and humans lacks a level in this hierarchy. Some scientists advocate recognizing this branch as the tribe Hominini, which would make humans and their fossil relatives a subtribe, Hominina. A different approach would be to introduce more levels: Historically, below the family level, taxonomists have used categories like “infrafamily”,”hypersubfamily” and “supersubfamily”.

None of this would matter very much to everyday use of these groups, if the names of them were not connected to the level. McKenna and Bell discuss this as well. What I didn’t realize is that the use of level-specific suffixes was itself a post-Linnaean innovation with the laudable aim of making levels more consistent:

With the proliferation of ranked categories that had increased steadily from Linnaeus's original six, came also a perceived need to encode the names of taxa themselves as signifying that the taxa for which they stood were members of some particular rank. In each particular discipline, the names of family-group taxa came to have various standardized inflected suffixes linked to the perceived rank. Thus, in zoology, a name ending in "-idae", signifies a taxon at family rank. Latreille (1796), who introduced the family category to zoology, did not use the suffix "-idae". That modification was provided later by Kirby (1815), and has not only stuck but is now legislated by the ICZN. We think of these inflective conventions as part of the "Linnaean System" but, in fact, they are arbitrary post-Linnaean additions to it, originally added for the mnemonic usefulness but now the occasion of much pedantic drudgery whenever taxonomic rank is changed or organisms are transferred from one kingdom to another.

There has been plenty of pedantic drudgery associated with changing hominin taxonomy, and that’s not counting the many holdouts.

Anyway, what if our taxonomies routinely use more and more levels? Doesn’t that get hard to keep track of? It’s fascinating to me that McKenna and Bell defend their 1997 classification of 25 levels by noting that “it’s no harder to learn than the alphabet”. But this passage really made my jaw drop:

In the present classification of more than 5000 mammalian taxa that are assigned generic or subgeneric rank, additional categories have proven useful in depicting in words a somewhat richer hierarchical arrangement of mammals than that found in Simpson's (1945) classification. There are now many more mammalian taxa to classify than was the case in 1945, both in real terms and because of the efforts of "splitters" and paleontological "apparent lineage choppers". Increasingly, most of these names organisms are made known from fossil materials only, sometimes very poorly represented. Moreover, the cladistic revolution in systematics has resulted in far more attention to phylogeny than was the case in the 1940s. The 25 taxonomic levels used in our classification actually fall closer to the theoretical minimum, 13 (see below for formula) than to the thousands that would be required if the classification reflected a completely pectinate (and very unlikely) sequence of taxa. The hierarchical level sequence is no more difficult (for humans) to learn than the alphabet, or probably less so in that some of the levels are very easy to remember because of meaningful prefixes and suffixes. We see no particular reason why, if useful, additional categories (or simply unranked taxa) should not be proposed (or revived). Computers can remember them for us. Indeed, in the program Unitaxon (TM) used to process the data resulting from this classification, facilities exist to expand and keep track of the names, number and sequence of taxonomic levels indefinitely, if deemed appropriate.

Ha! We don’t need to remember taxonomic categories because the computers can remember them for us!

If you’re interested in outsourcing your taxonomic knowledge to a computer, you can still see the Web 1.0 page for Unitaxon, listed as a “software product from yesteryear”. Here’s an excerpt:

Unitaxon Browser 2.0 is available directly from its developer, Mathemaesthetics, Inc. The application is distributed on CD for both Macintosh (System 8 and 9) and Windows (95 or later) operating systems. The Browser will work in Classic compatibity mode under Mac OS X. For maximum performance reasons, the Browser reads the entire classification into memory when you open the file. Depending on the level of taxon commenting in the database, the overhead is currently about 1MB RAM per 1200 taxa on average.
For instance, the most recent classification of the Mammals has been placed on the net in Unitaxon Browser format. It is our expectation and hope that other large taxonomic databases will follow suit.
The price per copy for the Browser is US $128, plus shipping/handling.

Well, that’s one solution.

The changes in 20 years have enormous. Even the link in the Unitaxon website to “vertebrate paleontologists” at the AMNH no longer connects to vertebrate paleontologists — the AMNH site now redirects the link to its “Center for Biodiversity and Conservation”. Malcolm McKenna passed away in 2008.

Anyway, the McKenna and Bell introduction has a lot of really interesting and useful thoughts about taxonomy and classification. The volume was published at the height of cladistic morphological classification, just as DNA evidence was starting to become a potent source of information about the deep relationships of mammal groups. As such, the McKenna-Bell classification has become outmoded in many details, even if some of the guiding concepts behind their taxonomy remain valuable.

Quote: Blumenbach looking for the horned rabbit

I have open Johann Blumbenbach’s A Short System of Comparative Anatomy, in the 1807 English translation by William Lawrence. The full text is on Google Books.

In a footnote to page 24, where Blumenbach described the various horns and antlers of the group known as the Pecora, Blumenbach describes the jackalope!

I have collected about twenty instances, from the middle of the 16th century downwards, in which horned hares are said to have been found, with small branches like those of the roebuck, both in different parts of Europe, and in the East Indies. Were this fact ascertained, it would furnish another striking point in which these animals resemble the pecora. The fact is suspicious, because I have not yet been sufficiently satisfied of a single instance in which the horns were on the hare's head, although every trouble has been taken to procure information; and they appear in the drawings, which I posses [sic], by far too large for a hare.

It seems likely that the source of this idea was the muntjac, or other small cervids. Still, it’s not hard to imagine Americans heading west, thinking that some of the large jackrabbits might turn out to be antelope-like in more ways than one.