Opening up paleontology

Ewen Callaway writes in Nature News this week on open access science in paleontology: “Fossil data enter the web period”. I write about this topic quite a lot. Me last year on the NSF data management requirements:

I mean, seriously -- they're going to "put people on notice that they have to think about it"? Give me a break.

Yeah, I’m a skeptic. Lots of entrenched interests oppose making paleontological data available to the public, and they’ve been acting as if the pressure for openness will just blow over.

The sad part is that so far they’ve been right. Data access requirements were first mandated as part of NSF and NIH reporting by a Republican Congress, signed by Bill Clinton. We’re now on our third administration, more than a dozen years later. I have been writing about these issues here for seven years, and I have seen very little progress toward making the primary research data available to the public. There’s been a lot of talk, and regrettably little action in paleoanthropology.

I wrote a long essay about this topic in 2005: “NSF and data access”. I described many of the efforts to make data access more open and to encourage digital archiving as a routine part of NSF-funded projects. My concern:

I do not think it would be overstating the problem to suggest that perhaps half the people teaching human evolution in four-year universities have never touched a cast of a Hadar fossil. I would be delighted to be proved wrong, but I don't think I am. Our field is educating students into a world in which A. afarensis is unknown in the laboratory and poorly represented in our textbooks. I'm not talking about new specimens, here, I'm talking about fossils that were found in the mid-1970's and monographed in 1982.

Looking back at that essay, I have two reactions. I’m very proud of what I wrote. I think I captured that main points while giving much credit to the structural drawbacks of open access. But I must say that I’m depressed that the situation has not changed in six years.

Callaway’s article gives me some hope. He describes how paleontologists and morphologists have begun to put some teeth into data access policies. The motivation for the article is the change in policies of the Journal of Vertebrate Paleontology to require access to certain kinds of raw data:

Propelled in part by data-sharing edicts from funding agencies such as the US National Science Foundation, the Journal of Vertebrate Paleontology announced in January that it would require authors to post raw data files on its website (A. Berta and P. M. Barrett J. Vert. Paleontol. 31, 1; 2011 ). It is also considering mandating storage in public repositories such as Morphobank. Meanwhile, the Paleontological Society in Boulder, Colorado, which publishes Paleobiology and the Journal of Paleontology, last month decided to archive data from its papers using a repository called Dryad. "My only concern is that archiving so far is an unfunded mandate," says Philip Gingerich, the society's president. "Archiving could easily consume an entire research budget."

In the past (and continuing in many cases), paleontology has involved a huge fixed and ongoing investment in curation. Museums have been the repositories and guardians of fossils, the primary resources of paleontological science. Digital data does not end that responsibility, and in some senses may increase the resources needed to maintain collections. So Gingerich’s point is an important one: data curation adds a large new expense, and many universities and museums are not up to the job, either because of a lack of funding or expertise.

But seeing this as an “unfunded mandate” is, in my opinion, the wrong perspective. Data curation is necessary for good science. In paleontology, results depend on reconstruction and comparison, and this process cannot be understood without access to the primary data. Digital methods make this vastly easier and more rapid, while greatly reducing the wear and damage from repetitive inspection and measurement of original specimens. More eyes on more specimens make for better morphological work. When a journal makes data access a condition of publication, that’s an enormously helpful step. It recognizes that data access supports the integrity of the science.

Unfortunately, there is the problem of phallus-swinging behavior among paleontologists:

Tensions between scientists who discover new fossils and those who analyse and synthesize their finds are not new, says Mike Benton, a vertebrate palaeontologist at the University of Bristol. For example, Jack Sepkoski of the University of Chicago, Illinois, who in the 1970s and 1980s studied mass extinctions in the global fossil record, faced criticisms for repurposing other scientists' field work. But, says Benton, "if you wanted to keep it secret, you shouldn't have published it".

Guess what: if you wanted to keep it secret, you need to send back your grant money and permits, and go into the collectors’ trade.