Stopping ‘wasteful recollection of data already held by other research groups’

2 minute read

Last week I commented on the American Association of Physical Anthropologists’ recent statement on access to data: “Biological Anthropology association speaks out on data access”.

This is a big issue to which many voices have contributed. I’d like to bring attention to a broader selection of those views—not just the loudest or most widely read.

Three years ago, Lynn Copes and coworkers published a Scientific Data descriptive paper for a dataset of CT scans of non-human primate skeletal material. The paper is open access: “A collection of non-human primate computed tomography scans housed in MorphoSource, a repository for 3D data.”

The sample consists of 489 scans taken from 431 specimens, representing 59 species of most Primate families. These data have transformative reuse potential as such datasets are necessary for conducting high power research into primate evolution, but require significant time and funding to collect. Similar datasets were previously only available to select research groups across the world.

This has been a groundbreaking data release, providing an irreplaceable source of data not only for comparative analyses with other primate collections but also for education. The Harvard Museum of Comparative Zoology, which houses the primate material in the study, is to be congratulated for its forward thinking.

In the Scientific Data paper, Copes and colleagues commented on the overall situation with data access in biological anthropology. One paragraph is especially worth sharing:

Despite this rush to digitize, comparative morphology is experiencing a crisis as a mode of addressing large-scale evolutionary questions due to the difficulty involved in accruing datasets large enough to have high explanatory power, and the small community of researchers that can participate effectively. This presents a paradox: If so many researchers are putting large efforts into scanning, where are the massive samples? Though a few research groups have managed to generate large samples of scans comprehensively representing diversity in one clade or another, this work has been time consuming, and expensive: as a result these scans are not made widely accessible to non-collaborating researchers. This inequality in access to what is now essential, basic data clearly falls short of scientific ideals for meritocracy. Furthermore, a significant component of the unmanageable demand for 3D scan data experienced by museums may represent wasteful recollection of data already held by other research groups.

We are in an age where scientists must recognize the risk of destruction, damage, and loss of physical specimens held by museums around the world. Creating high-fidelity digital models of the physical objects and distributing those models widely is increasingly essential as a strategy for protecting objects with scientific and heritage value.

Of course researchers cannot answer every possible question from a digital model. But even for questions that must be addressed from original specimens, providing high-resolution digital data is essential for future researchers to see the details of analyses and use them as comparative data for analysis of other objects.