Big data, no access, no replication possible

The New York Times has an article by John Markoff today, pointing to several disputes over the standards for data release with scientific papers. “Troves of Personal Data, Forbidden to Researchers”.

These cases mostly relate to data gathered by corporations about their users or customers, which raises privacy concerns that are similar in some ways to those attending biomedical research. For that reason, I don’t think that they are a good comparison with the situation in paleoanthropology, but they do overlap to a great extent with the issues in human genetics. In either case, the article has many elements that are useful to think about:

He added that corporate control of data could give preferential access to an elite group of scientists at the largest corporations. If this trend continues, he wrote, well see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right connections to private data.

Also, I did not realize this:

The data-sharing policy of the journal Science says, All data necessary to understand, assess and extend the conclusions of the manuscript must be available to any reader of Science.

Several paleoanthropology papers have been published in the last few years without meeting this basic standard.