White House policy on data access

The White House this week announced a new policy on public access to results from federally funded research. The announcement has gotten

Michael Eisen comments: "No celebrations here: why the White House public access policy sucks".

The administration fell hook line and sinker for the ridiculous argument put forth by publishers that the only way for researchers and the public to get the servies they provide is to give them monopoly control over the articles for a year the year when they are of greatest potential use.
Think about how absurd this is. Publishers, whose role should be to disseminate information as widely as possible, are now the only reason why the public will continue to not have access to research results their tax dollars paid for.

Why is Eisen so exercised? Here's an excerpt from the White House policy memo describing the policy on publication access:

In developing their public access plans, agencies shall seek to put in place policies that enhance innovation and competitiveness by maximizing the potential to create new business opportunities and are otherwise consistent with the principles articulated in section 1.
Agency plans must also describe, to the extent feasible, procedures the agency will take to help prevent the unauthorized mass redistribution of scholarly publications.

In other words, it's no longer just a matter of copyright agreements with publishers; now the federal agencies themselves must help police PDF sharing among researchers. I wonder where "mass redistribution" will kick in.

Further, the memo does not set a 12-month access embargo as a maximum, it directs agencies to adopt the 12-month embargo as a guideline. There is a lot not to like in the memo.

Most of the public attention to the decision has been directed at the effects on scientific publications. I have long been interested in a second area: the public access to data generated by federally funded research.

The White House Office of Science and Technology Policy last year requested public comment on two questions: open dissemination of federally-funded research and open access to data resulting from federally-funded research. I commented last year in response to the OSTP request ("Public interests in data from federally funded research") about the value of data to scientists and others who are not members of federally funded labs. The present announcement from the White House did not indicate how these comments from last year may have contributed to the decision, but it includes general recommendations on both publication and data access.

As it stands, the text of the memo essentially keeps in place the data access requirements established under the Bush administration. That is not a bad thing, and indeed the recommendations listed in the memo seem very reasonable. I quote them here at length:

Each agencys public access plan shall:
  1. Maximize access, by the general public and without charge, to digitally formatted scientific data created with Federal funds, while:
  1. protecting confidentiality and personal privacy,
  1. recognizing proprietary interests, business confidential information,and intellectual property rights and avoiding significant negative impact on intellectual property rights, innovation, and U.S. competitiveness, and
  1. preserving the balance between the relative value of long-term preservation and access and the associated cost and administrative burden;
  1. Ensure that all extramural researchers receiving Federal grants and contracts for scientific research and intramural researchers develop data management plans, as appropriate, describing how they will provide for long-term preservation of, and access to, scientific data in digital formats resulting from federally funded research, or explaining why long-term preservation and access cannot be justified;
  1. Allow the inclusion of appropriate costs for data management and access in proposals for Federal funding for scientific research;
  1. Ensure appropriate evaluation of the merits of submitted data management plans;
  1. Include mechanisms to ensure that intramural and extramural researchers comply with data management plans and policies;
  1. Promote the deposit of data in publicly accessible databases, where appropriate and available;
  1. Encourage cooperation with the private sector to improve data access and compatibility, including through the formation of public-private partnerships with foundations and other research funding organizations;
  1. Develop approaches for identifying and providing appropriate attribution to scientific data sets that are made available under the plan;
  1. In coordination with other agencies and the private sector, support training, education, and workforce development related to scientific data management, analysis, storage, preservation, and stewardship; and
  1. Provide for the assessment of long-term needs for the preservation of scientific data in fields that the agency supports and outline options for developing and sustaining repositories for scientific data in digital formats, taking into account the efforts of public and private sector entities.

These recommendations are all basically already in the NSF data access policies, meaning that the new White House memo will maintain the status quo at that level.

The problem is that the current policy is toothless. Continued data access is a very serious problem threatening the integrity of science. Self-archiving and institutional archiving have been sufficient to pass data management portions of grant applications, but have proven to be woefully insufficient to enable access to data. Meanwhile, some fields have intensive data collection but very little or no data entering the public domain as part of digital repositories. The recommendations listed above do nothing to change the current situation.

Nevertheless there is some room within the recommendations for agency directors to take bolder action on data access. Section (j) perhaps provides the best hope. If federal funding agencies actually assess the long-term needs of each field supported by funding, many (including anthropology) will clearly benefit from the establishment of standard digital repositories.

I hope that NSF will not sit on its current policy but will instead work to extend access more broadly. At the same time, I wish the White House had given clearer guidance to enable the creation of digital repositories and to require their standard use as a condition of continued funding of research projects.