john hawks weblog

paleoanthropology, genetics and evolution

open access

  • White House to recognize open science

    Wed, 2013-05-22 13:53 -- John Hawks

    The White House is looking to recognize people who are leading in open science efforts, either by providing free access to data or by using data that is already publicly available. I imagine that public education efforts using open data would also qualify for this recognition: "Seeking Outstanding 'Open Science' Champions of Change". The reward is a trip to a White House event June 20.

    We are asking for your help to identify “Open Science” Champions of Change—outstanding individuals, organizations, or research projects promoting and using open scientific data for the benefit of society. For example, a Champion’s work may involve:

    Providing free access to data or publications generated from scientific research; or

    Leading research that uses publically available scientific data.

    Anyone can nominate an “Open Science” candidate for consideration by May 23, 2013 (under “Theme of Service,” choose “Open Science”). In the “Reason for Nominating” section of the nomination form, please also include information about any upcoming open-science-related announcements or new steps that the individual or organization you are nominating has planned, which could potentially be launched at the Champions of Change event.

    I just found out about this process this morning, but it looks like a constructive step in recognizing people who are moving science in a more open direction. Earlier this year, the White House recommended a new policy on data access, which I found to be very helpful in comparison to the concurrent policy on publication access "White House policy on data access".

    Nominations for this honor are due tomorrow (Thursday), using the short online nomination form. I hope many worthy people can be recognized in this way!

    Synopsis: 
    A call for nominations for excellent open science researchers and advocates
  • Open 3-d archive of Kromdraai

    Tue, 2013-05-14 08:38 -- John Hawks

    A new paper in the Journal of Human Evolution by Matthew Skinner and colleagues [1] announces the new availability of an open archive of microCT data from the site of Kromdraai, South Africa, with a large collection of hominin specimens curated in Pretoria at the Ditsong National Museum:

    Digital representations of vertebrate fossils are quickly becoming a standard source of data for scientific inquiry and non-destructive imaging of the internal structure of fossils is opening up new avenues of research that will further our understanding of fossil taxa. The purpose of this paper is to formally announce the availability of high-resolution microtomographic (microCT) scans of hominin fossils from the site of Kromdraai B (known as the ‘hominin site’, as opposed to Kromdraai A, the ‘faunal site’), South Africa. These microCT scans are the result of a collaborative research project between the curatorial institution of the Kromdraai fossils, the Ditsong National Museum of Natural History (DNMNH; formerly the Transvaal Museum) in Pretoria, South Africa and the Department of Human Evolution of the Max Planck Institute for Evolutionary Anthropology (MPI-EVA) in Leipzig, Germany.

    In publishing these scans, we hope to stimulate research on these important specimens using virtual representations of the original fossils. We also envision that increased access to such data will stimulate additional research requiring study of the original fossils.

    This is really an outstanding development, a rich resource for education and further research. I want to congratulate all the people involved!

    The CT archive is hosted at the Max Planck Institute website, "DITSONG - CT Archive".

    Solving the problem of access has been especially difficult for scan data of hominin fossils. The physical specimens that represent ancient hominins are curated by museums in many countries around the world. Museums and other national institutions have a mission to steward their historical and cultural resources. A central archive, whether from a government sponsor like NSF or from a commercial entity such as Google, would be more convenient for researchers and save facilities and labor costs, but might take away some of the stewardship capability exerted by the separate institutions now. So to see a large institute entering into this kind of collaboration with a prominent national museum in another country makes me hopeful that the field is being persuaded about the benefits of more open access -- especially for educational use.

    Paleoanthropology is a comparative science, and good science requires comparing a specimen to the variation across samples of fossils and living primates. This is why casts have had such an important role in the history of the field. The key fossil specimens may never be in the same room with each other, but casts can be brought together for comparisons.

    Now with CT data, technology in principle makes it possible for every paleoanthropologist to have an archive of fossil morphology. That has such a potential to save morphologists time and trouble. Simply keying a publicly available 3-d scan model with a label can allow much clearer communication about the form of a trait that may appear ambiguous on a fossil fragment.

    So why has this technology taken so long to get into the hands of paleoanthropologists? In 2005, I reflected on an article that forecasted a bright future for CT data archives in paleoanthropology: "Frontiers of human origins". I wrote:

    Personally, I think CT will have a limited set of impacts. The best thing is that it will allow any lab in the world to have as full a set of comparative data as have been released. Currently, it's useless for that purpose; there's just not enough access. But that is changing, and CT scans are as useful to a practiced eye as casts -- which are much less available today even as CT increases. In fact, high-resolution CT may essentially end casting of new fossils, since that is one of the major sources of damage. We'll be doing a lot of comparative work with imaging in the future.

    There is still not enough access. There are a very limited number of scans of hominin fossils that are openly available for download -- for example, Harvard's Peabody Museum made CT data for the Skhul V cranium available several years ago. Other CT data are available for sale; some are available with a consortium membership, and some can be acquired by direct inquiry to researchers. Right now, a student cannot use these Kromdraai data in comparison with other open fossil data unless that student is well-connected in the hierarchy of paleoanthropology. The day is still far away when every laboratory has access to a useful archive of fossil hominin morphology.


    References

    Synopsis: 
    A new resource gives unprecedented access to imagery of a fossil hominin collection
  • Anthropology's online ecology

    Fri, 2013-04-05 10:19 -- John Hawks

    Jason Antrosio has composed a short report on the "Anthropology Blogosphere 2013 – Ecology of Online Anthropology". I appreciate his kind words about my work here, and love how he has connected the new media activity of many prominent anthropologists, the move to open access by Cultural Anthropology, and the increased activity of social media networks dedicated to connecting anthropologists. It really is an ecology with many niches for people to increase their engagement and connections across fields.

  • What should be the shape of the science journal landscape?

    Tue, 2013-04-02 23:12 -- John Hawks

    Michael Eisen, one of the founders of the Public Library of Science, has thought a lot about how to make the system of scientific publishing better. He has posted the text of a presentation he recently gave, which explains many of the current problems with access and curation: "The Past, Present and Future of Scholarly Publishing". In this passage, he suggests a different system that would obviate the problems finding appropriate research among the 10,000 different journals of the current publishing environment:

    So what would be better? The outlines of an ideal system are simple to spell out. There should be no journal hierarchy, only broad journals like PLOS ONE. When papers are submitted to these journals, they should be immediately made available for free online – clearly marked to indicate that they have not yet been reviewed, but there to be used by people in the field capable of deciding on their own if the work is sound and important.

    The journal would then organize a different type of peer review, in which experts in the field were asked if the paper is technically sound – as we currently do at PLOS ONE – but also what kinds of scientists would find this paper interesting, and how important should it be to them. This assessment would then be attached to the paper – there for everyone to see and use as they saw fit, whether it be to find papers, assess the contributions of the authors, or whatever.

    This simple process would capture all of the value in the current peer review system while shedding most of its flaws. It would get papers out fast to people most able to build on them, but would provide everyone else with a way to know which papers are relevant to them and a guide to their quality and import.

    So far, this kind of value-added curation is not happening very much with PLoS ONE. I'm an associate editor and I still can't keep track of all the research in the journal relevant to me. But even though I have access to many paywall journals through my university library, I still love the ease of just clicking on a link to a PLoS article. It just works, no library proxy, no password, just text. Creative Commons text and graphics, that I can freely comment and reuse. The way science should be.

  • A new high-coverage Neandertal genome

    Wed, 2013-03-20 00:32 -- John Hawks

    Today, Svante Pääbo's group at the Max Planck Institute for Evolutionary Anthropology released high-coverage sequence data from a toe bone from Denisova Cave. The new genome comes a year after the same group released the high-coverage genome of the Denisova finger bone, several months before they published the first high-coverage analysis of this ancient genome [1]. Today's announcement is here: "A high-quality Neandertal genome sequence". It adds a second high-coverage genome from Denisova Cave, this one from a toe bone. Unlike the first finger bone genome, this toe has produced a genome very much like Neandertal specimens from much further west, including the Vindija Neandertals.

    Something interesting in these data: the presence of a Y chromosome.

    There's not so terribly much we can say about a toe. This particular bone was first reported in 2011 by Mednikova [2], who described the specimen's anatomy. She found the toe similar in some respects to equivalent Neandertal toe bones, but also like recent humans in a couple of details. Still, the anatomy wouldn't be enough to conclude that the bone is a Neandertal, because we don't know much about the toes of other ancient human populations.

    The genetics are fairly clear about the level of similarity of this new genome to other Neandertals. From the announcement:

    Similarity of Neandertals and Denisova genomes

    The figure shows a tree relating this genome to the genomes of Neandertals from Croatia, from Germany and from the Caucasus as well as the Denisovan genome recovered from a finger bone excavated at Denisova Cave. It shows that this individual is closely related to these other Neandertals. Thus, both Neandertals and Denisovans have inhabited this cave in southern Siberia, presumably at different times.

    This is a cluster diagram based on genome-wide similarity, which doesn't tell us about possible mixture among the populations. But it does show the high degree of similarity among the known Neandertals. This new specimen from Denisova (labeled "Altai") is a bit further from them than they are to each other, but not much. It will be interesting to assess this degree of similarity in comparison with the within-population similarity of more living human populations.

    I'm reluctant to accept a dichotomy of "Denisovan" versus "Neandertal". Distinguishing the samples in that way invites a typological assumption about the ancient people, giving an impression of distinctness that I'm not yet convinced about. It remains to seriously investigate the hypothesis that one or both of these putative samples represents some amount of gene flow from each other, or from yet more ancient populations. But I suppose we're stuck with the "Neandertal from Denisova" and the "Denisovan from Denisova".

    Unless we go for "manual genome" versus "pedal genome", which is admittedly unappealing.

    There's not much meat in this announcement, that will wait for the full published analysis that we can expect later this year. The most important aspect of this, like the Denisova data availability from last year, is that we can now start working with the high-quality data. As someone who works with sequences, I cannot overstate the importance of having the best high-coverage data available for our work.

    I have a paper in preparation where I make a relevant analogy, in this case noting last year's high-coverage Denisovan genome in comparison to the history of ancient DNA sequencing:

    To put this into context: the original 360bp sequence from Feldhofer 1 has been memorialized on a cross-shaped plaque at the site outside Mettmann, Germany. This plaque is approximately 1 square meter in size. A similar monument to contain the Denisova high-coverage data would need to be more than 14 kilometers across. Compared to the first sequencing effort in 1997, today’s state of the art involves the generation of more than 200 million times more data.

    It's a pretty awesome time for those of us exploring human evolution!


    References

    Synopsis: 
    Noting the announcement of new data availability from Denisova
  • California's online imposition

    Tue, 2013-03-12 23:11 -- John Hawks

    This is big education news, from the California legislature: "Measure Seeks Campus Credit For Web Study".

    If it passes, as seems likely, it would be the first time that state legislators have instructed public universities to grant credit for courses that were not their own — including those taught by a private vendor, not by a college or university.

    “We want to be the first state in the nation to make this promise: No college student in California will be denied the right to move through their education because they couldn’t get a seat in the course they needed,” said Darrell Steinberg, the president pro tem of the Senate, who will introduce the bill. “That’s the motivation for this.”

    So instead of increasing funding to existing campuses at sufficient levels to train the students who are seeking education, California will mandate that online courses from other institutions be accepted as part of the degree requirements at its state universities and community colleges.

    Does that mean I'll soon have Berkeley anthropology students taking my online course for degree credit? We'll see....

  • Cultural Anthropology, open access

    Tue, 2013-03-12 08:05 -- John Hawks

    From Brad Weiss: "Cultural Anthropology will go Open Access in 2014".

    The Society for Cultural Anthropology (a section of the American Anthropological Association) is excited to announce a groundbreaking publishing initiative. With the support of the AAA, the influential journal of the SCA, Cultural Anthropology, will become available open access, freely available to everyone in the world. Starting with the first issue of 2014, CA will provide world-wide, instant, free (to the user), and permanent access to all of our content (as well as ten years of our back catalog). This is a boon to our authors, whose work we can guarantee the widest possible readership —and to a new generation of readers inside of anthropology and out.

    This is a good idea, and not an easy change for the Society to manage. In anthropology in particular open access is a valuable goal, because it can ensure that informants (and their families) can read the research that they help to create.

  • Binge learning

    Sun, 2013-03-10 22:22 -- John Hawks

    From Eli Dourado at The Ümlaut: "‘Binge Learning’ is Online Education’s Killer App".

    Binge viewing is so common that it is now beginning to affect the production of television shows. Increasingly, shows are made for bingeing. They have more intricate plots and recapitulate fewer past plot points. Viewers give the shows their undivided attention, and writers and producers respond with better TV.

    I thought of these facts this past weekend when I tried an online course for the first time. Because I wanted to brush up on my programming skills, I signed up for a Udacity computer science class on Friday. I was drawn in by the fact that there were no deadlines—I could put the class off if I got too busy for it. This concern was somewhat unwarranted, as I had finished half the class by Sunday evening. I realized that I had binged—on a class.

    The concept of "binge learning" seems a useful addition to the conversation about online learning. One issue about MOOCs pointed out by several commentators has been that an "open course" and "open materials" are different issues, that have different strengths. Having materials totally open means that a student is free to race through them as fast (or take as long) as desired. Open materials allow binge learning.

    An "open course" means that anyone can enroll in it. But the materials may be timed so that they are available only at particular times, and they may be restricted in access only to enrolled or registered students. Many students in an open course may find themselves unable to keep up with the pace of instruction. Others may be willing to work much faster, but the organization of the course may restrain them from binging on the material. It's the comparison of watching a television series broadcast week by week, instead of watching an entire season over the weekend on Netflix.

  • White House policy on data access

    Sun, 2013-02-24 23:29 -- John Hawks

    The White House this week announced a new policy on public access to results from federally funded research. The announcement has gotten

    Michael Eisen comments: "No celebrations here: why the White House public access policy sucks".

    The administration fell hook line and sinker for the ridiculous argument put forth by publishers that the only way for researchers and the public to get the servies they provide is to give them monopoly control over the articles for a year – the year when they are of greatest potential use.

    Think about how absurd this is. Publishers, whose role should be to disseminate information as widely as possible, are now the only reason why the public will continue to not have access to research results their tax dollars paid for.

    Why is Eisen so exercised? Here's an excerpt from the White House policy memo describing the policy on publication access:

    In developing their public access plans, agencies shall seek to put in place policies that enhance innovation and competitiveness by maximizing the potential to create new business opportunities and are otherwise consistent with the principles articulated in section 1.

    Agency plans must also describe, to the extent feasible, procedures the agency will take to help prevent the unauthorized mass redistribution of scholarly publications.

    In other words, it's no longer just a matter of copyright agreements with publishers; now the federal agencies themselves must help police PDF sharing among researchers. I wonder where "mass redistribution" will kick in.

    Further, the memo does not set a 12-month access embargo as a maximum, it directs agencies to adopt the 12-month embargo as a guideline. There is a lot not to like in the memo.

    Most of the public attention to the decision has been directed at the effects on scientific publications. I have long been interested in a second area: the public access to data generated by federally funded research.

    The White House Office of Science and Technology Policy last year requested public comment on two questions: open dissemination of federally-funded research and open access to data resulting from federally-funded research. I commented last year in response to the OSTP request ("Public interests in data from federally funded research") about the value of data to scientists and others who are not members of federally funded labs. The present announcement from the White House did not indicate how these comments from last year may have contributed to the decision, but it includes general recommendations on both publication and data access.

    As it stands, the text of the memo essentially keeps in place the data access requirements established under the Bush administration. That is not a bad thing, and indeed the recommendations listed in the memo seem very reasonable. I quote them here at length:

    Each agency’s public access plan shall:

    a) Maximize access, by the general public and without charge, to digitally formatted scientific data created with Federal funds, while:

    i) protecting confidentiality and personal privacy,

    ii) recognizing proprietary interests, business confidential information,and intellectual property rights and avoiding significant negative impact on intellectual property rights, innovation, and U.S. competitiveness, and

    iii) preserving the balance between the relative value of long-term preservation and access and the associated cost and administrative burden;

    b) Ensure that all extramural researchers receiving Federal grants and contracts for scientific research and intramural researchers develop data management plans, as appropriate, describing how they will provide for long-term preservation of, and access to, scientific data in digital formats resulting from federally funded research, or explaining why long-term preservation and access cannot be justified;

    c) Allow the inclusion of appropriate costs for data management and access in proposals for Federal funding for scientific research;

    d) Ensure appropriate evaluation of the merits of submitted data management plans;

    e) Include mechanisms to ensure that intramural and extramural researchers comply with data management plans and policies;

    f) Promote the deposit of data in publicly accessible databases, where appropriate and available;

    g) Encourage cooperation with the private sector to improve data access and compatibility, including through the formation of public-private partnerships with foundations and other research funding organizations;

    h) Develop approaches for identifying and providing appropriate attribution to scientific data sets that are made available under the plan;

    i) In coordination with other agencies and the private sector, support training, education, and workforce development related to scientific data management, analysis, storage, preservation, and stewardship; and

    j) Provide for the assessment of long-term needs for the preservation of scientific data in fields that the agency supports and outline options for developing and sustaining repositories for scientific data in digital formats, taking into account the efforts of public and private sector entities.

    These recommendations are all basically already in the NSF data access policies, meaning that the new White House memo will maintain the status quo at that level.

    The problem is that the current policy is toothless. Continued data access is a very serious problem threatening the integrity of science. Self-archiving and institutional archiving have been sufficient to pass data management portions of grant applications, but have proven to be woefully insufficient to enable access to data. Meanwhile, some fields have intensive data collection but very little or no data entering the public domain as part of digital repositories. The recommendations listed above do nothing to change the current situation.

    Nevertheless there is some room within the recommendations for agency directors to take bolder action on data access. Section (j) perhaps provides the best hope. If federal funding agencies actually assess the long-term needs of each field supported by funding, many (including anthropology) will clearly benefit from the establishment of standard digital repositories.

    I hope that NSF will not sit on its current policy but will instead work to extend access more broadly. At the same time, I wish the White House had given clearer guidance to enable the creation of digital repositories and to require their standard use as a condition of continued funding of research projects.

    Synopsis: 
    A new memo from the Obama administration alerts my interest in data access.
  • Do citation indices count in tenure review?

    Tue, 2013-01-08 17:07 -- John Hawks

    Amy Brand comments on journal citation metrics and tenure and promotion, from the viewpoint of a university administrator [1]. The piece is a reaction to those who believe that publishing in open access journals is harmful for the careers of junior scholars:

    In 2010, Nature carried out a survey in which it asked readers about the use of metrics in decisions about new hires and tenure (Abbott et al., 2010). Three-quarters of the 150 readers who replied thought that metrics were being used in hiring decisions. However, provosts and other administrators contacted by Nature painted a different picture: ‘Metrics are not used a great deal,’ said Alex Halliday, head of the mathematical, physical and life sciences division at Oxford University. ‘The most important things are the letters, the interview and the CV, and our opinions of the papers published.’ Claude Canizares, vice president for research and associate provost at the Massachusetts Institute of Technology, had a similar message: ‘We pay very little attention, almost zero, to citation indices and counting numbers of publications’.

    This is a little misleading. By the time a tenure application goes to the provost, it has already been through many layers of review. Letter writers, grant reviewers and departmental colleagues do pay attention to high-profile publications. It is true that calculating the citation index of a journal does not add much information to this process, but a scholar who publishes only in very obscure venues will be dinged for it at these levels of evaluation.

    Fortunately, open access journals are no longer obscure. Additionally, access is becoming part of what it takes a publication to be perceived as high-profile. Particularly in anthropology, there is a strong argument that providing access to research results is an ethical obligation to our research participants.


    References

Pages

Subscribe to open access

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.