john hawks weblog

paleoanthropology, genetics and evolution

metascience

  • The advisor female students don't want

    Fri, 2012-03-09 17:54 -- John Hawks

    Kate Clancy has thankfully continued her series of posts about sexual harassment and fieldwork, and I want to direct the current post to the attention of everyone in academic anthropology: "Retrograde Reactions: “Lady in the Field” on the Aftermath of Sexual Misconduct". The entire story is chilling, but I wanted to emphasize the following, in which the student's graduate advisor capitulates to a colleague, "F.", who wants to sweep another colleague's harassment under the rug:

    My graduate advisor agreed with me that F.’s reactions were retrograde. He valued the collaboration with F., however, and pointed out that my taking formal action would effectively terminate that collaboration. As a student dependent on my advisor for research funds, supervision, and credentialing, I chose not to pursue formal action.

    Sadly, there are many students who cannot count on their graduate advisors to stick up for them in matters of university policy. Some supervisors of graduate students are cowards who allow themselves to be bullied by prominent researchers who can deny them access to field sites.

    Sexual harassment is uniquely malevolent but it is not an isolated pattern of behavior. Senior researchers create circumstances where harassment is a likely outcome when they bully their subordinates, harangue other scholars at conferences, and use access to field sites or materials as a reward for scientific agreement.

    I warn my talented undergraduates about situations like these and steer them toward graduate programs with adequate protection for students. Moreover, when field situations are unsafe for students, we should not reward them with grant renewals. Graduate advisors must not tolerate harassment of their students, and should know that the community is more powerful than the bullies.

  • A Solutrean publicity blitz

    Sat, 2012-03-03 14:48 -- John Hawks

    So....

    About all the "Solutrean Paleoindian" news this week...

    There is no new evidence, no revelation, no reason why other archaeologists should revisit this issue at this time. The news is free publicity for the release of a book.

    The book, by Dennis Stanford and Bruce Bradley, titled Across Atlantic Ice: The Origin of America's Clovis Culture. The book argues that ancient Europeans, carrying knowledge of the Upper Paleolithic Solutrean toolmaking tradition, voyaged across the icy North Atlantic around the time of the Last Glacial Maximum to establish a new population in the Americas.

    I've been out of town so it took me a while to figure out why all these newspapers were suddenly interested. None of the news outlets that employ knowledgeable science writers have jumped on this, for good reason. There's no news here except the book release. An exception is The Washington Post, which ran a long article featuring Stanford and Bradley's claims ("Radical theory of first Americans places Stone Age Europeans in Delmarva 20,000 years ago").

    At this point, somebody reputable needs to review this and give a serious account of the book's claims, because there's too much hype going around. I went to Amazon to see if there was a Kindle version of the book for me to review. But there isn't a Kindle edition. So I thought, OK, I'll order the hardback. But Amazon doesn't have it in stock.

    In other words, the University of California Press publicity machine has done its job.

    I want to give some links to some other recent books about Paleoindians. I will be reviewing and reading several of these as I go through Stanford and Bradley's book. That won't be until after the AAPA meetings, because the hardback of Across Atlantic Ice will take a long time to get here, so if you want to learn more about the initial inhabitants of the Americas, I suggest looking at one or more of these. All are published after 2000, but the older ones are showing their age. I include them because the authors, including Dillehay and Crawford, are experts with their own views that merit comparison. Slightly older volumes are more likely to be found in libraries, also, and comparing them can be a useful reminder that the evidence about early New World peoples really does continue to change.

    I'm sure there are other books by specialists that I have missed, and I'll be happy to update.

    UPDATE (2012-03-08): A reader writes with another suggestion:

    Another writes to note that the publicity for Across Atlantic Ice has been mostly generated by the Smithsonian, owing to Stanford's position there, rather than University of California Press.

    Synopsis: 
    Hype over claims that American Indians came from Ice Age Europe
  • Chris Henshilwood profile

    Tue, 2012-02-28 18:16 -- John Hawks

    Nature News has an article written by Jeff Tollefson, which profiles archaeologist Chris Henshilwood and his work at Blombos, South Africa: "Human evolution: Cultural roots".

    Most fascinating line, regarding his early exploration at Blombos:

    The Middle Stone Age was not part of his thesis, so Henshilwood covered the site up and moved on.

    That's sadly symptomatic of archaeological funding.

    Henshilwood has made a great career out of the MSA since then, as the article details. Now lots of money is flowing into interdisciplinary research trying to tie African MSA to paleoclimate. The article details some of those developments also.

  • Re-prioritizing faster communication

    Mon, 2012-02-27 00:44 -- John Hawks

    Two experts on social policy from the London School of Economics comment on the importance of blogging and public outreach for academics, in an interview reporting the startup of a new public policy blog.

    One of the recurring themes (from many different contributors) on the Impact of Social Science blog is that a new paradigm of research communications has grown up – one that de-emphasizes the traditional journals route, and re-prioritizes faster, real-time academic communication in which blogs play a critical intermediate role. They link to research reports and articles on the one hand, and they are linked to from Twitter, Facebook and Google+ news-streams and communities. So in research terms blogging is quite simply, one of the most important things that an academic should be doing right now.

  • Text-mining science

    Sat, 2012-02-25 12:48 -- John Hawks

    There are many reasons why we should have an arXiv for human evolution, and this isn't the most important one...but I really wish I could do this with the literature on modern human origins right now:

    They want to break down the full text of the articles into component phrases to see how often a particular word or phrase appears relative to others — a measure of how 'meme-like' a term is. Their goals: to give Arxiv a new tool for identifying original source papers in physics, mathematics and computer science — and to enable historians to spot trends from the 20 years that the repository has existed.

    “How do you find the moment when a given scientific transformation occurred?” asks Jean-Baptiste Michel, co-director of the Cultural Observatory and a postdoctoral researcher in psychology at Harvard. “You can help the reader figure out where in time the most relevant papers were located, which has always been difficult to do.”

    That's from an article by Eric Hand about the "Cultural Observatory" at Harvard, who are going to apply Google's n-gram approach to the physics preprint service ("Researchers aim to chart intellectual trends in Arxiv"). My manuscript on shrinking human brain size will be in there, but I don't imagine it's tightly linked to any trends in physics.

    My impression of the modern human origins problem over the last 20 years is that it unfolded along parallel lines with some long-term stability of citation and linking. Only a small number of papers were consistently cited across the gamut of researchers -- for example, archaeologists writing on modern human origins would consistently cite only a handful of papers from biological anthropologists, geneticists would usually cite only two or three papers consistently from archaeologists. Those papers formed a highly artificial tradition. Most often papers in Science or Nature, they were highly abbreviated forms of arguments, too brief to illustrate why specialists persisted in disagreements about certain issues. So, many geneticists writing about modern human origins never really understood the morphological argument in favor of some regional continuity, and many paleoanthropologists never understood the limits of genetic models supporting complete replacement.

    UPDATE (2012-02-25): A reader writes:

    Are you SURE that the shrinking brains aren't related to trends in physics? Like maybe string theory?

  • Floating to the top of the data

    Sun, 2012-02-12 20:10 -- John Hawks

    The New York Times writes today about "Big Data" and its effects on disparate fields of science and public policy: "The Age of Big Data".

    For my money, this quote should be at the beginning of the article instead of embedded near the end:

    Big Data has its perils, to be sure. With huge data sets and fine-grained measurement, statisticians and computer scientists note, there is increased risk of “false discoveries.” The trouble with seeking a meaningful needle in massive haystacks of data, says Trevor Hastie, a statistics professor at Stanford, is that “many bits of straw look like needles.”

    Big Data also supplies more raw material for statistical shenanigans and biased fact-finding excursions. It offers a high-tech twist on an old trick: I know the facts, now let’s find ’em. That is, says Rebecca Goldin, a mathematician at George Mason University, “one of the most pernicious uses of data.”

    The article begins by hyping the career prospects for graduates who can analyze large datasets. I would emphasize that good analytical skills don't emerge naturally from working with data, they must be learned as part of one's scientific training. The top hazard working with large datasets is that they can temporarily knock out your BS meter.

    We are obviously in the realm of big data now in paleoanthropology, as we grapple here to compare genomes that sum into the terabytes. I periodically link to stories about open access in astronomy precisely for this reason: those instruments generate terabytes of data and more every night.

  • A fieldwork tale from beneath the pyramid

    Sun, 2012-02-05 12:11 -- John Hawks

    Kate Clancy shares a reader's story about her experiences as a graduate student doing fieldwork with a team of anthropologists: "From the Field: “Hazed” Tells Her Story of Harassment".

    My professor often joked that only pretty women were allowed to work for him, which led me to wonder if my intellect and skills had ever mattered. He asked very personal questions about my romantic life, often in the presence of the male students. His inappropriate behavior was a model for them, making it not only acceptable, but the norm. My body and my sexuality were openly discussed by my professor and the male students.

    This kind of story is too common, and I think it should be widely read. Many people reading this may imagine that it describes the field as it existed in 1970, but it is 2012 and there are many professional anthropologists who run their field programs like this today. The problem is not limited to cases where females have received discrimination from male supervisors -- I know many personal stories of extreme field harassment by female supervisors on female students as well.

    From later in the story, this quote encompasses much of the truth of the matter, "I didn’t realize that many research projects are run like pyramid schemes, with rigid status hierarchies, ruthless competition, the exploitation of students and objectification of women." Tenure protects these people, even though in theory they can be dismissed for harassment, because it gives them much power to make trouble for those who would complain. I will note that our middle schools have grown-ups who do not tolerate this kind of behavior from seventh-grade boys.

  • "Journals seem noticeably less important than 10 years ago."

    Mon, 2012-01-16 16:56 -- John Hawks

    As ScienceOnline2012 gets underway later this week, the New York Times is running an article about open science: "Cracking open the scientific process". The article spends many paragraphs promoting a social networking startup for scientists called ResearchGate, which honestly strikes me as having a not-very-useful approach to openness. For example:

    Dr. Rajiv Gupta, a radiology instructor who supervised Dr. Madisch at Harvard and was one of ResearchGate’s first investors, called it “a great site for serious research and research collaboration,” adding that he hoped it would never be contaminated “with pop culture and chit-chat.”

    I doubt that a walled garden where scientists share their reprints is the wave of the future. The "answering questions" aspect of the site seems similar to the Faculty of 1000 and similar concepts. Such sites aim to make social sharing into a virtue for scientists by credentialing them. On the other hand, if a social network for science can succeed in filtering out politics, that might be worth paying for.

    There are many other things in the article. One thing that shocked me: The open access fee for Nature Communications is really $5000. Holy cow. For $5000 I could pay someone to sit in a coffee shop all day and hand-type the contents of my article into personalized e-mails to everyone who reads it. What the heck is that about?

  • Public interests in data from federally funded research

    Thu, 2012-01-12 20:20 -- John Hawks

    I submitted the following essay in response to the Request for Information on Public Access to Digital Data Resulting from Federally Funded Research from the National Science and Technology Council's Interagency Working Group on Digital Data.

    This RFI is not the same as the current bill before Congress ("Open access op/ed in NY Times"), which would restrict public access to research articles based on federally funded research. Research articles are a very important issue, but I hope that the access to digital data will not be overshadowed by the attention to published results. As a paleoanthropologist, I believe that access to digital data from federally funded research projects is a fundamentally important issue, as I remark below.

    Introduction

    The United States provides grant funding to scientists through many federal programs. This funding advances work of public interest that might not happen without federal assistance.

    The creation of scientific knowledge may serve the public interest directly by enabling useful inventions or supplying actionable information on issues of public importance. A funded project may also serve the public interest indirectly, by (1) finding negative results that prevent wasted effort or public harm; (2) building the scientific infrastructure that enables future discoveries and advances; (3) training new and established scientists in effective research techniques; (4) enhancing international cooperation and public/private partnerships.

    Congress and the Executive Branch have recognized that access to the published results of scientific research is not sufficient to advance the direct and indirect public interests served by federally funded projects. Facilitating the indirect benefits of research is a major aim of federal agencies' "Broader Impacts" and data access rules. These policies have been a qualified success since their implementation, limited mainly by the exceptions carved out by programs and agencies to avoid requiring certain kinds of data to be reported along with research reports.

    I argue that open public access to digital data should be a requirement for all federally funded scientific research. Digital data can be maintained by federal agencies as a part of the reporting requirement of federal grant funding. Doing so will advance the interest of the public and ensure that today's science generates a continuing heritage of research excellence.

    Data access and transparency

    Transparency is essential to public trust. Scientific conclusions are formed by observation and replication, and for this process to be transparent, all data must be available for independent inspection. The possibility of such inspection should not be limited to qualified researchers, because the very existence of special access requirements blocks transparency of the scientific process.

    Changing technology has shifted the public's expectations about transparency. Digital technology enables most research data to be shared rapidly and at low cost. If data are produced in digital form, and digital data can be shared at low cost, researchers and agencies cannot credibly claim that the difficulty of reproducing and disseminating data is a sufficient reason to restrict access. Where no competing interest argues for restricted access (such as human subjects protections), a lack of access to digital data itself can now be a compelling reason for public distrust.

    Therefore, federally funded researchers should release digital data to the public by default. Federal agencies should facilitate this public reporting by requiring digital data to be supplied as part of final project reporting.

    Data access has a well-established record of success

    The recent history of human genetics demonstrates that open access to data has unforeseen benefits that can spawn innovation, support more effective education, and catalyze new discovery. In genetics, both federal and journal policies require release of data; raw data from federally funded projects are often available as they are generated, long before publication.

    My own laboratory has no federal research funding to date, but is actively engaged in research using data from federally funded projects. Today my laboratory trains undergraduate students in genetics with new data from ongoing federally funded genetic projects such as the 1000 Genomes Project. We use open access data from archaic human genomes to investigate the variation of ancient people and their relationships to living humans. This kind of work would be impractical without clearly established open data access policy.

    The open access to data from the Human Genome Project facilitated the rapid development of microarrays that are now used on a broad scale in human genetics to investigate the genetic correlates of human health and disease. Access to data from these studies has enabled other scientists to independently replicate many genetic associations. More important, meta-analysis of such data has shown that many associations cannot be replicated, while also showing some cases in which nonsignificant results across different samples give rise to a significant finding when pooling those samples. Access to negative results and raw data is necessary, in other words, to establish the facts in subsequent research. This goes beyond access to published research results and requires open access to unpublished digital data.

    Intellectual property protections and data access

    Research data are somewhat distinct from the intellectual property issues relating to research publications. Some kinds of data do not meet the standard of originality necessary for copyright protection, such as sequence data, CT or MRI data, or data from measurement instruments. For raw data from instruments, there is no intellectual property reason why federal agency should not maintain an open archive for the public.

    Much research data is unquestionably subject to copyright protection, such as lab notebooks, written descriptions, photographs, and original reconstructions. Yet there is still a substantial public and scientific interest in inspecting such data. For example, photographic documentation of archaeological sites and specimens are of particular scientific value and are today routinely produced by digital technologies and stored in digital form. Some primary digital records are unique products that cannot be recreated at another time and place: for example, in situ photographs of specimens, photographs and records of sites before excavation, and digital reconstructions. The scientific record would be incomplete without such contributions, and maintaining an archive of such data over the long term is a difficult task for a single investigator, beyond the scope of a grant term.

    In cases where it is impracticable to obtain Creative Commons or other open licenses to such content, a funding agency should at a minimum require that a copy of all such archival information be deposited along with the final project report and a limited-use non-commercial license permitting electronic dissemination of these materials to the public as part of the report.

    Metadata and data access

    Many have noted that raw data may be useless in the absence of additional information about how the data were obtained. Such information is known as "metadata". Researchers generate instrumental data using particular instrument settings and recording standards. They gather observational data under particular research protocols. These standards are may change quickly as instrumentation, technology, and scientific results themselves demand new practices.

    Some scientists note the problem of incompatible metadata, using it as an argument against to delay the establishment of open public access to data. In their view, the public are likely to misunderstand or misuse scientific data where metadata are not clearly indicated. Meta-analyses combining data from multiple research projects are an important secondary use of digital data, and such meta-analyses are impossible when data cannot be reconciled into common observational or instrumental frameworks. Performing original work with data collected in heterogeneous contexts is a research speciality of its own, and is itself sometimes targeted by federal grants.

    However, meta-analysis is only one purpose of data access. Transparency, replicability, and education are central public interests that do not require the reconciliation of data collection methods from multiple studies. They require only clear description of the methods under which data were obtained. At a minimum, final research reports on federally funded projects must describe the standards of data collection with sufficient detail to allow independent replication, including all unpublished results and data.

    Successes of data access in paleoanthropology

    I am an anthropologist, and am most familiar with the scientific data relating to human evolution. These data include genetic observations on living and skeletal samples of humans. They also include fossil and archaeological evidence such as photographs, CT scans, isotopic records, anatomical measurements and descriptions.

    For many years, nearly all genetic data resulting from federally funded research have been made available for public download. Much genetic data generated by non-federally funded research programs, including foreign and domestic institutes, has also been free for public download. These data have resulted in a massive acceleration of research on recent human evolution and human origins. They have also led to unexpected discoveries and a burgeoning contribution of other disciplines to understanding our evolution.

    Data from radiocarbon dating and other isotopic sampling has also been made available to the public. Human occupation sites are among the best sources of evidence about past climates. The investment of federal resources in human evolution research has generated a temporal record that is now essential to studying changes in the faunal and plant compositions of past environments. Free access to records has enabled stronger calibration of radiocarbon dates, the development of a more secure chronology, and a more highly replicable scientific record correlating different regions of the world. Our understanding of such events changes is vastly stronger when data are made public.

    Institutions and data access in paleoanthropology

    By contrast, CT scans and photographs pertaining to human origins are typically not made freely accessible to the public. The United States funding agencies are not the only parties with an interest in such data. In particular, museums and institutes that curate specimens often permit data collection under agreements that restrict the dissemination of the resulting data. Such agreements may be equated to "non-disclosure agreements" with respect to scientific data.

    An institution has a legitimate interest in controlling the public use of images and access to curated materials. Nevertheless, the lack of access to digital data results in reduplication of effort, overapplication of destructive sampling and measurement techniques, and unnecessary handling of precious and fragile specimens. Where it is practical, the United States should facilitate agreements with institutions that allow the release of digital data produced by public funding. Where release is not possible, funding should be granted only for those activities that will result in the release of data under a limited-use non-commercial license. Non-disclosure of data from instruments such as CT scanners, electron microscopes, or mass spectrometers is incompatible with scientific replication.

    Scientific careers and data access in paleoanthropology

    The economy of federal funding for scientific production sometimes leads to perverse incentives for high-ranking researchers that prevent public access to research data. Some scientists believe that their own future research will require exclusive access to data. Others want to impede research achievements by their academic rivals, or to maintain prestige and future funding opportunities.

    Scientific data in some areas may constitute "trade secrets" until they are protected by patents. Even in noncommercial research, federally funded scientists sometimes claim exclusive ownership over data that they plan to use in future research. In my own field of paleoanthropology, data secrecy supports a clandestine "quid pro quo" economy among researchers, in which established researchers and institutions allow furtive looks at unpublished data, to support and consolidate their power and influence.

    This is a game that the United States should simply decline to play. When federal research supports scientific results that are not subject to independent replication, it betrays the public interest in science.

    Established collaborations and centers of scientific research will always exert a strong influence upon the future of science, irrespective of federal data access policies. But established players should not use federal funding to construct barriers to open inquiry.

    Conclusion

    Open public access to data is one indication that a research project is following scientific principles. Making digital data available to the public would be good practice for any researcher, irrespective of funding source. Data access mitigates the risk that negative data will be unreported. Data access facilitates broader stewardship of research projects, in particular where collaborations create data that are distributed across many institutions. Data access and reporting standards enable other researchers to fill in for those who cannot complete scientific project due to health or other personal reasons.

    Federal grant agencies already have successful repositories for many kinds of digital data. Such data are shared with the public at minimal cost relative to the overall budget for federal research grants. Supporting digital data repositories has itself been an important granting aim for several federal agencies and continues to be an active part of scientific infrastructure. Limiting such repositories for the exclusive use of a small cadre of researchers is enormously wasteful of resources, when they can be opened to an interested public for a small incremental cost.

    The public has repeatedly invented surprising uses for digital data that can complement or enhance the scientific record. But much more important, open access to digital data serves the scientific values of transparency and independent replication, essential to maintaining public trust and investment in the research enterprise.

    Synopsis: 
    My response to a federal Request for Information on the topic of digital data access to federally funded research
  • Counting citations and career fitness

    Sun, 2012-01-08 15:53 -- John Hawks

    Philip Ball: "The h-index, or the academic equivalent of the stag's antlers".

    Few topics excite more controversy among scientists. When I spoke about the h-index to the German Physical Society a few years ago, the huge auditorium was packed. Some deplore it; some find it useful. Some welcome it as a defence against the subjective capriciousness of review and tenure boards.

    ...

    No one officially endorses the h-index for evaluation, but scientists confess that they use it all the time as an informal way of, say, assessing applicants for a job. The trouble is that it's precisely for average scientists that the index works rather poorly: small differences in small h-indices don't tell you very much.

    In anthropology, the h-index has almost no utility at the time it matters -- hiring and tenure. Citations have a long tail distribution -- a few papers will usually capture the majority of citations of a scholar's work, with most papers being relatively uncited. The h-index provides a measure that discounts the citations from one or two super-highly-cited papers, in an attempt to quantify more of the shape of the distribution of citations among an individual's works. The number of publications and citations for early-career scholars is just too low for the shape to differ much among scholars that have published the same number of papers. You see, just as an individual's distribution citations have a long tail, so does the distribution of citations among scholars. Publication count gives a proxy for effort, but whether that effort has translated into important effects is generally not well indicated by citations until later in the career.

    Metrics are a way to deflect accountability from promotion committees. Stag antlers work, in principle, because they are honest signals of the stag's ability to survive and thrive in the face of a significant handicap. If that's true of later-career scholars with high citation counts, it's probably a sign that the handicaps should be removed for younger academics!

Pages

Subscribe to metascience

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.