john hawks weblog

paleoanthropology, genetics and evolution

metascience

  • The costs of publication delays

    Sun, 2012-08-19 21:54 -- John Hawks

    Joe Pickrell has written a valuable post on Genomes Unzipped about the future of publication in genetics: "The first steps towards a modern system of scientific publication". One thought-provoking passage:

    In my experience, this lag time [between submission and publication] is on average about six months, with a non-trivial long tail of papers that take much longer. To put this in context with some back-of-the-envelope calculations, let’s define a unit of time called a Scientific Career (SC), and let 1 SC equal 30 years. If there are 50,000 papers published in biology per year (this number is somewhat random, but probably within an order of magnitude given that about 500k papers are added to PubMed per year), and on average each paper takes 6 months to go through the review process, then each year ~800 Scientific Careers are spent bringing papers from initial submission to formal publication. It would be a laughable to argue that 800 SCs of research or value have been added to the papers during this process (let’s be honest–for most of that time the papers are just sitting on someone’s desk waiting to be read). The system of pre-publication peer review thus dramatically retards scientific progress.

    Pickrell is arguing only that preprints help to address this unnecessary delay. I agree. In biological anthropology, the lag is typically much longer than six months.

    Of course some of the delay happens when papers are sitting on reviewers' or editors' desks. This situation could be improved if reviewers were paid or given formal recognition for their efforts. Also, papers are substantially delayed when rejected by a journal because they don't match the journal's focus or desire for news value. Authors are partly to blame for mistargeting these manuscripts. If preprints were routinely posted, the authors would suffer the pain of resubmitting without having their work remain unavailable to everyone else. Of course, this reduces news value of publication, but we need to eliminate the myth that publication itself is a newsworthy event.

    The post has an excellent, long comment by commenter asdf, featuring this:

    The solution: adopt the culture of open source, where source is assumed to be fragile and bug reports are met with patches. Reject the culture of academia, where “peer reviewed” papers are assumed to be correct, while corrections and retractions carry a career penalty.

    She's describing the myth that getting your work published is the point when it becomes valuable. You can see how that myth hurts the public by conveying a mistaken picture of how science works. It also hurts science by creating incentives for misbehavior by scientists.

    Mostly unrelated: While writing this, I was reminded of my post from last year, "Peer review in Castle Wolfenstein".

  • Entrepreneurship versus scholarship

    Fri, 2012-08-17 12:37 -- John Hawks

    Zen Faulkes: "Does a Ph.D. train you to head a lab?"

    The big one, though, is bookkeeping and budgeting. I didn’t have to worry about tracking money in any significant way as a grad student or post-doc. Spending money at an institution is not like spending your own money. You have layers of people and paperwork that stand between you and purchases. You have obscure “enterprise finance” systems that seem designed to drive a person to substance abuse. I’ve learned that I despise trying to keep track of grant money.

    He discusses many other ways that a Ph.D. course fails to prepare students for research independence. I have an orthogonal thought.

    Running independent research requires entrepreneurship. A Ph.D. formally is training in scholarship. A great scholar may be a poor entrepreneur, and few Ph.D. programs require training that would instill values of entrepreneurship. Essential skills include professional networking, balancing risk by diversification, repeatedly and widely asking for funding, accurately judging the motivations of people who share information, and publicizing and promoting one's own ideas. Some students get excellent informal training in these skills, but many miss them entirely.

    Being both an honest scholar and a successful entrepreneur requires a very special package of talents, and taking on research independence is a difficult transition for even the best students.

  • Neandertal ancestry "Iced"

    Wed, 2012-08-15 15:24 -- John Hawks

    I've been mobbed with e-mails from readers asking about my reaction to the new paper by Anders Eriksson and Andrea Manica in PNAS, titled "Effect of ancient population structure on the degree of polymorphism shared between modern human populations and ancient hominins" [1]. The paper asserts that Neandertal similarity in the genomes of living people outside Africa can be explained only in terms of incomplete lineage sorting from the shared human-Neandertal common ancestral population in Africa. If the paper's assertions were accurate, we could go back to thinking that all the genetic heritage of people today traces back to Africa, although we would still need to abandon the idea that the African population had undergone a small bottleneck.

    I have not been posting as frequently the last month or two because I have been out of the country doing science.

    The new paper's press release has given rise to quite a lot of media attention, much of which unfortunately misrepresents our current knowledge of human and Neandertal genomes. Razib Khan summarized the situation on Monday, in a post titled, "Why you shouldn't publish in PNAS". I agree with his criticism, although I have a perspective coming out soon in PNAS. In fact, I suppose this episode shows why everyone should publish in PNAS, because so many journalists will just parrot press releases instead of asking relevant experts. Ewen Callaway did a great job on this story by putting it into the broader context ("Neandertal sex debate highlights benefits of pre-publication"). You will notice how no other science writers with any Neandertal knowledge picked up this press release...

    Paleoanthropology is a field where data are rare and precious, and we do a lot of arguing about the validity of models. I love arguing about the validity of models (Cliff Notes version: All models are wrong).

    Genomics is not such a field. We have abundant data today to compare with Neandertal genomes. Yet puzzlingly, the idea of Neandertal ancestry has been challenged by several papers that haven't performed any new empirical comparisons at all. I'm struggling to figure this out. We have an unparalleled ability to explore the genomes of humans and Neandertals, and we should believe a computer model with no empirical data?

    I've been assessing the Neandertal similarity of 1000 Genomes Project samples here on my blog (e.g., "Which population in the 1000 Genomes Project samples has the most Neandertal similarity?"). This is ongoing research here in my group, but we've been making it open because it tells us immediately that some hypotheses about Neandertal similarity must be wrong. Modeling is a lot of work. We're trying to avoid putting a lot of investment into modeling that will be easily refuted by the next piece of genomic data. Data are flowing now so rapidly that we can afford to be naive empiricists.

    For example, our comparisons quickly refute the hypothesis that Neandertal similarity comes only from ancient population structure in Africa. That hypothesis predicts much more heterogeneity within Africans in Neandertal similarity than exists today. We've shown that the heterogeneity in Africans is basically the same as within Europeans or Asians, and that the variance among African populations so far is quite small. Those are very simple observations, which are consistent with what Yang and colleagues [2] concluded on the basis of the frequency spectrum of Neandertal alleles in large samples of living people. Even though many Neandertal-shared SNP alleles came from incomplete lineage sorting, the signature of excess Neandertal sharing outside Africa must come mostly from recent introgression. In Ewen Callaway's article about this research, David Reich dismissed the new paper by Eriksson and Manica as "obsolete". I agree. The paper describes a model without carrying out any new empirical comparisons, and so has fallen behind where the science has gone.

    Another example is the proportion of Neandertal ancestry. Initially, the proportion of ancestry from Neandertals in living people was argued to be between 1 and 4 percent [3]. That was a model-based estimate that was the best possible under the assumption that Africans have no Neandertal ancestry. We now have a lot more human comparisons, which would make possible a more precise estimate of the mean. I hesitate to provide a new estimate, because we have shown that some Africans have substantial evidence of Neandertal similarity, which throws the baseline for any estimate into question. How much Neandertal ancestry is present in living people must depend on a more complex model of mixture among later populations. The result will still be small (probably less than 6 percent) but understanding this proportion will help us to evaluate when and where Neandertal genes flowed into our populations.

    Here's a third example. I haven't written about here yet, but I have been lecturing about it quite widely over the past few months. Earlier this year, the genome of Ötzi the Tyrolean Iceman was reported by Andreas Keller and colleagues [4]. Aaron Sams and I downloaded the data and have been carrying out several different kinds of comparisons. A picture:

    Otzi 1000 Genomes Neandertal comparison

    I'd like to see the model of African population structure that could explain this result...

    If you'll remember my earlier posts on the 1000 Genomes Project samples, this chart is a histogram of the number of shared Neandertal derived SNP alleles in different samples. The European and Asian samples are substantially greater than either African sample (here, Luhya and Yoruba colored differently). If we took as a baseline that Europeans have an average of 3.5 percent Neandertal, Ötzi would have around 5.5 percent (again, the actual percentage would be highly model-dependent). He has substantially greater sharing with Neandertals than any other recent person we have ever examined.

    You can imagine, we have carried out just about every comparison we can think that could explain this result as anything other than greater Neandertal ancestry. Aaron and I will be putting our manuscript on the arXiv as soon as we've both signed off on all the text and figures, hopefully this week. This is simple stuff, and I see no reason not to be open about it -- anybody with the Ötzi data can immediately do the same thing.

    We think that showing and sharing these comparisons will save people a lot of useless effort. Personally, I can't believe that these people spending effort on population models for Neandertals aren't talking to those of us who have already carried out these comparisons and have already presented them in public. I guess we'll find out if secrecy or openness leads to better science.

    Meanwhile, I can share the abstract of the conference paper I'll be presenting in September at the meeting of the European Society of Human Evolution in Bordeaux:

    Evaluating recent evolution, migration and Neandertal ancestry in the Tyrolean Iceman

    Paleogenetic evidence from Neandertals, the Neolithic and other eras has the potential to transform our knowledge of human population dynamics. Previous work has established the level of contribution of Neandertals to living human populations. Here, I consider data from the Tyrolean Iceman. The genome of this Neolithic-era individual shows a substantially higher degree of Ne- andertal ancestry than living Europeans. This comparison suggests that early Upper Paleolithic Europeans may have mixed with Neandertals to a greater degree than other modern human populations. I also use this genome to evaluate the pattern of selection in post-Neolithic Europeans. In large part, the evidence of selection from living people’s genetic data is confirmed by this specimen, but in some cases selection may be disproved by the Iceman’s genotypes. Neolithic-living human comparisons provide information about migration and diffusion of genes into Europe. I compare these data to the situation within Neandertals, and the transition of Neandertals to Upper Paleolithic populations – three demographic transitions in Europe that generated strong genetic disequi- libria in successive populations.


    References

  • Spreading preprints in population biology

    Wed, 2012-08-01 17:47 -- John Hawks

    Ewen Callaway reports on the increasing use of the arXiv preprint server by geneticists and biologists: "Geneticists eye the potential of arXiv". With the near-arrival of the PeerJ system, which promises to seamlessly integrate preprints and pre-publication review with ultimate publication, this is a very timely story. Last week I pointed to the new paper on arXiv by Joseph Pickrell and colleagues, and there have been a few other notable ones recently.

    But Ginsparg says that pre-publication is more likely to stop scientists from being scooped. In many physics fields, publication on arXiv is what counts for claiming priority, and journal reviewers can use the server to check that discoveries are correctly attributed. An authoring history that accompanies all arXiv papers also allows scientists to arbitrate disputes over priority. In the 21 years since arXiv began, Ginsparg has seen astrophysicists, computer scientists and others go from sceptics to devotees. “Once a community adopts arXiv, it never seems to relinquish it,” he says.

    What readers probably don't know is that I have been experimenting quietly with preprints for the last year -- not front-paging, but putting up to make available and allow me to use the bibliographic system. Several of our in-progress manuscripts are online here on the blog and discoverable by Google, as are preprints of some of my published work. I've been motivated to publish preprints of published papers because copyright agreements generally do not allow authors to post final PDF versions, but do allow posting either pre-review or pre-publication manuscripts. The most frequently read preprint here is my 2008 book chapter, "From genes to numbers: effective population sizes in human evolution"

    More to the point, I posted one of my own preprints on arXiv last year, regarding shrinking brains: "Selection for smaller brains in Holocene human evolution". I wanted to know how widely a biology preprint would be read, without promoting it myself which would skew the numbers. The results have been interesting. The paper got an initial mention on the physics arXiv blog but little attention otherwise.

    That is, until the last few months. I've had a dozen requests from colleagues to cite the paper (which anyone is welcome to do by using the arXiv number). I also had two great interactions with colleagues who had comments and suggestions on the preprint, which I am now incorporating into a revision. So presubmission review actually does work, when the paper comes to the attention of the right people. But without promoting the preprint, that feedback won't happen for a while.

    Recent events suggest that many population biologists may be ready to go to the arXiv. I think we should do everything we can to encourage this trend.

  • Today's dose of depression on science jobs

    Sat, 2012-07-07 18:02 -- John Hawks

    Brian Vastig reports in the Washington Post on the problem with calls for more Ph.D. scientists: "U.S. pushes for more scientists, but the jobs aren’t there".

    Traditional academic jobs are scarcer than ever. Once a primary career path, only 14 percent of those with a PhD in biology and the life sciences now land a coveted academic position within five years, according to a 2009 NSF survey. That figure has been steadily declining since the 1970s, said Paula Stephan, an economist at Georgia State University who studies the scientific workforce. The reason: The supply of scientists has grown far faster than the number of academic positions.

    The story goes on to note the job losses in pharmaceuticals and other industry categories. Related, from Bob Cringely on the IT industry: "IT class warfare — It’s not just IBM".

    In America right now there is a glut of $80,000-and-above IT workers and a shortage of $40,000-and-below IT workers.

    Remember that $80,000-and-above population comes with a surcharge for benefits that may not equally apply to the $40,000-and-below crowd, especially if those are overseas or in this country temporarily. A good portion of that surcharge relates to costs that increase with age, so older workers are more expensive than younger workers.

    It’s illegal to discriminate based on age but not illegal to discriminate based on cost, yet one is a proxy for the other. So this is not just class warfare, it is generational warfare.

    Academic jobs are subject to different dynamics than corporate jobs, but some related phenomena are at play. Most academic scientists who make it onto the tenure track begin to experience "salary compression" -- the phenomenon in which institutions pay a market rate to new Ph.D. hires, which grows faster than the salaries paid to continuing faculty. This is a sort of perverse intergenerational conflict that arises in part because of tenure. By limiting the ability of mid-career academics to move to a new job, tenure protects universities against having to offer experienced faculty competitive salaries. Young researchers enter a speculation market in which most will fail to find academic jobs, while a few good "tenure prospects" are offered higher and higher salaries by the institutions that can afford them.

  • Lacking knowledge

    Tue, 2012-06-26 12:10 -- John Hawks

    Sandra Blakeslee discusses a new book about the process of science: Ignorance: How It Drives Science, by Stuart Firestein ("To Advance, Search for a Black Cat in a Dark Room").

    Dr. Firestein got the idea for his book by teaching a course on cellular and molecular neuroscience, based on a 1,414-page textbook that, at 7.7 pounds, weighs more than twice as much as a human brain. He eventually realized that his students must think that pretty much everything in neuroscience is known. “This could not be more wrong,” he writes. “I had, by teaching this course diligently, given the students the idea that science is an accumulation of facts.

    “When I sit down with colleagues over a beer at a meeting, we don’t go over facts,” Dr. Firestein writes. “We don’t talk about what’s known. We talk about what we’d like to figure out, about what needs to be done.”

    Lurking here is some insight about the process of deciding who gets to do what (and gets funded for it). To be willing to grant money to a project, there is a trade-off between admitting that the answer is unknown, and admitting that the process has a good likelihood of successful outcome. Ignorance has a dual role here: the grantor must admit a certain amount of ignorance, while the prospective grantee must define ignorance in an incredibly narrow way (and ideally demonstrate that it's not ignorance after all).

    In some sense, becoming successful at funding your work requires solving an intricate communication problem where the subject is ignorance.

    I'm a little concerned about the idea (mentioned in the review) of an entire semester-long course titled, "Ignorance". Don't the students get enough about ignorance after one or two class sessions?

  • "We don't need a master"

    Sun, 2012-05-27 13:56 -- John Hawks

    The Boston Globe has a a story about a new institute, founded by Jon F. Wilkins, that aims to solve some of the administrative problems facing independent scholars: "The Ronin Institute for wayward academics".

    But the issue isn’t just a lack of jobs for would-be academics. To do research, young scholars usually need to find full-time academic jobs. By training more people than it can employ, the current system leaves untapped brainpower languishing.

    In a white paper published this month by the Ewing Marion Kauffman Foundation, Wilkins and coauthor Samuel Arbesman, a senior scholar at the foundation, are suggesting an alternative. Academics, they argue, need not be professors with experiences “steeped within the ivory tower.” They can be “fractional scholars”—a term they coined—pursuing their interests on their own, outside of academia. “Many, many PhDs have the ability to do it,” said Arbesman, who has also written for Ideas. There’s just one issue. “Within the current culture,” he said, “you need some sort of institutional affiliation.”

    As the article explains, the affiliation is necessary for grant-seeking from some funding sources. Obviously its other function -- as a source of credibility -- depends on the scholars who affiliate with the Institute and their work. I think such an institute needs to establish a positive agenda so that others won't perceive it as a mere reaction to the job market.

    One way that other institutes gain credibility is by becoming involved with training students, or facilitating students to do work with established scholars. (The Santa Fe Institute, which the article mentions, is one that provides opportunities for advanced students to interact with resident scholars, for example). Could there be "Ronin workshops"?

  • Big data, no access, no replication possible

    Tue, 2012-05-22 15:29 -- John Hawks

    The New York Times has an article by John Markoff today, pointing to several disputes over the standards for data release with scientific papers. "Troves of Personal Data, Forbidden to Researchers".

    These cases mostly relate to data gathered by corporations about their users or customers, which raises privacy concerns that are similar in some ways to those attending biomedical research. For that reason, I don't think that they are a good comparison with the situation in paleoanthropology, but they do overlap to a great extent with the issues in human genetics. In either case, the article has many elements that are useful to think about:

    He added that corporate control of data could give preferential access to an elite group of scientists at the largest corporations. “If this trend continues,” he wrote, “we’ll see a small group of scientists with access to private data repositories enjoy an unfair amount of attention in the community at the expense of equally talented researchers whose only flaw is the lack of right ‘connections’ to private data.”

    Also, I did not realize this:

    The data-sharing policy of the journal Science says, “All data necessary to understand, assess and extend the conclusions of the manuscript must be available to any reader of Science.”

    Several paleoanthropology papers have been published in the last few years without meeting this basic standard.

  • Non-consensual replication

    Thu, 2012-05-17 08:48 -- John Hawks
    Still from Ghostbusters psi experiment

    Yes, it is a star!

    Ed Yong has a long article in Nature about the recurrent problems with non-replication of "Replication studies: Bad copy". The piece begins with the flap over Daryl Bem's work on ESP, in which journals refused to publish non-replications by other researchers. The sad part is that many other areas of psychology follow the same protocol as work on paranormal psychology: Publish highly massaged positive results, don't encourage anyone to replicate.

    One reason for the excess in positive results for psychology is an emphasis on “slightly freak-show-ish” results, says Chris Chambers, an experimental psychologist at Cardiff University, UK. “High-impact journals often regard psychology as a sort of parlour-trick area,” he says. Results need to be exciting, eye-catching, even implausible. Simmons says that the blame lies partly in the review process. “When we review papers, we're often making authors prove that their findings are novel or interesting,” he says. “We're not often making them prove that their findings are true.”

    Instead of actual replication, researchers sometimes pursue "conceptual replication": showing that similar experimental designs also yield positive results:

    But to other psychologists, reliance on conceptual replication is problematic. “You can't replicate a concept,” says Chambers. “It's so subjective. It's anybody's guess as to how similar something needs to be to count as a conceptual replication.” The practice also produces a “logical double-standard”, he says. For example, if a heavy clipboard unconsciously influences people's judgements, that could be taken to conceptually replicate the slow-walking effect. But if the weight of the clipboard had no influence, no one would argue that priming had been conceptually falsified. With its ability to verify but not falsify, conceptual replication allows weak results to support one another. “It is the scientific embodiment of confirmation bias,” says Brian Nosek, a social psychologist from the University of Virginia in Charlottesville. “Psychology would suffer if it wasn't practised but it doesn't replace direct replication. To show that 'A' is true, you don't do 'B'. You do 'A' again.”

    Someone quoted in the article compares this situation to a house of cards. I agree. You are building one assumption upon another. The disturbing part is that the discipline accepts that some researchers just have a "knack" for making a particular experimental design work, and other researchers may have trouble recreating the exact conditions. That very attitude enables fraud, as we have seen repeatedly during the last few years. In science, if no one else can make the experiment work, it didn't happen.

    The entire article is worth reading and wide discussion.

  • This is totally serial

    Thu, 2012-05-10 20:22 -- John Hawks

    Michael B. Eisen: "The solution to the ‘serials crisis’ on campus"

    The solution is obvious: universities must stop outsourcing vital functions to publishers. They need to shift the currency of academic success from the title of the journal in which a scholar’s works are published to the inherent quality of their research. And they need to immediately stop spending money on journal subscriptions, investing instead in the new forms of scholarly communication appropriate for the Internet age.

Pages

Subscribe to metascience

Neandertals

For years, I've worked on their bones. Now I'm working on their genes. Read more about the science studying these ancient people.

Denisova

From a finger bone of an ancient human came the record of a completely unexpected population. My lab is working on the science of the Denisova genome.

Acceleration

The advent of agriculture caused natural selection to speed up greatly in humans. We're uncovering some of the ways that populations have rapidly changed during the last 10,000 years.

Malapa

Just outside Johannesburg, the Malapa site is producing some of the most exciting finds in human evolution. This site is the headquarters of the Malapa Soft Tissue Project.