On a scandal in replication

3 minute read

Experimental psychology has recently become embroiled in a controversy about whether replication of high-profile findings should be a serious goal of new research. Bioethicist Michelle Meyer and psychologist Christopher Chabris have a weighty essay in Slate about the problem: “Why Psychologists’ Food Fight Matters”.

They focus on the complaints that some experimenters have raised, that they feel “persecuted” or “bullied” by other scientists who are attempting to replicate their findings. In some cases, young researchers worry that their job prospects or tenure cases may suffer as other scientists unjustifiably discredit their earlier work, without being given an opportunity to respond in print in a timely way.

My first reaction to this was, “Oh, really? Well, that’s science.” But there are some real cases of boorish behavior. For example,

Once the journal accepted the paper, Donnellan reported, in a much-tweeted blog post, that his team had failed to replicate Schnall’s results. Although the vast majority of the post consisted of sober academic analysis, he titled it “Go Big or Go Home”—a reference to the need for bigger sample sizes to reduce the chances of accidentally finding positive results—and at one point characterized their study as an “epic fail” to replicate the original findings.

Now, I think that reflects worse on the replicator than on the original researcher. But it’s easy to see how it puts a responsible scientist into an impossible position. What we need is a less confrontational attitude toward replication, one that sees replicability as a hallmark of good science and accepts replication studies into journals without needing a “crisis of replication” to make them newsworthy.

Some have legitimate complaints about “failures to replicate” that do not correctly implement the experimental protocol, or change some key aspect that leads to a different outcome. Meyer and Chabris do a good job of presenting the nuances of such concerns, which basically revolve around competence. Are researchers capable of describing what they’ve done accurately, so that the results can be replicated? Or does every result depend on some “special skill” that the original lab has developed, that somehow cannot be described?

We should remember the cases in experimental psychology where the “special skill” involved falsifying results.

I agree with most of Meyer and Chabris’ suggestions, particularly the concept that replication should not depend upon the original researchers’ cooperation.

Replicators have no obligation to routinely involve original authors because those authors are not the owners of their methods or results. By publishing their results, original authors state that they have sufficient confidence in them that they should be included in the scientific record. That record belongs to everyone. Anyone should be free to run any experiment, regardless of who ran it first, and to publish the results, whatever they are.

In sciences with unique specimens, cell cultures, or other materials, this principle means that the original materials must be available for examination or use by qualified researchers. Where substantial investment or assistance has been provided by the original researchers to enable replication, obviously they should be included as authors. Many journals now provide a standard format for reporting contributions to a paper, and “provided reagents or samples” is one of the standard entries. There are cases where the contribution does not rise to the level of authorship, but at least such assistance should be acknowledged.

Paleoanthropologists often ignore the fundamental principle of replicability. But providing the full ability to replicate paleoanthropological work really is not very hard. Just make sure that qualified researchers can access the specimens. Many of the observations that paleonthropologists depend upon are simple measurements that could be replicated easily on 3-d models. Heck, in many cases nowadays the original measurements may have been taken on 3-d models.

That means there should never be any question of access to the same models for replication.

Personally, I wonder whether we should make grant funding contingent on first replicating some prior finding. Yes, that would take time that researchers might otherwise devote to new work. But to do effective research most experimentalists already need a pipeline capable of replicating earlier work in the same area of research. And they need to convince a grant panel that they are already prepared to carry out such work. Replication studies are a good way to accomplish both goals, and might sometimes actually lead to rejection of overhyped bad ideas.

Besides, we know that most grant applications that are funded actually describe work that already exists in pilot form. They may as well be doing something for the good of the science. Instead of rehashing their own old results, they could rehash other people’s!