Some dissatisfaction with review articles

John Ioannidis often speaks out on abuses of confidence and statistics in science. He recently did an interview with Retraction Watch in which he commented upon the proliferation of “systematic reviews and meta-analyses” in science: “We have an epidemic of deeply flawed meta-analyses, says John Ioannidis”.

Retraction Watch: You say that the numbers of systematic reviews and meta-analyses have reached “epidemic proportions,” and that there is currently a “massive production of unnecessary, misleading, and conflicted systematic reviews and meta-analyses.” Indeed, you note the number of each has risen more than 2500% since 1991, often with more than 20 meta-analyses on the same topic. Why the massive increase, and why is it a problem?
John Ioannidis: The increase is a consequence of the higher prestige that systematic reviews and meta-analyses have acquired over the years, since they are (justifiably) considered to represent the highest level of evidence. Many scientists now want to do them, leading journals want to publish them, and sponsors and other conflicted stakeholders want to exploit them to promote their products, beliefs, and agendas. Systematic reviews and meta-analyses that are carefully done and that are done by players who do not have conflicts and pre-determined agendas are not a problem, quite the opposite. The problem is that most of them are not carefully done and/or are done with pre-determined agendas on what to find and report.

This comment resonated with me. This year has seen an explosion of review papers on hominin diversity.

The review papers that have appeared so far this year mostly seem to reflect symposium contributions that were presented in 2014 or 2015. Few of them have significant updates since they entered review sometime in 2015, and so they do not discuss findings of the last couple of years except very superficially. Most do not integrate findings from across different time intervals in human origins.

So, for example there have been three or four review papers this year summarizing hominin diversity in the Middle Pliocene. This is the time period that has produced evidence of Australopithecus afarensis, as well as a bunch of other species represented by a few fossils each, including A. bahrelghazali, Kenyanthropus platyops, and most recently A. deyiremeda.

Each of the review articles this year reads basically like a position paper by its authors about Australopithecus deyiremeda, either for or against.

This is a difficult area to review right now. The fossil evidence is sparse. Many of the authors consider the “best” evidence of species diversity to be the partial foot skeleton from Burtele, Ethiopia, which has not been assigned to a species. Few scientists have studied all the relevant material.

A real answer to the question of Middle Pliocene hominin diversity should start from a full and free examination of the variation found within the largest fossil samples, not excluding any specimens, and without any prior assumptions about their taxonomic status. We would need to revisit and test ideas from the 1980s about the taxonomy of Hadar fossils, relying upon the larger dataset that has since accumulated from Hadar and other sites. This would require us to seriously examine the idea that the cranial and dental fossils may represent more than a single species, and the postcranial fossils may represent more than a single adaptive pattern.

This kind of open examination would have many beneficial outcomes. Not least, it would provide some clear idea of what kind of evidence should be supplied to support the diagnosis of new species.

I recognize the root of my dissatisfaction in Ioannidis’ comments. The current crop of review articles, which serve the purposes of journals and players with pre-determined agendas, does little to advance the science. It is one thing when such a review is in a journal dedicated to review for non-specialists, like Annual Reviews in Anthropology or the Yearbook of Physical Anthropology. But it’s depressing to realize how little actual synthesis we are seeing in other contexts.