What sequences deserve a peer-reviewed publication?

David Roy Smith asks whether sequencing mitochondrial DNA is still worth a scientific paper: “Opinion: Too Many Mitochondrial Genome Papers”.

Few would question the utility of mitochondrial DNA (mtDNA) as a genetic marker. But it is increasingly clear that sequencing mtDNA has become an easy route to peer-reviewed publications; at times, the pursuit of these publications is encumbering journal editors, referees, and the research infrastructure as a whole. Is publishing papers on mitochondrial genomes a relic of the “publish or perish” academic landscape? Should mtDNA sequences go directly into GenBank? Are we still gaining new and significant insights from mitochondrial genome data?

The question here is about “new” mitochondrial genomes obtained for species that have not yet been sampled in that way. Smith contrasts the publication of a mere sequence with “fundamental questions”, although he doesn’t specify which questions those include:

With state-of-the-art methods for generating complete mtDNA sequences there came a deluge of publications describing these sequences. The scientific literature is saturated with mitochondrial genome papers. Some of these papers address fundamental questions, but others, unfortunately, add little in terms of new knowledge, and reflect more on some researchers’ desires to accumulate peer-reviewed papers in place of achieving scientific progress. Many journals have become dumping grounds for mtDNA papers of varying quality. All of this is tying up editors, reviewers, and the authors themselves, and potentially distracting them from more valuable tasks.

Sometime around 1996, human genetics passed the point where a single mtDNA partial sequence reported from a previously unsampled human population was worth a peer-reviewed publication. Sometime around 2000, the same was true of whole mitochondrial genomes. A single mitochondrial genome from any previously unsampled ancient hominin population is still of the greatest scientific interest, and even single mtDNA sequences from Neandertals are worth publishing rather than merely depositing them into Genbank. This is because these ancient sequences add substantively to our knowledge of human evolution and variation, while a single sequence from any living human today almost certainly does not do so—unless we find a living human who has the mtDNA sequence of a Neandertal, for example.

So what Smith is saying is that most of the current surge of mtDNA genome sequencing is adding incrementally rather than substantively to our understanding of variation on the tree of life. He further emphasizes that most effort has been allocated to metazoans, and very little to microbial eukaryotes where fundamental questions remain unanswered.

It seems to me that funders should enable this kind of work to become part of training programs for undergraduates, or even high school students. These genomes should all be published, but in journals that emphasize training with the potential that new genomes may be effectively replication studies for hypotheses already tested at other points in the phylogenetic tree. With some central coordination, such work can fill in the tree of life, allowing tests of hypotheses that no single lab could have accomplished.