The cost of plagiarism at NSF

I pass this along from ScienceInsider, really too irritated for clever comment: “NSF Audit of Successful Proposals Finds Numerous Cases of Alleged Plagiarism”.

The National Science Foundation (NSF) is investigating nearly 100 cases of suspected plagiarism drawn from a single year's worth of proposals funded by the agency.
The cases grow out of an internal examination by NSF's Office of Inspector General (IG) of every proposal that NSF funded in fiscal year 2011. James Kroll, head of administrative investigations within the IG's office, tells ScienceInsider that applying plagiarism software to NSF's entire portfolio of some 8000 awards made that year resulted in a "hit rate" of 1% to 1.5%. "My group is now swamped," he says about his staff of six investigators.

So…

Between 1 and 1.5% of the NSF budget is going to fund obvious plagiarists. Obvious because they can be caught with standard plagiarism filters, which are not richly seeded with scientific papers.

Because closed access stands in the way of incorporating much of the scientific literature into such databases.

And this doesn’t count the incidence of grants that are given to applications proposing work that is already done.

The NSF budget is not evenly distributed among grants, and I suppose that many small grants probably contain more plagiarism than the few really big ones. Still, we’re talking about $50 million or so.

UPDATE (2013-03-09): A reader writes:

I was just reading your post on plagiarism, and it made me recall something that happened to me years ago when I was a practicing biochemist. My boss received a grant to review on some work proposed by one of our competitors. He passed off a copy to me to look at (I was a postdoc at the time.) On reading the background section, there was a paragraph that sounded familiar. I did a little looking around on my computer and it turned out the reason the paragraph sounded familiar was that I had written it. But not in a paper - it was in one of our grant proposals. The material didn't concern any proposed experiments - it was just part of a short review of the state of the field, so we never did anything about it. I knew the guy who did this and he was quite capable of writing a decent paragraph himself, so I never could figure out why he borrowed my material. Anyway, it may not be enough to get all the literature in the database - they should have all the other grant proposals in there too.

This is another essential area. Probably the most common outcome is people stealing ideas from other proposals. The texts of unfunded proposals are not available to the public, which may cut down on stealing but also impedes comparing funded proposals. I tend to think that the lower the success rate, the more likely we’ll see substantial cheating of one kind or another.