Lots and lots of bits of copying in scientific literature

A new study indicates that lots of bits of old studies turn up, verbatim, in lots of newer scientific studies. The new study (which I have not checked to see whether it contains uncredited copied text) is:


Patterns of text reuse in a scientific corpus,” Daniel T. Citron  and Paul Ginsparg [pictured here], Proceedings of the National Academy of Sciences, epub December 8, 2014. The authors at Cornell University, report:

“We consider the incidence of text ‘reuse’ by researchers via a systematic pairwise comparison of the text content of all articles deposited to arXiv.org from 1991 to 2012. We measure the global frequencies of three classes of text reuse and measure how chronic text reuse is distributed among authors in the dataset. We infer a baseline for accepted practice, perhaps surprisingly permissive compared with other societal contexts, and a clearly delineated set of aberrant authors. We find a negative correlation between the amount of reused text in an article and its influence, as measured by subsequent citations.”

Co-author Ginsparg is the creator of arXiv.

John Bohannon gives further details and comment, in Science magazine.

(Thanks to investigator Scott Langill for bringing this to our attention.)