How Many Scientists Fabricate and Falsify Research?

Daniele Fanelli

The image of scientists as objective seekers of truth is periodically jeopardized by the discovery of a major scientific fraud. Recent scandals like Hwang Woo-Suk’s fake stem-cell lines [1] or Jan Hendrik Schön’s duplicated graphs [2] showed how easy it can be for a scientist to publish fabricated data in the most prestigious journals, and how this can cause a waste of financial and human resources and might pose a risk to human health. How frequent are scientific frauds? The question is obviously crucial, yet the answer is a matter of great debate [3], [4].

A popular view propagated by the media [5] and by many scientists (e.g. [6]) sees fraudsters as just a “few bad apples” [7]. This pristine image of science is based on the theory that the scientific community is guided by norms including disinterestedness and organized scepticism, which are incompatible with misconduct [8], [9]. Increasing evidence, however, suggests that known frauds are just the “tip of the iceberg”, and that many cases are never discovered. The debate, therefore, has moved on to defining the forms, causes and frequency of scientific misconduct [4].

What constitutes scientific misconduct? Different definitions are adopted by different institutions, but they all agree that fabrication (invention of data or cases), falsification (wilful distortion of data or results) and plagiarism (copying of ideas, data, or words without attribution) are serious forms of scientific misconduct [7], [10]. Plagiarism is qualitatively different from the other two because it does not distort scientific knowledge, although it has important consequences for the careers of the people involved, and thus for the whole scientific enterprise [11].

There can be little doubt about the fraudulent nature of fabrication, but falsification is a more problematic category. Scientific results can be distorted in several ways, which can often be very subtle and/or elude researchers’ conscious control. Data, for example, can be “cooked” (a process which mathematician Charles Babbage in 1830 defined as “an art of various forms, the object of which is to give to ordinary observations the appearance and character of those of the highest degree of accuracy”[12]); it can be “mined” to find a statistically significant relationship that is then presented as the original target of the study; it can be selectively published only when it supports one’s expectations; it can conceal conflicts of interest, etc… [10], [11], [13], [14], [15]. Depending on factors specific to each case, these misbehaviours lie somewhere on a continuum between scientific fraud, bias, and simple carelessness, so their direct inclusion in the “falsification” category is debatable, although their negative impact on research can be dramatic [11], [14], [16]. Henceforth, these misbehaviours will be indicated as “questionable research practices” (QRP, but for a technical definition of the term see [11]).

