Relatively little has been published about the particulars of fraudulent data, so that most statisticians have little or no experience in this field. From what is known we can divide the problems into two major groups:
• data manipulation to achieve a desired result or increase the statistical significance of the findings and affect the overall scientific conclusions;
• invention of data for non-existent or incomplete cases in clinical studies.
The motives in the first instance are to achieve publication, or to produce results confirming a particular theory, rather than financial considerations. The motives in the second are usually for financial gain. Most of the well known cases of fraud in the UK fall into this category. There are possibly some instances where academics have produced fictitious data to publish a paper or form material for a thesis, but this has been difficult to prove. A particular form of data manipulation is to include those who are ineligible according to the protocol for the study. Similarly it is possible that eligible subjects are excluded. These could be regarded as a separate category. These actions tend to have little effect on the internal validity, but could have an important effect on external validity (generalisability) of the study.
Publication bias is a well known feature of the scientific literature, in which results that are highly statistically significant are more likely to be published than those showing smaller effects. Hence, some fraud is directed at obtaining statistical significance of the results. Several publicised cases of this type have occurred in the USA but they are also known in Europe.This type of fraud tends to occur in academically related research, where career advancement is the ultimate motive. More details of known instances of fraud are given by Lock in Chapter 4 of this volume.
Was this article helpful?