Is the term “questionable research practices” questionable?


Hypothesising after results are known (HARKing), p-hacking your way to statistical significance, optional stopping in the context of null hypothesis significance testing, selectively reporting dependent variables – these and other practices have frequently been dubbed “questionable research practices” (QRPs). These QRPs are often seen to be contributing to replication failure in the context of the replication crisis, particularly in psychology and related fields. The concept of QRPs is an umbrella term mostly characterised by enumeration of prima facie problematic practices, leaving its precise definition unclear and failing to clearly distinguish QRPs from outright fraud. Working toward more clarity in the definition of QRPs, I will examine this distinction. I want to argue that, while it might previously have made sense to downplay the gravity of certain practices by considering them “permissible” QRPs rather than fraud, at least in psychological research, most practices described as QRPs should in fact be considered on the spectrum of fraud or scientific misconduct. There currently is a mismatch between where QRPs should be on this spectrum and where researchers likely think they are, as indicated by the relatively benign descriptor “questionable”. Failing to adjust this would continue to normalise practices which should not be normal as they can be epistemically pernicious. I will propose a working definition of QRPs, focused on their impact on the probability of false-positive results in the context of justification, while at the same time exploring their precise relationship with fraudulent research. QRPs are merely questionable because they can be used legitimately when described transparently or in conjunction with other practices. For example, p-hacking, i.e., using multiple analyses before settling on a result, may be permissible when transparently reported and/or accompanied by an appropriate multiple comparisons correction. In the absence of correction or transparency, p-hacking is statistically inappropriate and/or amounts to (intentional or unintentional) deception and thus leads to unjustified inferences. Thus, the use of this QRP may be said to always be questionable (or epistemically ambiguous), but it may or may not amount to fraud or scientific misconduct. To distinguish cases of QRPs and fraud, I propose to focus on the distinction between acting and omitting. Fraudulent research is clearly epistemically nefarious, as it actively creates false evidence. Arguably, QRPs are different because they hide truly existing evidence. This distinction could be used to judge cases on the spectrum between clear cases of QRP and clear cases of fraud. Wherever the omission of evidence precludes the reader from making an appropriate inference, that QRP use should be seen as just as epistemically nefarious as fraud. In the example above, those unknowingly evaluating a p-hacked result will have a different impression of the false positive rate than they would have if the p-hacking was transparently reported or a multiple comparisons correction was carried out. They will therefore be unable to make an appropriate inference, and this QRP use should be situated closer to fraud. I hope that my investigation of QRPs and a possible distinction from fraud will contribute to the discussion surrounding QRPs and fraud in the context of the replication crisis.

Malaga, Spain