Reproducible research

How to make research reproducible? How to prevent Scientific misconduct? What are the best practices for Research workflow?

The ultimate standard for science is replicability where other researchers can replicate the findings using their own, independent instruments and datasets. However, often it is infeasible to replicate the data collection because of the prohibitive cost. Therefore, reproducibility is usually considered as the “attainable minimum standard”.1

Ensuring reproducibility (in computational research) would mean sharing the research code and data — ideally all automated, linked and packaged with its environments. When code is not shared, other researchers need to put a lot of effort to recreate the computation.3 In practice, sharing data is often not straightforward due to the sensitivity and ownership of the datasets. It sometimes clashes with the right of the data owners.

Note that, however, the reproducibility “does not guarantee the quality, correctness, or validity of the published results.”12

One of the earliest efforts towards reproducible research standard was the ICERM workshop[^ICERM workshop] and the resulting report, Setting the Default to Reproducible.

Richard McElreath‘s talk Science as Amateur Software Development overviews the point of failure regarding Software engineering with several nice examples. He also has a paper that investigates the incentives for bad science: The natural selection of bad science.

Examples of harm


People and organizations

Open data and open code

Assel2018statistical reported a poor state of statistical code in medical research.





On the responsibility of journals

Replication studies tend to be desk-rejected