Reproducible research
How to make research reproducible? How to prevent Scientific misconduct? What are the best practices for Research workflow?
The ultimate standard for science is replicability where other researchers can replicate the findings using their own, independent instruments and datasets. However, often it is infeasible to replicate the data collection because of the prohibitive cost. Therefore, reproducibility is usually considered as the “attainable minimum standard”.1
Ensuring reproducibility (in computational research) would mean sharing the research code and data — ideally all automated, linked and packaged with its environments. When code is not shared, other researchers need to put a lot of effort to recreate the computation.3 In practice, sharing data is often not straightforward due to the sensitivity and ownership of the datasets. It sometimes clashes with the right of the data owners.
Note that, however, the reproducibility “does not guarantee the quality, correctness, or validity of the published results.”12
One of the earliest efforts towards reproducible research standard was the ICERM workshop[^ICERM workshop] and the resulting report, Setting the Default to Reproducible.
Richard McElreath‘s talk Science as Amateur Software Development overviews the point of failure regarding Software engineering with several nice examples. He also has a paper that investigates the incentives for bad science: The natural selection of bad science.
Examples of harm
Books
People and organizations
Open data and open code
Assel2018statistical reported a poor state of statistical code in medical research.
Tools
Tutorials
- Reproducibility starts at home
- 장혜식: 재현가능한 생물정보학 (서울대학교 생물정보학 협동과정 2019년 1학기 “생물정보학을 위한 IT기초”)
- Software Sustainability: Better Software Better Science
- Ten simple rules for writing and sharing computational analyses in Jupyter Notebooks
- Cookiecutter Data Science
- https://psyteachr.github.io
- For R users.
Talks
Articles
- Why Most Published Research Findings Are False by John Ioannidis
- http://nsaunders.wordpress.com/2011/02/28/nature-on-reproducible-research/
- nature:Computational science: …Error
- http://marciovm.com/i-want-a-github-of-science
- The CRAPL: An academic-strength open source license by Matt Might
- It’s Science, but Not Necessarily Right by Carl Zimmer
- Science: special issue on Data Replication & Reproducibility
- Ten Simple Rules for Reproducible Computational Research
- Nature special: CHALLENGES IN IRREPRODUCIBLE RESEARCH
- Reproducible Research by Sergey Fomel and Jon F
- Reproducible Science by Arturo Casadevall and Ferric C
- Achieving human and machine accessibility of cited data in scholarly publications
- A statistical definition for reproducibility and replicability
- Reproducibility and Replicability in Science
- How GitLab can help in research reproducibility
On the responsibility of journals
Replication studies tend to be desk-rejected
References
-
PNAS: Opinion: Reproducible research can still be wrong: Adopting a prevention approach, https://www.pnas.org/content/112/6/1645 ↩