Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Reproducibility
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Reproducible research== ===Reproducible research method=== The term ''reproducible research'' refers to the idea that scientific results should be documented in such a way that their deduction is fully transparent. This requires a detailed description of the methods used to obtain the data<ref>{{Cite journal|last=King|first=Gary|date=1995|title=Replication, Replication|journal=PS: Political Science and Politics|volume=28|issue=3|pages=444β452|doi=10.2307/420301|jstor=420301|s2cid=250480339 |issn=1049-0965|url=http://nrs.harvard.edu/urn-3:HUL.InstRepos:4266312}}</ref><ref>{{cite journal|last1=KΓΌhne |first1=Martin |last2=Liehr |first2=Andreas W. |year=2009 |title=Improving the Traditional Information Management in Natural Sciences |doi=10.2481/dsj.8.18 |journal=Data Science Journal |volume=8 |issue=1 |pages=18β27 |url=https://datascience.codata.org/jms/article/download/dsj.8.18/198 |doi-access=free}}</ref> and making the full dataset and the code to calculate the results easily accessible.<ref>{{cite journal|last1=Fomel |first1=Sergey |author-link2=Jon Claerbout |last2=Claerbout |first2=Jon |year=2009 |title=Guest Editors' Introduction: Reproducible Research |journal=Computing in Science and Engineering |volume=11 |issue=1 |pages=5β7 |doi=10.1109/MCSE.2009.14 |bibcode=2009CSE....11a...5F}}</ref><ref name="buckheit1995" /><ref>{{cite journal|title=The Yale Law School Round Table on Data and Core Sharing: "Reproducible Research" |journal=Computing in Science and Engineering |volume=12 |issue=5 |pages=8β12 |doi=10.1109/MCSE.2010.113 |year=2010 |doi-access=}}</ref><ref>{{cite journal|last1=Marwick |first1=Ben |year=2016 |title=Computational reproducibility in archaeological research: Basic principles and a case study of their implementation |journal=Journal of Archaeological Method and Theory |volume=24 |issue=2 |pages=424β450 |doi=10.1007/s10816-015-9272-9 |s2cid=43958561 |url=https://ro.uow.edu.au/smhpapers/4034}}</ref><ref>{{cite journal|last1=Goodman|first1=Steven N.|last2=Fanelli|first2=Daniele|last3=Ioannidis|first3=John P. A.|title=What does research reproducibility mean?|journal=Science Translational Medicine|date=1 June 2016|volume=8|issue=341|pages=341ps12|doi=10.1126/scitranslmed.aaf5027|pmid=27252173|doi-access=free}}</ref><ref>{{Cite journal|last1=Harris J.K|last2=Johnson K.J|last3=Combs T.B|last4=Carothers B.J|last5=Luke D.A|last6=Wang X|date=2019|title=Three Changes Public Health Scientists Can Make to Help Build a Culture of Reproducible Research|journal=Public Health Rep. Public Health Reports|volume=134|issue=2|pages=109β111|issn=0033-3549|oclc=7991854250|doi=10.1177/0033354918821076|pmid=30657732|pmc=6410469}}</ref> This is the essential part of [[open science]]. To make any research project computationally reproducible, general practice involves all data and files being clearly separated, labelled, and documented. All operations should be fully documented and automated as much as practicable, avoiding manual intervention where feasible. The workflow should be designed as a sequence of smaller steps that are combined so that the intermediate outputs from one step directly feed as inputs into the next step. Version control should be used as it lets the history of the project be easily reviewed and allows for the documenting and tracking of changes in a transparent manner. A basic workflow for reproducible research involves data acquisition, data processing and data analysis. Data acquisition primarily consists of obtaining primary data from a primary source such as surveys, field observations, experimental research, or obtaining data from an existing source. Data processing involves the processing and review of the raw data collected in the first stage, and includes data entry, data manipulation and filtering and may be done using software. The data should be digitized and prepared for data analysis. Data may be analysed with the use of software to interpret or visualise statistics or data to produce the desired results of the research such as quantitative results including figures and tables. The use of software and automation enhances the reproducibility of research methods.<ref>{{cite book |last1=Kitzes |first1=Justin |last2=Turek |first2=Daniel |last3=Deniz |first3=Fatma |title=The practice of reproducible research case studies and lessons from the data-intensive sciences |date=2018 |publisher=University of California Press |location=Oakland, California |isbn=9780520294745 |pages=19β30 |jstor=10.1525/j.ctv1wxsc7 |url=http://www.jstor.org/stable/10.1525/j.ctv1wxsc7}}</ref> There are systems that facilitate such documentation, like the [[R (programming language)|R]] [[Markdown]] language<ref>{{cite journal|last1=Marwick|first1=Ben|last2=Boettiger|first2=Carl|last3=Mullen|first3=Lincoln|title=Packaging data analytical work reproducibly using R (and friends)|journal=The American Statistician|volume=72|date=29 September 2017|pages=80β88|doi=10.1080/00031305.2017.1375986|s2cid=125412832|url=http://ro.uow.edu.au/cgi/viewcontent.cgi?article=6445&context=smhpapers}}</ref> or the [[Jupyter]] notebook.<ref>{{cite conference|title=Jupyter Notebooksβa publishing format for reproducible computational workflows |url=https://eprints.soton.ac.uk/403913/1/STAL9781614996491-0087.pdf |archive-url=https://web.archive.org/web/20180110174609/https://eprints.soton.ac.uk/403913/1/STAL9781614996491-0087.pdf |archive-date=2018-01-10 |url-status=live |book-title=Positioning and Power in Academic Publishing: Players, Agents and Agendas |editor1-last=Loizides |editor1-first=F |editor2-last=Schmidt |editor2-first=B |publisher=IOS Press |last1=Kluyver |first1=Thomas |last2=Ragan-Kelley |first2=Benjamin |last3=Perez |first3=Fernando |last4=Granger |first4=Brian |last5=Bussonnier |first5=Matthias |last6=Frederic |first6=Jonathan |last7=Kelley |first7=Kyle |last8=Hamrick |first8=Jessica |last9=Grout |first9=Jason |last10=Corlay |first10=Sylvain |conference=20th International Conference on Electronic Publishing |pages=87β90 |year=2016|doi=10.3233/978-1-61499-649-1-87}}</ref><ref>{{cite journal |last1=Beg |first1=Marijan |last2=Taka |first2=Juliette |last3=Kluyver |first3=Thomas |last4=Konovalov |first4=Alexander |last5=Ragan-Kelley |first5=Min |last6=Thiery |first6=Nicolas M. |last7=Fangohr |first7=Hans |title=Using Jupyter for Reproducible Scientific Workflows |journal=Computing in Science & Engineering |date=1 March 2021 |volume=23 |issue=2 |pages=36β46 |doi=10.1109/MCSE.2021.3052101|arxiv=2102.09562 |bibcode=2021CSE....23b..36B |s2cid=231979203 }}</ref><ref>{{cite journal |last1=Granger |first1=Brian E. |last2=Perez |first2=Fernando |title=Jupyter: Thinking and Storytelling With Code and Data |journal=Computing in Science & Engineering |date=1 March 2021 |volume=23 |issue=2 |pages=7β14 |doi=10.1109/MCSE.2021.3059263|bibcode=2021CSE....23b...7G |s2cid=232413965 |doi-access=free }}</ref> The [[Open Science Framework]] provides a platform and useful tools to support reproducible research. ===Reproducible research in practice=== Psychology has seen a renewal of internal concerns about irreproducible results (see the entry on [[replicability crisis]] for empirical results on success rates of replications). Researchers showed in a 2006 study that, of 141 authors of a publication from the American Psychological Association (APA) empirical articles, 103 (73%) did not respond with their data over a six-month period.<ref>{{Cite journal|last1=Wicherts |first1=J. M. |last2=Borsboom |first2=D. |last3=Kats |first3=J. |last4=Molenaar |first4=D. |title=The poor availability of psychological research data for reanalysis |doi=10.1037/0003-066X.61.7.726 |journal=American Psychologist |volume=61 |issue=7 |pages=726β728 |year=2006 |pmid=17032082}}</ref> In a follow-up study published in 2015, it was found that 246 out of 394 contacted authors of papers in APA journals did not share their data upon request (62%).<ref>{{Cite journal|last1=Vanpaemel |first1=W. |last2=Vermorgen |first2=M. |last3=Deriemaecker |first3=L. |last4=Storms |first4=G. |title=Are we wasting a good crisis? The availability of psychological research data after the storm |doi=10.1525/collabra.13 |journal=Collabra |volume=1 |issue=1 |pages=1β5 |year=2015 |doi-access=free}}</ref> In a 2012 paper, it was suggested that researchers should publish data along with their works, and a dataset was released alongside as a demonstration.<ref>{{Cite journal|last1=Wicherts |first1=J. M. |last2=Bakker |first2=M. |doi=10.1016/j.intell.2012.01.004 |title=Publish (your data) or (let the data) perish! Why not publish your data too? |journal=Intelligence |volume=40 |issue=2 |pages=73β76 |year=2012}}</ref> In 2017, an article published in ''[[Scientific Data (journal)|Scientific Data]]'' suggested that this may not be sufficient and that the whole analysis context should be disclosed.<ref>{{cite journal|last1=Pasquier|first1=Thomas|last2=Lau|first2=Matthew K.|last3=Trisovic|first3=Ana|last4=Boose|first4=Emery R.|last5=Couturier|first5=Ben|last6=Crosas|first6=MercΓ¨|last7=Ellison|first7=Aaron M.|last8=Gibson|first8=Valerie|last9=Jones|first9=Chris R.|last10=Seltzer|first10=Margo|title=If these data could talk|journal=Scientific Data|date=5 September 2017|volume=4|issue=1 |pages=170114|doi=10.1038/sdata.2017.114|pmid=28872630|pmc=5584398|bibcode=2017NatSD...470114P}}</ref> In economics, concerns have been raised in relation to the credibility and reliability of published research. In other sciences, reproducibility is regarded as fundamental and is often a prerequisite to research being published, however in economic sciences it is not seen as a priority of the greatest importance. Most peer-reviewed economic journals do not take any substantive measures to ensure that published results are reproducible, however, the top economics journals have been moving to adopt mandatory data and code archives.<ref>{{cite journal |last1=McCullough |first1=Bruce |title=Open Access Economics Journals and the Market for Reproducible Economic Research |journal=Economic Analysis and Policy |date=March 2009 |volume=39 |issue=1 |pages=117β126 |doi=10.1016/S0313-5926(09)50047-1|doi-access= }}</ref> There is low or no incentives for researchers to share their data, and authors would have to bear the costs of compiling data into reusable forms. Economic research is often not reproducible as only a portion of journals have adequate disclosure policies for datasets and program code, and even if they do, authors frequently do not comply with them or they are not enforced by the publisher. A Study of 599 articles published in 37 peer-reviewed journals revealed that while some journals have achieved significant compliance rates, significant portion have only partially complied, or not complied at all. On an article level, the average compliance rate was 47.5%; and on a journal level, the average compliance rate was 38%, ranging from 13% to 99%.<ref>{{cite journal |last1=Vlaeminck |first1=Sven |last2=Podkrajac |first2=Felix |title=Journals in Economic Sciences: Paying Lip Service to Reproducible Research? |journal=IASSIST Quarterly |date=2017-12-10 |volume=41 |issue=1β4 |page=16 |doi=10.29173/iq6 |url=https://iassistquarterly.com/index.php/iassist/article/view/6/905|hdl=11108/359 |s2cid=96499437 |hdl-access=free }}</ref> A 2018 study published in the journal ''[[PLOS ONE]]'' found that 14.4% of a sample of public health statistics researchers had shared their data or code or both.<ref>{{Cite journal|date=2018|title=Use of reproducible research practices in public health: A survey of public health analysts.|journal=PLOS ONE|volume=13|issue=9|pages=e0202447|issn=1932-6203|oclc=7891624396|bibcode=2018PLoSO..1302447H|last1=Harris|first1=Jenine K.|last2=Johnson|first2=Kimberly J.|last3=Carothers|first3=Bobbi J.|last4=Combs|first4=Todd B.|last5=Luke|first5=Douglas A.|last6=Wang|first6=Xiaoyan|doi=10.1371/journal.pone.0202447|pmid=30208041|pmc=6135378|doi-access=free}}</ref> There have been initiatives to improve reporting and hence reproducibility in the medical literature for many years, beginning with the [[Consolidated Standards of Reporting Trials|CONSORT]] initiative, which is now part of a wider initiative, the [[EQUATOR Network]]. This group has recently turned its attention to how better reporting might reduce waste in research,<ref>{{Cite web|title=Research Waste/EQUATOR Conference {{!}} Research Waste |url=http://researchwaste.net/research-wasteequator-conference/ |website=researchwaste.net |url-status=dead |archive-url=https://web.archive.org/web/20161029015313/http://researchwaste.net:80/research-wasteequator-conference/ |archive-date=29 October 2016}}</ref> especially biomedical research. Reproducible research is key to new discoveries in [[pharmacology]]. A Phase I discovery will be followed by Phase II reproductions as a drug develops towards commercial production. In recent decades Phase II success has fallen from 28% to 18%. A 2011 study found that 65% of medical studies were inconsistent when re-tested, and only 6% were completely reproducible.<ref>{{Cite journal|last1=Prinz |first1=F. |last2=Schlange |first2=T. |last3=Asadullah |first3=K. |doi=10.1038/nrd3439-c1 |title=Believe it or not: How much can we rely on published data on potential drug targets? |journal=Nature Reviews Drug Discovery |volume=10 |issue=9 |page=712 |year=2011 |pmid=21892149 |doi-access=free}}</ref> Some efforts have been made to increase replicability beyond the social and biomedical sciences. Studies in the humanities tend to rely more on expertise and hermeneutics which may make replicability more difficult. Nonetheless, some efforts have been made to call for more transparency and documentation in the humanities.<ref>{{Cite journal |last1=Van Eyghen |first1=Hans |last2= Van den Brink |first2=Gijsbert |last3= Peels |first3= Rik |title=Brooke on the Merton Thesis: A Direct Replication of John Hedley Brooke's Chapter on Scientific and Religious Reform |journal=Zygon: Journal of Religion and Science |volume=59 |issue=2 |year=2024|url=https://www.zygonjournal.org/article/id/11497/| doi=10.16995/zygon.11497|doi-access=free }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)