Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Provenance
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Data provenance=== {{further|Data lineage}} [[Scientific research]] is generally held to be of good provenance when it is documented in detail sufficient to allow [[reproducibility]].<ref>Altintas, I.; Berkley, C.; Jaeger, E.; Jones, M.; Ludascher, B.; Mock S. (2004) "Kepler: An extensible system for design and execution of scientific workflows". ''Proceedings of 16th International Conference on Scientific and Statistical Database Management'', pp. 423β424</ref><ref>{{cite journal|last1=Pasquier|first1=Thomas|last2=Lau|first2=Matthew K.|last3=Trisovic|first3=Ana|last4=Boose|first4=Emery R.|last5=Couturier|first5=Ben|last6=Crosas|first6=MercΓ¨|last7=Ellison|first7=Aaron M.|last8=Gibson|first8=Valerie|last9=Jones|first9=Chris R.|last10=Seltzer|first10=Margo|title=If these data could talk|journal=Scientific Data|date=5 September 2017|volume=4|pages=170114|doi=10.1038/sdata.2017.114|pmid=28872630|pmc=5584398|bibcode=2017NatSD...470114P}}</ref> [[Scientific workflow system]]s assist scientists and programmers with tracking their data through all transformations, analyses, and interpretations. Data sets are reliable when the processes used to create them are [[Reproducibility|reproducible]] and analyzable for defects.<ref>Boose, E.; Ellison, A.; Osterweil, L.; Clarke, L.; Podorozhny, R., Hadley, J.; Wise, A.; Foster, D. (2007) Ensuring reliable datasets for environmental models and forecasts. Ecological Informatics, 2(3):237β247</ref> Security researchers are interested in data provenance because it can analyze suspicious data and make large opaque systems transparent.<ref>{{Cite journal|last1=Bates|first1=Adam|last2=Hassan|first2=Wajih Ul|date=2019|title=Can Data Provenance Put an End to the Data Breach?|journal=IEEE Security & Privacy|volume=17|issue=4|pages=88β93|doi=10.1109/MSEC.2019.2913693|s2cid=195832747|doi-access=free}}</ref> Current initiatives to effectively manage, share, and reuse ecological data are indicative of the increasing importance of data provenance. Examples of these initiatives are [[National Science Foundation]] [[Datanet]] projects, [[DataONE]] and Data Conservancy, as well as the [[U.S. Global Change Research Program]].<ref name="Ma, et al.">Ma, X.; Fox, P.; Tilmes, C.; Jacobs, K.; Waple, A. (2014) Capturing and presenting provenance of global change information. Nature Climate Change 4 (6), 409-413.</ref> Some international academic consortia, such as the [[Research Data Alliance]], have specific groups to tackle issues of provenance. In that case it is the Research Data Provenance Interest Group.<ref>{{cite web|url=https://rd-alliance.org/groups/research-data-provenance.html|title=Research Data Provenance IG|date=11 September 2013|website=RDA}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)