Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Data mining
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Privacy concerns and ethics== While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to [[User behavior analytics|user behavior]] (ethical and otherwise).<ref>{{cite journal |author=Seltzer, William |title=The Promise and Pitfalls of Data Mining: Ethical Issues |url=https://ww2.amstat.org/committees/ethics/linksdir/Jsm2005Seltzer.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://ww2.amstat.org/committees/ethics/linksdir/Jsm2005Seltzer.pdf |archive-date=2022-10-09 |url-status=live|publisher = American Statistical Association|journal = ASA Section on Government Statistics|date = 2005 }}</ref> The ways in which data mining can be used can in some cases and contexts raise questions regarding [[privacy]], legality, and [[ethics]].<ref>{{cite journal |author=Pitts, Chip |title=The End of Illegal Domestic Spying? Don't Count on It |url=http://www.washingtonspectator.com/articles/20070315surveillance_1.cfm |journal=Washington Spectator |date=15 March 2007 |url-status=dead |archive-url=https://web.archive.org/web/20071128015201/http://www.washingtonspectator.com/articles/20070315surveillance_1.cfm |archive-date=2007-11-28 }}</ref> In particular, data mining government or commercial data sets for [[national security]] or [[law enforcement]] purposes, such as in the [[Total Information Awareness]] Program or in [[ADVISE]], has raised privacy concerns.<ref>{{cite journal |author=Taipale, Kim A. |title=Data Mining and Domestic Security: Connecting the Dots to Make Sense of Data |url=http://www.stlr.org/cite.cgi?volume=5&article=2 |journal=Columbia Science and Technology Law Review |volume=5 |issue=2 |date=15 December 2003 |ssrn=546782 |oclc=45263753 |access-date=21 April 2004 |archive-date=5 November 2014 |archive-url=https://web.archive.org/web/20141105035644/http://www.stlr.org/cite.cgi?volume=5&article=2 |url-status=dead }}</ref><ref>{{cite web|last1=Resig|first1=John|title=A Framework for Mining Instant Messaging Services|url=https://johnresig.com/files/research/SIAMPaper.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://johnresig.com/files/research/SIAMPaper.pdf |archive-date=2022-10-09 |url-status=live|access-date=16 March 2018}}</ref> Data mining requires data preparation which uncovers information or patterns which compromise [[confidentiality]] and [[Data privacy|privacy]] obligations. A common way for this to occur is through [[aggregate function|data aggregation]]. [[Data aggregation]] involves combining data together (possibly from various sources) in a way that facilitates analysis (but that also might make identification of private, individual-level data deducible or otherwise apparent).<ref name="NASCIO">[http://www.nascio.org/publications/documents/NASCIO-dataMining.pdf ''Think Before You Dig: Privacy Implications of Data Mining & Aggregation''] {{webarchive|url=https://web.archive.org/web/20081217063043/http://www.nascio.org/publications/documents/NASCIO-dataMining.pdf |date=2008-12-17 }}, NASCIO Research Brief, September 2004</ref> This is not data mining ''per se'', but a result of the preparation of data before—and for the purposes of—the analysis. The threat to an individual's privacy comes into play when the data, once compiled, cause the data miner, or anyone who has access to the newly compiled data set, to be able to identify specific individuals, especially when the data were originally anonymous.<ref>{{cite magazine |first=Paul |last=Ohm |title=Don't Build a Database of Ruin |magazine=Harvard Business Review |url=http://blogs.hbr.org/cs/2012/08/dont_build_a_database_of_ruin.html}}</ref> It is recommended{{according to whom|date=August 2019}} to be aware of the following '''before''' data are collected:<ref name="NASCIO" /> * The purpose of the data collection and any (known) data mining projects. * How the data will be used. * Who will be able to mine the data and use the data and their derivatives. * The status of security surrounding access to the data. * How collected data can be updated. Data may also be modified so as to ''become'' anonymous, so that individuals may not readily be identified.<ref name="NASCIO" /> However, even "[[Data anonymization|anonymized]]" data sets can potentially contain enough information to allow identification of individuals, as occurred when journalists were able to find several individuals based on a set of search histories that were inadvertently released by AOL.<ref>[http://www.securityfocus.com/brief/277 ''AOL search data identified individuals''] {{Webarchive|url=https://web.archive.org/web/20100106113836/http://www.securityfocus.com/brief/277 |date=2010-01-06 }}, SecurityFocus, August 2006</ref> The inadvertent revelation of [[personally identifiable information]] leading to the provider violates Fair Information Practices. This indiscretion can cause financial, emotional, or bodily harm to the indicated individual. In one instance of [[privacy violation]], the patrons of Walgreens filed a lawsuit against the company in 2011 for selling prescription information to data mining companies who in turn provided the data to pharmaceutical companies.<ref>{{Cite journal|title = Big data's impact on privacy, security and consumer welfare|journal = Telecommunications Policy|pages = 1134–1145|volume = 38|issue = 11|doi = 10.1016/j.telpol.2014.10.002|first = Nir|last = Kshetri|year = 2014|url = http://libres.uncg.edu/ir/uncg/f/N_Kshetri_Big_2014.pdf|access-date = 2018-04-20|archive-date = 2018-06-19|archive-url = https://web.archive.org/web/20180619135001/http://libres.uncg.edu/ir/uncg/f/N_Kshetri_Big_2014.pdf|url-status = live}}</ref> ===Situation in Europe=== [[European Union|Europe]] has rather strong privacy laws, and efforts are underway to further strengthen the rights of the consumers. However, the [[International Safe Harbor Privacy Principles|U.S.–E.U. Safe Harbor Principles]], developed between 1998 and 2000, currently effectively expose European users to privacy exploitation by U.S. companies. As a consequence of [[Edward Snowden]]'s [[global surveillance disclosure]], there has been increased discussion to revoke this agreement, as in particular the data will be fully exposed to the [[National Security Agency]], and attempts to reach an agreement with the United States have failed.<ref>{{cite web |url=https://crsreports.congress.gov/product/pdf/R/R44257/7 |title=U.S.–E.U. Data Privacy: From Safe Harbor to Privacy Shield |last1=Weiss |first1=Martin A. |last2=Archick |first2=Kristin |date=19 May 2016 |agency=Congressional Research Service |location=Washington, D.C. |page=6 |format=PDF |id=R44257 |access-date=9 April 2020 |quote=On October 6, 2015, the [[CJEU]] ... issued a decision that invalidated Safe Harbor (effective immediately), as currently implemented. |archive-date=9 April 2020 |archive-url=https://web.archive.org/web/20200409134413/https://crsreports.congress.gov/product/pdf/R/R44257/7 |url-status=dead }}</ref> In the United Kingdom in particular there have been cases of corporations using data mining as a way to target certain groups of customers forcing them to pay unfairly high prices. These groups tend to be people of lower socio-economic status who are not savvy to the ways they can be exploited in digital market places.<ref>{{Cite web |last=Parker |first=George |date=2018-09-30 |title=UK companies targeted for using big data to exploit customers |url=https://www.ft.com/content/5dbd98ca-c491-11e8-bc21-54264d1c4647 |archive-url=https://ghostarchive.org/archive/20221210/https://www.ft.com/content/5dbd98ca-c491-11e8-bc21-54264d1c4647 |archive-date=2022-12-10 |url-access=subscription |access-date=2022-12-04 |website=Financial Times}}</ref> ===Situation in the United States=== In the United States, privacy concerns have been addressed by the [[US Congress]] via the passage of regulatory controls such as the [[Health Insurance Portability and Accountability Act]] (HIPAA). The HIPAA requires individuals to give their "informed consent" regarding information they provide and its intended present and future uses. According to an article in ''Biotech Business Week'', "'[i]n practice, HIPAA may not offer any greater protection than the longstanding regulations in the research arena,' says the AAHC. More importantly, the rule's goal of protection through informed consent is approach a level of incomprehensibility to average individuals."<ref>Biotech Business Week Editors (June 30, 2008); ''BIOMEDICINE; HIPAA Privacy Rule Impedes Biomedical Research'', Biotech Business Week, retrieved 17 November 2009 from LexisNexis Academic</ref> This underscores the necessity for data anonymity in data aggregation and mining practices. U.S. information privacy legislation such as HIPAA and the [[Family Educational Rights and Privacy Act]] (FERPA) applies only to the specific areas that each such law addresses. The use of data mining by the majority of businesses in the U.S. is not controlled by any legislation.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)