Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Analytics
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Challenges == In the industry of commercial analytics software, an emphasis has emerged on solving the challenges of analyzing massive, complex data sets, often when such data is in a constant state of change. Such data sets are commonly referred to as [[big data]].<ref name=":2">{{Cite web|title=2.3 Ten common characteristics of big data|url=https://www.bitbybitbook.com/en/1st-ed/observing-behavior/characteristics/|access-date=2022-01-10|website=www.bitbybitbook.com|language=en|archive-date=March 31, 2022|archive-url=https://web.archive.org/web/20220331114208/https://www.bitbybitbook.com/en/1st-ed/observing-behavior/characteristics/|url-status=live}}</ref> Whereas once the problems posed by big data were only found in the scientific community, today big data is a problem for many businesses that operate transactional systems online and, as a result, amass large volumes of data quickly.<ref>{{cite web|last=Naone|first=Erica|title=The New Big Data|url=https://www.technologyreview.com/2011/08/22/192225/the-new-big-data/|access-date=August 22, 2011|publisher=Technology Review, MIT|archive-date=May 20, 2022|archive-url=https://web.archive.org/web/20220520143457/https://www.technologyreview.com/2011/08/22/192225/the-new-big-data/|url-status=live}}</ref><ref name=":2"/> The analysis of [[unstructured data]] types is another challenge getting attention in the industry. Unstructured data differs from [[structured data]] in that its format varies widely and cannot be stored in traditional relational databases without significant effort at data transformation.<ref>{{cite book|last1=Inmon|first1=Bill|title=Tapping Into Unstructured Data|last2=Nesavich|first2=Anthony|publisher=Prentice-Hall|year=2007|isbn=978-0-13-236029-6}}</ref> Sources of unstructured data, such as email, the contents of word processor documents, PDFs, [[geospatial data]], etc., are rapidly becoming a relevant source of [[business intelligence]] for businesses, governments and universities.<ref>{{cite web|last=Wise|first=Lyndsay|title=Data Analysis and Unstructured Data|url=http://www.dashboardinsight.com/articles/business-performance-management/data-analysis-and-unstructured-data.aspx|url-status=dead|archive-url=https://web.archive.org/web/20140105045015/http://www.dashboardinsight.com/articles/business-performance-management/data-analysis-and-unstructured-data.aspx|archive-date=January 5, 2014|access-date=February 14, 2011|publisher=Dashboard Insight}}</ref><ref>{{Cite web|title=Tapping the power of unstructured data|url=https://mitsloan.mit.edu/ideas-made-to-matter/tapping-power-unstructured-data|access-date=2022-01-10|website=MIT Sloan|date=February 2021 |language=en|archive-date=January 10, 2022|archive-url=https://web.archive.org/web/20220110151504/https://mitsloan.mit.edu/ideas-made-to-matter/tapping-power-unstructured-data|url-status=live}}</ref> For example, in Britain the discovery that one company was illegally selling fraudulent doctor's notes in order to assist people in defrauding employers and insurance companies<ref>{{cite news|date=August 26, 2008|title=Fake doctors' sick notes for Sale for Β£25, NHS fraud squad warns|newspaper=The Telegraph|location=London|url=https://www.telegraph.co.uk/news/uknews/2626120/Fake-doctors-sick-notes-for-sale-on-web-for-25-NHS-fraud-squad-warns.html |archive-url=https://ghostarchive.org/archive/20220112/https://www.telegraph.co.uk/news/uknews/2626120/Fake-doctors-sick-notes-for-sale-on-web-for-25-NHS-fraud-squad-warns.html |archive-date=January 12, 2022 |url-access=subscription |url-status=live|access-date=September 16, 2011}}{{cbignore}}</ref> is an opportunity for insurance firms to increase the vigilance of their unstructured [[data analysis]].<ref>{{cite news|date=May 26, 2011|title=Big Data: The next frontier for innovation, competition and productivity as reported in Building with Big Data|newspaper=The Economist|url=http://www.economist.com/node/18741392|url-status=live|archive-url=https://web.archive.org/web/20110603031738/http://www.economist.com/node/18741392|archive-date=June 3, 2011}}</ref>{{Original research inline|date=January 2022}} These challenges are the current inspiration for much of the innovation in modern analytics information systems, giving birth to relatively new machine analysis concepts such as [[complex event processing]],<ref>{{Cite journal|last1=Flouris|first1=Ioannis|last2=Giatrakos|first2=Nikos|last3=Deligiannakis|first3=Antonios|last4=Garofalakis|first4=Minos|last5=Kamp|first5=Michael|last6=Mock|first6=Michael|date=2017-05-01|title=Issues in complex event processing: Status and prospects in the Big Data era|url=https://www.sciencedirect.com/science/article/pii/S0164121216300802|journal=Journal of Systems and Software|language=en|volume=127|pages=217β236|doi=10.1016/j.jss.2016.06.011|issn=0164-1212|access-date=January 10, 2022|archive-date=April 14, 2019|archive-url=https://web.archive.org/web/20190414070609/http://www.sciencedirect.com/science/article/pii/S0164121216300802|url-status=live|url-access=subscription}}</ref> full text search and analysis, and even new ideas in presentation. One such innovation is the introduction of grid-like architecture in machine analysis, allowing increases in the speed of [[massively parallel]] processing by distributing the workload to many computers all with equal access to the complete data set.<ref>{{Cite journal|last1=Yang|first1=Ning|last2=Liu|first2=Diyou|last3=Feng|first3=Quanlong|last4=Xiong|first4=Quan|last5=Zhang|first5=Lin|last6=Ren|first6=Tianwei|last7=Zhao|first7=Yuanyuan|last8=Zhu|first8=Dehai|last9=Huang|first9=Jianxi|date=2019-06-25|title=Large-Scale Crop Mapping Based on Machine Learning and Parallel Computation with Grids|journal=Remote Sensing|volume=11|issue=12|pages=1500|doi=10.3390/rs11121500|bibcode=2019RemS...11.1500Y |issn=2072-4292|doi-access=free}}</ref> Analytics is increasingly used in [[education]], particularly at the district and government office levels. However, the complexity of student performance measures presents challenges when educators try to understand and use analytics to discern patterns in student performance, predict graduation likelihood, improve chances of student success, etc.<ref>{{Cite book|last1=Prinsloo|first1=Paul|last2=Slade|first2=Sharon|title=Proceedings of the Seventh International Learning Analytics & Knowledge Conference |chapter=An elephant in the learning analytics room |date=2017-03-13|chapter-url=https://doi.org/10.1145/3027385.3027406|series=LAK '17|location=New York, NY, USA|publisher=Association for Computing Machinery|pages=46β55|doi=10.1145/3027385.3027406|isbn=978-1-4503-4870-6|s2cid=9490514|url=http://oro.open.ac.uk/48944/2/LAK_2017_paper_89.pdf }}</ref> For example, in a study involving districts known for strong data use, 48% of teachers had difficulty posing questions prompted by data, 36% did not comprehend given data, and 52% incorrectly interpreted data.<ref>U.S. Department of Education Office of Planning, Evaluation and Policy Development (2009). ''Implementing data-informed decision making in schools: Teacher access, supports and use.'' United States Department of Education (ERIC Document Reproduction Service No. ED504191)</ref> To combat this, some analytics tools for educators adhere to an [[over-the-counter data]] format (embedding labels, supplemental documentation, and a help system, and making key package/display and content decisions) to improve educators' understanding and use of the analytics being displayed.<ref>Rankin, J. (March 28, 2013). [https://sas.elluminate.com/site/external/recording/playback/link/table/dropin?sid=2008350&suid=D.4DF60C7117D5A77FE3AED546909ED2 How data Systems & reports can either fight or propagate the data analysis error epidemic, and how educator leaders can help.] {{Webarchive|url=https://web.archive.org/web/20190326201414/https://sas.elluminate.com/site/external/recording/playback/link/table/dropin?sid=2008350&suid=D.4DF60C7117D5A77FE3AED546909ED2|date=March 26, 2019}} ''Presentation conducted from Technology Information Center for Administrative Leadership (TICAL) School Leadership Summit.''</ref> === Risks === Risks for the general population include [[discrimination]] on the basis of characteristics such as gender, skin colour, ethnic origin or political opinions, through mechanisms such as [[price discrimination]] or [[Statistical discrimination (economics)|statistical discrimination]].<ref>{{Cite journal|last1=Favaretto|first1=Maddalena|last2=De Clercq|first2=Eva|last3=Elger|first3=Bernice Simone|date=2019-02-05|title=Big Data and discrimination: perils, promises and solutions. A systematic review|journal=[[Journal of Big Data]]|volume=6|issue=1|pages=12|doi=10.1186/s40537-019-0177-4|s2cid=59603476|issn=2196-1115|doi-access=free}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)