Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Data stream
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Digitally encoded information}} {{Prose|date=January 2025}} {{Use mdy dates|date=October 2019}} {{about|the more general meaning of the term "data stream"|the UK-specific DSL technology|Datastream}} {{distinguish|Stream (computing)|Bitstream}} In [[connection-oriented communication]], a '''data stream''' is the [[transmission (telecommunications)|transmission]] of a sequence of [[digital signal|digitally encoded signals]] to convey [[information]].<ref>{{Cite web |url=http://www.its.bldrdoc.gov/fs-1037/dir-010/_1451.htm |title=Federal Standard 1037C ''data stream'' |access-date=April 4, 2007 |archive-url=https://web.archive.org/web/20070413122107/http://www.its.bldrdoc.gov/fs-1037/dir-010/_1451.htm |archive-date=April 13, 2007 |url-status=live }}</ref> Typically, the transmitted symbols are grouped into a series of [[Network packet|packet]]s.<ref>{{cite web|url=https://www.techopedia.com/definition/6757/data-stream|title=Data Stream|website=techopedia.com|access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083723/https://www.techopedia.com/definition/6757/data-stream|archive-date=April 24, 2019|url-status=live}}</ref> Data streaming has become ubiquitous. Anything transmitted over the [[Internet]] is transmitted as a data stream. Using a [[mobile phone]] to have a conversation transmits the sound as a data stream. ==Formal definition== In a formal way, a data stream is any [[ordered pair]] <math> ( s, \Delta ) </math> where: # <math> s </math> is a [[sequence]] of [[tuple]]s and # <math>\Delta </math> is a sequence of positive [[Real number|real]] [[time interval]]s. ==Content== Data Stream contains different sets of data, that depend on the chosen data format. * '''Attributes''' β each attribute<ref>{{cite web|url=http://www.businessdictionary.com/definition/attribute.html|title=Attribute|website=businessdictionary.com|access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083729/http://www.businessdictionary.com/definition/attribute.html|archive-date=April 24, 2019|url-status=live}}</ref> of the data stream represents a certain type of data, e.g. segment / data point ID, timestamp, [[Geographic data and information|geodata]]. * '''[[Timestamp]]''' attribute helps to identify when an event occurred. * '''Subject ID''' is an encoded-by-algorithm ID, that has been extracted out of a [[magic cookie|cookie]]. * '''[[Raw Data]]''' includes information straight from the data provider without being processed by an algorithm nor human. * '''[[Processed data|Processed Data]]''' is a data that has been prepared<ref>{{cite web|url=https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-constitutes-data-processing_en|title=What constitutes data processing?|website=ec.europa.eu|access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083732/https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-constitutes-data-processing_en|archive-date=April 24, 2019|url-status=live}}</ref> (somehow modified, validated or cleaned), to be used for future actions. ==Usage== There are various areas where data streams are used: * '''[[Fraud]] detection & scoring''' β raw data is used as source data for an anti-fraud algorithm ([[data analysis techniques for fraud detection]]). For example, timestamps, cookie occurrences or analysis of data points are used within the scoring system to detect fraud or to make sure that a message receiver is not a bot (so-called Non-Human Traffic<ref>{{cite web|url=https://theonlineadvertisingguide.com/glossary/non-human-traffic/|title=Non-Human Traffic [NHT]|website=theonlineadvertisingguide.com|date=June 7, 2017 |access-date=April 24, 2019|archive-url=https://web.archive.org/web/20170813062942/http://theonlineadvertisingguide.com/glossary/non-human-traffic/|archive-date=August 13, 2017|url-status=live}}</ref>). * '''[[Artificial intelligence]]''' β raw data is treated like a train set and a test set during AI and [[machine learning]] algorithms building. * '''[[Raw data]]''' is used for profiling and personalization to customize user profiles<ref>{{cite web|url=https://www.selligent.com/blog/inspiration/behavioral-profiling-and-personalization-customer-experience-first|title=BEHAVIORAL PROFILING AND PERSONALIZATION: CUSTOMER EXPERIENCE FIRST|website=selligent.com|date=July 26, 2012 |access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083739/https://www.selligent.com/blog/inspiration/behavioral-profiling-and-personalization-customer-experience-first|archive-date=April 24, 2019|url-status=live}}</ref> and divide them for segmentation, e.g., per gender or location (based on [[data point]]). * '''[[Business intelligence]]''' β raw data is a source of information for BI systems, used for enriching user profiles with detailed information about them, e.g., purchase path or geodata. This information is used for [[business analysis]] and predictive research. * '''Targeting''' β processed data by data scientists improve online campaigns and is used for reaching the target audience.<ref>{{cite web|url=https://sendpulse.com/support/glossary/targeting|title=What is Targeting β Meaning|website=selligent.com|access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083726/https://sendpulse.com/support/glossary/targeting|archive-date=April 24, 2019|url-status=live}}</ref> * '''CRM Enrichment''' β raw data is integrated with [[customer-relationship management]] system. CRM integration allows to fill the gaps in users' profiles with demographic data, interests or buying intentions. ==Integration== Core integrations with data streams are: * Data streams are integrated with systems such as [[customer data platform]] (CDP), customer relationship management (CRM) or [[data management platform]] (DMP) to enrich users' profiles with external data. It is possible to expand the knowledge about existing users by using external sources.<ref>{{cite web|url=http://www.onaudience.com/resources/what-is-data-stream-and-how-to-use-it/|title=What is Data Stream and how to use it|website=OnAudience.com|date=April 17, 2019 |access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083725/http://www.onaudience.com/resources/what-is-data-stream-and-how-to-use-it/|archive-date=April 24, 2019|url-status=live}}</ref> * Data streams are used to enrich business intelligence systems and make analysis more precise and conclusions more accurate. * In the case of [[content management system]] (CMS) integration, Data Stream is used to identify the users and personalize their visit, even if it's their first one. By data analysis, the actual content of the website is adapted to the user. * Data streams are integrated with [[demand side platform]] (DSP) within programmatic advertising ecosystem. Parties (e.g., advertisers) can exchange the users' IDs and concatenate with them existing profiles. * Data streams are used to choose respective user segments (e.g., people interested in the automotive industry) and use them in an online campaign. Segments are enriched with more user characteristics out of data stream and then sent to DSP. ==Data sources visible== In a data stream it is visible what device has been used by the user side β it is visible on [[user agent]]: * '''mobile''' β when a user uses a mobile browser to explore, it has narrow screen resolution and mobile app version, respectively; * '''desktop''' β when a user uses a desktop browser or app version. The following information is shared out of used device: * Actual URL to the visited website, where an event occurred * User Agent * [[Geolocation]] * [[Internet Protocol]] (IP) ==Formats== A '''[[data point]]''' is a tag that collects information about a certain action, performed by a user on a website. Data points exists in two types, the values of which are used to create appropriate audiences. Those are: * 'event' with information about occurrences of the specific event (e.g., click on a link or displaying ad) * 'attribute' with numerical or alphanumerical values. '''Segment''' is a logical statement, built on specific Data Points using AND, OR or NOT operators.<ref>{{cite web|url=https://uxdesign.cc/how-to-think-segmentation-from-day-1-f714df093ccb|title=The 6 types of user segmentation and what they mean for your product|website=uxdesign.cc|date=June 12, 2018 }}</ref><br/> '''Hybrid data''' β raw data out of both Data Point and Segment data formats.<ref>{{cite web|url=https://www.ibm.com/analytics/data-management/resources/what-is-hybrid-data-management/|title=What is hybrid data management|website=ibm.com|date=January 2, 2018 |access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083730/https://www.ibm.com/analytics/data-management/resources/what-is-hybrid-data-management/|archive-date=April 24, 2019|url-status=live}}</ref><br/> '''URLs''' β is a set of information about a particular [[URL]] that has been visited. ==GDPR== Information gathered out of websites are based on user behavior. Data providers deliver both personal or non-personal information. There are two types of user data available in data stream: * '''[[Personally identifiable information]]''' (PII) β information that allows clearly or by combining with data identification methods identify a person. Examples of PII are: insurance ID, email address, phone number, [[IP address]], geolocation, [[biometric data]].<ref>{{cite web|url=https://www.csoonline.com/article/3215864/how-to-protect-personally-identifiable-information-pii-under-gdpr.html|title=What is personally identifiable information (PII)? How to protect it under GDPR|website=csoonline.com|access-date=April 24, 2019|archive-url=https://web.archive.org/web/20190424083727/https://www.csoonline.com/article/3215864/how-to-protect-personally-identifiable-information-pii-under-gdpr.html|archive-date=April 24, 2019|url-status=live}}</ref> * '''Non-personally identifiable information''' (non-PII) is information that can't be used to identify a person or to track a location. A cookie or a device ID is an example of non-PII. ==See also== * [[Streaming algorithm]] == References == {{reflist}} {{DEFAULTSORT:Data Stream}} [[Category:Computing terminology]] [[Category:Big data]] [[Category:Business analysis]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:About
(
edit
)
Template:Cite web
(
edit
)
Template:Distinguish
(
edit
)
Template:Prose
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Use mdy dates
(
edit
)