Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
OLAP cube
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Multidimensional data array organized for rapid analysis}} [[File:OLAP Cube.svg|thumb|An example of an OLAP cube]] An '''OLAP cube''' is a [[multi-dimensional array]] of data.<ref>{{cite conference |author1=Gray, Jim |author2=Bosworth, Adam | author3=Layman, Andrew | author4=Pirahesh, Hamid |title=Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals | book-title=Proceedings of the International Conference on Data Engineering (ICDE) | year=1996 | pages=152β159 |doi=10.1109/ICDE.1996.492099 | arxiv=cs/0701155 }}</ref> [[Online analytical processing]] (OLAP)<ref>{{cite web|url=https://support.office.com/en-us/article/overview-of-online-analytical-processing-olap-15d2cdde-f70b-4277-b009-ed732b75fdd6 |title=Overview of Online Analytical Processing (OLAP) |publisher=support.office.com |access-date=2018-09-08}}</ref> is a computer-based technique of analyzing data to look for insights. The term ''cube'' here refers to a multi-dimensional dataset, which is also sometimes called a [[hypercube]] if the number of dimensions is greater than three. == Terminology == A cube can be considered a multi-dimensional generalization of a two- or three-dimensional [[spreadsheet]]. For example, a company might wish to summarize financial data by product, by time-period, and by city to compare actual and budget expenses. Product, time, city and scenario (actual and budget) are the data's dimensions.<ref>{{cite web | url=http://www.postgresql.org/about/news.653 | title=Cybertec releases OLAP cubes for PostgreSQL | publisher=PostgreSQL | date=2006-10-02 | access-date=2008-03-05 | archive-url=https://web.archive.org/web/20130630034832/http://www.postgresql.org/about/news/653/ | archive-date=2013-06-30 | url-status=dead }}</ref> ''Cube'' is a shorthand for ''multidimensional dataset'', given that data can have an arbitrary number of ''[[Dimension (data warehouse)|dimensions]]''. The term [[hypercube]] is sometimes used, especially for data with more than three dimensions. A cube is not a "cube" in the strict mathematical sense, as the sides are not all necessarily equal. But this term is used widely. A ''Slice'' is a term for a subset of the data, generated by picking a value for one dimension and only showing the data for that value (for instance only the data at one point in time). Spreadsheets are only 2-dimensional, so by (continued) slicing or other techniques, it becomes possible to visualise multidimensional data in them. Each cell of the cube holds a number that represents some ''measure'' of the business, such as sales, profits, expenses, budget and forecast. OLAP data is typically stored in a [[star schema]] or [[snowflake schema]] in a [[relational database|relational]] [[data warehouse]] or in a special-purpose data management system. Measures are derived from the records in the [[fact table]] and dimensions are derived from the [[dimension table]]s. == Hierarchy == The elements of a dimension can be organized as a [[hierarchy]],<ref>{{cite web | url=http://www.lorentzcenter.nl/awcourse/oracle/server.920/a96520/glossary.htm#432038 | title=Oracle9i Data Warehousing Guide hierarchy | publisher=Lorentz Center | access-date=2008-03-05 }}</ref> a set of parent-child relationships, typically where a parent member summarizes its children. Parent elements can further be aggregated as the children of another parent.<ref name=OLAPGlossary1995>{{cite web | url=http://www.olapcouncil.org/research/glossaryly.htm | title=OLAP and OLAP Server Definitions | publisher=The OLAP Council | year=1995 | access-date=2008-03-18 }}</ref> For example, May 2005's parent is Second Quarter 2005 which is in turn the child of Year 2005. Similarly cities are the children of regions; products roll into product groups and individual expense items into types of expenditure. == Operations == Conceiving data as a cube with hierarchical dimensions leads to conceptually straightforward operations to facilitate analysis. Aligning the data content with a familiar visualization enhances analyst learning and productivity.<ref name=OLAPGlossary1995/> The user-initiated process of navigating by calling for page displays interactively, through the specification of slices via rotations and drill down/up is sometimes called "slice and dice". Common operations include slice and dice, drill down, roll up, and pivot. [[File:OLAP slicing en.png|thumb|OLAP slicing|350x350px]] ''Slice'' is the act of picking a rectangular subset of a cube by choosing a single value for one of its dimensions, creating a new cube with one fewer dimension.<ref name=OLAPGlossary1995/> The picture shows a slicing operation: The sales figures of all sales regions and all product categories of the company in the year 2005 and 2006 are "sliced" out of the data cube. {{Clear}} [[File:OLAP dicing en.png|thumb|OLAP dicing|350x350px]] ''Dice'': The dice operation produces a subcube by allowing the analyst to pick specific values of multiple dimensions.<ref>{{cite web | url=http://www.cs.ualberta.ca/~zaiane/courses/cmput690/glossary.html | title=Glossary of Data Mining Terms | publisher=University of Alberta | year=1999 | access-date=2008-03-17 }}</ref> The picture shows a dicing operation: The new cube shows the sales figures of a limited number of product categories, the time and region dimensions cover the same range as before. {{Clear}} [[File:OLAP drill up&down en.png|thumb|alt=OLAP-functionalities|OLAP drill-up and drill-down|350x350px]] ''Drill Down/Up'' allows the user to navigate among levels of data ranging from the most summarized (up) to the most detailed (down).<ref name=OLAPGlossary1995/> The picture shows a drill-down operation: The analyst moves from the summary category "Outdoor protective equipment" to see the sales figures for the individual products. {{Clear}} ''Roll-up'': A roll-up involves summarizing the data along a dimension. The summarization rule might be an [[aggregate function]], such as computing totals along a hierarchy or applying a set of formulas such as "profit = sales - expenses".<ref name=OLAPGlossary1995/> General aggregation functions may be costly to compute when rolling up: if they cannot be determined from the cells of the cube, they must be computed from the base data, either computing them online (slow) or precomputing them for possible rollouts (large space). Aggregation functions that can be determined from the cells are known as [[decomposable aggregation function]]s, and allow efficient computation.{{sfn|Zhang|2017|p=1}} For example, it is easy to support <code>COUNT, MAX, MIN,</code> and <code>SUM</code> in OLAP, since these can be computed for each cell of the OLAP cube and then rolled up, since on overall sum (or count etc.) is the sum of sub-sums, but it is difficult to support <code>MEDIAN</code>, as that must be computed for every view separately: the median of a set is not the median of medians of subsets. {{Clear}} [[File:OLAP pivoting en.png|thumb|OLAP pivoting|350x350px]] ''[[Pivot table|Pivot]]'' allows an analyst to rotate the cube in space to see its various faces. For example, cities could be arranged vertically and products horizontally while viewing data for a particular quarter. Pivoting could replace products with time periods to see data across time for a single product.<ref name=OLAPGlossary1995/><ref>{{cite web |url=http://www.answers.com/topic/multidimensional-views?cat=technology | title=Computer Encyclopedia: multidimensional views |publisher=Answers.com | access-date=2008-03-05 }}</ref> The picture shows a pivoting operation: The whole cube is rotated, giving another perspective on the data. {{Clear}} == Mathematical definition == {{refimprove section|date=July 2012}} In [[database theory]], an OLAP cube is<ref name="DataCubeGray1995" >{{cite web | url=http://research.microsoft.com/~gray/DataCube.doc | title=Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals | access-date=2008-11-09 | date=1995-11-18 | last=Gray | first=Jim | author-link=Jim Gray (computer scientist) |author2=Bosworth, Adam |author3=Layman, Andrew |author4= Priahesh, Hamid | work=Proc. 12th International Conference on Data Engineering | publisher=IEEE | pages=152β159 }}</ref> an abstract representation of a [[projection (relational algebra)|projection]] of an [[RDBMS]] relation. Given a [[relation (database)|relation]] of order ''N'', consider a projection that subtends ''X'', ''Y'', and ''Z'' as the key and ''W'' as the [[Errors and residuals in statistics|residual]] [[Attribute (computing)|attribute]]. Characterizing this as a [[function (mathematics)|function]], :''f'' : (''X'',''Y'',''Z'') β ''W'', the attributes ''X'', ''Y'', and ''Z'' correspond to the axes of the cube, while the ''W'' value corresponds to the data element that populates each cell of the cube. Insofar as two-dimensional output devices cannot readily characterize three dimensions, it is more practical to project "slices" of the data cube (we say ''project'' in the classic vector analytic sense of dimensional reduction, not in the [[SQL]] sense, although the two are conceptually similar), :''g'' : (''X'',''Y'') β ''W'' which may suppress a primary key, but still have some semantic significance, perhaps a slice of the triadic functional representation for a given ''Z'' value of interest. The motivation<ref name=DataCubeGray1995/> behind [[OLAP]] displays harks back to the ''cross-tabbed report'' paradigm of 1980s [[DBMS]], and to earlier [[contingency table]]s from 1904. The result is a spreadsheet-style display, where values of ''X'' populate row $1; values of ''Y'' populate column $A; and values of ''g'' : ( ''X'', ''Y'' ) β ''W'' populate the individual cells at intersections of ''X''-labeled columns and ''Y''-labeled rows, "southeast", so to speak, of $B$2, with $B$2 itself included. == See also == {{columns-list|colwidth=22em| * [[Business intelligence]] * [[Comparison of OLAP servers]] * [[Data cube]] * [[Data mart]] * [[Data mining]] * [[Data Mining Extensions]] * [[Fast Analysis of Shared Multidimensional Information]] * [[Multidimensional Expressions]] * [[XML for Analysis]] }} == References == {{Reflist|30em}} {{refbegin}} * {{cite tech report |first=Chao |last=Zhang |year=2017 |title=Symmetric and Asymmetric Aggregate Function in Massively Parallel Computing |url=https://hal.archives-ouvertes.fr/hal-01533675/ }} {{refend}} == External links == * {{cite web | url=http://www.daniel-lemire.com/OLAP/ | title=Data Warehousing and OLAP - A Research-Oriented Bibliography | author=Daniel Lemire | date=December 2007 | access-date=2008-03-05 | archive-url=https://archive.today/20130102171555/http://lemire.me/OLAP/ | archive-date=2013-01-02 | url-status=dead }} * [http://www.w3.org/TR/vocab-data-cube/ The RDF Data Cube Vocabulary] * [https://docs.microsoft.com/en-us/azure/architecture/data-guide/relational-data/online-analytical-processing Microsoft Azure: Online analytical processing (OLAP)] {{Data warehouse}} {{DEFAULTSORT:Olap Cube}} [[Category:Online analytical processing]] [[Category:Data warehousing]] [[fr:Hypercube OLAP]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite conference
(
edit
)
Template:Cite tech report
(
edit
)
Template:Cite web
(
edit
)
Template:Clear
(
edit
)
Template:Columns-list
(
edit
)
Template:Data warehouse
(
edit
)
Template:Refbegin
(
edit
)
Template:Refend
(
edit
)
Template:Refimprove section
(
edit
)
Template:Reflist
(
edit
)
Template:Sfn
(
edit
)
Template:Short description
(
edit
)