Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Snowflake schema
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Data normalization and storage == [[Database normalization|Normalization]] splits up data to avoid redundancy (duplication) by moving commonly repeating groups of data into new tables. Normalization therefore tends to increase the number of tables that need to be joined in order to perform a given query, but reduces the space required to hold the data and the number of places where it needs to be updated if the data changes. From a space storage point of view, dimensional tables are typically small compared to fact tables. This often negates the potential storage-space benefits of the snowflake schema as compared to the star schema. Example: One million sales transactions in 300 shops in 220 countries would result in 1,000,300 records in a star schema (1,000,000 records in the fact table and 300 records in the dimensional table where each country would be listed explicitly for each shop in that country). A more normalized snowflake schema with country keys referring to a country table would consist of the same 1,000,000 record fact table, a 300 record shop table with references to a country table with 220 records. In this case, the star schema, although further denormalized, would only reduce the number or records by a (negligible) ~0.02% (=[1,000,000+300] instead of [1,000,000+300+220]) Some database developers compromise by creating an underlying snowflake schema with [[View (database)|views]] built on top of it that perform many of the necessary joins to simulate a star schema. This provides the storage benefits achieved through the normalization of dimensions with the ease of querying that the star schema provides. The tradeoff is that requiring the server to perform the underlying joins automatically can result in a performance hit when querying as well as extra joins to tables that may not be necessary to fulfill certain queries.{{citation needed|date=October 2012}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)