Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Extract, transform, load
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Data variations === The range of data values or data quality in an operational system may exceed the expectations of designers at the time validation and transformation rules are specified. [[Data profiling]] of a source during data analysis can identify the data conditions that must be managed by transform rules specifications, leading to an amendment of validation rules explicitly and implicitly implemented in the ETL process. Data warehouses are typically assembled from a variety of data sources with different formats and purposes. As such, ETL is a key process to bring all the data together in a standard, homogeneous environment. Design analysis<ref>{{Cite journal|last=Theodorou|first=Vasileios|date=2017|title=Frequent patterns in ETL workflows: An empirical approach|journal=Data & Knowledge Engineering|volume=112|pages=1β16|doi=10.1016/j.datak.2017.08.004|hdl=2117/110172|hdl-access=free}}</ref> should establish the [[scalability]] of an ETL system across the lifetime of its usage β including understanding the volumes of data that must be processed within [[service level agreement]]s. The time available to extract from source systems may change, which may mean the same amount of data may have to be processed in less time. Some ETL systems have to scale to process terabytes of data to update data warehouses with tens of terabytes of data. Increasing volumes of data may require designs that can scale from daily [[batch processing|batch]] to multiple-day micro batch to integration with [[message queue]]s or real-time change-data-capture for continuous transformation and update.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)