Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Extract, transform, load
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Parallel computing === {{Unreferenced section|date=September 2024}} Some ETL software implementations include [[Parallel computing|parallel processing]]. This enables a number of methods to improve overall performance of ETL when dealing with large volumes of data. ETL applications implement three main types of parallelism: * Data: By splitting a single sequential file into smaller data files to provide [[Parallel Random Access Machine|parallel access]] * [[pipeline (computing)|Pipeline]]: allowing the simultaneous running of several components on the same [[data stream]], e.g. looking up a value on record 1 at the same time as adding two fields on record 2 * Component: The simultaneous running of multiple [[process (computing)|processes]] on different data streams in the same job, e.g. sorting one input file while removing duplicates on another file All three types of parallelism usually operate combined in a single job or task. An additional difficulty comes with making sure that the data being uploaded is relatively consistent. Because multiple source databases may have different update cycles (some may be updated every few minutes, while others may take days or weeks), an ETL system may be required to hold back certain data until all sources are synchronized. Likewise, where a warehouse may have to be reconciled to the contents in a source system or with the general ledger, establishing synchronization and reconciliation points becomes necessary.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)