Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Data engineering
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== History == Around the 1970s/1980s the term '''information engineering methodology''' (IEM) was created to describe [[database design]] and the use of [[software]] for data analysis and processing.<ref name="hist1">{{cite web |last1=Black |first1=Nathan |title=What is Data Engineering and Why Is It So Important? |url=https://quanthub.com/what-is-data-engineering/ |website=QuantHub |access-date=31 July 2022 |date=15 January 2020}}</ref> These techniques were intended to be used by [[database administrator]]s (DBAs) and by [[systems analyst]]s based upon an understanding of the operational processing needs of organizations for the 1980s. In particular, these techniques were meant to help bridge the gap between strategic business planning and information systems. A key early contributor (often called the "father" of information engineering methodology) was the Australian [[Clive Finkelstein]], who wrote several articles about it between 1976 and 1980, and also co-authored an influential [[Savant Institute]] report on it with James Martin.<ref>"Information engineering," [https://books.google.com/books?id=U2Da-O9RAgIC&pg=PA29 part 3], [https://books.google.com/books?id=aMrnCDJzb9MC&pg=RA1-PA1 part 4], [https://books.google.com/books?id=Ux9iw6tMs6MC&pg=PA32 part 5], [https://books.google.com/books?id=dPLZ7QidjbEC&pg=RA1-PA1 Part 6]" by Clive Finkelstein. In ''Computerworld, In depths, appendix.'' May 25 β June 15, 1981.</ref><ref>Christopher Allen, Simon Chatwin, Catherine Creary (2003). ''Introduction to Relational Databases and SQL Programming.''</ref><ref>[[Terry Halpin]], [[Tony Morgan (computer scientist)|Tony Morgan]] (2010). ''Information Modeling and Relational Databases.'' p. 343</ref> Over the next few years, Finkelstein continued work in a more business-driven direction, which was intended to address a rapidly changing business environment; Martin continued work in a more data processing-driven direction. From 1983 to 1987, Charles M. Richter, guided by Clive Finkelstein, played a significant role in revamping IEM as well as helping to design the IEM software product (user data), which helped automate IEM. In the early 2000s, the data and data tooling was generally held by the [[information technology]] (IT) teams in most companies.<ref name="hist2">{{cite web |last1=Dodds |first1=Eric |title=The History of the Data Engineering and the Megatrends |url=https://www.rudderstack.com/blog/the-data-engineering-megatrend-a-brief-history |website=Rudderstack |access-date=31 July 2022}}</ref> Other teams then used data for their work (e.g. reporting), and there was usually little overlap in data skillset between these parts of the business. In the early 2010s, with the rise of the [[internet]], the massive increase in data volumes, velocity, and variety led to the term [[big data]] to describe the data itself, and data-driven tech companies like [[Facebook]] and [[Airbnb]] started using the phrase ''' data engineer'''.<ref name="hist1" /><ref name="hist2" /> Due to the new scale of the data, major firms like [[Google]], Facebook, [[Amazon (company)|Amazon]], [[Apple Inc.|Apple]], [[Microsoft]], and [[Netflix]] started to move away from traditional [[Extract transform load|ETL]] and storage techniques. They started creating '''data engineering''', a type of [[software engineering]] focused on data, and in particular [[data infrastructure|infrastructure]], [[data warehouse|warehousing]], [[Information privacy|data protection]], [[cybersecurity]], [[data mining|mining]], [[data modelling|modelling]], [[data processing|processing]], and [[metadata]] management.<ref name="hist1" /><ref name="hist2" /> This change in approach was particularly focused on [[cloud computing]].<ref name="hist2" /> Data started to be handled and used by many parts of the business, such as [[sales]] and [[marketing]], and not just IT.<ref name="hist2" />
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)