Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Database
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===1970s, relational DBMS=== [[Edgar F. Codd]] worked at IBM in [[San Jose, California]], in an office primarily involved in the development of [[hard disk]] systems.{{r|rdbmsearlyyearsoh20070612}} He was unhappy with the navigational model of the CODASYL approach, notably the lack of a "search" facility. In 1970, he wrote a number of papers that outlined a new approach to database construction that eventually culminated in the groundbreaking ''A Relational Model of Data for Large Shared Data Banks''.{{sfn|Codd|1970}} The paper described a new system for storing and working with large databases. Instead of records being stored in some sort of [[linked list]] of free-form records as in CODASYL, Codd's idea was to organize the data as a number of "[[Table (database)|tables]]", each table being used for a different type of entity. Each table would contain a fixed number of columns containing the attributes of the entity. One or more columns of each table were designated as a [[primary key]] by which the rows of the table could be uniquely identified; cross-references between tables always used these primary keys, rather than disk addresses, and queries would join tables based on these key relationships, using a set of operations based on the mathematical system of [[relational calculus]] (from which the model takes its name). Splitting the data into a set of normalized tables (or ''relations'') aimed to ensure that each "fact" was only stored once, thus simplifying update operations. Virtual tables called ''views'' could present the data in different ways for different users, but views could not be directly updated. Codd used mathematical terms to define the model: relations, tuples, and domains rather than tables, rows, and columns. The terminology that is now familiar came from early implementations. Codd would later criticize the tendency for practical implementations to depart from the mathematical foundations on which the model was based. [[File:Relational key SVG.svg|thumb|In the [[relational model]], records are "linked" using virtual keys not stored in the database but defined as needed between the data contained in the records.]] The use of primary keys (user-oriented identifiers) to represent cross-table relationships, rather than disk addresses, had two primary motivations. From an engineering perspective, it enabled tables to be relocated and resized without expensive database reorganization. But Codd was more interested in the difference in semantics: the use of explicit identifiers made it easier to define update operations with clean mathematical definitions, and it also enabled query operations to be defined in terms of the established discipline of [[first-order predicate calculus]]; because these operations have clean mathematical properties, it becomes possible to rewrite queries in provably correct ways, which is the basis of query optimization. There is no loss of expressiveness compared with the hierarchic or network models, though the connections between tables are no longer so explicit. In the hierarchic and network models, records were allowed to have a complex internal structure. For example, the salary history of an employee might be represented as a "repeating group" within the employee record. In the relational model, the process of normalization led to such internal structures being replaced by data held in multiple tables, connected only by logical keys. For instance, a common use of a database system is to track information about users, their name, login information, various addresses and phone numbers. In the navigational approach, all of this data would be placed in a single variable-length record. In the relational approach, the data would be ''normalized'' into a user table, an address table and a phone number table (for instance). Records would be created in these optional tables only if the address or phone numbers were actually provided. As well as identifying rows/records using logical identifiers rather than disk addresses, Codd changed the way in which applications assembled data from multiple records. Rather than requiring applications to gather data one record at a time by navigating the links, they would use a declarative query language that expressed what data was required, rather than the access path by which it should be found. Finding an efficient access path to the data became the responsibility of the database management system, rather than the application programmer. This process, called query optimization, depended on the fact that queries were expressed in terms of mathematical logic. Codd's paper inspired teams at various universities to research the subject, including one at [[University of California, Berkeley]]{{r|rdbmsearlyyearsoh20070612}} led by [[Eugene Wong]] and [[Michael Stonebraker]], who started [[INGRES]] using funding that had already been allocated for a geographical database project and student programmers to produce code. Beginning in 1973, INGRES delivered its first test products which were generally ready for widespread use in 1979. INGRES was similar to [[IBM System R|System R]] in a number of ways, including the use of a "language" for [[data access]], known as [[QUEL query languages|QUEL]]. Over time, INGRES moved to the emerging SQL standard. IBM itself did one test implementation of the relational model, [[PRTV]], and a production one, [[IBM Business System 12|Business System 12]], both now discontinued. [[Honeywell]] wrote [[Multics Relational Data Store|MRDS]] for [[Multics]], and now there are two new implementations: [[Dataphor|Alphora Dataphor]] and Rel. Most other DBMS implementations usually called ''relational'' are actually SQL DBMSs. In 1970, the University of Michigan began development of the [[MICRO Relational Database Management System|MICRO Information Management System]]{{sfn|Hershey|Easthope|1972}} based on [[David L. Childs|D.L. Childs]]' Set-Theoretic Data model.{{sfn|North|2010}}{{sfn|Childs|1968a}}{{sfn|Childs|1968b}} The university in 1974 hosted a debate between Codd and Bachman which Bruce Lindsay of IBM later described as "throwing lightning bolts at each other!".<ref name="rdbmsearlyyearsoh20070612">{{Cite interview |interviewer=Burton Grad |title=RDBMS Plenary 1: Early Years |url=https://archive.computerhistory.org/resources/access/text/2013/05/102702562-05-01-acc.pdf |access-date=2025-05-30 |publisher=Computer History Museum |date=2007-06-12}}</ref> MICRO was used to manage very large data sets by the [[US Department of Labor]], the [[U.S. Environmental Protection Agency]], and researchers from the [[University of Alberta]], the [[University of Michigan]], and [[Wayne State University]]. It ran on IBM mainframe computers using the [[Michigan Terminal System]].<ref name=MICROManual1977>{{cite book |author1=M.A. Kahn |author2=D.L. Rumelhart |author3=B.L. Bronson |date=October 1977 |url=https://docs.google.com/viewer?a=v&pid=explorer&chrome=true&srcid=0B4t_NX-QeWDYZGMwOTRmOTItZTg2Zi00YmJkLTg4MTktN2E4MWU0YmZlMjE3 |title=MICRO Information Management System (Version 5.0) Reference Manual |publisher=Institute of Labor and Industrial Relations (ILIR), University of Michigan and Wayne State University}}</ref> The system remained in production until 1998.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)