Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Flat-file database
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Overview== Plain text files usually contain one [[Record (computer science)|record]] per line.<ref>{{Citation | last = Fowler | first = Glenn | year = 1994 | title = cql: Flat-file database query language | periodical = WTEC'94: Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference | url = http://static.usenix.org/publications/library/proceedings/sf94/full_papers/fowler.a }}</ref> Examples of flat files include <code>[[/etc/passwd]]</code> and <code>[[/etc/group]]</code> on [[Unix-like]] operating systems. Another example of a flat file is a name-and-address list with the fields ''Name'', ''Address'' and ''Phone Number''. Flat files are typically either delimiter-separated or fixed-width. === Delimiter-separated values === In [[delimiter-separated values]] files, the [[field (computer science)|field]]s are separated by a character or string called the [[delimiter]]. Common variants are [[comma-separated values]] (CSV) where the delimiter is a comma, [[tab-separated values]] (TSV) where the delimiter is the tab character), space-separated values and vertical-bar-separated values (delimiter is <code>|</code>). If the delimiter is allowed inside a field, there needs to be a way to distinguish delimiters characters or strings that are meant literally. For example, consider the sentence "If I have to, I'll do it myself.". To encode it in CSV, there needs to be a way to prevent the comma from splitting the field. Several [[Delimiter#Solutions|strategies to prevent delimiter collision]] exist. === Fixed-width formats === With fixed-width formats, each field has a fixed length with extra [[space character|spaces]] added as needed. The fixed lengths can be predefined and known ahead of time (i.e. stated in the format's specification), or parsed from a [[Header (computing)|header]]. With predefined lengths, fields are limited to a maximum length. The need for longer fields may appear sometime after the format is defined. Possible workarounds include abbreviating phrases, replacing values with links (e.g. a URI pointing to the value), and splitting a file into multiple files. With delimiter-separated formats, determining the field boundaries requires finding the delimiters, which incurs some [[computational overhead]]. This is not needed for fixed-width formats. However, fixed-width formats can lead to unnecessarily large file sizes if fields tend to be shorter than the lengths reserved for them. === Declarative notation === Delimiters can be used alongside a notation stating the length of each field. For example, <code>5apple|9pineapple</code> specifies the length (5 and 9) of each field. This is called [[String literal#Declarative notation|declarative notation]]. It has low overhead and trivially avoids delimiter collisions, but it is brittle when edited manually.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)