Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Delimiter-separated values
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{short description|Data storage format}} {{Infobox file format | name = Delimiter-separated values | icon = | logo = | screenshot = | caption = |_noextcode = | extension = | mime = | type code = | uniform type = public.delimited-values-text<ref>{{cite web |url=https://developer.apple.com/documentation/uniformtypeidentifiers/uttypedelimitedtext |title=UTTypeDelimitedText |work=Apple Developer Documentation: Uniform Type Identifiers |publisher=[[Apple Inc]]}}</ref> | magic = | owner = | released = | latest release version = | latest release date = | genre = | container for = | contained by = | extended from = | extended to = | standard = | url = }} Formats that use '''delimiter-separated values''' (also '''DSV''')<ref name="artofunix">DSV stands for ''Delimiter Separated Values'' {{cite book | last = Raymond | first = Eric | title = The Art of Unix Programming | publisher = Addison-Wesley | location = Boston | year = 2004 | isbn = 0-13-142901-9 | url = http://www.catb.org/~esr/writings/taoup/html/ch05s02.html}}</ref>{{rp|113}} store two-dimensional arrays of data by separating the values in each row with specific [[delimiter]] [[character (computing)|characters]]. Most [[database]] and [[spreadsheet]] programs are able to read or save data in a delimited format. Due to their wide support, DSV files can be used in [[data exchange]] among many applications. A '''delimited text file''' is a [[text file]] used to store data, in which each line represents a single book, company, or other thing, and each line has fields separated by the delimiter.<ref> Stephen R. Westman. [https://books.google.com/books?id=20bG82MuVJIC "Creating Database-backed Library Web Pages: Using Open Source Tools"]. 2006. Section "Structured Text Files". p. 15. </ref> Compared to the kind of [[flat file]] that uses spaces to force every field to the same width, a '''delimited file''' has the advantage of allowing field values of any length.<ref> Richard Petersen. [https://books.google.com/books?id=4hStzByjNvEC "Introductory Command Line Unix for Users"]. 2006. p. 356. </ref> == Delimited formats == Any character may be used to separate the values, but the most common delimiters are the [[comma (punctuation)|comma]], [[Tab stop|tab]], and [[Colon (punctuation)|colon]].<ref name="artofunix" />{{rp|113}}<ref>Under UNIX, the colon is the most common DSV delimiter for values that may contain whitespace. ''Ibid''.</ref> The [[vertical bar]] (also referred to as ''pipe'') and [[space (punctuation)|space]] are also sometimes used.<ref name="artofunix" />{{rp|113}} Column headers are sometimes included as the first line, and each subsequent line is a row of data. The lines are separated by [[newline]]s. For example, the following fields in each record are delimited by commas, and each record by newlines: "Date","Pupil","Grade" "25 May","Bloggs, Fred","C" "25 May","Doe, Jane","B" "15 July","Bloggs, Fred","A" "15 April","Muniz, Alvin ""Hank""","A" Note the use of the [[double quote]] to enclose each field. This prevents the comma in the actual field value (Bloggs, Fred; Doe, Jane; etc.) from being interpreted as a field separator. This necessitates a way to "[[escape character|escape]]" the field wrapper itself, in this case the double quote; it is customary to double the double quotes actually contained in a field as with those surrounding "Hank". In this way, any [[ASCII]] text including newlines can be contained in a field. [[ASCII]] and [[Unicode]] include several [[control character]]s that are intended to be used as delimiters. They are: [[File separator|28 for File Separator]], [[Group separator|29 for Group Separator]], [[Record separator|30 for Record Separator]], and [[Unit separator|31 for Unit Separator]]. Example of such use is [[MARC standards#MARC 21|MARC 21]] bibliographic data format.<ref>{{Cite web |date=2007 |title=Character Sets: General Character Set Issues: MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media |url=https://www.loc.gov/marc/specifications/specchargeneral.html |access-date=2024-08-02 |website=[[Library of Congress]]}}</ref> Use of these characters has not achieved widespread adoption; some systems have replaced their control properties with more accepted controls such as [[Newline|CR/LF]] and TAB.{{citation needed|date=February 2019}} == Uses and applications == Due to their widespread use, comma- and tab-delimited text files can be opened by several kinds of applications, including most [[spreadsheet]] programs and [[statistical package]]s, sometimes even without the user designating which delimiter has been used.<ref>{{cite book | last = Knight | first = Andrew | title = Basics of Matlab and beyond | publisher = Chapman & Hall/CRC | location = Boca Raton | year = 2000 | isbn = 0-8493-2039-9 }}</ref><ref>{{cite book | last = Robbins | first = Arnold | title = Classic Shell Scripting | publisher = O'Reilly | location = Sebastopol | year = 2005 | isbn = 0-596-00595-4 }}</ref> Despite that each of those applications has its own [[database design]] and its own [[file format]] (for example, accdb or xlsx), they can all map the fields in a DSV file to their own [[data model]] and format.{{citation needed|date=November 2016}} Typically a delimited file format is indicated by a specification. Some specifications provide conventions for avoiding [[delimiter collision]]; others do not. Delimiter collision is a problem that occurs when a character that is intended as part of the data gets interpreted as a delimiter instead. Comma- and space-separated formats often suffer from this problem, since in many contexts those characters are legitimate parts of a data field. Most such files avoid delimiter collision either by surrounding all data fields in double quotes, or only quoting those data fields that contain the delimiter character. One problem with tab-delimited text files is that tabs are difficult to distinguish from spaces; therefore, there are sometimes problems with the files being corrupted when people try to edit them by hand. Another set of problems occur due to errors in the file structure, usually during import of file into a [[database]] (in the example above, such error may be a pupil's first name missing). Depending on the data itself, it may be beneficial to use non-standard characters such as the tilde (~) as delimiters. With rising prevalence of web sites and other applications that store snippets of code in databases, simply using a " which occurs in every hyperlink and image source tag is not sufficient to avoid this type of collision. Since colons (:), semi-colons (;), pipes (|), and many other characters are also used, it can be quite challenging to find a character that is not being used elsewhere. ==See also== * [[Comma-separated values]] * [[Delimiter]] * [[Tab-separated values]] * [[Health Level 7#HL7 Version 2|Health Level 7 version 2]] * [[Data Interchange Format]] ==Notes and references== {{reflist}} ==Further reading== *{{cite web |title=IBM DB2 Administration Guide - LOAD, IMPORT, and EXPORT File Formats |publisher=[[IBM]] |url=https://www.columbia.edu/sec/acis/db2/db2d0/db2d053.htm |access-date=2016-12-12 |url-status=live |archive-url=https://web.archive.org/web/20161213014111/https://www.columbia.edu/sec/acis/db2/db2d0/db2d053.htm |archive-date=2016-12-13}} (Has file descriptions of delimited ASCII (.DEL) and non-delimited ASCII (.ASC) files for data transfer.) [[Category:Delimiter-separated format]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation needed
(
edit
)
Template:Cite book
(
edit
)
Template:Cite web
(
edit
)
Template:Infobox file format
(
edit
)
Template:Reflist
(
edit
)
Template:Rp
(
edit
)
Template:Short description
(
edit
)