Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Comma-separated values
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Basic rules== Many informal documents exist that describe "CSV" formats. [[IETF]] RFC 4180 (summarized above) defines the format for the "text/csv" [[MIME type]] registered with the [[Internet Assigned Numbers Authority|IANA]]. Rules typical of these and other "CSV" specifications and implementations are as follows: {{unordered list | CSV is a [[delimited]] data format that has [[field (computer science)|fields/columns]] separated by the [[Comma (punctuation)|comma]] [[grapheme|character]] and [[row (database)|records/rows]] terminated by newlines. | A CSV file does not require a specific [[character encoding]], [[byte order]], or line terminator format (some software do not support all line-end variations). | A record ends at a line terminator. However, line terminators can be embedded as data within fields, so software must recognize quoted line-separators (see below) in order to correctly assemble an entire record from perhaps multiple lines. | All records should have the same number of fields, in the same order. | Data within fields is interpreted as a sequence of [[character (computing)|character]]s, not as a sequence of bits or bytes (see RFC 2046, section 4.1). For example, the numeric quantity 65535 may be represented as the 5 ASCII characters "65535" (or perhaps other forms such as "0xFFFF", "000065535.000E+00", etc.); but not as a sequence of 2 bytes intended to be treated as a single binary integer rather than as two characters (e.g. the numbers 11264–11519 have a comma as their high order byte: <syntaxhighlight lang="perl" inline>ord(',')*256..ord(',')*256+255</syntaxhighlight>). If this "plain text" convention is not followed, then the CSV file no longer contains sufficient information to interpret it correctly, the CSV file will not likely survive transmission across differing computer architectures, and will not conform to the ''text/csv'' MIME type. | Adjacent fields must be separated by a single comma. However, "CSV" formats vary greatly in this choice of separator character. In particular, in [[Locale (computer software)|locale]]s where the comma is used as a decimal separator, a semicolon, [[tab key|tab character]], or other character is used instead. <pre>1997,Ford,E350</pre> | Any field ''may'' be ''quoted'' (that is, enclosed within double-quote characters), while some fields ''must'' be quoted, as specified in the following rules and examples: <pre>"1997","Ford","E350"</pre> | Fields with embedded commas or double-quote characters must be quoted. <pre>1997,Ford,E350,"Super, luxurious truck"</pre> | Each of the embedded double-quote characters must be represented by a pair of double-quote characters. <pre>1997,Ford,E350,"Super, ""luxurious"" truck"</pre> | Fields with embedded line breaks must be quoted (however, many CSV implementations do not support embedded line breaks). <pre> 1997,Ford,E350,"Go get one now they are going fast" </pre> | In some CSV implementations{{which|date=September 2017}}, leading and trailing spaces and tabs are trimmed (ignored). Such trimming is forbidden by RFC 4180, which states "Spaces are considered part of a field and should not be ignored." <pre> 1997, Ford, E350 not same as 1997,Ford,E350 </pre> | According to RFC 4180, spaces outside quotes in a field are not allowed{{failed verification|date=January 2024}}; however, the RFC also says that "Spaces are considered part of a field and should not be ignored." and "Implementers should 'be conservative in what you do, be liberal in what you accept from others' (RFC 793, section 2.10) when processing CSV files." <!-- DO NOT CORRECT THIS INTENTIONAL ERROR IN THE EXAMPLE! --> <pre>1997, "Ford" ,E350</pre> <!-- DO NOT CORRECT THIS INTENTIONAL ERROR IN THE EXAMPLE! --> | In CSV implementations that do trim leading or trailing spaces, fields with such spaces as meaningful data must be quoted. <pre>1997,Ford,E350," Super luxurious truck "</pre> | Double quote processing need only apply if the field starts with a double quote. Note, however, that double quotes are not allowed in unquoted fields according to RFC 4180.<!-- rule for non-escaped text: %x20-21 / %x23-2B / %x2D-7E (double quotes are %x22, which is explicitly omitted. --> <pre> Los Angeles,34°03′N,118°15′W New York City,40°42′46″N,74°00′21″W Paris,48°51′24″N,2°21′03″E </pre> | The first record may be a "header", which contains column names in each of the fields (there is no reliable way to tell whether a file does this or not; however, it is uncommon to use characters other than letters, digits, and underscores in such column names). <pre> Year,Make,Model 1997,Ford,E350 2000,Mercury,Cougar </pre> }}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)