Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Newline
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Unicode === {{redir|Paragraph separator|the symbol also known as a "paragraph sign"|Pilcrow}} The [[Unicode]] standard defines a number of characters that conforming applications should recognize as line terminators:<ref>{{cite web |last1=Heninger |first1=Andy |title=UAX #14: Unicode Line Breaking Algorithm |url=https://www.unicode.org/reports/tr14/tr14-32.html |publisher=The Unicode Consortium |date=2013-09-20}}</ref> {| | {{mono| {{ctrl|LF}}}}: || Line Feed, {{mono|U+000A}} |- | {{mono| {{ctrl|VT}}}}: || [[Vertical Tab]], {{mono|U+000B}} |- | {{mono| {{ctrl|FF}}}}: || [[Form Feed]], {{mono|U+000C}} |- | {{mono| {{ctrl|CR}}}}: || [[Carriage return|Carriage Return]], {{mono|U+000D}} |- | {{mono| CR}}+{{mono|LF}}: || {{mono|CR}} ({{mono|U+000D}}) followed by {{mono|LF}} ({{mono|U+000A}}) |- | {{mono| {{ctrl|NEL}}}}: || Next Line, {{mono|U+0085}} |- | {{mono| {{ctrl|LS}}}}: || Line Separator, {{mono|U+2028}} |- | {{mono| {{ctrl|PS}}}}: || Paragraph Separator, {{mono|U+2029}} |} While it may seem overly complicated compared to an approach such as converting all line terminators to a single character (e.g. {{mono|LF}}), because Unicode is designed to preserve all information when converting a text file from any existing encoding to Unicode and back ([[Round-trip format conversion|round-trip integrity]]), Unicode needs to make the same distinctions between line breaks made by other encodings. For instance [[EBCDIC]] has {{mono|{{ctrl|NL}}}}, {{mono|{{ctrl|CR}}}}, and {{mono|{{ctrl|LF}}}} characters, so all three have to also exist in Unicode. Most newline characters and sequences are in [[ASCII]]'s [[C0 controls]] (i.e. have Unicode code points up to {{mono|0x1F}}). The three newline characters outside of this rangeโ{{mono|NEL}}, {{mono|LS}} and {{mono|PS}}โare often not recognized as newlines by software. For example: *[[JSON]] recognizes {{mono|CR}} and {{mono|LF}} as whitespace, but not any other newline characters.<ref>{{cite IETF |title=The JavaScript Object Notation (JSON) Data Interchange Format |section=2 |sectionname=JSON Grammar |rfc=7159 |date=March 2014|last1=Bray |first1=Tim}}</ref> C0 controls cannot appear unescaped within strings, but any other line break characters can.<ref>{{cite IETF |title=The JavaScript Object Notation (JSON) Data Interchange Format |section=7 |sectionname=Strings |rfc=7159 |date=March 2014|last1=Bray |first1=Tim}}</ref> *[[ECMAScript]] only recognizes {{mono|CR}}, {{mono|LF}}, {{mono|LS}} and {{mono|PS}} as line terminators.<ref name="ES 2019">{{cite web |title=ECMAScript 2019 Language Specification |date=June 2019 |publisher=ECMA International |at=[https://www.ecma-international.org/ecma-262/10.0/#sec-line-terminators 11.3 Line Terminators] |url=https://www.ecma-international.org/ecma-262/10.0/}}</ref> Historically, unescaped line terminators were not permitted in string literals,<ref>{{cite web |title=ECMAScript 2019 Language Specification |date=June 2018 |publisher=ECMA International |at=[https://www.ecma-international.org/ecma-262/9.0/#sec-line-terminators 11.3 Line Terminators] |url=https://www.ecma-international.org/ecma-262/9.0/}}</ref> but this was changed in {{Pslink|ECMAScript|ES2019}} to allow unescaped {{mono|LS}} and {{mono|PS}} in strings<ref name="ES 2019"/> for compatibility with JSON.<ref>{{cite web |url=https://github.com/tc39/proposal-json-superset |title=Subsume JSON (a.k.a. JSON โ ECMAScript) |date=22 May 2018 |website=GitHub }}</ref> *[[YAML]] 1.1 recognized all three as line breaks; YAML 1.2 no longer recognizes them as line breaks in order to be compatible with [[JSON]].<ref>{{cite web |work=YAML Ain't Markup Language revision 1.2.2 |date=2021-10-01 |title=5.4. Line Break Characters |url=https://yaml.org/spec/1.2/spec.html#id2774608}}</ref> *[[Windows Notepad]], the default [[text editor]] of [[Microsoft Windows]], does not treat any of {{mono|NEL}}, {{mono|LS}}, or {{mono|PS}} as line breaks. *[[gedit]], the default [[text editor]] of the [[GNOME]] [[desktop environment]], treats {{mono|LS}} and {{mono|PS}} as line breaks, but not {{mono|NEL}}. Unicode includes some [[glyph]]s intended for presenting a user-visible character to the reader of the document, and are thus not recognized themselves as a newline: * {{unichar|23CE}} * {{unichar|240A}} * {{unichar|240D}} * {{unichar|2424}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)