Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Null character
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{short description|Control character with value 0}} {{Other uses|Null symbol (disambiguation){{!}}Null symbol}} The '''null character''' is a [[control character]] with the value [[zero]].<ref>{{cite IETF |rfc=20 |title=ASCII format for Network Interchange |section=5.2 |quote=NUL (Null): The all-zeros character which may serve to accomplish time fill and media fill. |publisher=[[Internet Engineering Task Force|IETF]] }}</ref><ref>{{cite web |url = http://kikaku.itscj.ipsj.or.jp/ISO-IR/001.pdf |title = The set of control characters of the ISO 646 |publisher = Secretariat ISO/TC 97/SC 2 |date = 1975-12-01 |page = 4.4 |quote = Position: 0/0, Name: Null, Abbreviation: Nul |url-status = dead |archive-url = https://web.archive.org/web/20140512221203/http://kikaku.itscj.ipsj.or.jp/ISO-IR/001.pdf |archive-date = 2014-05-12 }}</ref><ref>{{cite web |url=https://www.fileformat.info/info/unicode/char/0000/index.htm |title=Unicode Character 'NULL' (U+0000) |access-date=2018-10-20 }}</ref><ref>{{cite web |url=https://www.unicode.org/charts/PDF/U0000.pdf |title=C0 Controls and Basic Latin |date=2018 |publisher=Unicode Consortium |access-date=2018-10-20}} </ref><ref>"A byte with all bits set to 0, called the ''null character'', shall exist in the basic execution character set; it is used to terminate a character string literal." β ANSI/ISO 9899:1990 (the ANSI C standard), section 5.2.1</ref> Many [[character set]]s include a [[code point]] for a null character {{endash}} including [[Unicode]] ([[Universal Coded Character Set]]), [[ASCII]] ([[ISO/IEC 646]]), [[Baudot code|Baudot]], [[ITA2]] codes, the [[C0 and C1 control codes|C0 control code]], and [[EBCDIC]]. In modern character sets, the null character has a code point value of zero which is generally translated to a single code unit with a zero value. For instance, in [[UTF-8]], it is a single, zero byte. However, in [[UTF-8#Modified UTF-8|Modified UTF-8]] the null character is encoded as two bytes <!-- An overlong encoding of U+0000 -->: {{tt|0xC0,0x80}}. This allows the byte with the value of zero, which is not used for any character, to be used as a string terminator. Originally, its meaning was like [[NOP (code)|NOP]] {{endash}} when sent to a [[computer printer|printer]] or a [[computer terminal|terminal]], it had no effect (although some terminals incorrectly displayed it as [[space (punctuation)|space]]). When electromechanical [[teleprinter]]s were used as computer output devices, one or more null characters were sent at the end of each printed line to allow time for the mechanism to return to the first printing position on the next line.{{Citation needed|date=April 2010}} On [[punched tape]], the character is represented with no holes at all, so a new unpunched tape is initially filled with null characters, and often text could be inserted at a reserved space of null characters by punching the new characters into the tape over the nulls. A [[null-terminated string]] is a commonly used data structure in the [[C (programming language)|C programming language]], its many derivative languages and other programming contexts that uses a null character to indicate the end of a [[string (computer science)|string]].<ref>"A ''string'' is a contiguous sequence of characters terminated by and including the first null character" β ANSI/ISO 9899:1990 (the ANSI C standard), section 7.1.1</ref><ref>{{citation |title=Working Draft, Standard for Programming Language C++ |type=ISO 14882 standard working draft |date=28 February 2011 |publisher=[[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] |page=427 |url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf |access-date=27 February 2013 |id=N3242=11-0012 |quote=A ''null-terminated byte string'', or NTBS, is a character sequence whose highest-addressed element with defined content has the value zero (the ''terminating null character''); no other element in the sequence has the value zero.}}</ref> This design allows a string to be any length at the cost of only one extra character of memory. The common competing design for a string stores the length of the string as an [[integer]] [[data type]], but this limits the size of the string to the range of the integer (for example, 255 for a byte). For [[byte]] storage, the null character can be called a '''null byte'''. == Representation == Since the null character is not a [[printable character]] representing it requires special notation in [[source code]]. In a [[string literal]], the null character is often represented as the [[escape sequence]] <code>\0</code> (for example, <code>"abc\0def"</code>). Similar notation is often used for a character literal (i.e. <code>'\0'</code>) although that is often equivalent to the numeric literal for zero (<code>0</code>).<ref name="KandR38">Kernighan and Ritchie, ''C'', p. 38: "The character constant '\0' represents the character with value zero, the null character. '\0' is often written instead of 0 to emphasize the character nature of some expression, but the numeric value is just 0."</ref> In many languages ([[Escape sequences in C|such as C]], which introduced this notation), this is not a separate escape sequence, but an octal escape sequence with a single [[octal]] digit 0; as a consequence, <code>\0</code> must not be followed by any of the digits <code>0</code> through <code>7</code>; otherwise it is interpreted as the start of a longer octal escape sequence.<ref>In [[YAML]] this combination is a [http://www.yaml.org/spec/1.2/spec.html#id2776092 separate escape sequence].</ref> Other escape sequences that are found in use in various languages are <code>\000</code>, <code>\x00</code>, <code>\z</code>, or <code>\u0000</code>. A null character can be placed in a [[URL]] with the [[Percent encoding|percent code]] <code>%00</code>. The ability to represent a null character does not always mean the resulting string will be correctly interpreted, as many programs will consider the null to be the end of the string. Thus, the ability to type it (in case of [[unchecked user input]]) creates a [[vulnerability (computing)|vulnerability]] known as '''null byte injection''' and can lead to security exploits.<ref>[http://projects.webappsec.org/Null-Byte-Injection Null Byte Injection] WASC Threat Classification Null Byte Attack section.</ref> In [[software documentation]], the null character is often represented with the text '''NUL''' (or '''NULL''' although that may mean the [[null pointer]]). In [[Unicode]], there is a character for this: {{unichar|2400}}. In [[caret notation]] the null character is <code>^@</code>. On some keyboards, one can enter a null character by holding down {{keypress|Ctrl}} and pressing {{keypress|@}} (on US layouts just {{keypress|Ctrl|2}} will often work, there being no need for {{keypress|Shift}} to get the @ sign). == References == {{Reflist|30em}} ==External links== * [http://projects.webappsec.org/Null-Byte-Injection Null Byte Injection] WASC Threat Classification Null Byte Attack section * [http://www.hackthis.co.uk/articles/common-php-attacks-poison-null-byte Poison Null Byte Introduction] Introduction to Null Byte Attack * [[Apple Inc.|Apple]] [http://projects.webappsec.org/Null-Byte-Injection null byte injection] [[QR code]] [https://github.com/cloudsecurityalliance/gsd-database/issues/2322 vulnerability] {{nulls}} {{DEFAULTSORT:Null Character}} [[Category:Control characters]] [[Category:Computer security exploits]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation
(
edit
)
Template:Citation needed
(
edit
)
Template:Cite IETF
(
edit
)
Template:Cite web
(
edit
)
Template:Endash
(
edit
)
Template:Keypress
(
edit
)
Template:Nulls
(
edit
)
Template:Other uses
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Tt
(
edit
)
Template:Unichar
(
edit
)