Template:Short description {{#invoke:other uses|otheruses}} The null character is a control character with the value zero.<ref>Template:Cite IETF</ref><ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref><ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref><ref>{{#invoke:citation/CS1|citation |CitationClass=web }} </ref><ref>"A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string literal." — ANSI/ISO 9899:1990 (the ANSI C standard), section 5.2.1</ref> Many character sets include a code point for a null character Template:Endash including Unicode (Universal Coded Character Set), ASCII (ISO/IEC 646), Baudot, ITA2 codes, the C0 control code, and EBCDIC. In modern character sets, the null character has a code point value of zero which is generally translated to a single code unit with a zero value. For instance, in UTF-8, it is a single, zero byte. However, in Modified UTF-8 the null character is encoded as two bytes : Template:Tt. This allows the byte with the value of zero, which is not used for any character, to be used as a string terminator.

Originally, its meaning was like NOP Template:Endash when sent to a printer or a terminal, it had no effect (although some terminals incorrectly displayed it as space). When electromechanical teleprinters were used as computer output devices, one or more null characters were sent at the end of each printed line to allow time for the mechanism to return to the first printing position on the next line.Template:Citation needed On punched tape, the character is represented with no holes at all, so a new unpunched tape is initially filled with null characters, and often text could be inserted at a reserved space of null characters by punching the new characters into the tape over the nulls.

A null-terminated string is a commonly used data structure in the C programming language, its many derivative languages and other programming contexts that uses a null character to indicate the end of a string.<ref>"A string is a contiguous sequence of characters terminated by and including the first null character" — ANSI/ISO 9899:1990 (the ANSI C standard), section 7.1.1</ref><ref>Template:Citation</ref> This design allows a string to be any length at the cost of only one extra character of memory. The common competing design for a string stores the length of the string as an integer data type, but this limits the size of the string to the range of the integer (for example, 255 for a byte).

For byte storage, the null character can be called a null byte.

RepresentationEdit

Since the null character is not a printable character representing it requires special notation in source code.

In a string literal, the null character is often represented as the escape sequence \0 (for example, "abc\0def"). Similar notation is often used for a character literal (i.e. '\0') although that is often equivalent to the numeric literal for zero (0).<ref name="KandR38">Kernighan and Ritchie, C, p. 38: "The character constant '\0' represents the character with value zero, the null character. '\0' is often written instead of 0 to emphasize the character nature of some expression, but the numeric value is just 0."</ref> In many languages (such as C, which introduced this notation), this is not a separate escape sequence, but an octal escape sequence with a single octal digit 0; as a consequence, \0 must not be followed by any of the digits 0 through 7; otherwise it is interpreted as the start of a longer octal escape sequence.<ref>In YAML this combination is a separate escape sequence.</ref> Other escape sequences that are found in use in various languages are \000, \x00, \z, or \u0000.

A null character can be placed in a URL with the percent code %00.

The ability to represent a null character does not always mean the resulting string will be correctly interpreted, as many programs will consider the null to be the end of the string. Thus, the ability to type it (in case of unchecked user input) creates a vulnerability known as null byte injection and can lead to security exploits.<ref>Null Byte Injection WASC Threat Classification Null Byte Attack section.</ref>

In software documentation, the null character is often represented with the text NUL (or NULL although that may mean the null pointer). In Unicode, there is a character for this: Template:Unichar.

In caret notation the null character is ^@. On some keyboards, one can enter a null character by holding down Template:Keypress and pressing Template:Keypress (on US layouts just Template:Keypress will often work, there being no need for Template:Keypress to get the @ sign).

ReferencesEdit

Template:Reflist

External linksEdit

Template:Nulls