Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Control character
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==In ASCII== [[File:US ASCII Control Character Symbols.png|thumb|right|Early symbols assigned to the 32 control characters, space and delete characters. (ISO 2047, MIL-STD-188-100, 1972)]] {{Main|C0 and C1 control codes#C0 controls}} There were quite a few control characters defined (33 in ASCII, and the ECMA-48 standard adds 32 more). This was because early terminals had very primitive mechanical or electrical controls that made any kind of state-remembering [[API]] quite expensive to implement, thus a different code for each and every function looked like a requirement. It quickly became possible and inexpensive to interpret sequences of codes to perform a function, and device makers found a way to send hundreds of device instructions. Specifically, they used ASCII code 27<sub>10</sub> (escape), followed by a series of characters called a "control sequence" or "escape sequence". The mechanism was invented by [[Bob Bemer]], the father of ASCII. For example, the sequence of code 27<sub>10</sub>, followed by the printable characters <nowiki>"[2;10H"</nowiki>, would cause a [[Digital Equipment Corporation]] [[VT100]] terminal to move its [[cursor (user interface)|cursor]] to the 10th cell of the 2nd line of the screen. Several standards exist for these sequences, notably [[ANSI X3.64]], but the number of non-standard variations is large. All entries in the [[ASCII]] table below code 32<sub>10</sub> (technically the [[C0 and C1 control codes#C0 controls|C0]] control code set) are of this kind, including [[newline|CR and LF]] used to separate lines of text. The code 127<sub>10</sub> ([[Delete character|DEL]]) is also a control character.<ref name="rfc20">{{Cite IETF |date=1969-10-01 |rfc=20 |title=ASCII format for network interchange |access-date=2023-04-05}}</ref><ref>{{cite book |url=https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub1-2-1977.pdf |archive-url=https://ghostarchive.org/archive/20221009/https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub1-2-1977.pdf |archive-date=2022-10-09 |url-status=live |title=American National Standard Code for Information Interchange {{!}} ANSI X3.4-1977 |section=5.2 Control Characters |publisher=National Institute for Standards |date=1977}}</ref> [[Extended ASCII]] sets defined by [[ISO 8859]] added the codes 128<sub>10</sub> through 159<sub>10</sub> as control characters. This was primarily done so that if the high bit was stripped, it would not change a printing character to a C0 control code. This second set is called the [[C0 and C1 control codes#C1 controls|C1]] set. These 65 control codes were carried over to [[Unicode]]. Unicode added more characters that could be considered controls, but it makes a distinction between these "Formatting characters" (such as the [[zero-width non-joiner]]) and the 65 control characters. The [[Extended Binary Coded Decimal Interchange Code]] (EBCDIC) character set contains 65 control codes, including all of the ASCII control codes plus additional codes which are mostly used to control IBM peripherals. {| class="wikitable floatright" |+ ASCII control codes.<ref>MS-DOS QBasic v1.1 Documentation. Microsoft 1987-1991.</ref> |- !!! scope=col | 0x00 !! scope=col | 0x10 |- ! scope=row | 0x00 | [[null character|NUL]] | [[Data link escape character|DLE]] |- ! scope=row | 0x01 | [[Start-of-Header|SOH]] | [[Device Control 1|DC1]] |- ! scope=row | 0x02 | [[Start-of-Text|STX]] | [[Device Control 2|DC2]] |- ! scope=row | 0x03 | [[End-of-text character|ETX]] | [[Device Control 3|DC3]] |- ! scope=row | 0x04 | [[End-of-transmission character|EOT]] | [[Device Control 4|DC4]] |- ! scope=row | 0x05 | [[Enquiry character|ENQ]] | [[NAK]] |- ! scope=row | 0x06 | [[Acknowledge character|ACK]] | [[Synchronous idle character|SYN]] |- ! scope=row | 0x07 | [[Bell character|BEL]] | [[End Transmission Block character|ETB]] |- ! scope=row | 0x08 | [[backspace|BS]] | [[cancel character|CAN]] |- ! scope=row | 0x09 | [[tab character|HT]] | [[End of Medium|EM]] |- ! scope=row | 0x0A | [[Line Feed|LF]] | [[Substitute character|SUB]] |- ! scope=row | 0x0B | [[Vertical Tab|VT]] | [[escape character|ESC]] |- ! scope=row | 0x0C | [[Form Feed|FF]] | [[File separator|FS]] |- ! scope=row | 0x0D | [[carriage return|CR]] | [[Group separator|GS]] |- ! scope=row | 0x0E | [[Shift Out and Shift In characters|SO]] | [[Record separator|RS]] |- ! scope=row | 0x0F | [[Shift Out and Shift In characters|SI]] | [[Unit separator|US]] |- ! scope=row | 0x7F | [[Delete character|DEL]] |} The control characters in ASCII still in common use include: * 0x00 ([[null character|null]], {{tt|NUL}}, {{tt|\0}}, {{tt|^@}}), originally intended to be an ignored character, but now used by many [[programming language]]s including [[C programming language|C]] to [[Null-terminated string|mark the end of a string]]. * 0x04 ([[end-of-transmission character|EOT]]} {{tt|N/A}}, {{tt|N/A}}, {{tt|^D}}) used as an End Of File character on some terminals<REF>{{cite book | title = Component Description: IBM 2780 Data Transmission Terminal | id = GA27-3005-3 | section = EOT (End of ransmission) | section-url = http://bitsavers.org/pdf/ibm/2780/GA27-3005-3-2780_Data_Terminal_Description_Aug71.pdf#page=31 | page = 31 | quote = The EOT character terminates the current transmission and returns all terminals in the data-link to control mode. When sent by the transmitting terminal, it indicates that the terminal has nothing more to transmit and is relinquishing the communications line. The receiving terminal can send an EOT character instead of a normal DLE 0, DLE 1, or NAK response. The EOT character in this case is an abort signal that terminates the transmission. When sent in response to a polling operation, the EOT character indicates that the polled terminal has no data to transmit or is unable to continue transmission. An EOT character is recognized (except in Six- Bit Transcode) only when immediately preceded by a SYN pattern (SYN SYN EOT PAD), or when imme- diately preceded by a DLE and followed by a character of which the first four bits must be all "1" bits (PAD character) DLE EOT PAD. | series = Systems Reference Library | url = http://bitsavers.org/pdf/ibm/2780/GA27-3005-3-2780_Data_Terminal_Description_Aug71.pdf | access-date = May 21, 2025 }} </ref> and for terminal input on [[Unix-like]] systems. * 0x07 ([[bell character|bell]], {{tt|BEL}}, {{tt|\a}}, {{tt|^G}}), which may cause the device to emit a warning such as a bell or beep sound or the screen flashing. * 0x08 ([[backspace]], {{tt|BS}}, {{tt|\b}}, {{tt|^H}}), may overprint the previous character. * 0x09 ([[Tab key|horizontal tab]], {{tt|HT}}, {{tt|\t}}, {{tt|^I}}), moves the printing position right to the next tab stop. * 0x0A ([[line feed]], {{tt|LF}}, {{tt|\n}}, {{tt|^J}}), moves the print head down one line, or to the left edge and down. Used as the end of line marker in Unix-like systems. * 0x0B ([[Tab key|vertical tab]], {{tt|VT}}, {{tt|\v}}, {{tt|^K}}), vertical tabulation. * 0x0C ([[Page break|form feed]], {{tt|FF}}, {{tt|\f}}, {{tt|^L}}), to cause a printer to eject paper to the top of the next page, or a video terminal to clear the screen. * 0x0D ([[carriage return]], {{tt|CR}}, {{tt|\r}}, {{tt|^M}}), moves the printing position to the start of the line, allowing overprinting. Used as the end of line marker in [[Classic Mac OS]], [[OS-9]], [[FLEX (operating system)|FLEX]] (and variants). A {{tt|CR+LF}} pair is used by [[CP/M]]-80 and its derivatives including [[DOS]] and [[Windows]], and by [[Application Layer]] [[communications protocol|protocols]] such as [[FTP]], [[SMTP]], and [[HTTP]]. * 0x1A ([[Control-Z]], {{tt|SUB}}, {{tt|^Z}}). Acts as an end-of-file for the Windows text-mode file i/o. * 0x1B ([[escape character|escape]], {{tt|ESC}}, {{tt|\e}} ([[GCC (software)|GCC]] only), {{tt|^[}}). Introduces an [[escape sequence]]. Control characters may be described as doing something when the user inputs them, such as code 3 ([[End-of-Text character]], ETX, {{tt|^C}}) to interrupt the running process, or code 4 ([[End-of-Transmission character]], EOT, {{tt|^D}}), used to end text input on Unix or to exit a [[Unix shell]]. These uses usually have little to do with their use when they are in text being output.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)