Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Control character
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==The design purpose== {{Unreferenced section|date=February 2012}} The control characters were designed to fall into a few groups: printing and display control, data structuring, transmission control, and miscellaneous. ===Printing and display control=== Printing control characters were first used to control the physical mechanism of printers, the earliest output device. An early example of this idea was the use of [[Baudot code#Details|Figures (FIGS)]] and [[Baudot code#Details|Letters (LTRS)]] in [[Baudot code]] to shift between two code pages. A later, but still early, example was the [[Out-of-band data|out-of-band]] [[ASA carriage control characters]]. Later, control characters were integrated into the stream of data to be printed. The carriage return character (CR), when sent to such a device, causes it to put the character at the edge of the paper at which writing begins (it may, or may not, also move the printing position to the next line). The line feed character (LF/NL) causes the device to put the printing position on the next line. It may (or may not), depending on the device and its configuration, also move the printing position to the start of the next line (which would be the leftmost position for [[left-to-right]] scripts, such as the alphabets used for Western languages, and the rightmost position for [[right-to-left]] scripts such as the Hebrew and Arabic alphabets). The vertical and horizontal tab characters (VT and HT/TAB) cause the output device to move the printing position to the next tab stop in the direction of reading. The form feed character (FF/NP) starts a new sheet of paper, and may or may not move to the start of the first line. The backspace character (BS) moves the printing position one character space backwards. On printers, including [[Computer terminal#Hard-copy terminals|hard-copy terminals]], this is most often used so the printer can overprint characters to make other, not normally available, characters. On [[Computer terminal#VDUs|video terminals]] and other electronic output devices, there are often software (or hardware) configuration choices that allow a destructive backspace (e.g., a BS, SP, BS sequence), which erases, or a non-destructive one, which does not. The shift in and shift out characters (SI and SO) selected alternate character sets, fonts, underlining, or other printing modes. Escape sequences were often used to do the same thing. With the advent of [[computer terminal]]s that did not physically print on paper and so offered more flexibility regarding screen placement, erasure, and so forth, printing control codes were adapted. Form feeds, for example, usually cleared the screen, there being no new paper page to move to. More complex escape sequences were developed to take advantage of the flexibility of the new terminals, and indeed of newer printers. The concept of a control character had always been somewhat limiting, and was extremely so when used with new, much more flexible, hardware. Control sequences (sometimes implemented as escape sequences) could match the new flexibility and power and became the standard method. However, there were, and remain, a large variety of standard sequences to choose from. ===Data structuring=== The separators (File, Group, Record, and Unit: FS, GS, RS and US) were made to structure data, usually on a tape, in order to simulate [[punched card]]s. End of medium (EM) warns that the tape (or other recording medium) is ending. While many systems use CR/LF and TAB for structuring data, it is possible to encounter the separator control characters in data that needs to be structured. The separator control characters are not overloaded; there is no general use of them except to separate data into structured groupings. Their numeric values are contiguous with the space character, which can be considered a member of the group, as a word separator. For example, the RS separator is used by {{IETF RFC|7464}} (JSON Text Sequences) to encode a sequence of JSON elements. Each sequence item starts with a RS character and ends with a line feed. This allows to serialize open-ended JSON sequences. It is one of the [[JSON streaming]] protocols. ===Transmission control=== The transmission control characters were intended to structure a data stream, and to manage re-transmission or graceful failure, as needed, in the face of transmission errors. The start of heading (SOH) character was to mark a non-data section of a data stream—the part of a stream containing addresses and other housekeeping data. The start of text character (STX) marked the end of the header, and the start of the textual part of a stream. The end of text character (ETX) marked the end of the data of a message. A widely used convention is to make the two characters preceding ETX a checksum or [[Cyclic redundancy check|CRC]] for error-detection purposes. The end of transmission block character (ETB) was used to indicate the end of a block of data, where data was divided into such blocks for transmission purposes. The escape character ([[escape character|ESC]]) was intended to "quote" the next character, if it was another control character it would print it instead of performing the control function. It is almost never used for this purpose today. Various printable characters are used as visible "[[escape character]]s", depending on context. The substitute character ([[substitute character|SUB]]) was intended to request a translation of the next character from a printable character to another value, usually by setting bit 5 to zero. This is handy because some media (such as sheets of paper produced by typewriters) can transmit only printable characters. However, on MS-DOS systems with files opened in text mode, "end of text" or "end of file" is marked by this [[Ctrl-Z]] character, instead of the [[Ctrl-C]] or [[Ctrl-D]], which are common on other operating systems. The cancel character ([[cancel character|CAN]]) signaled that the previous element should be discarded. The negative acknowledge character ([[NAK]]) is a definite flag for, usually, noting that reception was a problem, and, often, that the current element should be sent again. The acknowledge character ([[acknowledge character|ACK]]) is normally used as a flag to indicate no problem detected with current element. When a transmission medium is half duplex (that is, it can transmit in only one direction at a time), there is usually a master station that can transmit at any time, and one or more slave stations that transmit when they have permission. The enquire character ([[Enquiry character|ENQ]]) is generally used by a master station to ask a slave station to send its next message. A slave station indicates that it has completed its transmission by sending the end of transmission character ([[end-of-transmission character|EOT]]). The device control codes (DC1 to DC4) were originally generic, to be implemented as necessary by each device. However, a universal need in data transmission is to request the sender to stop transmitting when a receiver is temporarily unable to accept any more data. [[Digital Equipment Corporation]] invented a convention which used 19 (the device control 3 character ([[Software flow control|DC3]]), also known as control-S, or [[XOFF]]) to "S"top transmission, and 17 (the device control 1 character ([[Software flow control|DC1]]), a.k.a. control-Q, or [[XON]]) to start transmission. It has become so widely used that most don't realize it is not part of official ASCII. This technique, however implemented, avoids additional wires in the data cable devoted only to transmission management, which saves money. A sensible protocol for the use of such transmission flow control signals must be used, to avoid potential deadlock conditions, however. The data link escape character ([[C0 and C1 control codes|DLE]]) was intended to be a signal to the other end of a data link that the following character is a control character such as STX or ETX. For example a packet may be structured in the following way ([[C0 and C1 control codes|DLE]]) <STX> <PAYLOAD> ([[C0 and C1 control codes|DLE]]) <ETX>. ===Miscellaneous codes=== Code 7 ([[bell character|BEL]]) is intended to cause an audible signal in the receiving terminal.<ref>{{cite IETF |rfc=20 |title=ASCII format for Network Interchange |date=October 1969 |access-date=2013-11-03}} An old RFC, which explains the structure and meaning of the control characters in chapters 4.1 and 5.2</ref> Many of the ASCII control characters were designed for devices of the time that are not often seen today. For example, code 22, "synchronous idle" ([[C0 and C1 control codes|SYN]]), was originally sent by synchronous modems (which have to send data constantly) when there was no actual data to send. (Modern systems typically use a start bit to announce the beginning of a transmitted word— this is a feature of ''asynchronous'' communication. ''Synchronous'' communication links were more often seen with mainframes, where they were typically run over corporate leased lines to connect a mainframe to another mainframe or perhaps a minicomputer.) Code 0 (ASCII code name [[null character|NUL]]) is a special case. In paper tape, it is the case when there are no holes. It is convenient to treat this as a ''fill'' character with no meaning otherwise. Since the position of a NUL character has no holes punched, it can be replaced with any other character at a later time, so it was typically used to reserve space, either for correcting errors or for inserting information that would be available at a later time or in another place. In computing, it is often used for padding in [[Record-oriented filesystem|fixed length records]]; to [[Null-terminated string|mark the end of a string]]; and formerly to [[Output padding|give printing devices enough time to execute a control function]]. Code 127 ([[Delete character|DEL]], a.k.a. "rubout") is likewise a special case. Its 7-bit code is ''all-bits-on'' in binary, which essentially erased a character cell on a [[paper tape]] when overpunched. Paper tape was a common storage medium when ASCII was developed, with a computing history dating back to WWII code breaking equipment at [[Biuro Szyfrów]]. Paper tape became obsolete in the 1970s, so this aspect of ASCII rarely saw any use after that. Some systems (such as the original Apple computers) converted it to a backspace. But because its code is in the range occupied by other printable characters, and because it had no official assigned glyph, many computer equipment vendors used it as an additional printable character (often an all-black [[box character]] useful for erasing text by overprinting with ink). Non-erasable [[programmable ROM]]s are typically implemented as arrays of fusible elements, each representing a [[bit]], which can only be switched one way, usually from one to zero. In such PROMs, the DEL and NUL characters can be used in the same way that they were used on punched tape: one to reserve meaningless fill bytes that can be written later, and the other to convert written bytes to meaningless fill bytes. For PROMs that switch one to zero, the roles of NUL and DEL are reversed; also, DEL will only work with 7-bit characters, which are rarely used today; for 8-bit content, the character code 255, commonly defined as a nonbreaking space character, can be used instead of DEL. Many [[file system]]s do not allow control characters in [[filename]]s, as they may have reserved functions.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)