Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Digraphs and trigraphs (programming)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Two or three characters, treated as one}} {{Other uses|Digraph (disambiguation){{!}}Digraph|Trigraph (disambiguation){{!}}Trigraph}} {{More citations needed|date=September 2008}} {{Use dmy dates|date=May 2019|cs1-dates=y}} In [[computer programming]], '''digraphs and trigraphs''' are sequences of two and three [[character (computing)|character]]s, respectively, that appear in [[source code]] and, according to a [[programming language]]'s specification, should be treated as if they were single characters. Various reasons exist for using digraphs and trigraphs: keyboards may not have keys to cover the entire [[character set]] of the language, input of special characters may be difficult, [[text editor]]s may reserve some characters for special use and so on. Trigraphs might also be used for some [[EBCDIC]] [[code page]]s that lack characters such as <code>{</code> and <code>}</code>. == History == The basic character set of the [[C programming language]] is a subset of the [[ASCII]] character set that includes nine characters which lie outside the [[ISO 646]] invariant character set. This can pose a problem for writing [[source code]] when the [[encoding]] (and possibly [[computer keyboard|keyboard]]) being used does not support any of these nine characters. The [[ANSI C]] committee invented trigraphs as a way of entering source code using keyboards that support any version of the ISO 646 character set.<ref>{{Cite book |url=https://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf |title=Rationale for International Standard—Programming Languages—C |version=Revision 5.10 |pages=20–21 }}</ref> With the widespread adoption of [[ASCII]] and [[Unicode]]/[[UTF-8]], trigraph use is limited today, and trigraph support has been removed from C as of C23. <ref>{{cite web|url=https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2940.pdf|title=Removing trigraphs??!}}</ref> == {{anchor|TRIGRAPH.EXE}}Implementations == Trigraphs are not commonly encountered outside [[compiler]] [[test suite]]s.<ref name="C"/> Some compilers support an option to turn recognition of trigraphs off, or disable trigraphs by default and require an option to turn them on. Some can issue warnings when they encounter trigraphs in source files. [[Borland]] supplied a separate program, the trigraph preprocessor (<code>TRIGRAPH.EXE</code>), to be used only when trigraph processing is desired (the rationale was to maximise speed of compilation). == Language support == Different systems define different sets of digraphs and trigraphs, as described below. === ALGOL === Early versions of [[ALGOL]] predated the standardized ASCII and EBCDIC character sets, and were typically implemented using a manufacturer-specific [[six-bit character code]]. A number of ALGOL operations either lacked [[codepoint]]s in the available character set or were not supported by peripherals, leading to a number of substitutions including <code>:=</code> for <code>←</code> (assignment) and <code>>=</code> for <code>≥</code> (greater than or equal). === Pascal === The [[Pascal (programming language)|Pascal]] programming language supports digraphs <code>(.</code>, <code>.)</code>, <code>(*</code> and <code>*)</code> for <code>[</code>, <code>]</code>, <code>{</code> and <code>}</code> respectively. Unlike all other cases mentioned here, <code>(*</code> and <code>*)</code> were and still are in wide use. However, many compilers treat them as a different type of commenting block rather than as actual digraphs, that is, a comment started with <code>(*</code> cannot be closed with <code>}</code> and vice versa. === J === The [[J programming language]] is a descendant of [[APL (programming language)|APL]] but uses the ASCII character set rather than [[APL symbol]]s. Because the printable range of ASCII is smaller than APL's specialized set of symbols, <code>.</code> (dot) and <code>:</code> (colon) characters are used to inflect ASCII symbols, effectively interpreting unigraphs, digraphs or rarely trigraphs as standalone "symbols".<ref name="Hui_2015"/> Unlike the use of digraphs and trigraphs in C and [[C++]], there are no single-character equivalents to these in J. === C === {{See also|C alternative tokens}} {|class="wikitable floatright" style="margin-left: 1.5em;" |- ! Trigraph !! Equivalent |- | <code>??=</code> || <code>#</code> |- | <code>??/</code> || <code>\</code> |- | <code>??'</code> || <code>^</code> |- | <code>??(</code> || <code><nowiki>[</nowiki></code> |- | <code>??)</code> || <code><nowiki>]</nowiki></code> |- | <code>??!</code> || <code><nowiki>|</nowiki></code> |- | <code>??<</code> || <code>{</code> |- | <code>??></code> || <code>}</code> |- | <code>??-</code> || <code>~</code> |} The [[C preprocessor]] (used for C and with slight differences in [[C++]]; see [[#C++|below]]) replaces all occurrences of the nine trigraph sequences in this table by their single-character equivalents before any other processing (until [[C23 (C standard revision)|C23]]<ref>{{cite web|url=https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2940.pdf|title=Removing trigraphs??!}}</ref>).<ref name="BSI_2003_C"/><ref name="Rationale_2003_C"/> A programmer may want to place two question marks together yet not have the compiler treat them as introducing a trigraph. The C grammar does not permit two consecutive <code>?</code> tokens, so the only places in a C file where two question marks in a row may be used are in multi-character constants, [[string literal]]s, and comments. This is particularly a problem for the [[classic Mac OS]], where the constant <code>'????'</code> may be used as a file type or creator.<ref>{{Cite web |title=File Basics |url=https://whitefiles.org/dta/pgs/f01.htm |access-date=2024-05-08 |website=whitefiles.org}}</ref> To safely place two consecutive question marks within a string literal, the programmer can use string concatenation <code>"...?""?..."</code> or an [[escape sequence]] <code>"...?\?..."</code>. <code>???</code> is not itself a trigraph sequence, but when followed by a character such as <code>-</code> it will be interpreted as <code>?</code> + <code>??-</code>, as in the example below which has 16 <code>?</code>s before the <code>/</code>. The <code>??/</code> trigraph can be used to introduce an escaped newline for line splicing; this must be taken into account for correct and efficient handling of trigraphs within the preprocessor. It can also cause surprises, particularly within comments. For example: {{sxhl|2=c|1=<nowiki/> // Will the next line be executed????????????????/ a++; }} which is a single logical comment line (used in C++ and [[C99]]), and {{sxhl|2=c|1=<nowiki/> /??/ * A comment *??/ / }} which is a correctly formed block comment. The concept can be used to check for trigraphs as in the following C99 example, where only one return statement will be executed. {{sxhl|2=c|1=<nowiki/> int trigraphsavailable() // returns 0 or 1; language standard C99 or later { // are trigraphs available??/ return 0; return 1; } }} {|class="wikitable floatright" style="margin-left: 1.5em;" |+ Alternative digraphs introduced in the C standard in 1994 |- ! Digraph !! Equivalent |- | <code><:</code> || <code>[</code> |- | <code>:></code> || <code>]</code> |- | <code><%</code> || <code>{</code> |- | <code>%></code> || <code>}</code> |- | <code>%:</code> || <code>#</code> |} In 1994, a normative amendment to the C standard, [[ANSI C#C95|C95]],<ref>{{Cite ISO standard|csnumber=23909|title=ISO/IEC 9899:1990/Amd 1:1995 - Programming languages — C — Amendment 1: C Integrity|date=March 1995|access-date=30 May 2024}}</ref><ref>{{Cite web| url=http://www.lysator.liu.se/c/na1.html | title=A brief description of Normative Addendum 1 | author=Clive D.W. Feather | date=2010-09-12}}</ref> included in C99, supplied digraphs as more readable alternatives to five of the trigraphs. Unlike trigraphs, digraphs are handled during [[Tokenization (lexical analysis)|tokenization]], and any digraph must always represent a full token by itself, or compose the token <code>%:%:</code> replacing the preprocessor concatenation token <code>##</code>. If a digraph sequence occurs inside another token, for example a quoted string, or a character constant, it will not be replaced. === C++ === {{See also|C alternative tokens}} [[C++]] (through [[C++14]], see [[#C++REMOVAL|below]]) behaves like C, including the C99 additions.<ref name="Stroustrup_1994_DEC"/> As a note, <code>%:%:</code> is treated as a single token, rather than two occurrences of <code>%:</code>. In the sequence <code><::</code> if the subsequent character is neither <code>:</code> nor <code>></code>, the <code><</code> is treated as a preprocessing token by itself and not as the first character of the alternative token <code><:</code>. This is done so certain uses of templates are not broken by the substitution. The C++ Standard makes this comment with regards to the term "digraph":<ref name="OpenSTD"/> {{Quote|text=The term "digraph" (token consisting of two characters) is not perfectly descriptive, since one of the alternative preprocessing-tokens is <code>%:%:</code> and of course several primary tokens contain two characters. Nonetheless, those alternative tokens that aren't lexical keywords are colloquially known as "digraphs".}} {{Anchor|C++REMOVAL}}Trigraphs were proposed for deprecation in [[C++0x]], which was released as [[C++11]].<ref name="N2837"/> This was opposed by [[IBM]], speaking on behalf of itself and other users of C++,<ref name="N2910"/> and as a result trigraphs were retained in C++11. Trigraphs were then proposed again for removal (not only deprecation) in [[C++17]].<ref name="N3981"/> This passed a committee vote, and trigraphs (but not the additional tokens) are removed from C++17 despite the opposition from IBM.<ref name="N4210"/> Existing code that uses trigraphs can be supported by translating from the source files (parsing trigraphs) to the basic source character set that does not include trigraphs.<ref name="N3981"/> === {{anchor|TIO}}RPL === [[Hewlett-Packard]] calculators supporting the [[Reverse Polish LISP|RPL]] language and input method provide support for a large number of trigraphs (also called ''TIO codes'') to reliably transcribe non-seven-bit ASCII characters of the [[RPL character set|calculators' extended character set]]<ref name="HP82240B_1989"/><ref name="HP48G_UG"/><ref name="HP50G_AUR"/> on foreign platforms, and to ease keyboard input without using the {{Mono|CHARS}} application.<ref name="HP-TIO"/><ref name="Heinz_2005"/><ref name="HP48G_UG"/><ref name="HP50G_AUR"/> The first character of all TIO codes is a <code>\</code>, followed by two other ASCII characters vaguely resembling the glyph to be substituted.<ref name="HP-TIO"/><ref name="Heinz_2005"/><ref name="HP48G_UG"/><ref name="HP50G_AUR"/><ref name="Finseth_2012"/> All other characters can be entered using the special <code>\nnn</code> TIO code syntax with nnn being a three-digit [[decimal number]] (with [[leading zero]]s if necessary) of the corresponding [[code point]] (thereby formally representing a ''[[tetragraph]]'').<ref name="HP-TIO"/><ref name="HP48G_UG"/><ref name="HP50G_AUR"/> == Application support == === Vim === The [[Vim (text editor)|Vim]] text editor supports digraphs for actual entry of text characters, following {{IETF_RFC|1345}}. The entry of digraphs is [[Key binding|bound]] to {{keypress|Ctrl|K}} by default.<ref name="vim"/> The list of all possible digraphs in [[Vim (text editor)|Vim]] can be displayed by typing {{kbd|:dig}}. === GNU Screen === [[GNU Screen]] has a digraph command, bound to {{keypress|Ctrl|A}} {{keypress|Ctrl|V}}<!-- ^A ^V --> by default.<ref name="Screen"/> === Lotus === [[Lotus 1-2-3]] for [[DOS]] uses {{keypress|Alt|F1}} as [[compose key]] to allow easier input of many special characters of the [[Lotus International Character Set]] (LICS)<ref name="HP_1991_95LXUG"/> and [[Lotus Multi-Byte Character Set]] (LMBCS). == See also == {{Portal|Computer programming}} * [[Compose key]] * [[List of XML and HTML character entity references]] * [[Escape sequence]] * [[Escape sequences in C]] * [[C alternative tokens]] == References == {{Reflist|refs= <ref name="C">{{Cite book |title=The New C Standard: An Economic and Cultural Commentary |author-first=Derek M. |author-last=Jones |chapter=Sentence 117}}</ref> <ref name="Hui_2015">{{cite web |author-last=Hui |author-first=Roger |title=Vocabulary |url=http://www.jsoftware.com/help/dictionary/vocabul.htm |website=jsoftware.com |access-date=2015-04-16 |archive-url=https://web.archive.org/web/20190402203251/http://www.jsoftware.com/help/dictionary/vocabul.htm |archive-date=2019-04-02}}</ref> <ref name="BSI_2003_C">{{cite book |author=British Standards Institute |author-link=British Standards Institute |title=The C Standard - Incorporating TC1 - BS ISO/IEC 9899:1999 |publisher=[[John Wiley & Sons]] |date=2003 |isbn=0-470-84573-2}}</ref> <ref name="Rationale_2003_C">{{cite web |title=Rationale for International Standard - Programming Languages - C |version=5.10 |date=April 2003 |url=http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf |access-date=2010-10-17 |url-status=live |archive-url=https://web.archive.org/web/20160606072228/http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf |archive-date=2016-06-06}}</ref> <ref name="Stroustrup_1994_DEC">{{cite book |title=Design and Evolution of C++ |author-first=Bjarne |author-last=Stroustrup |author-link=Bjarne Stroustrup |publisher=[[Addison-Wesley Publishing Company]]<!-- [[AT&T Bell Labs]] --> |edition=1 |date=1994-03-29 |isbn=0-201-54330-3}}</ref> <ref name="OpenSTD">{{cite web |editor-first=Stefanus |editor-last=Du Toit |title=Working Draft, Standard for Programming Language C++ |date=2012-01-16 |id=N3337 |url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf |access-date=2019-05-08 |url-status=live |archive-url=https://web.archive.org/web/20190508044758/http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf |archive-date=2019-05-08}}</ref> <ref name="N2837">{{cite web |url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2837.pdf |title=C++0X, CD 1, National Body Comments |id=SC22/WG21 N2837 comment UK 11 |date=2009-01-30 |access-date=2019-05-12 |url-status=live |archive-url=https://web.archive.org/web/20170801160118/http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2837.pdf |archive-date=2017-08-01}}</ref> <ref name="N2910">{{cite web |url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2910.pdf |title=Comment on Proposed Trigraph Deprecation |author-first1=Michael |author-last1=Wong |author-first2=Hubert |author-last2=Tong |author-first3=Robert |author-last3=Klarer |author-first4=Ian |author-last4=McIntosh |author-first5=Raymond |author-last5=Mak |author-first6=Christopher |author-last6=Cambly |author-first7=Alain |author-last7=LaBonté |id=N2910 |date=2009-06-19 |access-date=2019-05-12 |url-status=live |archive-url=https://web.archive.org/web/20170801162124/http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2910.pdf |archive-date=2017-08-01}}</ref> <ref name="N3981">{{cite web |url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3981.html |title=Removing trigraphs??! |id=N3981 |author-first=Richard |author-last=Smith |date=2014-05-06 |access-date=2019-05-12 |url-status=live |archive-url=https://web.archive.org/web/20180709123422/http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n3981.html |archive-date=2018-07-09}}</ref> <ref name="N4210">{{cite web |url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4210.pdf |title=IBM comment on preparing for a Trigraph-adverse future in C++17 |id=IBM paper N4210 |date=2014-10-10 |author-first1=Michael |author-last1=Wong |author-first2=Hubert |author-last2=Tong |author-first3=Rajan |author-last3=Bhakta |author-first4=Derek |author-last4=Inglis |access-date=2019-05-12 |url-status=live |archive-url=https://web.archive.org/web/20180911053619/http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4210.pdf |archive-date=2018-09-11}}</ref> <ref name="HP82240B_1989">{{cite book |title=HP 82240B Infrared Printer |publisher=[[Hewlett-Packard]] |date=August 1989 |edition=1 |id=HP reorder number 82240-90014 |location=Corvallis, OR, USA }}</ref> <ref name="HP-TIO">{{cite web |url=http://holyjoe.org/hp/tiotable.htm |title=HP RPL TIO Table |work=holyjoe.org |access-date=2015-01-23 |url-status=live |archive-url=https://web.archive.org/web/20160523164117/http://holyjoe.org/hp/tiotable.htm |archive-date=2016-05-23}}</ref> <ref name="Heinz_2005">{{cite web |author-first=Michael W. |author-last=Heinz, Sr. |title=HP-ASCII and Trigraphs |date=2005 |url=http://hpconnect.sourceforge.net/trigraphs.html |access-date=2016-08-02 |url-status=live |archive-url=https://web.archive.org/web/20160802011132/http://hpconnect.sourceforge.net/trigraphs.html |archive-date=2016-08-02}}</ref> <ref name="HP48G_UG">{{cite book |title=HP 48G Series – User's Guide (UG) |publisher=[[Hewlett-Packard]] |edition=8 |date=December 1994 |id=HP 00048-90126, (00048-90104) |orig-year=1993<!-- edition 1 (1993-05) --> |pages=2-5, 27-16 |url=http://www.hpcalc.org/details.php?id=3937 |access-date=2015-09-06 |url-status=live |archive-url=https://web.archive.org/web/20160806145719/http://www.hpcalc.org/details.php?id=3937 |archive-date=2016-08-06}} [http://www.hpcalc.org/hp48/docs/misc/hp48gug.zip<!-- https://web.archive.org/web/20160528071119/http://www.hpcalc.org/hp48/docs/misc/hp48gug.zip -->]</ref> <ref name="HP50G_AUR">{{cite book |title=HP 50g / 49g+ / 48gII graphing calculator advanced user's reference manual (AUR) |publisher=[[Hewlett-Packard]] |edition=2 |date=2009-07-14 |orig-year=2005<!-- first published: Edition 1 (2005–09) --> |id=HP F2228-90010 |pages=J-1, J-2 |url=http://www.hpcalc.org/details.php?id=7141 |access-date=2015-10-10 |url-status=live |archive-url=https://web.archive.org/web/20180708015449/https://www.hpcalc.org/details/7141 |archive-date=2018-07-08}} [https://web.archive.org/web/20190430072646/http://holyjoe.net/hp/HP_50g_AUR_v2_English_searchable.pdf<!-- http://holyjoe.net/hp/HP_50g_AUR_v2_English_searchable.pdf --> Searchable PDF]</ref> <ref name="Finseth_2012">{{cite web |title=chars |author-first=Craig A. |author-last=Finseth |date=2012-02-25 |url=https://www.finseth.com/hpdata/chars.php |access-date=2017-12-21 |url-status=live |archive-url=https://web.archive.org/web/20171221075534/https://www.finseth.com/hpdata/chars.php |archive-date=2017-12-21}}</ref> <ref name="vim">{{cite web |url=http://vimdoc.sourceforge.net/htmldoc/digraph.html#digraphs-default |title=Vim documentation: *digraphs-default* |author=<!--Staff writer(s); no by-line.--> |date=2011-01-15 |access-date=2019-05-12 |url-status=live |archive-url=https://web.archive.org/web/20181220141938/http://vimdoc.sourceforge.net/htmldoc/digraph.html |archive-date=2018-12-20}}</ref> <ref name="HP_1991_95LXUG">{{cite book |title=HP 95LX User's Guide |publisher=[[Hewlett-Packard Company]], Corvallis Division |location=Corvallis, OR, USA |edition=2 |chapter=Appendix F |date=June 1991 |orig-year=March 1991 |id=F0001-90003 |url=http://www.retroisle.com/others/hp95lx/OriginalDocs/95LX_UsersGuide_F1000-90001_826pages_Jun91.pdf |access-date=2016-11-27 |url-status=live |archive-url=https://web.archive.org/web/20161128202642/http://www.retroisle.com/others/hp95lx/OriginalDocs/95LX_UsersGuide_F1000-90001_826pages_Jun91.pdf |archive-date=2016-11-28}}</ref> <ref name="Screen">{{cite web |title=Digraph - Screen User's Manual |url=https://www.gnu.org/software/screen/manual/html_node/Digraph.html |access-date=2019-05-12 |url-status=live |archive-url=https://web.archive.org/web/20181231060134/https://www.gnu.org/software/screen/manual/html_node/Digraph.html |archive-date=2018-12-31}}</ref> }} == External links == * {{IETF_RFC|1345}} [[Category:C (programming language)]] [[Category:Character encoding]] [[Category:Input/output]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Anchor
(
edit
)
Template:Cite ISO standard
(
edit
)
Template:Cite book
(
edit
)
Template:Cite web
(
edit
)
Template:IETF RFC
(
edit
)
Template:Kbd
(
edit
)
Template:Keypress
(
edit
)
Template:Mono
(
edit
)
Template:More citations needed
(
edit
)
Template:Other uses
(
edit
)
Template:Portal
(
edit
)
Template:Quote
(
edit
)
Template:Reflist
(
edit
)
Template:See also
(
edit
)
Template:Short description
(
edit
)
Template:Sxhl
(
edit
)
Template:Use dmy dates
(
edit
)