Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
String literal
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Delimiter collision == {{main|Delimiter collision}} When using quoting, if one wishes to represent the delimiter itself in a string literal, one runs into the problem of ''[[delimiter collision]].'' For example, if the delimiter is a double quote, one cannot simply represent a double quote itself by the literal <code>"""</code> as the second quote is interpreted as the end of the string literal, not as the value of the string, and similarly one cannot write <code>"This is "in quotes", but invalid."</code> as the middle quoted portion is instead interpreted as outside of quotes. There are various solutions, the most general-purpose of which is using escape sequences, such as <code>"\""</code> or <code>"This is \"in quotes\" and properly escaped."</code>, but there are many other solutions. Paired quotes, such as braces in Tcl, allow nested strings, such as <code>{foo {bar} zork}</code> but do not otherwise solve the problem of delimiter collision, since an unbalanced closing delimiter cannot simply be included, as in <code>{}}</code>. ===Doubling up=== A number of languages, including [[Pascal (programming language)|Pascal]], [[BASIC]], [[DIGITAL Command Language|DCL]], [[Smalltalk]], [[SQL]], [[J (programming language)|J]], and [[Fortran]], avoid delimiter collision by ''doubling up'' on the quotation marks that are intended to be part of the string literal itself: <syntaxhighlight lang="pascal"> 'This Pascal string''contains two apostrophes''' </syntaxhighlight> <syntaxhighlight lang="qbasic"> "I said, ""Can you hear me?""" </syntaxhighlight> ===Dual quoting=== Some languages, such as [[Fortran]], [[Modula-2]], [[JavaScript]], [[Python (programming language)|Python]], and [[PHP]] allow more than one quoting delimiter; in the case of two possible delimiters, this is known as '''dual quoting'''. Typically, this consists of allowing the programmer to use either single quotations or double quotations interchangeably β each literal must use one or the other. <syntaxhighlight lang="python"> "This is John's apple." 'I said, "Can you hear me?"' </syntaxhighlight> This does not allow having a single literal with both delimiters in it, however. This can be worked around by using several literals and using [[string concatenation]]: <syntaxhighlight lang="python"> 'I said, "This is ' + "John's" + ' apple."' </syntaxhighlight> Python has [[string literal concatenation]], so consecutive string literals are concatenated even without an operator, so this can be reduced to: <syntaxhighlight lang="python"> 'I said, "This is '"John's"' apple."' </syntaxhighlight> ===Delimiter quoting=== [[C++11]] introduced so-called ''raw string literals''. They consist, essentially of :<code>R" ''end-of-string-id'' ( ''content'' ) ''end-of-string-id'' "</code>, that is, after <code>R"</code> the programmer can enter up to 16 characters except whitespace characters, parentheses, or backslash, which form the ''end-of-string-id'' (its purpose is to be repeated to signal the end of the string, ''eos id'' for short), then an opening parenthesis (to denote the end of the eos id) is required. Then follows the actual content of the literal: Any sequence characters may be used (except that it may not contain a closing parenthesis followed by the eos id followed a quote), and finally β to terminate the string β a closing parenthesis, the eos id, and a quote is required.<br/> The simplest case of such a literal is with empty content and empty eos id: <code>R"()"</code>.<br/> The eos id may itself contain quotes: {{code|2=cpp|1=R""(I asked, "Can you hear me?")""}} is a valid literal (the eos id is <code>"</code> here.)<br/> Escape sequences don't work in raw string literals. [[D (programming language)|D]] supports a few quoting delimiters, with such strings starting with <code>q"</code> plus an opening delimiter and ending with the respective closing delimiter and <code>"</code>. Available delimiter pairs are <code>()</code>, <code><></code>, <code>{}</code>, and <code>[]</code>; an unpaired non-identifier delimiter is its own closing delimiter. The paired delimiters nest, so that {{code|2=d|1=q"(A pair "()" of parens in quotes)"}} is a valid literal; an example with the non-nesting <code>/</code> character is {{code|2=d|1=q"/I asked, "Can you hear me?"/"}}.<br/> Similar to C++11, D allows here-document-style literals with end-of-string ids: :<code>q" ''end-of-string-id'' newline ''content'' newline ''end-of-string-id'' "</code> In D, the ''end-of-string-id'' must be an identifier (alphanumeric characters). In some programming languages, such as [[Bourne shell|sh]] and [[Perl]], there are different delimiters that are treated differently, such as doing string interpolation or not, and thus care must be taken when choosing which delimiter to use; see [[#Different kinds of strings|different kinds of strings]], below. ===Multiple quoting=== A further extension is the use of ''multiple quoting'', which allows the author to choose which characters should specify the bounds of a string literal. For example, in [[Perl]]: <syntaxhighlight lang="perl"> qq^I said, "Can you hear me?"^ qq@I said, "Can you hear me?"@ qqΒ§I said, "Can you hear me?"Β§ </syntaxhighlight> all produce the desired result. Although this notation is more flexible, few languages support it; other than Perl, [[Ruby (programming language)|Ruby]] (influenced by Perl) and [[C++11]] also support these. A variant of multiple quoting is the use of [[here document]]-style strings. Lua (as of 5.1) provides a limited form of multiple quoting, particularly to allow nesting of long comments or embedded strings. Normally one uses <code>[[</code> and <code>]]</code> to delimit literal strings (initial newline stripped, otherwise raw), but the opening brackets can include any number of equal signs, and only closing brackets with the same number of signs close the string. For example: <syntaxhighlight lang="lua"> local ls = [=[ This notation can be used for Windows paths: local path = [[C:\Windows\Fonts]] ]=] </syntaxhighlight> Multiple quoting is particularly useful with [[regular expression]]s that contain usual delimiters such as quotes, as this avoids needing to escape them. An early example is [[sed]], where in the substitution command <code>s/'''regex'''/'''replacement'''/</code> the default slash <code>/</code> delimiters can be replaced by another character, as in <code>s,'''regex''','''replacement''',</code> . ===Constructor functions=== Another option, which is rarely used in modern languages, is to use a function to construct a string, rather than representing it via a literal. This is generally not used in modern languages because the computation is done at run time, rather than at parse time. For example, early forms of [[BASIC]] did not include escape sequences or any other workarounds listed here, and thus one instead was required to use the <code>CHR$</code> function, which returns a string containing the character corresponding to its argument. In [[ASCII]] the quotation mark has the value 34, so to represent a string with quotes on an ASCII system one would write <syntaxhighlight lang="qbasic"> "I said, " + CHR$(34) + "Can you hear me?" + CHR$(34) </syntaxhighlight> In C, a similar facility is available via <code>[[sprintf]]</code> and the <code>%c</code> "character" format specifier, though in the presence of other workarounds this is generally not used: <syntaxhighlight lang="c"> char buffer[32]; snprintf(buffer, sizeof buffer, "This is %cin quotes.%c", 34, 34); </syntaxhighlight> These constructor functions can also be used to represent nonprinting characters, though escape sequences are generally used instead. A similar technique can be used in C++ with the <code>std::string</code> stringification operator.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)