Editing String literal (section)

== Syntax ==

=== Bracketed delimiters ===
Most modern programming languages use [[delimiter#Bracket delimiters|bracket delimiters]] (also '''balanced delimiters''')
to specify string literals. [[Quotation mark|Double quotations]] are the most common quoting delimiters used:

  "Hi There!"

An [[empty string]] is literally written by a pair of quotes with no character at all in between:

  ""

Some languages either allow or mandate the use of single quotations instead of double quotations (the string must begin and end with the same kind of quotation mark and the type of quotation mark may or may not give slightly different semantics):

  'Hi There!'

These quotation marks are ''unpaired'' (the same character is used as an opener and a closer), which is a hangover from the [[typewriter]] technology which was the precursor of the earliest computer input and output devices.

In terms of [[regular expression]]s, a basic quoted string literal is given as:
 "[^"]*"
This means that a string literal is written as: ''a quote, followed by zero, one, or more non-quote characters, followed by a quote''. In practice this is often complicated by escaping, other delimiters, and excluding newlines.

==== Paired delimiters ====
A number of languages provide for paired delimiters, where the opening and closing delimiters are different. These also often allow nested strings, so delimiters can be embedded, so long as they are paired, but still result in delimiter collision for embedding an unpaired closing delimiter. Examples include [[PostScript]], which uses parentheses, as in <code>(The quick (brown fox))</code> and [[m4 (computer language)|m4]], which uses the [[backtick]] (`) as the starting delimiter, and the [[apostrophe]] (') as the ending delimiter. [[Tcl]] allows both quotes (for interpolated strings) and braces (for raw strings), as in <code>"The quick brown fox"</code> or <code>{The quick {brown fox}}</code>; this derives from the single quotations in Unix shells and the use of braces in [[C (programming language)|C]] for compound statements, since blocks of code is in Tcl syntactically the same thing as string literals – that the delimiters are paired is essential for making this feasible.

The [[Unicode]] character set includes paired (separate opening and closing) versions of both single and double quotations:
  “Hi There!”
  ‘Hi There!’
  „Hi There!“
  «Hi There!»

These, however, are rarely used, as many programming languages will not register them (one exception is the paired double quotations which can be used in [[Visual Basic .NET]]). Unpaired marks are preferred for compatibility, as they are easier to type on a wide range of keyboards, and so even in languages where they are permitted, many projects forbid their use for source code.

=== Whitespace delimiters ===
String literals might be ended by newlines.

One example is [[MediaWiki]] template parameters.
<syntaxhighlight lang="wikitext">
{{Navbox
|name=Nulls
|title=[[wikt:Null|Nulls]] in [[computing]]
}}
</syntaxhighlight>

There might be special syntax for multi-line strings.

In [[YAML]], string literals may be specified by the relative positioning of [[Whitespace character|whitespace]] and
indentation.
<syntaxhighlight lang="yaml">
    - title: An example multi-line string in YAML
      body : |
        This is a multi-line string.
        "special" metacharacters may
        appear here. The extent of this string is
        represented by indentation.
</syntaxhighlight>

=== No delimiters ===

Some programming languages, such as Perl and PHP, allow string literals without any delimiters in some contexts. In the following Perl program, for example, <code>red</code>, <code>green</code>, and <code>blue</code> are string literals, but are unquoted:

  <syntaxhighlight lang="perl">%map = (red => 0x00f, blue => 0x0f0, green => 0xf00);</syntaxhighlight> 

Perl treats non-reserved sequences of alphanumeric characters as string literals in most contexts. For example, the following two lines of Perl are equivalent:
<syntaxhighlight lang="perl">
$y = "x";
$y = x;
</syntaxhighlight>

=== Declarative notation ===

In the original [[FORTRAN]] programming language (for example), string literals were written in so-called [[Hollerith constant|''Hollerith'' notation]], where a decimal count of the number of characters was followed by the letter H, and then the characters of the string:

  <syntaxhighlight lang="fortran">35HAn example Hollerith string literal</syntaxhighlight>

This declarative notation style is contrasted with bracketed [[delimiter]] quoting, because it does
not require the use of balanced "bracketed" characters on either side of the string.

'''Advantages:'''
* eliminates text searching (for the delimiter character) and therefore requires significantly less [[Computational overhead|overhead]]
* avoids the problem of [[delimiter collision]]
* enables the inclusion of [[metacharacter]]s that might otherwise be mistaken as commands
* can be used for quite effective data compression of plain text strings{{citation needed|reason=doesn't look like compression to me|date=March 2011}}

'''Drawbacks:'''
* this type of notation is error-prone if used as manual entry by [[programmer]]s
* special care is needed in case of multi byte encodings
This is however not a drawback when the prefix is generated by an algorithm as is most likely the case.{{citation needed|reason=humans don't generally write Fortran code, or what? we're talking source code formats, after all...|date=February 2012}}

=== Constructor functions ===

C++ has two styles of string, one inherited from C (delimited by <code>"</code>), and the safer <code>std::string</code> in the C++ Standard Library. The <code>std::string</code> class is frequently used in the same way a string literal would be used in other languages, and is often preferred to C-style strings for its greater flexibility and safety. But it comes with a performance penalty for string literals, as <code>std::string</code> usually allocates memory dynamically, and must copy the C-style string literal to it at run time.

Before C++11, there was no literal for C++ strings (C++11 allows <code>"this is a C++ string"s</code> with the <code>s</code> at the end of the literal), so the normal constructor syntax was used, for example:
* {{code|2=cpp|1=std::string str = "initializer syntax";}}
* {{code|2=cpp|1=std::string str("converting constructor syntax");}}
* {{code|2=cpp|1=std::string str = string("explicit constructor syntax");}}

all of which have the same interpretation. Since C++11, there is also new constructor syntax:
* {{code|2=cpp|1=std::string str{"uniform initializer syntax"};}}
* {{code|2=cpp|1=auto str = "constexpr literal syntax"s;}}