Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Numeric character reference
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Restrictions== The Universal Character Set defined by ISO 10646 is the "document character set" of SGML, HTML 4, so by default, any character in such a document, and any character ''referenced'' in such a document, must be in the UCS. While the syntax of SGML does not prohibit references to invalid or unassigned code points, such as <code>&#xFFFF;</code>, SGML-derived markup languages such as HTML and XML can, and often do, restrict numeric character references to only those code points that are assigned to characters. Restrictions may also apply for other reasons. For example, in HTML 4, <code>&#12;</code>, which is a reference to a non-printing "form feed" control character, is allowed because a form feed character is allowed. But in XML, the form feed character cannot be used, not even by reference.<ref>{{cite web |title=HTML 5.2: 8. The HTML syntax |url=https://www.w3.org/TR/2017/WD-html52-20170228/syntax.html |website=www.w3.org}}</ref>{{Citation needed|date=May 2013}} As another example, <code>&#128;</code>, which is a reference to another control character, is not allowed to be used or referenced in either HTML or XML, but when used in HTML, it is usually not flagged as an error by web browsers β some of which interpret it as a reference to the character represented by code value 128 in the [[Windows-1252]] encoding for compatibility reasons. This character, "β¬", has to be represented as <code>&#8364;</code> in a standard-compliant HTML code. As a further example, prior to the publication of XML 1.0 Second Edition on October 6, 2000, XML 1.0 was based on an older version of ISO 10646 and prohibited using characters above U+FFFD, except in character data, thus making a reference like <code>&#65536;</code> (U+10000) illegal. In XML 1.1 and newer editions of XML 1.0, such a reference is allowed, because the available character repertoire was explicitly extended. Markup languages also place restrictions on where character references can occur.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)