Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Wide character
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Programming specifics== ===C/C++=== The [[C Standard Library|C]] and [[C++ Standard Library|C++]] standard libraries include [[C string handling|a number of facilities]] for dealing with wide characters and strings composed of them. The wide characters are defined using datatype <code>wchar_t</code>, which in the original [[C90 (C version)|C90]] standard was defined as : "an integral type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales" (ISO 9899:1990 Β§4.1.5) Both C and [[C++]] introduced fixed-size character types <code>char16_t</code> and <code>char32_t</code> in the 2011 revisions of their respective standards to provide unambiguous representation of 16-bit and 32-bit [[Unicode]] transformation formats, leaving <code>wchar_t</code> implementation-defined. The ISO/IEC 10646:2003 [[Unicode]] standard 4.0 says that: :"The width of <code>wchar_t</code> is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use <code>wchar_t</code> for storing Unicode text. The <code>wchar_t</code> type is intended for storing compiler-defined wide characters, which may be [[Unicode]] characters in some compilers."<ref>{{Cite book|url=https://www.worldcat.org/oclc/52257637|title=The Unicode standard|date=2003|publisher=Addison-Wesley|others=Aliprand, Joan., Unicode Consortium.|isbn=0-321-18578-1|edition=Version 4.0|location=Boston|pages=109|chapter=5.2 ANSI/ISO C wchar_t|oclc=52257637}}</ref> ===Python=== According to [[Python (programming language)|Python]] 2.7's documentation, the language sometimes uses <code>wchar_t</code> as the basis for its character type <code>Py_UNICODE</code>. It depends on whether <code>wchar_t</code> is "compatible with the chosen Python Unicode build variant" on that system.<ref>{{cite web |url=https://docs.python.org/2.7/c-api/unicode.html |title=Unicode Objects and Codecs β Python 2.7 documentation |website=docs.python.org |access-date=2009-12-19}}</ref> This distinction has been deprecated since Python 3.3, which introduced a flexibly-sized UCS1/2/4 storage for strings and formally aliased {{code|Py_UNICODE}} to <code>wchar_t</code>.<ref>{{cite web |url=https://docs.python.org/3.10/c-api/unicode.htm|title=Unicode Objects and Codecs β Python 3.10.10 documentation |website=docs.python.org |access-date=2023-02-18}}</ref> Since Python 3.12 use of <code>wchar_t</code>, i.e. the <code>Py_UNICODE</code> [[typedef]], for Python strings (wstr in implementation) has been dropped and still as before an "[[UTF-8]] representation is created on demand and cached in the Unicode object."<ref>{{Cite web |title=Unicode Objects and Codecs |url=https://docs.python.org/3.12/c-api/unicode.html |access-date=2023-09-09 |website=Python documentation}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)