Editing UTF-32 (section)

== Use ==
The main use of UTF-32 is in internal APIs where the data is single code points or [[Glyph|glyphs]], rather than strings of characters. For instance, in modern text rendering, it is common{{citation needed|date=January 2023}} that the last step is to build a list of structures each containing [[Coordinate system|coordinates (x, y)]], attributes, and a single UTF-32 code point identifying the glyph to draw. Often non-Unicode information is stored in the "unused" 11 bits of each word.{{Citation needed|date=June 2017}}

Use of UTF-32 strings on Windows (where {{mono|[[wchar_t]]}} is 16 bits) is almost non-existent. On Unix systems, UTF-32 strings are sometimes, but rarely, used internally by applications, due to the type {{mono|wchar_t}} being defined as 32-bit. 

UTF-32 is also forbidden as an HTML character encoding.<ref>{{cite web|access-date=2024-11-11 |language=en |title=HTML Standard |url=https://html.spec.whatwg.org/multipage/parsing.html#character-encodings |website=html.spec.whatwg.org}}<!-- auto-translated from French by Module:CS1 translator --></ref><ref>{{cite web|access-date=2024-11-11 |language=fr |title=Choisir et appliquer un encodage de caractères |url=https://www.w3.org/International/questions/qa-choosing-encodings.fr.html#avoid:~:text=particuli%C3%A8rement%20celle%20d%E2%80%99-,UTF-32,-. |website=www.w3.org}}<!-- auto-translated from French by Module:CS1 translator --></ref> 

=== Programming languages ===
[[Python (programming language)|Python]] versions up to 3.2 can be compiled to use them{{Clarify|reason=“Python versions up to 3.2 can be compiled to use them” is unclear|date=November 2024}} instead of [[UTF-16]]; from version 3.3 onward, Unicode strings are stored in UTF-32 if there is at least 1 non-[[Basic Multilingual Plane|BMP]] character in the string, but with leading zero bytes optimized away "depending on the [code point] with the largest Unicode ordinal (1, 2, or 4 bytes)" to make all code points that size.<ref>{{cite web|last1=Löwis|first1=Martin|title=PEP 393 -- Flexible String Representation|url=https://legacy.python.org/dev/peps/pep-0393/|website=python.org|publisher=Python|access-date=26 October 2014}}</ref> <!-- In previous versions of Python, "\U0001F51F" (UTF-32) was equivalent to "\ud83d\udd1f" (UCS-2).
However, this is true in languages like JavaScript. This also means that a non-BMP character is not equivalent to its surrogate pair (example: <code>"\U0001F51F" != "\ud83d\udd1f"</code>) unlike most programming languages. -->

[[Seed7]]<ref>{{Cite web|url=https://seed7.sourceforge.net/faq.htm|title=Seed7 FAQ|website=seed7.sourceforge.net}}</ref> and [[Lasso (programming language)|Lasso]]{{Citation needed|date=June 2017}} programming languages encode all strings with UTF-32, in the belief that direct indexing is important, whereas the [[Julia (programming language)|Julia]] programming language moved away from built-in UTF-32 support with its 1.0 release, simplifying the language to having only UTF-8 strings (with all the other encodings considered legacy and moved out of the standard library to package<ref>{{Citation|title=JuliaStrings/LegacyStrings.jl: Legacy Unicode string types|date=2019-05-17|url=https://github.com/JuliaStrings/LegacyStrings.jl|publisher=JuliaStrings|access-date=2019-10-15}}</ref>) following the "UTF-8 Everywhere Manifesto".<ref>{{Cite web|url=http://utf8everywhere.org/|title=UTF-8 Everywhere|website=utf8everywhere.org}}</ref>

[[C++11]] has 2 built-in data types that use UTF-32. The <code>char32_t</code> data type stores 1 character in UTF-32. The <code>u32string</code> data type stores a string of UTF-32-encoded characters. A UTF-32-encoded character or string literal is marked with <code>U</code> before the character or string literal.<ref>{{Cite web |url=https://cplusplus.com/reference/string/u32string/ |access-date=2024-11-12 |website=cplusplus.com|title = u32string}}</ref><ref>{{Cite web |title=String literal - cppreference.com |url=https://en.cppreference.com/w/cpp/language/string_literal |access-date=2024-11-14 |website=en.cppreference.com}}</ref>

<syntaxhighlight lang="c++">
#include <string> 
char32_t UTF32_character = U'🔟'; // also written as U'\U0001F51F'
std::u32string UTF32_string = U"UTF–32-encoded string"; // defined as `const char32_t*´
</syntaxhighlight>[[C Sharp (programming language)|C#]] has a <code>UTF32Encoding</code> class which represents Unicode characters as bytes, rather than as a string.<ref>{{Cite web |last=dotnet-bot |title=UTF32Encoding Class (System.Text) |url=https://learn.microsoft.com/en-us/dotnet/api/system.text.utf32encoding?view=net-8.0 |access-date=2024-11-27 |website=learn.microsoft.com |language=en-us}}</ref>