Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
UTF-7
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Motivation== [[MIME]], the modern standard for e-mail formats, forbids encoding of [[header (computing)|headers]] using byte values above the ASCII range. Although MIME allows encoding the message body in various [[character set]]s (broader than ASCII), the underlying transmission infrastructure ([[Simple Mail Transfer Protocol|SMTP]], the main E-mail transfer standard) is still not guaranteed to be [[8-bit clean]]. Therefore, a non-trivial content transfer encoding has to be applied in case of doubt. Unfortunately, [[Base64]] has a disadvantage of making even [[ASCII]] characters unreadable in non-MIME clients. On the other hand, UTF-8 combined with [[quoted-printable]] produces a very size-inefficient format requiring 6–9 bytes for non-ASCII characters from the [[Basic Multilingual Plane|BMP]] and 12 bytes for characters outside the BMP. Provided certain rules are followed during encoding, UTF-7 can be sent in e-mail without using an underlying MIME [[MIME#Content-Transfer-Encoding|transfer encoding]], but still must be explicitly identified as the text character set. In addition, if used within e-mail headers such as "Subject:", UTF-7 must be contained in MIME [[MIME#Encoded-Word|encoded word]]s identifying the character set. Since encoded words force use of either [[quoted-printable]] or [[Base64]], UTF-7 was designed to avoid using the = sign as an escape character to avoid double escaping when it is combined with quoted-printable (or its variant, the RFC 2047/1522 "Q"-encoding of headers). UTF-7 is generally not used as a native representation within applications as it is very awkward to process. Despite its size advantage over the combination of UTF-8 with either quoted-printable or Base64, the now defunct [[Internet Mail Consortium]] recommended against its use.<ref>{{Cite web|url=https://www.imc.org/imcr-010.html|title=Using International Characters in Internet Mail |work=Internet Mail Consortium |date=1 August 1998 |archive-url=https://web.archive.org/web/20150907234243/https://www.imc.org/imcr-010.html |archive-date=2015-09-07}}</ref> [[8BITMIME]] has also been introduced, which reduces the need to encode message bodies in a 7-bit format. A modified form of UTF-7 (sometimes dubbed 'mUTF-7'<ref>{{cite web | url = https://doc.dovecot.org/configuration_manual/mail_location/ | title = Configuration Manual | at = Sec. "Mail Location Settings" | author = <!--Not stated--> | date = 8 February 2023 | website = Dovecot Documentation | access-date = 2023-02-28 | quote = Store mailbox names on disk using UTF-8 instead of modified UTF-7 (mUTF-7). }}</ref>) was used in the [[Internet Message Access Protocol|Internet Message Access Protocol (IMAP)]] e-mail retrieval protocol, version 4 rev 1, for "international" mailbox names.{{Ref RFC|3501|section=5.1.3 "Mailbox International Naming Convention" |quote=In modified UTF-7, printable [[US-ASCII]] characters, except for "&", represent themselves…. The character "&" (0x26) is represented by the two-octet sequence "&-". All other characters… are represented in modified BASE64…. }} The following version, IMAP version 4 rev 2, uses UTF-8 instead.{{Ref RFC|9051|section=5.1. "Mailbox Naming" |quote=In IMAP4rev2, mailbox names are encoded in Net-Unicode (this differs from IMAP4rev1).}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)