Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Formatted text
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description |Digital text which has styling information beyond minimal semantic elements}} {{Use American English |date=February 2024}} {{Use mdy dates |date=February 2024}} {{Distinguish|Typesetting{{!}}Text formatting}} {{Redirect|Rich text|text/richtext|Enriched text|the file format|Rich Text Format}} {{More citations needed |date=February 2024}} In [[computing]], '''formatted text''', '''styled text,''' or '''rich text''', as opposed to [[plain text]], is [[E-text|digital text]] which has styling information beyond the minimum of semantic elements: colours, styles ([[boldface]], [[Italic type|italic]]), [[Point (typography)|sizes]], and special features in [[HTML]] (such as [[hyperlink]]s). ==Beginnings of formatted text== Formatted text has its genesis in the pre-computer use of underscoring to embolden passages in [[typewritten]] [[manuscript (publishing)|manuscripts]]. In the first interactive systems of early computer technology, underlining was not possible, and users made up for this lack (and the lack of formatting in ASCII) by using certain symbols as substitutes. Emphasis, for example, could be achieved in ASCII in a number of ways:<ref name="ham95" /><ref name="mul15" /> * Capitalization: {{mono|I am NOT making this up.}} * Surrounding with underscores: {{mono|I am _not_ making this up.}} * Surrounding with asterisks: {{mono|I am *not* making this up.}} * Spacing: {{mono|I am n o t making this up.}} Surrounding by underscores was also used for book titles: {{mono|Look it up in _The_C_Programming_Language_.}} ==Markup languages== {{Main article|Markup language}} Formatting can be marked by tags distinguished from the body text by special characters, such as angle brackets in [[HTML]]. For example, this text: :The dog is classified as ''Canis familiaris'' in taxonomy. is marked up in [[HTML]] thus: <syntaxhighlight lang="html"> <p>The dog is classified as <i>Canis familiaris</i> in taxonomy.</p> </syntaxhighlight> The italicised text is enclosed by an opening and a closing italics tag. In [[LaTeX]], the text would be marked up like this: <syntaxhighlight lang="latex"> The dog is classified as \textit{Canis familiaris} in taxonomy. </syntaxhighlight> Most markup languages can be edited with any [[text editor]], needing no special [[software]]. Many markup languages can also be edited with specialized software designed to automate some functions or present the output as [[WYSIWYG]]. ==Formatted document files== Since the invention of [[MacWrite]], the first [[WYSIWYG]] word processor, in which the typist codes the formatting visually rather than by inserting textual markup, word processors have tended to save to [[binary files]]. Opening such files with a [[text editor]] reveals them embedded with various binary characters, either around the formatted text (e.g. in [[WordPerfect]]) or separate from it, at the beginning or end of the file (e.g. in [[Microsoft Word]]). Formatted text documents in binary files have, however, the disadvantages of formatting scope and secrecy. Whereas the extent of formatting is accurately marked in markup languages, [[WYSIWYG]] formatting is based on memory, that is, keeping for example your pressing of the boldface button until cancelled. This can lead to formatting mistakes and maintenance troubles.{{clarify|date=April 2025|reason=Why would this lead to mistakes or maintenance troubles?}}{{cn|date=April 2025}} As for secrecy, formatted text document file formats tend to be proprietary and undocumented, leading to difficulty in coding compatibility by third parties, and also to unnecessary upgrades because of version changes. [[WordStar]] was a popular word processor that did not use binary files with hidden characters. [[OpenOffice.org]] Writer saves files in an [[XML]] format. However, the resultant file is a binary since it is compressed (a [[tar (file format)|tar]]ball equivalent). [[PDF]] is another formatted text file format that is usually binary (using compression for the text, and storing graphics and fonts in binary). It is generally an end-user format, written from an application such as [[Microsoft Word]] or [[OpenOffice.org]] Writer, and not editable by the user once done. ==See also== * [[Character encoding]] * [[Online rich-text editor]] * [[Prepress]] * [[Word processor]] <!-- moved section from [[Text encoding]] --> <!-- All this is pretty much bogus, I think. I agree. It contains naïve information. Text encoding ≠ Text formatting. as a [[sequence]] of [[code]]s (from a [[character encoding]]) for the purpose of [[computer storage]] or electronic [[communication]] of that text. While character encodings like [[ASCII]] represent individual [[character (computing)|characters]] of a [[language]], a text encoding has to represent much larger things like [[article]]s and [[book]]s, and must represent not only the characters they contain but the [[structure]] and [[organization]] of the text, and perhaps [[information]] about the text or its [[appearance]]. Common examples are [[HTML]] and [[RTF]] which represent texts in [[natural language]]s, and [[XML]], which can represent many kinds of text not necessarily intended to be human-readable (the contents of a [[database]], for example). In general there are two basic forms of text encoding that are widely used. One is to use a [[markup language]] which adds markers to the text itself. Markup has the advantage of being easy to represent, but has the disadvantage of being hard to view without an "aware" [[reader application]]. For instance, if an HTML document is opened in a [[text editor]], it is largely readable, but the text is cluttered with codes, and even more so in the case of a table, and there are character references for special characters which may make parts unreadable, at least to those unfamiliar with the format. Another method is to use "[[pointer]]s" into the text, which is left in the original format. This has the advantage of allowing the [[content]] to be easily readable in any [[editor]], although you lose the "[[styling]]". On the downside, editing such a [[document]] in a non-aware application typically leaves the pointers pointing to the wrong [[data]]. Today the majority of text encoding systems appear to use markup, although whether by choice or simply because "everyone else does" is open to question. Though character encodings like [[ASCII]] and [[Unicode]] are not, strictly speaking, text encodings in their own right, they may serve as very simple text encodings if one wishes only to preserve the English [[content]] of a document and not necessarily its [[formatting]]. By far the most common text encoding now in use is what might informally be called "Plain ASCII", which involves simply encoding a text as a [[stream]] of ASCII characters. The specifics of how this is done vary greatly: for example, the end of a [[text line]] might be encoded as ASCII code 10 ("[[line feed]]" or "new line") as is common practice on [[Unix]] machines, or as ASCII code 13 ("[[carriage return]]") as is common on [[Apple computer|Apple]] machines, or as both (the sequence <13, 10> is used to end lines on [[DOS]] based machines and many others, while the rather rare sequence <10, 13> was used by some [[Acorn Computers Ltd|Acorn]] machines). Some texts also use this line-end sequence inside [[paragraph]]s (with a blank line between paragraphs) while some do not. Also, various texts in this form interpret code 9 ("tab") and other [[control character]]s differently. None of these methods specify how to identify text structure like [[heading]]s and [[table]]s, or special text forms like [[italics]]. Text in this format is basically readable by any [[computer]] though some work might be needed to accommodate local variations, and all information besides the actual [[word]]s of the text will be lost. --> == References == {{reflist |refs= <ref name="ham95">{{cite journal |title=RFC1855: Netiquette Guidelines |first=Sally |last=Hambridge |date=October 1995 |website=IETF Datatracker, Internet Engineering Task Force |url=https://datatracker.ietf.org/doc/html/rfc1855 |access-date=2024-02-04 }}</ref> <ref name="mul15">{{cite web |title=Structured Text |date=2015-07-26 |first=Ed |last=Mullen |website=edmullen.net |url=https://edmullen.net/mozilla/moz_stext.php |access-date=2024-02-04 }}</ref> }} [[Category:Computer file formats]] [[Category:Publishing]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Clarify
(
edit
)
Template:Cn
(
edit
)
Template:Distinguish
(
edit
)
Template:Main article
(
edit
)
Template:Mono
(
edit
)
Template:More citations needed
(
edit
)
Template:Redirect
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Use American English
(
edit
)
Template:Use mdy dates
(
edit
)