Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Word divider
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Distinguish|Word mark (computer hardware)}} {{short description|Glyph that separates written words}} {{No footnotes|date=September 2022}} {{Infobox punctuation mark |variant1= |caption1=space |variant2=· <!-- Interpunct --> |caption2=Latin interpunct |variant3=፡ <!-- Geez --> |caption3=Geʽez double point}} In [[punctuation]], a '''word divider''' is a form of [[glyph]] which separates written [[Word|words]]. In languages which use the [[Latin alphabet|Latin]], [[Cyrillic script|Cyrillic]], and [[Arabic alphabet]]s, as well as other scripts of Europe and West Asia, the word divider is a blank [[Space (punctuation)|space]], or ''whitespace''. This convention is spreading, along with other aspects of European punctuation, to Asia and Africa, where words are usually written without word separation.<ref>{{harv|Saenger|2000}}</ref>{{better source needed|date=June 2020}} In [[character encoding]], [[text segmentation#Word segmentation|word segmentation]] depends on which characters are defined as word dividers. ==History== In [[Egyptian hieroglyphs|Ancient Egyptian]], [[determinative]]s may have been used as much to demarcate word boundaries as to disambiguate the semantics of words.<ref>"Determinatives are a most significant aid to legibility, being readily identifiable word dividers." (Ritner 1996:77)</ref> Rarely in [[Assyrian cuneiform]], but commonly in the later cuneiform [[Ugaritic alphabet]], a vertical stroke 𒑰 was used to separate words. In [[Old Persian cuneiform]], a diagonally sloping wedge 𐏐 was used.<ref>{{cite book |title=Assyrian Cuneiform |year=1901 |publisher=AMS Press |location=New York |page=42 |last=King |first=Leonard William}}</ref> As the alphabet spread throughout the ancient world, words were often run together without division, and this practice remains or remained until recently in much of South and Southeast Asia. However, not infrequently in inscriptions a vertical line, and in manuscripts a single (·), double (:), or triple (⁝) [[interpunct]] (dot) was used to divide words. This practice was found in [[Phoenician alphabet|Phoenician]], [[Aramaic alphabet|Aramaic]], [[Hebrew alphabet|Hebrew]], [[Greek alphabet|Greek]], and [[Latin alphabet|Latin]], and continues today with [[Ge'ez alphabet|Ethiopic]], though there whitespace is gaining ground. ===Scriptio continua=== The early [[alphabet]]ic writing systems, such as the [[Phoenician alphabet]], had only signs for [[consonant]]s (although some signs for consonants could also stand for a [[vowel]], so-called ''[[matres lectionis]]''). Without some form of visible word dividers, parsing a text into its separate words would have been a puzzle. With the introduction of letters representing vowels in the [[Greek alphabet]], the need for inter-word separation lessened. The earliest Greek inscriptions used interpuncts, as was common in the writing systems which preceded it, but soon the practice of ''[[scriptio continua]]'', continuous writing in which all words ran together without separation became common. == Types == ===None=== Alphabetic writing without inter-word separation, known as ''[[scriptio continua]]'', was used in Ancient Egyptian. It appeared in Post-classical Latin after several centuries of the use of the interpunct. Traditionally, ''scriptio continua'' was used for the [[Brahmic scripts|Indic alphabets]] of South and Southeast Asia and [[hangul]] of Korea, but spacing is now used with hangul and increasingly with the Indic alphabets. Today [[Written Chinese|Chinese]] and [[Japanese writing|Japanese]] are the most widely used scripts consistently written without punctuation to separate words, though other scripts such as [[Thai script|Thai]] and [[Lao script|Lao]] also follow this writing convention. In Classical Chinese, a word and a [[Chinese character|character]] were almost the same thing, so that word dividers would have been superfluous. Although [[Standard Mandarin|Modern Mandarin]] has numerous polysyllabic words, and each syllable is written with a distinct character, the conceptual link between character and word or at least [[morpheme]] remains strong, and no need is felt for word separation apart from what characters already provide. This link is also found in the [[Vietnamese language]]; however, in the [[Vietnamese alphabet]], virtually all syllables are separated by spaces, whether or not they form word boundaries. [[File:Sample Tuladha Jejeg.png|center|thumb|x50px|An example of [[Javanese script]] [[scriptio continua]] of the first article of declaration of human rights.]] ===Space=== Space is the most common word divider, especially in [[Latin script]]. [[Image:Traditional spacing examples from the 1911 Chicago Manual of Style.png|center|thumb|Traditional spacing examples from the 1911 ''Chicago Manual of Style''<ref>{{cite book |title=Manual of Style: A Compilation of Typographical Rules Governing the Publications of The University of Chicago, with Specimens of Types Used at the University Press |edition=Third |author=University of Chicago Press |year=1911 |publisher=University of Chicago |location=Chicago |page=[https://archive.org/details/manualstyleacom00presgoog/page/n115 101] |url=https://archive.org/details/manualstyleacom00presgoog|quote=this line is spaced. }}</ref>]] {{-}} ===Vertical lines=== Ancient inscribed and cuneiform scripts such as [[Anatolian hieroglyphs]] frequently used short vertical lines to separate words, as did [[Linear B]]. In manuscripts, vertical lines were more commonly used for larger breaks, equivalent to the Latin comma and period. This continues with many Indic scripts today (the [[danda]]). ===Interpunct, multiple dots, and hypodiastole=== {| align=right style="border:1px solid #ccc; padding:.3em; margin:1em" |<span lang="la" style="font-family: times, serif; font-size: 90%;">{{smallcaps|arma·virvmqve·cano·troiae·qvi·primvs·ab·oris<br>italiam·fato·profvgvs·laviniaqve·venit<br>litora·mvltvm·ille·et·terris·iactatvs·et·alto<br>vi·svpervm·saevae·memorem·ivnonis·ob·iram }}</span><br> |- |<span style="font-size: 90%;">The Latin interpunct</span> |} [[Image:Ethiopic genesis (ch. 29, v. 11-16), 15th century (The S.S. Teacher's Edition-The Holy Bible - Plate XII, 1).jpg|thumb|The Ethiopic double interpunct]] As noted above, the single and double interpunct were used in manuscripts (on paper) throughout the ancient world. For example, Ethiopic inscriptions used a vertical line, whereas manuscripts used double dots (፡) resembling a colon. The latter practice continues today, though the space is making inroads. Classical Latin used the interpunct in both paper manuscripts and stone inscriptions.<ref>(Wingo 1972:16)</ref> [[Greek orthography#Punctuation|Ancient Greek orthography]] used between two and five dots as word separators, as well as the [[hypodiastole]]. {{-}} ===Different letter forms=== In the modern [[Hebrew alphabet|Hebrew]] and [[Arabic alphabet]]s, some letters have distinct forms at the ends and/or beginnings of words. This demarcation is used in addition to spacing. ===Vertical arrangement=== [[File:Urdu couplet.svg|thumb|Nastaʿlīq used for Urdu (written right-to-left)]] The [[Nastaʿlīq script|Nastaʿlīq]] form of [[Islamic calligraphy]] uses vertical arrangement to separate words. The beginning of each word is written higher than the end of the preceding word, so that a line of text takes on a [[sawtooth wave|sawtooth]] appearance. Nastaliq spread from Persia and today is used for [[Persian language|Persian]], [[Uyghur language|Uyghur]], [[Pashto language|Pashto]], and [[Urdu]]. ===Pause=== In [[finger spelling]] and in [[Morse code]], words are separated by a pause. ==Unicode== For use with computers, these marks have [[codepoint]]s in [[Unicode]]: * {{unichar|00B7|html=|nlink=}} * {{unichar|2E31|nlink=}} * {{unichar|1361|nlink=Geʽez script#Punctuation}} * {{unichar|10FB}} * {{unichar|205D}} * {{unichar|205E}}<ref name=opoudjis>[http://www.tlg.uci.edu/~opoudjis/unicode/punctuation.html#papyrological ''Punctuation'' § 5. Papyrological Punctuation] {{webarchive |url=https://web.archive.org/web/20141120120157/http://www.tlg.uci.edu/~opoudjis/unicode/punctuation.html#papyrological |date=November 20, 2014 }}</ref> * {{unichar|2E12|nlink=}} * {{Unichar|2E19}} In [[Linear B]] script: * {{unichar|10100}} * {{unichar|10101}} ==See also== * [[Whitespace (computer science)|Whitespace]] * [[Sentence spacing]] * [[Speech segmentation]] * [[Zero-width non-joiner]] * [[Zero-width space]] * [[Substitute blank]] * [[Underscore]] ==References== <references/> ==Further reading== * {{cite book | editor1-last = Daniels | editor1-first = Peter T. | editor2-last = Bright | editor2-first = William | title = The World's Writing Systems | publisher = [[Oxford University Press]] | year = 1996 | ref = Daniels1996 }} * {{cite book | last = Knight | first = Stan | chapter = The Roman Alphabet | editor1-last = Daniels | editor1-first = Peter T. | editor2-last = Bright | editor2-first = William | title = The World's Writing Systems | publisher = [[Oxford University Press]] | year = 1996 | ref = Knight1996 }} * {{cite book | last = Ritner | first = Robert | chapter = Egyptian Writing | editor1-last = Daniels | editor1-first = Peter T. | editor2-last = Bright | editor2-first = William | title = The World's Writing Systems | publisher = [[Oxford University Press]] | year = 1996 | ref = Ritner1996 }} * {{cite book | last = Saenger | first = Paul | title = Space Between Words: The Origins of Silent Reading | publisher = [[Stanford University Press]] | year = 2000 | isbn = 0-8047-4016-X }} * {{cite book | last = Wingo | first = E. Otha | title = Latin Punctuation in the Classical Age | url = https://archive.org/details/latinpunctuation0000wing | url-access = registration | publisher = Mouton | year = 1972 | page= [https://archive.org/details/latinpunctuation0000wing/page/16 16] | ref = Wingo1972 }} {{navbox punctuation}}<!-- these symbols are not in the navbox, fix? --> [[Category:Punctuation]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:-
(
edit
)
Template:Better source needed
(
edit
)
Template:Cite book
(
edit
)
Template:Clear
(
edit
)
Template:Distinguish
(
edit
)
Template:Harv
(
edit
)
Template:Infobox punctuation mark
(
edit
)
Template:Navbox punctuation
(
edit
)
Template:No footnotes
(
edit
)
Template:Short description
(
edit
)
Template:Smallcaps
(
edit
)
Template:Unichar
(
edit
)
Template:Webarchive
(
edit
)