Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Grapheme
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Smallest functional written unit}} {{distinguish|Graphene|Graphane|Graphyne}} {{more citations needed|date=April 2020}} {{Use dmy dates|date=August 2021}} [[File:A-small glyphs.svg|thumb|Various [[glyph]]s representing instances of the lower case letter {{gpm|[[a]]}}, considered to be [[allograph]]s of the same grapheme ]] {{Orthography notation}} {{Reading}} In [[linguistics]], a '''grapheme''' is the smallest functional unit of a [[writing system]].<ref>Coulmas, F. (1996), The Blackwell Encyclopedia of Writing Systems. Oxford: Blackwell, p. 174</ref> The word ''grapheme'' is derived from [[Ancient Greek]] {{tlit|grc|gráphō}} ('write'), and the suffix ''-eme'' by analogy with ''[[phoneme]]'' and other [[emic unit]]s. The study of graphemes is called ''[[graphemics]]''. The concept of graphemes is abstract and similar to the notion in [[computing]] of a [[Character (computing)|character]]. (A specific geometric shape that represents any particular grapheme in a given [[typeface]] is called a [[glyph]].)<!-- Examples go in the body for such a short article. This opening section should be no more than a concise summary of the body content, per [[WP:LEAD]]. --> ==Conceptualization== There are two main opposing grapheme concepts.<ref>Kohrt, M. (1986), The term 'grapheme' in the history and theory of linguistics. In G. Augst (Ed.), ''New trends in graphemics and orthography''. Berlin: De Gruyter, pp. 80–96. {{doi|10.1515/9783110867329.80}}</ref> In the so-called ''referential conception'', graphemes are interpreted as the smallest units of writing that correspond with sounds (more accurately [[phoneme]]s). In this concept, the ''sh'' in the written English word ''shake'' would be a grapheme because it represents the phoneme [[Voiceless postalveolar fricative|/ʃ/]]. This referential concept is linked to the ''dependency hypothesis'' that claims that writing merely depicts speech. By contrast, the ''analogical concept'' defines graphemes analogously to phonemes, i.e. via written [[minimal pair]]s such as ''shake'' vs. ''snake''. In this example, ''h'' and ''n'' are graphemes because they distinguish two words. This analogical concept is associated with the autonomy hypothesis which holds that writing is a system in its own right and should be studied independently from speech. Both concepts have weaknesses.<ref>Lockwood, D. G. (2001), Phoneme and grapheme: How parallel can they be? ''LACUS Forum'' 27, 307–316.</ref> Some models adhere to both concepts simultaneously by including two individual units,<ref>Rezec, O. (2013), Ein differenzierteres Strukturmodell des deutschen Schriftsystems. ''Linguistische Berichte'' 234, pp. 227–254.</ref> which are given names such as ''graphemic grapheme'' for the grapheme according to the analogical conception (''h'' in ''shake''), and ''phonological-fit grapheme'' for the grapheme according to the referential concept (''sh'' in ''shake'').<ref>Herrick, E. M. (1994), Of course a structural graphemics is possible! ''LACUS Forum'' 21, pp. 413–424.</ref> In newer concepts, in which the grapheme is interpreted [[semiotics|semiotically]] as a dyadic [[linguistic sign]],<ref>Fedorova, L. (2013), The development of graphic representation in abugida writing: The akshara’s grammar. ''Lingua Posnaniensis'' 55:2, pp. 49–66. {{doi|10.2478/linpo-2013-0013}}</ref> it is defined as a minimal unit of writing that is both lexically distinctive and corresponds with a linguistic unit ([[phoneme]], [[syllable]], or [[morpheme]]).<ref>Meletis, D. (2019), The grapheme as a universal basic unit of writing. ''Writing Systems Research''. {{doi|10.1080/17586801.2019.1697412}}</ref> ==Notation== {{Further|International Phonetic Alphabet#Brackets and transcription delimiters}} Graphemes are often notated within [[angle bracket]]s: e.g. {{angbr|a}}.<ref name="Cambridge">The Cambridge Encyclopedia of Language, second edition, Cambridge University Press, 1997, p. 196</ref> This is analogous to the slash notation {{IPA|/a/}} used for [[phoneme]]s. Analogous to the [[square bracket]] notation {{IPA|[a]}} used for [[phonetic transcription|phone]]s, [[glyph]]s are sometimes denoted with vertical lines, e.g. {{gph|ɑ}}.<ref>{{Cite book |last=Meletis |first=Dimitrios |title=Writing Systems and Their Use: An Overview of Grapholinguistics |last2=Dürscheid |first2=Christa |publisher=De Gruyter Mouton |year=2022 |isbn=978-3-110-75777-4 |page=64}}</ref> ==Glyphs== {{main|Glyph|Allograph}} In the same way that the [[surface form]]s of [[phoneme]]s are speech sounds or [[phone (phonetics)|phones]] (and different phones representing the same phoneme are called [[allophone]]s), the surface forms of graphemes are [[glyph]]s (sometimes ''graphs''), namely concrete written representations of symbols (and different glyphs representing the same grapheme are called [[allograph]]s). Thus, a grapheme can be regarded as an [[abstraction]] of a collection of glyphs that are all functionally equivalent. For example, in written English (or other languages using the [[Latin alphabet]]), there are two different physical representations of the [[lowercase]] Latin letter "a": "<big>a</big>" and "<big>ɑ</big>". Since, however, the substitution of either of them for the other cannot change the meaning of a word, they are considered to be allographs of the same grapheme, which can be written {{angbr|a}}. Similarly, the grapheme corresponding to "Arabic numeral zero" has a unique semantic identity and Unicode value {{code|U+0030}} but exhibits variation in the form of [[slashed zero]]. Italic and bold face forms are also allographic, as is the variation seen in [[serif]] (as in [[Times New Roman]]) versus [[sans-serif]] (as in [[Helvetica]]) forms. There is some disagreement as to whether capital and lower case letters are allographs or distinct graphemes. Capitals are generally found in certain triggering contexts that do not change the meaning of a word: a proper name, for example, or at the beginning of a sentence, or all caps in a newspaper headline. In other contexts, capitalization can determine meaning: compare, for example [[Polish language|Polish]] and [[Shoe polish|polish]]: the former is a language, the latter is for shining shoes. Some linguists consider [[digraph (orthography)|digraphs]] like the {{angbr|sh}} in ''ship'' to be distinct graphemes, but these are generally analyzed as sequences of graphemes. Non-stylistic [[Typographic ligature|ligatures]], however, such as {{angbr|æ}}, are distinct graphemes, as are various letters with distinctive [[diacritic]]s, such as {{angbr|ç}}. Identical glyphs may not always represent the same grapheme. For example, the three letters {{angbr|A}}, {{angbr|А}} and {{angbr|Α}} appear identical but each has a different meaning: in order, they are the Latin letter [[A]], the Cyrillic letter [[A (Cyrillic)|Azǔ/Азъ]] and the Greek letter [[Alpha]]. Each has its own [[code point]] in Unicode: {{unichar|0041|Latin capital letter A}}, {{unichar|0410|Cyrillic capital letter A}} and {{unichar|0391|Greek capital letter alpha}}. ==Types of grapheme== {{more citations needed|section|date=December 2022}} The principal types of graphemes are [[logogram]]s (more accurately termed morphograms<ref>Joyce, T. (2011), The significance of the morphographic principle for the classification of writing systems, ''Written Language and Literacy'' 14:1, pp. 58–81. {{doi|10.1075/wll.14.1.04joy}}</ref>), which represent words or [[morpheme]]s (for example [[Chinese characters]], the [[ampersand]] "&" representing the word ''and'', [[Arabic numerals]]); [[syllabary|syllabic]] characters, representing [[syllable]]s (as in Japanese [[kana]]); and [[alphabet]]ic letters, corresponding roughly to [[phoneme]]s (see next section). For a full discussion of the different types, see {{section link|Writing system|Functional classification}}. There are additional graphemic components used in writing, such as [[punctuation mark]]s, [[mathematical symbol]]s, [[word divider]]s such as the space, and other [[:Category:Typographical symbols|typographic symbols]]. Ancient [[logogram|logographic scripts]] often used silent [[determinative]]s to disambiguate the meaning of a neighboring (non-silent) word. ==Relationship with phonemes== {{main|Phonemic orthography}} As mentioned in the previous section, in languages that use [[alphabet]]ic writing systems, many of the graphemes stand in principle for the [[phoneme]]s (significant sounds) of the language. In practice, however, the [[orthography|orthographies]] of such languages entail at least a certain amount of deviation from the ideal of exact grapheme–phoneme correspondence. A phoneme may be represented by a [[multigraph (orthography)|multigraph]] (sequence of more than one grapheme), as the [[digraph (orthography)|digraph]] ''sh'' represents a single sound in English (and sometimes a single grapheme may represent more than one phoneme, as with the Russian letter [[я]] or the Spanish c). Some graphemes may not represent any sound at all (like the ''b'' in English ''debt'' or the ''h'' in all Spanish words containing the said letter), and often the rules of correspondence between graphemes and phonemes become complex or irregular, particularly as a result of historical [[sound change]]s that are not necessarily reflected in spelling. "Shallow" orthographies such as those of standard [[Spanish language|Spanish]] and [[Finnish language|Finnish]] have relatively regular (though not always one-to-one) correspondence between graphemes and phonemes, while those of French and English have much less regular correspondence, and are known as [[orthographic depth|deep orthographies]]. Multigraphs representing a single phoneme are normally treated as combinations of separate letters, not as graphemes in their own right. However, in some languages a multigraph may be treated as a single unit for the purposes of [[collation]]; for example, in a [[Czech language|Czech]] dictionary, the section for words that start with {{angbr|ch}} comes after that for {{angbr|h}}.<ref>{{cite web|last=Zeman|first=Dan|title=Czech Alphabet, Code Page, Keyboard, and Sorting Order|url=http://old-site.clsp.jhu.edu/ws98/projects/nlp/doc/czech_env/czech-info.html|access-date=31 March 2012|publisher=Old-site.clsp.jhu.edu|archive-url=https://web.archive.org/web/20120415123555/http://old-site.clsp.jhu.edu/ws98/projects/nlp/doc/czech_env/czech-info.html|archive-date=15 April 2012|url-status=dead}}</ref> For more examples, see {{section link|Alphabetical order|Language-specific conventions}}. ==See also== * {{Annotated link |Character (computing)}} * {{Annotated link |Grapheme–color synesthesia}} * {{Annotated link |Sign (semiotics)}} ==References== {{Commons category}} {{Reflist}} {{lexicology}} {{writing systems}} {{List of writing systems}} {{Authority control}} [[Category:Graphemes| ]] [[Category:Learning to read]] [[Category:Typography]] [[Category:Linguistics terminology]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Angbr
(
edit
)
Template:Annotated link
(
edit
)
Template:Authority control
(
edit
)
Template:Cite book
(
edit
)
Template:Cite web
(
edit
)
Template:Code
(
edit
)
Template:Commons category
(
edit
)
Template:Distinguish
(
edit
)
Template:Doi
(
edit
)
Template:Further
(
edit
)
Template:Gph
(
edit
)
Template:Gpm
(
edit
)
Template:IPA
(
edit
)
Template:Lexicology
(
edit
)
Template:List of writing systems
(
edit
)
Template:Main
(
edit
)
Template:More citations needed
(
edit
)
Template:Orthography notation
(
edit
)
Template:Reading
(
edit
)
Template:Reflist
(
edit
)
Template:Section link
(
edit
)
Template:Short description
(
edit
)
Template:Sister project
(
edit
)
Template:Tlit
(
edit
)
Template:Unichar
(
edit
)
Template:Use dmy dates
(
edit
)
Template:Writing systems
(
edit
)