Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Moby Project
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Pronunciator == The '''Moby Pronunciator II''' contains 177,267 entries with corresponding pronunciations. Most of the entries describe a single word, but approximately 79,000<ref>Obtained by running the UNIX command ''grep '.*[-_].* .*' mobypron.unc | wc -l'' after converting the line endings and correcting some encoding errors.</ref> contain hyphenated or multiple word phrases, names, or [[lexemes]]. The Project Gutenberg distribution also contains a copy of the [[cmudict]] v0.3. The file contains lines of the format ''word[/part-of-speech] pronunciation''. Each line is ended with the ASCII [[carriage return]] character (CR, '\r', 0x0D, 13 in decimal). The ''word'' field can include apostrophes (e.g. ''isn't''), hyphens (e.g. ''able-bodied''), and multiple words separated by underscores (e.g. ''{{not a typo|monkey_wrench}}''). Non-English words are generally rendered, as stated in the documentation, without accents or other diacritical marks. However, in 36 entries (e.g. ''{{not a typo|São_Miguel}}''), some non-ASCII accented characters remain, represented using [[Mac OS Roman]] encoding. The part-of-speech field is used to disambiguate 770 of the words which have differing pronunciations depending on their part-of-speech. For example, for the words spelled ''close,'' the verb has the pronunciation {{IPAc-en|ˈ|k|l|oʊ|z}}, whereas the adjective is {{IPAc-en|ˈ|k|l|oʊ|s}}. The parts-of-speech have been assigned the following codes: {| class="wikitable" |- ! Part-of-speech ! Code |- | [[Noun]] | n |- | [[Verb]] | v |- | [[Adjective]] | aj |- | [[Adverb]] | av |- | [[Interjection]] | interj |} Following this is the pronunciation. Several special symbols are present: {| class="wikitable" |- ! Symbol ! Meaning |- | _ | Used to separate words |- | ' | [[Primary stress]] on the following syllable |- | , | [[Secondary stress]] on the following syllable |} The rest of the symbols are used to represent [[International Phonetic Alphabet|IPA]] characters. The pronunciations are generally consistent with a [[General American]] dialect of English, that exhibits [[father-bother merger]], [[hurry-furry merger]] and [[lot-cloth split]], but does not exhibit [[cot-caught merger]] or [[wine-whine merger]]. Each phoneme is represented by a sequence of one or more characters. Some of the sequences are delimited with a slash character "/", as shown in the following table, but note that the sequence for {{IPAc-en|ɔɪ}} is delimited by ''two'' slash characters at either end: {| class="wikitable" |- ! Symbol ! [[Help:IPA/English|IPA]] |- | /&/ | æ |- | /-/ | ə |- | /@/ | ʌ, ə |- | /[@]/r | ɜr, ər |- | /A/ | ɑ, ɑː |- | /aI/ | aɪ |- | /AU/ | aʊ |- | b | b |- | d | d |- | /D/ | ð |- | /dZ/ | dʒ |- | /E/ | ɛ |- | /eI/ | eɪ |- | f | f |- | g | ɡ |- | h | h |- | hw | hw |- | /i/ | iː |- | /I/ | ɪ |- | /j/ | j |- | /ju/ | juː |- | k | k |- | l | l |- | m | m |- | n | n |- | /N/ | ŋ |- | /O/ | ɔ, ɔː |- | //Oi// | ɔɪ |- | /oU/ | oʊ |- | p | p |- | r | r |- | s | s |- | /S/ | ʃ |- | t | t |- | /T/ | θ |- | /tS/ | tʃ |- | /u/ | uː |- | /U/ | ʊ |- | v | v |- | w | w |- | z | z |- | /Z/ | ʒ |} To this collection are added a number of extra sequences representing phonemes found in several other languages. These are used to encode the non-English words, phrases and names that are included in the database. The following table contains these extra phonemes, but note that the extent to which some of these may exist due to encoding errors is not clear. {| class="wikitable" |- ! Symbol ! [[Help:IPA/English|IPA]] |- | A | a |- | e | e, ɛ |- | i | i, ɪ |- | N | [[Nasalisation]] of preceding vowel |- | o | o |- | O | [intent not clear] |- | R | ʁ |- | S | s |- | u | u |- | V | v, β, ʋ |- | W | w |- | /x/ | x |- | /y/ | ø |- | Y | y |- | /z/ | ts |- | Z | z |}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)