Welcome to hypercone.com on January 7 2009.
This is an internet experiment running to monitor browsing habbits of individuals through wikipedia contents.

Romanian alphabet

From Wikipedia, the free encyclopedia

Jump to: navigation, search
This article contains IPA phonetic symbols. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Unicode characters.

The Romanian alphabet is a modification of the Latin alphabet and consists of 31 letters:[1]

Letter Name
A, a a
Ă, ă ă
Â, â â din a
B, b be
C, c ce
D, d de
E, e e
F, f fe / ef
Letter Name
G, g ghe / ge
H, h ha / haş
I, i i
Î, î î din i
J, j je
K, k ka
L, l le / el
M, m me / em
Letter Name
N, n ne / en
O, o o
P, p pe
Q, q kü / chiu
R, r re / er
S, s se / es
Ș, ș șe
T, t te
Letter Name
Ț, ț țe
U, u u
V, v ve
W, w dublu ve
X, x ics
Y, y igrec / i grec
Z, z ze / zet

The letters Q (read or chiu), W (dublu ve), and Y (igrec or i grec) were officially introduced in the Romanian alphabet in 1982, although they had been used earlier. They occur only in foreign words, such as quasar, watt, and yacht. The letters K and X are rarely used.[citation needed] The letter K is relatively older, used mainly for measure units "kilo", it is not used in Romanian words, and it is still perceived as foreign due to the fact that it appears only in borrowings, many of them still neologisms.

In cases where the word is a direct borrowing having diacritical marks not present in the above alphabet, official spelling tends to favor their use (München, Angoulême etc., as opposed to the use of Istanbul over İstanbul).

Contents

[edit] Letters and their pronunciation

See also: Romanian phonology

Romanian spelling is mostly phonetic. The table below gives the correspondence between letters and sounds. Some of the letters have several possible readings, even if allophones are not taken into account. When vowels /i/, /u/, /e/, and /o/ are changed into their corresponding semivowels, this is not marked in writing. Letters K, Q, W, and Y appear only in foreign borrowings; the pronunciation of W and Y depends on the origin of the word they appear in.

Letter Phoneme Approximate pronunciation
A a /a/ a in "father"
Ă ă (a with breve) /ə/ a in "above"
 â (a with circumflex) /ɨ/ like e in roses in some English dialects, see the phonetic description
B b /b/ b in "ball"
C c /k/ c in "cat"
/ʧ/ ch in "chimpanzee" — if c appears before letters e or i
D d /d/ d in "door"
E e /e/ e in "merry"
/e̯/ (semivocalic /e/)
/je/ ye in "yes" — in a few old words with initial e: este, el etc.[2]
F f /f/ f in "flag"
G g /ɡ/ g in "goat"
/ʤ/ g in "general"
H h /h/ h in "house"
I i /i/ i in "machine"
/j/ y in "yes"
/ʲ/ (palatalization)
Î î (i with circumflex) /ɨ/ like e in roses in some English dialects, see the phonetic description
J j /ʒ/ s in "treasure"
K k /k/ k in "like"
L l /l/ l in "lamp"
M m /m/ m in "mouth"
N n /n/ n in "north"
O o /o/ o in "floor"
/o̯/ (semivocalic /o/)
P p /p/ p in "post"
Q q /k/ k in "kettle"
R r /r/ alveolar trill or tap
S s /s/ s in "song"
Ş ş (s with comma) /ʃ/ s in "sugar"
T t /t/ t in "tip"
Ţ ţ (t with comma) /ʦ/ zz in "pizza"
U u /u/ u in "group"
/w/ w in "cow"
V v /v/ v in "vision"
W w /v/ v in "vision"
/w/ w in "west"
X x /ks/ x in "six"
/ɡz/ x in "example"
Y y /j/ y in "yes"
/i/ i in "machine"
Z z /z/ z in "zipper"

[edit] Special letters

Pre- (top) and post-1993 (bottom) street signs in Bucharest, showing the two different spellings of the same name

Romanian does not use accents. In the sense of diacritics as being signs added to letters to alter their pronunciation or to make distinction between words, the Romanian alphabet does not have diacritics. There are, however, five special letters in the Romanian alphabet (associated with four different sounds), formed by modifying other Latin letters; strictly speaking they are not diacritics, but are generally referred to as such.

The letter â is used exclusively in the middle of words; its majuscule version appears only in all-capitals inscriptions.

Writing letters ș and ț with a cedilla instead of a comma is considered incorrect by the Romanian Academy. Actual Romanian writings, including books created to teach children to write, treat the comma and cedilla as a variation in font. See Unicode and HTML below.

[edit] Î versus Â

The letters î and â are phonetically and functionally identical. The reason for using both of them is historical, denoting the language's Latin origin, although statistically only few of the words written today with â actually derive from Latin words having an a in the corresponding position. In 1953, during the communist regime, the Romanian Academy eliminated the letter â, replacing it with î everywhere, including until 1964 the name of the country, which was spelled Romînia. The first stipulation coincided with the official designation of the country as a People's Republic, which meant that its full title was Republica Populară Romînă, whereas the Socialist Republic proclaimed in 1965 is associated with the spelling Republica Socialistă România.

After the fall of the Ceauşescu regime, the Romanian Academy decided to reintroduce â from 1993 onward, in accordance to the 1904 spelling reform, thus cancelling the effects of the 1953 spelling reform. The choice between î and â is thus based on a simple rule: the letter is always spelled as â, except at the beginning and the end of words, where î is used instead. Exceptions include proper nouns where the usage of the letters is frozen, whichever it may be, and compound words, whose components are each separately subjected to the rule above, not the resulting word itself (e.g. ne+îndemânaticneîndemânatic, not * neândemânatic). Quite a number of people and institutions (including major newspapers such as Evenimentul Zilei, Cotidianul, and, since April 2008, Jurnalul Naţional)[citation needed] prefer the 1964 norms. Generally, usage of either the 1964 or 1993 norms is regarded as correct in most situations.

[edit] Comma-below (ș and ț) versus cedilla (ş and ţ)

Although the Romanian Academy standard mandates the comma-below variants for the sounds /ʃ/ and /ʦ/, the cedilla variants are more widespread in practice. Many printed texts, including books intended to teach children how to write, still use "s with cedilla" and "t with cedilla". This state of affairs is due to an initial lack of glyph standardization, compounded by the lack of computer font support for the comma-below variants (see the Unicode section for details).

[edit] Obsolete diacritics

An old manhole cover in Bucharest, writing "Bucharest - sewerage" using etymological spelling, Bucuresci - Canalisare instead of Bucureşti - Canalizare

Before the spelling reform of 1904, there were several additional letters with diacritical marks.

  • Vowels:
    • ĭi with breve served to illustrate the final, "whispered" sound of the palatalized consonant, in words such as Bucureşti (/bu.ku'reʃtʲ/), lupi (/lupʲ/ - "wolves"), and greci (/greʧʲ/ - "Greeks") — Bucureşcĭ (the proper spelling at the time used c instead of t, see -eşti), lupĭ, grecĭ. This distinction is no longer considered necessary.
    • ŭu with breve was used only in the ending of a word. Unvoiced, it served to indicate that the previous consonant was not palatalized, or that the vowel i was fully voiced. Once frequent, it survives today only in author Mateiu Caragiale's name - originally spelled Mateiŭ (it is not specified whether the pronunciation should adopt a version that he himself probably never used, while in many editions he is still credited as Matei).
    • ĕe with breve. This letter is now replaced with ă. The existence of two letters for one sound, the schwa, had an etymological purpose, showing from which vowel ("a" or "e") it originally derived. For example împĕrat - "emperor" (<Imperator), vĕd - "I see" (<vedo), umĕr - "shoulder" (<humerus), păsĕri - "birds" (<cf. passer).
  • Consonants
    • d̦ / D̦ — Latin small/capital letter d with comma below was used to indicate the sound that corresponds today to Romanian letter z. It would denote that the word it belonged to was assumed to be derived from Latin and that its corresponding latin letter was d. Examples of words containing this letter are: d̦i (day in English) - assumed[3] to be derived from the Latin word dies, Dumned̦eu (God in English) - assumed[4] to be derived from Latin phrase Domine Deus), d̦ână (fairy in English) - supposed[5] to be derived from the Latin word Diana. In today's Romanian language this letter is no longer present and Latin letter z is used in its stead.[citation needed]

Use of these letters was not fully adopted even before 1904, as some publications (e.g. Timpul and Universul) chose to use a more simplified and easy to read approach[neutrality disputed], which resembled today's Romanian language writing.[citation needed]

[edit] Other diacritics

As with other languages, the acute accent is sometimes used in Romanian to indicate the stressed vowel in polysyllabic words. This use is regular in dictionary headwords, but also occasionally found in carefully edited texts to disambiguate between otherwise identical words, such as to differentiate between cópii (copies) and copíi (children), although the latter is seldom used outside dictionaries.

[edit] Digital typography

[edit] ISO 8859

The character encoding standard ISO 8859 initially defined a single code page for the entire Central and Eastern Europe — ISO 8859-2. This code page includes only "s" and "t" with cedillas. The South-Eastern European ISO 8859-16 includes "s" and "t" with comma below on the same places "s" and "t" with cedilla were in ISO 8859-2. Unfortunately, the ISO 8859-16 code page became a standard after Unicode became wide-spread, so it was largely ignored by software vendors.

[edit] Unicode and HTML

The circumflex and breve accented Romanian letters were part of the Unicode standard since its inception, as well as the cedilla variants of s and t. Ș and ț (comma-below variants) were added to Unicode version 3.0[6], but wide-spread adoption was hampered for some years by the lack of fonts providing the new glyphs. In May 2007, five months after Romania (and Bulgaria) joined the EU, Microsoft released updated fonts that include all official glyphs of Romanian (and Bulgarian) alphabet[7]. This font update targeted Windows XP SP2, Windows Server 2003, and Windows Vista. The subset of Unicode most widely supported on Microsoft Windows systems, Windows Glyph List 4, still does not include the comma-below variants of S and T.

Phoneme With comma (official) With cedilla
Character Unicode position (hex) HTML entity Character Unicode position (hex) HTML entity
/ʃ/ Ș 0218 &#x218; or &#536; Ş 015E &#x15E; or &#350;
ș 0219 &#x219; or &#537; ş 015F &#x15F; or &#351;
/ʦ/ Ț 021A &#x21A; or &#538; Ţ 0162 &#x162; or &#354;
ț 021B &#x21B; or &#539; ţ 0163 &#x163; or &#355;

Vowels with diacritics are coded as follows:

Phoneme Character Unicode position (hex) HTML entity
/ə/ Ă 0102 &#x102; or &#258;
ă 0103 &#x103; or &#259;
/ɨ/ Â 00C2 &Acirc; or &#xC2; or &#194;
â 00E2 &acirc; or &#xE2; or &#226;
Î 00CE &Icirc; or &#xCE; or &#206;
î 00EE &icirc; or &#xEE; or &#238;

[edit] Adobe/Linotype/Vista de-facto standard

Inconsistent cedilla glyphs in Adobe Caslon (left). The correct Romanian rendering (right) can be obtained by activating the OpenType GSUB/latn/ROM/locl feature, which remaps the s with cedilla glyph to comma-below. The rendering on the right is visually indistinguishable from the rendering produced by comma-below code points for this font.

Adobe decided[8] that the Unicode glyphs "t with cedilla" U+0162/3 are not used in any language. Adobe has therefore substituted the glyphs with "t with comma below" (U+021A/B) in all the fonts they ship. The unfortunate consequence of this decision is that Romanian documents using the (unofficial) Unicode points U+015E/F and U+0162/3 (for ș and ț) are rendered in Adobe fonts in a visually inconsistent way using "s with cedilla", but "t with comma" (see figure). Linotype fonts that support Romanian glyphs mostly follow this convention[9].

The fonts introduced by Microsoft in Windows Vista also implement this de-facto Adobe standard. Few Microsoft fonts provide a consistent look when cedilla variants are used; notable ones are Tahoma, Verdana, Trebuchet MS, Microsoft Sans Serif and Segoe UI.

The free DejaVu and Linux Libertine fonts provide proper and consistent glyphs in both variants. Red Hat's Liberation fonts only support the comma below variants starting with version 1.04, scheduled for inclusion in Fedora 10.

[edit] OpenType ROM/locl feature

Some OpenType fonts from Adobe and all C-series Vista fonts implement the optional OpenType feature GSUB/latn/ROM/locl[10]. This feature forces "s with cedilla" to be rendered using the same glyph as "s with comma below". When this second (but optional) remapping takes place, Romanian Unicode text is rendered with comma-below glyphs regardless of code point variants.

Unfortunately, most Microsoft pre-Vista OpenType fonts (Arial etc.) do not implement the ROM/locl feature, even after the European Union Expansion Font Update[7], so old documents will look inconsistent as in the left side of the above figure. Select few fonts, e.g. Verdana and Trebuchet MS, not only have a consistent look for cedilla variants (after the EU update), but also do a simultaneous remapping of cedilla s and t to comma-below variants when ROM/locl is activated. The free DejaVu and Linux Libertine fonts do not yet offer this feature in their current releases, but development versions do.

Pango supports the locl tag since version 1.17. XeTeX supports locl since version 0.995. As of July 2008, very few Windows applications support the locl feature tag. From the Adobe CS3 suite, only InDesign has support for it.[11]

The status of Romanian support in the free fonts that ship with Fedora is maintained here.

[edit] Combining characters

Unicode also allows diacritical marks to be represented as standalone combining characters. For Romanian characters this method is practically unsupported in commercial fonts. A few free fonts like Charis SIL, DejaVu or Linux Libertine support this method, but the typographical quality varies, thus it is preferable to use the single code points instead.

[edit] (La)TeX

LaTeX allows typesetting in Romanian using the cedilla Ş and Ţ using the Cork encoding. The comma-below variants are not completely supported in the standard 8-bit TeX font encodings. The lack of a standard LICR (LaTeX Internal Character Representations) for comma-below Ș and Ț is part of the problem. The latin10 input method attempts to remedy the problem by defining the \textcommabelow LICR accent. This is unfortunately not supported by the utf8 input method. The latin10 package composes the comma-below glyphs by superimposing a comma and the letters S and T. This method is suitable only for printing. In PDF documents produced this way searching or copying text does not work properly. The Polish QX encoding has some support for comma-below glyphs, which are improperly mapped to cedilla LICRs, but also lacks A breve (Ă), which must always be composite, thus unsearchable.

In the Latin Modern Type 1 fonts the T with comma below is found under the AGL name /Tcommaaccent. This is in contradiction with Adobe's decision discussed above, which puts a T with comma-below at /Tcedilla. In consequence, no fixed mapping can work across all Type 1 fonts; each font must come with its own mapping. Unfortunately, TeX output drivers, like dvips, dvipdfm or pdfTeX's internal PDF driver, access the glyphs by AGL name. Since all of the output drivers mentioned are unaware of this peculiarity, the problem is essentially intractable across all fonts. In consequence, one needs to use fonts that include a mapping which is not bypassed by TeX. This is the case with newer TeX engine XeTeX, which can use Unicode OpenType fonts, and does not bypass the font's Unicode map.

[edit] Keyboard layout

Romanian letters  and à on the keyboard of an Apple MacBook Pro

The current Romanian National Standard SR 13392:2004 establishes two layouts for Romanian keyboards: a “primary”[12] one and a “secondary”[13] one.

Romanian SR 13392:2004 ("primary") keyboard layout

The “primary” layout is intended for more traditional users that learned long ago how to type with older, Microsoft-style implementations of the Romanian keyboard. The “secondary” layout is mainly used by programmers and it does not contradict the physical arrangement of keys on a US-style keyboard. The “secondary” arrangement is used as the default one by the majority of GNU/Linux distributions.

There are four Romanian-specific characters that are incorrectly implemented in all Microsoft Windows versions before Vista:

  • “S with comma below” (U+0218) – incorrectly implemented as “S with cedilla below” (U+015E)
  • “s with comma below” (U+0219) – incorrectly implemented as “s with cedilla below” (U+015F)
  • “T with comma below” (U+021A) – incorrectly implemented as “T with cedilla below” (U+0162)
  • “t with comma below” (U+021B) – incorrectly implemented as “t with cedilla below” (U+0163)

Since Romanian hardware keyboards are not widely available, Cristian Secară has created a driver that allows the Romanian characters to be generated with a US-style keyboard, in all Windows versions previous to Vista. It uses the right AltGr key modifier to generate the characters.[14]

An alternative, more ergonomic (though non-standard) keyboard layout, with a user choice between cedillas and commas, is proposed and implemented for the Microsoft Windows operating system by the Ergo Romanian project. They suggest altering keys on the standard QWERTY layout which are less frequent in Romanian, namely q, w, y, k, x, to produce Romanian characters ă, ș, ț, î, â, respectively.[15]

[edit] Phonetic alphabet

There is a Romanian equivalent to the English-language NATO phonetic alphabet. Most code words are people's first names, with the exception of K, J, Q, W, Y, and Z. Letters with diacritics (Ă, Â, Î, Ș, Ț) are generally transmitted without diacritics (A, A, I, S, T).

      Word IPA (unofficial)         Word IPA (unofficial)
A Ana /'a.na/ N Nicolae /ni.ko'la.e/
B Barbu /'bar.bu/ O Olga /'ol.ɡa/
C Constantin /kon.stan'tin/ P Petre /'pe.tre/
D Dumitru /du'mi.tru/ Q Q /kju/
E Elena /e'le.na/ R Radu /'ra.du/
F Florea /'flo.re̯a/ S Sandu /'san.du/
G Gheorghe /'ɡe̯or.ɡe/ T Tudor /'tu.dor/
H Haralambie /ha.ra'lam.bi.e/ U Udrea /'u.dre̯a/
I Ion /i'on/ V Vasile /va'si.le/
J Jiu /ʒiw/ W dublu V /du.blu've/
K kilogram /ki.lo'ɡram/ X Xenia /'kse.ni.a/
L Lazăr /'la.zər/ Y I grec /'i.ɡrek/
M Maria /ma'ri.a/ Z zahăr /'za.hər/

[edit] References

[edit] Notes

  1. ^ (Romanian) Dicţionarul explicativ al limbii române, 1998, Z is the thirty first letter of the Romanian alphabet
  2. ^ (Romanian) Several Romanian dictionaries specify the pronunciation [je] for word-initial letter e in some personal pronouns: el, ei, etc. and in some forms of the verb a fi (to be): este, eram, etc.
  3. ^ Definition of Romanian word zi at dexonline.ro
  4. ^ Definition of Romanian word Dumnezeu at dexonline.ro
  5. ^ Definition of Romanian word zână at dexonline.ro
  6. ^ Unicode 3.0 standard, p.162
  7. ^ a b European Union Expansion Font Update
  8. ^ comments of Canadian type designer John Hudson
  9. ^ Linotype's font finder allows users to test font rendering with their own sample texts. Tested with the sample text "Țâșnit în şanţ".http://www.linotype.com/featuresearch?cf[]=adobece&cf[]=euro&cf[]=latinext
  10. ^ locl glyph localization feature tag explained.
  11. ^ http://store.adobe.com/type/browser/pdfs/OTGuide.pdf p. 15
  12. ^ Primary keyboard layout
  13. ^ Secondary keyboard layout
  14. ^ Cristian Secară. "RO Keyboard" (in Romanian). Retrieved on 6 January 2009.
  15. ^ Anasoftware. "Ergo Romanian". Retrieved on 6 January 2009.

[edit] Bibliography

  • (Romanian) Mioara Avram, Ortografie pentru toţi, Editura Litera Internaţional, 2002
  • The Unicode Consortium (2000). The Unicode Standard, Version 3.0. Boston: Addison-Wesley. ISBN 0201616335. 

[edit] See also

[edit] External links

Personal tools

Visit joltnews for the latest headlines
Visit bloit.com for company information
Geed Media does computer consulting on long island.
This page viewed times. See Logs