Vietnamese (
tiếng Việt, or less commonly
Việt ngữ), formerly known under French colonization as
Annamese (
see Annam), is the
national and
official language of
Vietnam. It is the
mother tongue of the
Vietnamese people (
người Việt or
người Kinh), who constitute 86% of
Vietnam's population, and of about three million
overseas Vietnamese, most of whom live in the
United States. It is also spoken as a
second language by many
ethnic minorities of Vietnam. It is part of the
Austroasiatic language family, of which it has the most speakers by a significant margin (several times larger than the other Austroasiatic languages put together).
Much vocabulary has been borrowed from
Chinese, especially words that denote abstract ideas in the same way European languages borrow from Latin and Greek, and it was formerly written using the
Chinese writing system, albeit in a modified format and was given vernacular pronunciation. The
Vietnamese writing system in use today is an adapted version of the
Latin alphabet, with additional
diacritics for tones and certain letters.
Geographic distribution
As the national language of the majority ethnic group, Vietnamese is spoken throughout Vietnam by the
Vietnamese people, as well as by ethnic minorities. It is also spoken in overseas Vietnamese communities, most notably in the United States, where it has more than one million speakers and is the seventh most-spoken language (it is 3rd in
Texas, 4th in
Arkansas and
Louisiana, and 5th in
California). In
Australia, it is the sixth most-spoken language.
According to the
Ethnologue, Vietnamese is also spoken by substantial numbers of people in
Cambodia,
Canada,
China,
Côte d'Ivoire,
Czech Republic,
Finland,
France,
Germany,
Laos,
Martinique, the
Netherlands,
New Caledonia,
Norway, the
Philippines,
Senegal,
Thailand, the
United Kingdom, and
Vanuatu.
Genealogical classification
Vietnamese was identified more than 150 years ago to be part of the
Mon-Khmer branch of the
Austroasiatic language family (a family that also includes
Khmer, spoken in
Cambodia, as well as various tribal and
regional languages, such as the
Munda and
Khasi languages spoken in eastern
India, and others in southern
China). Later,
Mường was found to be more closely related to Vietnamese than other Mon-Khmer languages, and a Việt-Mường sub-grouping was established. As data on more Mon-Khmer languages were acquired, other minority languages (such as Thavưng, Chứt languages, Hung, etc.) were found to share Việt-Mường characteristics, and the
Việt-Mường term was renamed to
Vietic. The older term
Việt-Mường now refers to a lower sub-grouping (within an eastern Vietic branch) consisting of Vietnamese dialects, Mường dialects, and
Nguồn (of
Quảng Bình Province).
Language policy
While spoken by the Vietnamese people for millennia, written Vietnamese did not become the official administrative language of Vietnam until the 20th century. For most of its history, the entity now known as Vietnam used written classical Chinese for governing purposes, whereas written Vietnamese in the form of
Chữ nôm was used for poetry and literature. It was also used for administrative purposes during the brief
Ho and
Tay Son Dynasties. During French colonialism, French superseded Chinese in administration. It was not until independence from France that Vietnamese was used officially. It is the language of instruction in schools and universities and is the language for official business.
History
It seems likely that in the distant past, Vietnamese shared more characteristics common to other languages in the Austroasiatic family, such as an inflectional
morphology and a richer set of
consonant clusters, which have subsequently disappeared from the language. However, Vietnamese appears to have been heavily influenced by its location in the
Southeast Asian sprachbund, with the result that it has acquired or converged toward characteristics such as isolating morphology and
tonogenesis. These characteristics, which may or may not have been part of
proto-Austroasiatic, nonetheless have become part of many of the phylogenetically unrelated languages of Southeast Asia; for example,
Thai (one of the
Kradai languages),
Tsat (a member of the
Malayo-Polynesian group within Austronesian), and Vietnamese each developed
tones as a phonemic feature, although their respective ancestral languages were not originally tonal.
Presently, Vietnamese has similarities with both Chinese and French due to the influence of the French invasion.
The ancestor of the Vietnamese language was originally based in the area of the
Red River in what is now northern Vietnam, and during the subsequent expansion of the Vietnamese language and people into what is now central and southern Vietnam (through conquest of the ancient nation of
Champa and the
Khmer people of the
Mekong Delta in the vicinity of present-day
Ho Chi Minh City (Saigon), characteristic tonal variations have emerged.
Vietnamese was linguistically influenced primarily by Chinese, which came to predominate politically in the 2nd century B.C.
With the rise of Chinese political dominance came radical importation of Chinese vocabulary and grammatical influence. As
Chinese was, for a prolonged period, the only medium of literature and government, as well as the primary written language of the ruling class in Vietnam, much of the Vietnamese lexicon in all realms consists of
Hán Việt (
Sino-Vietnamese) words. In fact, as the vernacular language of Vietnam gradually grew in prestige toward the beginning of the second millennium, the Vietnamese language was written using
Chinese characters (using both the original Chinese characters, called
Hán tự, as well as a system of newly created and modified characters called
Chữ nôm) adapted to write Vietnamese, in a similar pattern as used in Japan (
kanji), Korea (
hanja), and other countries in the
Sinosphere. The
Nôm writing reached its zenith in the 18th century when many Vietnamese writers and poets composed their works in
Chữ Nôm, most notably
Nguyễn Du and
Hồ Xuân Hương (dubbed "the Queen of Nôm poetry").
As contact with the West grew, the
Quốc Ngữ system of Romanized writing was developed in the 17th century by
Portuguese and other
Europeans involved in
proselytizing and trade in Vietnam. When
France invaded Vietnam in the late 19th century,
French gradually replaced Chinese as the official language in education and government. Vietnamese adopted many French terms, such as
đầm (dame, from
madame),
ga (train station, from
gare),
sơ mi (shirt, from
chemise), and
búp bê (doll, from
poupée). In addition, many Sino-Vietnamese terms were devised for Western ideas imported through the French. However, the Romanized script did not come to predominate until the beginning of the 20th century, when education became widespread and a simpler writing system was found more expedient for teaching and communication with the general population.
Vocabulary
As a result of a thousand years of Chinese occupation, much of the Vietnamese
lexicon relating to science and politics is derived from Chinese. As much as 60%-70% of the vocabulary has Chinese roots, although many compound words are
Sino-Vietnamese, composed of native Vietnamese words combined with Chinese borrowings. One can usually distinguish between a native Vietnamese word and a Chinese borrowing if it can be reduplicated or its meaning doesn't change when the tone is shifted. As a result of French colonization, Vietnamese also has words borrowed from the
French language, for example
cà phê (from French
café). Nowadays, many new words are being added to the language's lexicon; these are either borrowed from
English, for example TV (though usually seen in the written form as
tivi), or are themselves inventions of the communists (the communist translation for television is
truyền hình). Sometimes these borrowings are
calques literally translated into Vietnamese (for example, consider how the communists have rendered the word 'software' into
phần mềm, which literally means "soft part").
Sounds
Vowels
Like other southeast Asian languages, Vietnamese has a comparatively large number of
vowels. Below is a
vowel diagram of Hanoi Vietnamese.
Front, central, and low vowels (
i,
ê,
e,
ư,
â,
ơ,
ă,
a) are
unrounded, whereas the back vowels (
u,
ô,
o) are rounded. The vowels
â and
ă are pronounced very short, much shorter than the other vowels. Thus,
ơ and
â are basically pronounced the same except that
ơ is long while
â is short — the same applies to the low vowels long
a and short
ă .
In addition to single vowels (or
monophthongs), Vietnamese has
diphthongs and
triphthongs. The diphthongs consist of a main vowel component followed by a shorter semivowel
offglide to a high front position , a high back position , or a central position .
The centering diphthongs are formed with only the three high vowels (
i,
ư,
u) as the main vowel. They are generally spelled as
ia,
ưa,
ua when they end a word and are spelled
iê,
ươ,
uô, respectively, when they are followed by a consonant. There are also restrictions on the high offglides: the high front offglide cannot occur after a front vowel (
i,
ê,
e) nucleus and the high back offglide cannot occur after a back vowel (
u,
ô,
o) nucleus.
The correspondence between the orthography and pronunciation is complicated. For example, the offglide is usually written as
i however, it may also be represented with
y. In addition, in the diphthongs and the letters
y and
i also indicate the pronunciation of the main vowel:
ay =
ă + ,
ai =
a + . Thus,
tay "hand" is while
tai "ear" is . Similarly,
u and
o indicate different pronunciations of the main vowel:
au =
ă + ,
ao =
a + . Thus,
thau "brass" is while
thao "raw silk" is .
The four triphthongs are formed by adding front and back offglides to the centering diphthongs. Similarly to the restrictions involving diphthongs, a triphthong with front nucleus cannot have a front offglide (after the centering glide) and a triphthong with a back nucleus cannot have a back offglide.
With regards to the front and back offglides , many phonological descriptions analyze these as consonant glides . Thus, a word such as
đâu "where", phonetically , would be phonemicized as .
Tones

Pitch contours and duration of the six Northern Vietnamese tones as uttered by a male speaker (not from Hanoi).
Fundamental frequency is plotted over time. From Nguyễn & Edmondson (1998).
Vietnamese vowels are all pronounced with an inherent
tone. Tones differ in:
Tone is indicated by diacritics written above or below the vowel (most of the tone diacritics appear above the vowel; however, the
nặng tone dot diacritic goes below the vowel). The six tones in the northern varieties (including Hanoi) are:
Other dialects of Vietnamese have fewer tones (typically only five). See the
language variation section for a brief survey of tonal differences among dialects.
In Vietnamese poetry, tones are classed into two groups:
Words with tones belonging to particular tone group must occur in certain positions with the poetic verse.
Consonants
The consonants that occur in Vietnamese are listed below in the Vietnamese
orthography with the phonetic pronunciation to the right.
Some consonant sounds are written with only one letter (like "p"), other consonant sounds are written with a two-letter
digraph (like "ph"), and others are written with more than one letter or digraph (the velar stop is written variously as "c", "k", or "q").
Not all dialects of Vietnamese have the same consonant in a given word (although all dialects use the same spelling in the written language). See the
language variation section for further elaboration.
The analysis of syllable-final orthographic
ch and
nh in Hanoi Vietnamese has had different analyses. One analysis has final
ch,
nh as being phonemes contrasting with syllable-final
t,
c and
n,
ng and identifies final
ch with the syllable-initial
ch . The other analysis has final
ch and
nh as predictable
allophonic variants of the velar phonemes and that occur before upper front vowels
i and
ê . (See
Vietnamese phonology: Analysis of final ch, nh for further details.)
Language variation
There are various mutually intelligible regional varieties (or
dialects), the main four being:

Icon of loudspeaker
The first article of the Universal Declaration of Human Rights spoken by Nghiem Mai Phuong, native speaker of a northern variety. (
audio help)

Icon of loudspeaker
Vietnamese has traditionally been divided into three dialect regions: North, Central, and South. However, Michel Fergus and Nguyễn Tài Cẩn offer evidence for considering a North-Central region separate from Central. The term
Haut-Annam refers to dialects spoken from northern Nghệ An Province to southern (former) Thừa Thiên Province that preserve archaic features (like consonant clusters and undiphthongized vowels) that have been lost in other modern dialects.
These dialect regions differ mostly in their sound systems (see below), but also in vocabulary (including basic vocabulary, non-basic vocabulary, and grammatical words) and grammar. The North-central and Central regional varieties, which have a significant amount of vocabulary differences, are generally less
mutually intelligible to Northern and Southern speakers. There is less internal variation within the Southern region than the other regions due to its relatively late settlement by Vietnamese speakers (in around the end of the 15th century). The North-central region is particularly conservative. Along the coastal areas, regional variation has been neutralized to a certain extent while more mountainous regions preserve more variation. As for
sociolinguistic attitudes, the North-central varieties are often felt to be "peculiar" or "difficult to understand" by speakers of other dialects.
It should be noted that the large movements of people between North and South beginning in the mid-20th century and continuing to this day have resulted in a significant number of Southern residents speaking in the Northern accent/dialect and to a lesser extent, Northern residents speaking in the Southern accent/dialect. Following the Geneva Accords of 1954 that called for the "temporary" division of the country, almost a million Northern speakers (mainly from Hanoi and the surrounding Red River Delta areas) moved South (mainly to Saigon, now Ho Chi Minh City, and the surrounding areas.) About a third of that number of people made the move in the reverse direction.
Following the reunification of Vietnam in 1975-76, Northern and North-Central speakers from the densely populated Red River Delta and the traditionally poorer provinces of Nghe An, Ha Tinh and Quang Binh have continued to move South to look for better economic opportunities. Additionally, government and military personnel are posted to various locations throughout the country, often away from their home regions. More recently, the growth of the free market system have resulted in business people and tourists traveling to distant parts of Vietnam. These movements have resulted in some small blending of the dialects but more significantly, have made the Northern dialect more easily understood in the South and vice versa. It is also interesting to note that most Southerners, when singing modern/popular Vietnamese songs, would do so in the Northern accent. This is true in Vietnam as well as in the overseas Vietnamese communities.
The
syllable-initial
ch and
tr digraphs are pronounced distinctly in North-central, Central, and Southern varieties, but are merged in Northern varieties (i.e. they are both pronounced the same way). The North-central varieties preserve three distinct pronunciations for
d,
gi, and
r whereas the North has a three-way merger and the Central and South have a merger of
d and
gi while keeping
r distinct. At the end of syllables, palatals
ch and
nh have merged with alveolars
t and
n, which, in turn, have also partially merged with velars
c and
ng in Central and Southern varieties.
In addition to the regional variation described above, there is also a merger of
l and
n in certain rural varieties:
Variation between
l and
n can be found even in mainstream Vietnamese in certain words. For example, the numeral "five" appears as
năm by itself and in compound numerals like
năm mươi "fifty" but appears as
lăm in
mười lăm "fifteen". (See
Vietnamese syntax: Cardinal numerals.) In some northern varieties, this numeral appears with an initial
nh instead of
l:
hai mươi nhăm "twenty-five" vs. mainstream
hai mươi lăm.
The consonant clusters that were originally present in Middle Vietnamese (of the 17th century) have been lost in almost all modern Vietnamese varieties (but retained in other closely related
Vietic languages). However, some speech communities have preserved some of these archaic clusters: "sky" is
blời with a cluster in Hảo Nho (Yên Mô prefecture,
Ninh Binh Province) but
trời in Southern Vietnamese and
giời in Hanoi Vietnamese (initial single consonants , respectively).
Generally, the Northern varieties have six tones while those in other regions have five tones. The
hỏi and
ngã tones are distinct in North and some North-central varieties (although often with different
pitch contours) but have merged in Central, Southern, and some North-central varieties (also with different pitch contours). Some North-central varieties (such as
Hà Tĩnh Vietnamese) have a merger of the
ngã and
nặng tones while keeping the
hỏi tone distinct. Still other North-central varieties have a three-way merger of
hỏi,
ngã, and
nặng resulting in a four-tone system. In addition, there are several phonetic differences (mostly in pitch contour and
phonation type) in the tones among dialects.
The table above shows the pitch contour of each tone using
Chao tone number notation (where 1 = lowest pitch, 5 = highest pitch);
glottalization (
creaky,
stiff,
harsh) is indicated with the <> symbol;
breathy voice with <>;
glottal stop with <>; sub-dialectal variants are separated with commas. (See also the
tone section below.)
Grammar
Vietnamese, like many languages in Southeast Asia, is an
analytic (or isolating) language. Vietnamese does not use
morphological marking of
case,
gender,
number or
tense (and, as a result, has no
finite/
nonfinite distinction). Also like other languages in the region, Vietnamese syntax conforms to
Subject Verb Object word order, is
head-initial (displaying modified-
modifier ordering), and has a noun
classifier system. Additionally, it is
pro-drop,
wh-in-situ, and allows
verb serialization.
Some Vietnamese sentences with English word
glosses and translations are provided below.
Writing system
Currently, the written language uses the
Vietnamese alphabet (
quốc ngữ or "national script", literally "national language"), based on the
Latin alphabet. Originally a
Romanization of Vietnamese, it was codified in the 17th century by a French
Jesuit missionary named
Alexandre de Rhodes (1591–1660), based on works of earlier
Portuguese missionaries (Gaspar do Amaral and António Barbosa). The use of the script was gradually extended from its initial domain in Christian writing to become more popular among the general public.
Under French colonial rule, the script became official and required for all public documents in 1910 by issue of a decree by the French Résident Supérieur of the protectorate of Tonkin. By the end of first half 20th century virtually all writings were done in
quốc ngữ.
Changes in the script were made by French scholars and administrators and by conferences held after independence during 1954–1974. The script now reflects a so-called
Middle Vietnamese dialect that has vowels and final consonants most similar to northern dialects and initial consonants most similar to southern dialects (Nguyễn 1996). This Middle Vietnamese is presumably close to the Hanoi variety as spoken sometime after 1600 but before the present.
Before
French rule, the first two Vietnamese writing systems were based on Chinese script:
- a complicated variant form known as chữ nôm (southern/vernacular characters, 字喃) with characters not found in the Chinese character set; this system was better adapted to the unique phonetic aspects of Vietnamese which differed from Chinese
The authentic Chinese writing,
chữ nho, was in more common usage, whereas
chữ nôm was used by members of the educated elite (one needs to be able to read
chữ nho in order to read
chữ nôm). Both scripts have fallen out of common usage in modern
Vietnam, and almost all citizens are unable to read
chữ nôm in more recent years.
Chữ nho was still in use on early
North Vietnamese and late
French Indochinese banknotes issued after WWII but fell out of official use shortly thereafter.
Computer support
The
Unicode character set contains all Vietnamese characters and the Vietnamese currency symbol. On systems that do not support Unicode, many 8-bit Vietnamese
code pages are available such as
VISCII or
CP1258. Where
ASCII must be used, Vietnamese letters are often typed using the
VIQR convention, though this is largely unnecessary nowadays, with the increasing ubiquity of Unicode. There are many software tools that help type true Vietnamese text on US keyboards, such as and on Windows, or on Macintosh.
Pragmatics and ethnography of communication
Word play
A
language game known as
nói lái is used by Vietnamese speakers and is often considered clever.
Nói lái involves switching the tones in a pair of words and also the order of the two words or the first consonant and
rime of each word; the resulting
nói lái pair preserves the original sequence of tones. Some examples:
The resulting transformed phrase often has a different meaning but sometimes may just be a nonsensical word pair.
Nói lái can be used to obscure the original meaning and thus soften the discussion of a socially sensitive issue, as with
dấm đài and
hoảng chưa (above) or, when implied (and not overtly spoken), to deliver a hidden subtextual message, as with
bồi tây. Naturally,
nói lái can be used for a humorous effect.
Another word game somewhat reminiscent of
pig latin is played by children. Here a nonsense syllable (chosen by the child) is prefixed onto a target word's syllables, then their initial consonants and rimes are switched with the tone of the original word remaining on the new switched rime.