The
Latin alphabet, also called the
Roman alphabet, is the most widely used
alphabetic
writing system in the world today. It evolved from the western variety of the
Greek alphabet called the
Cumaean alphabet, which was borrowed and modified by the
Etruscans who
ruled early Rome, which alphabet was then adapted and further modified by the
ancient Romans to write the
Latin language.
During the
Middle Ages, it was adapted to the
Romance languages, the direct descendants of Latin, as well as to the
Celtic,
Germanic,
Baltic, and some
Slavic languages, and finally to most of the
languages of Europe.
With the
age of colonialism and
Christian proselytism, the Latin alphabet was spread overseas, and applied to
Indigenous American,
Indigenous Australian,
Austronesian,
East Asian, and
African languages. More recently, western
linguists have also tended to prefer the Latin alphabet or the
International Phonetic Alphabet (itself largely based on the Latin alphabet) when transcribing or creating written standards for non-European languages, such as the
African reference alphabet.
In modern usage, the term
Latin alphabet is used for any direct derivation of the alphabet first used to write Latin. These variants may discard letters from the classical Roman script (like the
Rotokas alphabet) or add new characters to it, as from the
Danish and Norwegian alphabet. Letter shapes have changed over the centuries, including the creation of entirely new
lower case characters.
History
Origins
It is generally believed that the
Romans adopted the
Cumae alphabet, a variant of the
Greek alphabet, in the
7th century B.C. from
Cumae, a
Greek colony in
Southern Italy. (
Gaius Julius Hyginus in
Fab. 277 mentions the legend that it was
Carmenta, the
Cimmerian Sibyl. who altered fifteen letters of the Greek alphabet to become the Latin alphabet, which her son
Evander introduced into Latium, supposedly 60 years before the
Trojan War, but there is no historically sound basis to this tale.) The Ancient Greek alphabet was in turn based upon the Phoenician alphabet. From the Cumae alphabet, the
Etruscan alphabet was derived and the Romans eventually adopted 21 of the original 26 Etruscan letters:
The letter
C was the western form of the Greek
gamma, but it was used for the sounds and alike, possibly under the influence of
Etruscan, which lacked any voiced
plosives. Later, probably during the 3rd century BC, the letter
Z — unneeded to write Latin proper — was replaced with the new letter
G, a
C modified with a small horizontal stroke, which took its place in the alphabet. From then on,
G represented the
voiced plosive , while
C was generally reserved for the voiceless plosive . The letter
K was used only rarely, in a small number of words such as
Kalendae, often interchangeably with
C.
After the Roman conquest of
Greece in the
first century BC, Latin adopted the Greek letters
Y and
Z (or rather readopted, in the latter case) to write
Greek loanwords, placing them at the end of the alphabet. An attempt by the emperor
Claudius to introduce three
additional letters did not last. Thus it was that during the
classical Latin period the Latin alphabet contained 23 letters:
The Latin names of some of these letters are disputed. In general, however, the Romans did not use the traditional (
Semitic-derived) names as in Greek: the names of the
plosives were formed by adding to their sound (except for
K and
Q, which needed different vowels to be distinguished from
C) and the names of the
continuants consisted either of the bare sound, or the sound preceded by . The letter
Y when introduced was probably called
hy as in Greek, the name
upsilon not being in use yet, but this was changed to
i Graeca (Greek letter 'i') as Latin speakers had difficulty distinguishing its foreign sound from .
Z was given its Greek name,
zeta. For the Latin sounds represented by the various letters see
Latin spelling and pronunciation; for the names of the letters in English see
English alphabet. The modern language that has been most conservative in preserving the ancient Roman names of the letters is the
German.
Old Roman cursive script, also called
majuscule cursive and capitalis cursive, was the everyday form of handwriting used for writing letters, by merchants writing business accounts, by schoolchildren learning the Latin alphabet, and even
emperors issuing commands. A more formal style of writing was based on
Roman square capitals, but cursive was used for quicker, informal writing. It was most commonly used from about the 1st century BC to the 3rd century, but it probably existed earlier than that. It led to
Uncial, a
majuscule script commonly used from the 3rd to 8th centuries AD by Latin and Greek scribes.
New Roman cursive script, also known as
minuscule cursive, was in use from the 3rd century to the 7th century, and uses letter forms that are more recognizable to modern eyes;
a,
b,
d, and
e had taken a more familiar shape, and the other letters were proportionate to each other. This script evolved into the medieval scripts known as
Merovingian and
Carolingian minuscule.
Medieval and later developments


It was not until the
Middle Ages that the letter
W (originally a
ligature of
V and
V) was added to the Latin alphabet, to represent sounds from the
Germanic languages which did not exist in medieval Latin, and only after the
Renaissance did the convention of treating
I and
U as
vowels, and
J and
V as
consonants, become established. Prior to that, the former had been merely
glyph variants of the latter.
With the fragmentation of political power, the
style of writing changed and varied greatly throughout the Middle Ages, and even after the invention of the
printing press. Early deviations from the classical forms were the
uncial script, a development of the
Old Roman cursive, and various so-called minuscule scripts that developed from
New Roman cursive, of which the
Carolingian minuscule was the most influential, introducing the
lower case forms of the letters, as well as other writing conventions that have since become standard.
The languages that use the Latin alphabet today generally use
capital letters to begin paragraphs and sentences and
proper nouns. The rules for
capitalization have changed over time, and different languages have varied in their rules for capitalization.
Old English, for example, was rarely written with even proper nouns capitalized; whereas
Modern English of the 18th century had frequently all nouns capitalized, in the same way that Modern
German is written today, e.g. "Alle Schwestern der alten Stadt hatten die Vögel gesehen".
Spread of the Latin alphabet
The Latin alphabet spread, along with the
Latin language, from the
Italian Peninsula to the lands surrounding the
Mediterranean Sea with the expansion of the
Roman Empire. The eastern half of the Empire, including
Greece,
Asia Minor, the
Levant, and
Egypt, continued to use
Greek as a
lingua franca, but Latin was widely spoken in the western half, and as the western
Romance languages evolved out of Latin, they continued to use and adapt the Latin alphabet.
With the spread of
Western Christianity during the
Middle Ages, the alphabet was gradually adopted by the peoples of
northern Europe who spoke
Celtic languages (displacing the
Ogham alphabet) or
Germanic languages (displacing earlier
Runic alphabets),
Baltic languages, as well as by the speakers of several
Finno-Ugric languages, most notably
Hungarian,
Finnish and
Estonian. The alphabet also came into use for writing the
West Slavic languages and several
South Slavic languages, as the people who spoke them adopted
Roman Catholicism. The speakers of
East Slavic languages generally adopted the
Cyrillic alphabet along with
Orthodox Christianity, despite that
Latin alphabet has been actively used in
Belarus in late Middle Ages and that there has always been a significant
Roman Catholic minority in the country. The
Serbian language uses both alphabets, with Latin being the predominant alphabet in the province of Vojvodina.
As late as 1492, the Latin alphabet was limited primarily to the languages spoken in
Western,
Northern, and
Central Europe. The
Orthodox Christian Slavs of
Eastern and
Southeastern Europe mostly used the
Cyrillic alphabet, and the Greek alphabet was in use by Greek-speakers around the eastern Mediterranean. The
Arabic alphabet was widespread within Islam, both among
Arabs and non-Arab nations like the
Iranians,
Indonesians,
Malays, and
Turkic peoples. Most of the rest of Asia used a variety of
Brahmic alphabets or the
Chinese script.

Latin alphabet world distribution. The dark green areas shows the countries where this alphabet is the sole main script. The light green shows the countries where the alphabet co-exists with other scripts. Please note that the Latin alphabet is sometimes extensively used even in areas coloured grey due to use of unofficial second languages (e.g. French in Algeria or English in Egypt) and Latin transliterations of the official language (practised to some degree in most countries with a non-Latin alphabet).
Over the past 500 years, the Latin alphabet has spread around the world, to
the Americas,
Oceania, and parts of
Asia,
Africa, and the Pacific with European colonization, along with the
Spanish,
Portuguese,
English,
French,
Swedish and
Dutch languages. The Latin alphabet is also used for many
Austronesian languages, including
Tagalog and the other
languages of the Philippines, and the official
Malaysian and
Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. Some glyph forms from the Latin alphabet served as the basis for the forms of the symbols in the
Cherokee syllabary developed by
Sequoyah; however, the sounds of the final
syllabary were completely different.
L. L. Zamenhof used the Latin alphabet as the basis for the alphabet of
Esperanto. And the Latin alphabet was chosen for the
Ido language due to its unquestionable international predominance of a global alphabet in most of the world's population.
In the late nineteenth century, the
Romanians adopted the Latin alphabet, primarily because
Romanian is a Romance language. The Romanians were predominantly Orthodox Christians, and their Church had promoted the Cyrillic alphabet prior to that.
Under French rule and Portuguese missionary influence, the Latin alphabet was adapted for writing the
Vietnamese language, which had previously used
Chinese-like characters.
In 1928, as part of
Kemal Atatürk's reforms,
Turkey adopted the Latin alphabet for the
Turkish language, replacing the Arabic alphabet. Most of
Turkic-speaking peoples of the former
USSR, including
Tartars,
Bashkirs,
Azeri,
Kazakh,
Kyrgyz and others, used the Latin-based
Uniform Turkic alphabet in the 1930s, but in the 1940s all those alphabets were replaced by Cyrillic. After the collapse of the
Soviet Union in 1991, several of the newly-independent Turkic-speaking republics, namely
Azerbaijan,
Uzbekistan, and
Turkmenistan, as well as Romanian-speaking
Moldova, have officially adopted the Latin alphabet for
Azeri,
Uzbek,
Turkmen,
Kazakh,
Tatar, and
Romanian respectively.
Kyrgyzstan,
Tajikistan, and the breakaway region of
Transnistria kept the Cyrillic alphabet, chiefly due to their close ties with Russia. In the same periods during the 1930s and 1940s, the majority of
Kurds throughout the
Kurdistan region replaced their use of the Arabic alphabet for writing in the
Kurdish language by adopting two forms of the Latin alphabet.
Although today the only official
Kurdish government located in
Iraq uses the Arabic alphabet for public documents, the Latin alphabet remains widely used throughout the region by the majority of
Kurdish-speakers.
Extensions
In the course of its use, the Latin alphabet was adapted for use in new languages, sometimes representing
phonemes not found in languages that were already written with the Roman characters. To represent these new sounds, extensions were therefore created, be it by adding
diacritics to existing
letters, by joining multiple letters together to make
ligatures, by creating completely new forms, or by assigning a special function to pairs or triplets of letters. These new forms are given a place in the alphabet by defining an
alphabetical order or collation sequence, which can vary with the particular language.
Ligatures
A
ligature is a fusion of two or more ordinary letters into a new
glyph or character. Examples are
Æ/æ (from
AE, called "ash"),
Œ/œ (from
OE, sometimes called "oethel"), the
abbreviation & (from
Latin et "and"), and the
German symbol
ß ("sharp
S" or
eszet, from
ſz or
ſs, the
archaic medial form of s, followed by a
z or
s).
Wholly new letters
Examples are the
Runic letters
wynn () and
thorn (
Þ/þ), and the
Irish letter
eth (
Ð/ð), which were added to the alphabet of
Old English. Another Irish letter, the
insular g, developed into
yogh (Ȝ/ȝ), used in
Middle English. Wynn was later replaced with the new letter
w, eth and thorn with
th, and yogh with
gh. Although the four are no longer part of the English or Irish alphabets, eth and thorn are still used in the modern
Icelandic alphabet.
The
Azerbaijani alphabet has adopted the
letter schwa from the
International Phonetic Alphabet, using it to represent the sound . Some West, Central and
Southern African languages use a few additional letters which have a similar sound value to their equivalents in the IPA. For example,
Adangme uses the letters and , and
Ga uses ,
Ŋ/ŋ and .
Hausa uses and for
implosives, and for an
ejective.
Africanists have standardized these into the
African reference alphabet.
Digraphs and trigraphs
A digraph is a pair of letters used to write one sound or a combination of sounds that does not correspond to the written letters in sequence. Examples are
ch,
rh,
sh in English, or the
Dutch ij (note that
ij is capitalized as
IJ or the ligature
IJ and sometimes as the single letter
Y despite it is a different letter, but never as
Ij, and that it often takes the appearance of a ligature
ij very similar to the letter
ÿ in
handwriting). A trigraph is made up of three letters, like the
German sch, the
Breton c’h or the
Milanese oeu. In the
orthographies of some languages, digraphs and trigraphs are regarded as independent letters of the alphabet in their own right. The capitalization of digraphs and trigraphs is language-dependent, as only on the first letter may be capitalized, or all component letters simultaneously even for words written in titlecase only where the other non-initial letters after the digraph or trigraph are left in lowercase.
Diacritics
A diacritic, in some cases also called an accent, is a small symbol which can appear above or below a letter, or in some other position, such as the
umlaut sign used in the German characters
Ä,
Ö,
Ü. Its main function is to change the phonetic value of the letter to which it is added, but it may also modify the pronunciation of a whole syllable or word, or distinguish between
homographs. As with letters, the value of diacritics is language-dependent.
Collation
Modified letters such as the symbols
Å,
Ä, and
Ö may be regarded as new individual letters in themselves, and assigned a specific place in the alphabet for
collation purposes, separate from that of the letter on which they are based, as is done in
Swedish. In other cases, such as with
Ä,
Ö,
Ü in German, this is not done, letter-diacritic combinations being identified with their base letter. The same applies to digraphs and trigraphs. Different diacritics may be treated differently in collation within a single language. For example, in
Spanish the character
Ñ is considered a letter in its own, and sorted between
N and
O in dictionaries, but the accented vowels
Á,
É,
Í,
Ó,
Ú are not separated from the unaccented vowels
A,
E,
I,
O,
U.
Romanization
Words from languages natively written with other
scripts, such as
Arabic or
Chinese, are usually
transliterated or
transcribed when embedded in Latin text or in
multilingual international communication, a process termed Romanization.
Whilst the Romanization of such languages is used mostly at unofficial levels, it has been especially prominent in computer messaging where only the limited 7-bit
ASCII code is available on older systems. However, with the introduction of
Unicode, Romanization is now becoming less necessary. Note that keyboards used to enter such text may still restrict users to Romanized text, as only ASCII or Latin-alphabet characters may be available.
The English alphabet
As used in modern
English, the Latin alphabet consists of the following
characters
In addition, the
ligatures Æ of
A with
E (e.g. "
encyclopædia"), and
Œ of
O with
E (e.g. "
cœlacanth") may be used, optionally, in words derived from Latin or Greek, and the
diaeresis mark is sometimes placed for example on the letters
o and
e (e.g. "coöperate" or "preëxisting") to indicate the pronunciation of
oo or
ee as two distinct vowels, rather than a long one. Hyphenation may also be used, to avoid having to type accented characters: "co-operate" or "pre-existing". Outside of professional papers on specific subjects that traditionally use ligatures in
loanwords, however, ligatures and diaereses are seldom used in modern English. Note, however, that some
fonts for typesetting English contain commonly used ligatures, such as for tt, fi, fl, ffi, and ffl. These are not part of the
language, per se, but rather typographic convention.
Latin alphabet and international standards
By the 1960s it became apparent to the computer and
telecommunications industries in the
First World that a non-proprietary method of encoding characters was needed. The
International Organization for Standardization (ISO) encapsulated the Latin alphabet in their (
ISO/IEC 646) standard. To achieve widespread acceptance, this encapsulation was based on popular usage. As the United States held a preeminent position in both industries during the 1960s the standard was based on the already published
American Standard Code for Information Interchange, better known as
ASCII, which included in the
character set the 26 x 2 letters of the
English alphabet. Later standards issued by the ISO, for example
ISO/IEC 10646 (
Unicode Latin), have continued to define the 26 x 2 letters of the English alphabet as the basic Latin alphabet with extensions to handle other letters in other languages.
See also