In
linguistics,
prosody (from
Greek προσῳδία,
prosōidía) is the
rhythm,
stress, and
intonation of
speech. Prosody may reflect various features of the speaker or the utterance: the emotional state of a speaker; whether an utterance is a statement, a question, or a command; whether the speaker is being ironic or sarcastic; emphasis, contrast, and
focus; or other elements of language that may not be encoded by grammar or choice of vocabulary.
Acoustic attributes of prosody
In terms of
acoustics, the prosodics of
oral languages involve variation in
syllable length,
loudness,
pitch, and the
formant frequencies of speech sounds. In
cued speech and
sign languages, prosody involves the rhythm, length, and tension of gestures, along with mouthing and facial expressions. Prosody is absent in writing, which is one reason e-mail, for example, may be misunderstood.
Orthographic conventions to mark or substitute for prosody include
punctuation (commas, exclamation marks, question marks,
scare quotes, and
ellipses), typographic styling for
emphasis (italic, bold, and underlined text), and
emoticons.
The details of a language's prosody depend upon its
phonology. For instance, in a language with
phonemic vowel length, this must be marked separately from prosodic syllable length. In similar manner, prosodic pitch must not obscure
tone in a
tone language if the result is to be intelligible. Although tone languages such as
Mandarin have prosodic pitch variations in the course of a sentence, such variations are long and smooth contours, on which the short and sharp lexical tones are superimposed. If pitch can be compared to ocean waves, the swells are the prosody, and the wind-blown ripples in their surface are the lexical tones, as with stress in English. The word
dessert has greater stress on the second syllable, compared to
desert, which has greater stress on the first; but this distinction is not obscured when the entire word is stressed by a child demanding "Give me
dessert!" Vowels in many languages are likewise pronounced differently (typically less
centrally) in a careful rhythm or when a word is emphasized, but not so much as to overlap with the
formant structure of a different vowel. Both
lexical and prosodic information are encoded in rhythm, loudness, pitch, and vowel formants.
The prosodic domain
Prosodic features are
suprasegmental. They are not confined to any one
segment, but occur in some higher level of an utterance. These
prosodic units are the actual phonetic "spurts", or chunks of speech. They need not correspond to grammatical units such as
phrases and
clauses, though they may; and these facts suggest insights into how the brain processes speech.
Prosodic units are marked by phonetic cues, such as a coherent
pitch contour – or the gradual decline in pitch and lengthening of vowels over the duration of the unit, until the pitch and speed are reset to begin the next unit. Breathing, both inhalation and exhalation, seems to occur only at these boundaries where the prosody resets.
"Prosodic structure" is important in language contact and lexical borrowing. Linguist Ghil'ad Zuckermann demonstrates that in "Israeli" (his term for
Modern Hebrew), the XiXéX verb-template is much more productive than the XaXáX verb-template because in morphemic adaptations of non-Hebrew stems, the XiXéX verb-template is more likely to retain — in all conjugations throughout the tenses — the prosodic structure (e.g., the consonant clusters and the location of the vowels) of the stem.
For example, the Israeli verb
le-transfér "to transfer (people)" is fitted into the XiXéX verb-template. In the past (3rd person, masculine, singular) one says
trinsfér, in the present
metransfér and in the future
yetransfér. The consonant clusters of the stem
transfer are kept throughout. Now, let's try to fit the stem
transfer into the XaXáX verb-template, which in fact used to be the most productive one in Classical Hebrew. The normal pattern can be seen in
garám–gorém–yigróm "cause" (past, present, future). So, yesterday, he *transfár "transferred (people)"; today, he *tronsfér. So far so good; the consonant clusters and the location of the vowels of
transfer are maintained, the specific characteristics of the vowels (e.g. whether they are
a or
i) being less important. However, the future form, *yitrnsfór, is impossible because among other things, lacking a vowel between the
r and the
n, it violates the prosodic structure of the stem
transfer.
According to Zuckermann, this is exactly why the stem
click "select by pressing one of the buttons on the computer mouse" was fitted into the hiXXíX verb-template, resulting in
hiklík rather than in the XiXéX (*kilék) or XaXáX (*kalák) verb-templates. The form
hiklík is the only one preserving the [kl] cluster.
One important conclusion is that prosodic considerations supersede semantic ones. For example, although hiXXíX is historically the
causative verb-template, it is employed — on purely phonological grounds — in the intransitive
hishvíts "show off" (from
Yiddish shvits) and in the ambitransitive (in fact, usually intransitive)
hiklík "click" (cf.
English click).
Prosody and emotion
Emotional prosody is the expression of feelings using prosodic elements of speech. It was recognized by
Charles Darwin in
The Descent of Man as predating
the evolution of human language: "Even monkeys express strong feelings in different tones – anger and impatience by low, – fear and pain by high notes."
Native speakers listening to actors reading emotionally neutral text while projecting emotions correctly recognized happiness 62% of the time, anger 95%, surprise 91%, sadness 81%, and neutral tone 76%. When a database of this speech was processed by computer, segmental features allowed better than 90% recognition of happiness and anger, while suprasegmental prosodic features allowed only 44%–49% recognition. The reverse was true for surprise, which was recognized only 69% of the time by segmental features and 96% of the time by suprasegmental prosody. In typical conversation (no actor voice involved), the recognition of emotion may be quite low, of the order of 50%, hampering the complex interrelationship function of speech advocated by some authors.
Brain location of prosody
An
aprosodia is an acquired or developmental impairment in comprehending or generating the emotion conveyed in spoken language.
Producing these nonverbal elements requires intact motor areas of the face, mouth, tongue, and throat. This area is associated with Brodmann areas 44 and 45 (
Broca's area) of the left
frontal lobe. Damage to areas 44/45 produces
motor aprosodia, with the nonverbal elements of speech being disturbed (facial expression, tone, rhythm of voice).
Understanding these nonverbal elements requires an intact and properly functioning Brodmann area 22 (
Wernicke's area) in the right hemisphere. Right-hemispheric area 22 aids in the interpretation of prosody, and damage causes
sensory aprosodia, with the patient unable to comprehend changes in voice and body language.