Mandarin sibilants

Mandarin has many more sibilant sounds than English does.  I have always found it hard to keep them apart, both in listening and in speaking.

The following table shows for each sibilant sound first the official romanised (pinyin) spelling and then, in [square brackets], a transcription in the alphabet of the International Phonetic Association.

Dentals [s]z [ts]c [tsh]
Post-alveolarsh [ʂ]zh [tʂ]ch [tʂh]
Post-alveolar approximantr [ʐ]  
Alveolo-palatalx [ɕ]j [tɕ]q [tɕh]

Three of the four rows in this table each contain a series of three sounds. In each case, the first sound is a fricative, the second sound is an unaspirated affricate (a sequence of a stop followed by the fricative from that series) and the third sound is an aspirated version of the affricate.
The third row contains only one sound: a fricative.

Here are explanations of terms in the table:

  • The vocal cords vibrate while the speaker is producing the post-alveolar approximant (it is voiced). All other consonants in the above table are unvoiced. Note that Mandarin distinguishes between aspirated and unaspirated consonants, whereas English (like many other European languages) distinguishes between voiced and unvoiced consonants. English unvoiced consonants tend to be aspirated and English voiced consonants tend to be unaspirated.
  • The dental sounds are pronounced with the tip of the tongue close to the teeth. Rough English equivalents are: s (as in sat); ds (as in cards); and ts as in (cats). Examples of words containing these sounds: sai4 (to compete); zai4 (again); cai4 (vegetable). (In the examples, the superscripted numerals designate the tone. For a discussion of tone in Mandarin, please see A Mandarin tongue twister – Language Miscellany.)
  • The post-alveolar sounds are pronounced with the tongue behind the alveolar ridge—the ridge above and behind the top teeth. Rough English equivalents are: sh (as in sheep); j (as in jeep); and ch as in (cheap). Examples of words containing these sounds: shang4 (top, on, above); zhang4 (to rise —of water, prices); chang4 (to sing)
    Lin (2007) states that the post-alveolar consonants have traditionally been described as retroflex—that is, with the tip of the tongue curled up and backwards. But Lin says that more recent descriptions show that these sounds are produced with the upper side of the tongue (not its underside) approaching the back of the alveolar ridge.
  • The post-alveolar approximant is something like a cross between the southern British sounds [r] (as in red) and [ʒ] (as in measure), but with lips not rounded and with the tip of the tongue raised towards the back of the alveolar ridge. Example of words containing this sound: rang4 (to let, to yield); ren2 (person).
  • The alveolo-palatal sounds are pronounced with the tongue raised towards the roof of the mouth (hard palate). These sounds occur only before the vowels [i] or [y] or glides [j] or [ɥ]. Thus, some phonologists analyse them not as separate phonemes but as allophones (positional variants) of another series, with those allophones occurring only before those vowels or glides.  Some analyse them as allophones of the post-alveolar series. Others analyse them as allophones of the dental series.  Yet others analyse them as allophones of a guttural series not shown in the table ([x], [k], [kh]). Examples of words containing the alveolo-palatal sounds: xi1 (west); ji1 (chicken) qi1 (seven).

Hard to distinguish

Like other English speakers, I find it hard to distinguish the 3 alveolo-palatal sounds from the 3 sounds in the post-alveolar series. Ladefoged and Maddieson (1996) provide some comments on the distinction between post-alveolar [ʂ] and alveolo-palatal [ɕ]:

  • post-alveolar [ʂ]: The constriction between the tongue and the post-alveolar region is made with the upper surface of the tip of the tongue. The location and width of the constriction are comparable to those in English [ʃ] (as in shoe). The front of the tongue is fairly flat (or even slightly hollowed), rather than being slightly raised towards the hard palate, as it is in English [ʃ]. No part of the tongue is touching the lower teeth (unlike in [s]).
  • alveolo-palatal [ɕ]:There are some similarities to English [ʃ], but both the blade and the body of the tongue are higher in the mouth, forming a comparatively long, flat constriction.

I also have similar difficulties in distinguishing different series of sibilants in Polish, but that’s a subject for another day.

Historical development

Although Mandarin has a great variety of sibilants, these sounds arose, by a complex succession of changes, from a collection of Old Chinese sounds that was much less diverse. Dong (2021) summarises current thinking on how the sibilants arose:

  • In Old Chinese (before about 220 CE), there was a single dental (or alveolar) series. The series comprised a stop, a voiceless unaspirated affricate, a voiceless aspirated affricate and a voiced affricate: [s]; [ts]; [tsh]; [dz].
  • From Old Chinese to Middle Chinese (after about 220 and before about 1279), sequences of dental (or alveolar) stops (and affricates) followed by [r] developed into retroflex stops (produced with the tongue curled upwards): [ʈ], [ʈh], [ɖ], [ ɳ], and retroflex affricates: ([tʂ], [tʂh], [dʐ], [ʂ]).  
  • Also from Old Chinese to Middle Chinese, alveolo-palatal sounds developed in two cases: (1) from sequences of an alveolar followed by [-j], for example *tjaŋ (chapter) became tɕaŋ; (2) from sequences of a velar stop (such as [k]) followed by [-j], for example *skjig (support) became tɕjě.  (forms marked with an asterisk * are reconstructed). In Late Middle Chinese, this alveolo-palatal series then merged with the Early Middle Chinese retroflex series into a single retroflex series.  
  • Also during Middle Chinese, the retroflex stops also merged into the retroflex affricates, for example [ʈ] became [tɕ].
  • From Middle Chinese to Old Mandarin (around 1279 CE to around 1368 CE), voiced consonants become unvoiced.
  • From Old Mandarin to Modern Standard Chinese:
    • during the Ming dynasty (1368-1644), velar stops and fricatives changed to corresponding alveolo-palatals before high front vowels (such as [i] and [y]). For example, the initial [k] in old Mandarin [kiɛm4] (sword) palatalised during the Ming dynasty, ultimately becoming jian4[tɕiɛn4].
    • then, during the Qing dynasty (1644-1911), dental stops and fricatives also changed to corresponding alveolo-palatals before those high front vowels. For example, the initial [ts] in [tsiɛm4] (arrow) also palatalised, ultimately leading to jian4 [tɕiɛn4]—now pronounced in the same way as the word for sword.


The Sounds of Chinese, Yen-Hwei Lin (2007)

The Sounds of the World’s Languages, Peter Ladefoged and Ian Maddieson (1996)

A History of the Chinese Language, Hongyuan Dong (2nd edition, 2021)

