Emotional tones of voice affect the acoustics and perception of Mandarin tones
PLOS ONE
RESEARCH ARTICLE
Emotional tones of voice affect the acoustics
and perception of Mandarin tones
Hui-Shan Chang1,2,3, Chao-Yang Lee ID2,4, Xianhui Wang4, Shuenn-Tsong Young5, ChengHsuan Li3, Woei-Chyn Chu ID1*
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
1 Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei City, Taiwan,
2 Department of Audiology and Speech-Language Pathology, Asia University, Taichung City, Taiwan,
3 Graduate Institute of Educational Information and Measurement, National Taichung University of
Education, Taichung City, Taiwan, 4 Division of Communication Sciences and Disorders, Ohio University,
Athens, Ohio, United States of America, 5 Institute of Geriatric Welfare Technology and Science, MacKay
Medical College, New Taipei City, Taiwan
*
Abstract
OPEN ACCESS
Citation: Chang H-S, Lee C-Y, Wang X, Young S-T,
Li C-H, Chu W-C (2023) Emotional tones of voice
affect the acoustics and perception of Mandarin
tones. PLoS ONE 18(4): e0283635. https://doi.org/
10.1371/journal.pone.0283635
Editor: Yiu-Kei Tsang, Hong Kong Baptist
University, HONG KONG
Received: July 23, 2021
Accepted: March 14, 2023
Published: April 5, 2023
Peer Review History: PLOS recognizes the
benefits of transparency in the peer review
process; therefore, we enable the publication of
all of the content of peer review and author
responses alongside final, published articles. The
editorial history of this article is available here:
https://doi.org/10.1371/journal.pone.0283635
Copyright: © 2023 Chang et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: All relevant data are
within the manuscript and its Supporting
Information files.
Funding: This study was supported by grants
MOST 109-2218-E-010-003 and MOST 110-2622-
Lexical tones and emotions are conveyed by a similar set of acoustic parameters; therefore,
listeners of tonal languages face the challenge of processing lexical tones and emotions in
the acoustic signal concurrently. This study examined how emotions affect the acoustics
and perception of Mandarin tones. In Experiment 1, Mandarin tones were produced by professional actors with angry, fear, happy, sad, and neutral tones of voice. Acoustic analyses
on mean F0, F0 range, mean amplitude, and duration were conducted on syllables excised
from a carrier phrase. The results showed that emotions affect Mandarin tone acoustics to
different degrees depending on specific Mandarin tones and specific emotions. In Experiment 2, selected syllables from Experiment 1 were presented in isolation or in context. Listeners were asked to identify the Mandarin tones and emotions of the syllables. The results
showed that emotions affect Mandarin tone identification to a greater extent than Mandarin
tones affect emotion recognition. Both Mandarin tones and emotions were identified more
accurately in syllables presented with the carrier phrase, but the carrier phrase affected
Mandarin tone identification and emotion recognition to different degrees. These findings
suggest that lexical tones and emotions interact in complex but systematic ways.
Introduction
Speech conveys more than the linguistic message intended by a speaker. It provides information about the speaker such as physical characteristics, regional accent, and emotional state.
Since multiple sources of information often converge on the same acoustic parameters, the
two fundamental questions are how the different sources of information contribute to speech
acoustics, and how listeners disentangle these sources of information during speech perception. In this study, we investigated the relationship between emotional tones of voice (emotions hereafter) and lexical tones by examining how four common emotions shape the
acoustic characteristics of Mandarin tones, and how the emotions affect the perception of
Mandarin tones.
PLOS ONE | https://doi.org/10.1371/journal.pone.0283635 April 5, 2023
1 / 26
PLOS ONE
B-A49-001 from the Ministry of Science and
Technology, Taiwan ROC. The funders had no role
in study design, data collection and analysis,
decision to publish, or preparation of the
manuscript.
Competing interests: The authors have declared
that no competing interests exist.
Emotional and lexical tones
Emotional tone is defined as the vocal expression of emotion, which conveys a speaker’s
affective states, motivational states, or intended emotions [1–5]. The primary acoustic correlates of vocal emotions include fundamental frequency (F0), mean amplitude, and duration [1,
2, 6]. Previous research showed that F0 is the primary acoustic correlate of emotions [7–10],
whereas amplitude and duration serve as secondary cues [11, 12]. Importantly, F0 and amplitude are highly correlated with each other [8].
Lexical tones are used to distinguish words in tonal languages. In Mandarin, segmentally
identical words can be distinguished on the basis of F0 height or contour. For example, the syllable /ba/ means “eight”, “uproot”, “grip”, or “father” with Tone 1 (a high-flat tone), Tone 2
(mid-rising), Tone 3 (mid-falling-rising), or Tone 4 (high-falling), respectively. The primary
acoustic correlate of lexical tones is F0 [13]. Amplitude and duration also vary systematically
among Mandarin tones [13–15], and both contribute to Mandarin tone perception as secondary cues [16–19]. However, F0 remains the most powerful cue for the perception of Mandarin
tones [19–21].
Since the acoustic characteristics most relevant for lexical tones coincide with those for
emotions, the convergence raises the question of how emotions affect the acoustics and perception of lexical tones. A tonal language like Mandarin offers a unique opportunity to examine this question.
Theories of emotion
The two approaches to the analysis of emotion are the dimensional theory of emotion and the
theory of basic emotions [22]. The difference between these two approaches is that emotions
are either described as independent dimensions [23] or discrete entities [24]. In the dimensional approach, Russell (1980) [23] proposed a circlex model of emotion, which showed that
each emotion could be arranged in a circle controlled by two orthogonal dimensions in space:
valence and arousal [25–28]. The position of each emotion on the quadrant reflects different
amounts of valence and arousal traits [27, 29]. The valence dimension is associated with a person’s subjective feeling, ranging from displeasure to pleasure. The arousal dimension is associated with the energy of a person’s subjective feeling, ranging from sleep to excitement [28].
The theory of basic emotions suggests that human emotions are composed of a limited
number of basic emotions [30]. Each basic emotion has its proprietary neural circuits which
are structural (...truncated)