Categorical Vowel Perception Enhances the Effectiveness and Generalization of Auditory Feedback in Human-Machine-Interfaces
Stepp CE (2013) Categorical Vowel Perception Enhances the Effectiveness and Generalization of Auditory Feedback in
Human-Machine-Interfaces. PLoS ONE 8(3): e59860. doi:10.1371/journal.pone.0059860
Categorical Vowel Perception Enhances the Effectiveness and Generalization of Auditory Feedback in Human-Machine-Interfaces
Eric Larson 0
Howard P. Terry 0
Margaux M. Canevari 0
Cara E. Stepp 0
Bernd Sokolowski, University of South Florida, United States of America
0 1 Institute for Learning and Brain Sciences, University of Washington, Seattle, Washington, United States of America, 2 Department of Speech, Language, and Hearing Sciences, Boston University , Boston , Massachusetts, United States of America, 3 Department of Biomedical Engineering, Boston University , Boston, Massachusetts , United States of America
Human-machine interface (HMI) designs offer the possibility of improving quality of life for patient populations as well as augmenting normal user function. Despite pragmatic benefits, utilizing auditory feedback for HMI control remains underutilized, in part due to observed limitations in effectiveness. The goal of this study was to determine the extent to which categorical speech perception could be used to improve an auditory HMI. Using surface electromyography, 24 healthy speakers of American English participated in 4 sessions to learn to control an HMI using auditory feedback (provided via vowel synthesis). Participants trained on 3 targets in sessions 1-3 and were tested on 3 novel targets in session 4. An ''established categories with text cues'' group of eight participants were trained and tested on auditory targets corresponding to standard American English vowels using auditory and text target cues. An ''established categories without text cues'' group of eight participants were trained and tested on the same targets using only auditory cuing of target vowel identity. A ''new categories'' group of eight participants were trained and tested on targets that corresponded to vowel-like sounds not part of American English. Analyses of user performance revealed significant effects of session and group (established categories groups and the new categories group), and a trend for an interaction between session and group. Results suggest that auditory feedback can be effectively used for HMI operation when paired with established categorical (native vowel) targets with an unambiguous cue.
-
Funding: This study was funded by National Institutes of Health (NIH) National Institute on Deafness and Other Communication Disorders (NIDCD) training
grants T32DC000018 and 1F32DC012456 (EDL), and an Undergraduate Teaching and Scholarship Grant from Boston University. The funders had no role in study
design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
Human-machine interfaces (HMIs) are designed to translate
volitionally produced physiological signals into commands or
control signals to augment or restore normal user function. For
example, a common goal of HMI designs is to improve
communication or mobility for patients with spinal cord injury,
stroke or amyotrophic lateral sclerosis (ALS), in whom typical
motor function has been greatly reduced or eliminated. HMI
designs typically utilize biosignals such as electro-encephalography
(EEG) or surface electromyography (sEMG) (which translate scalp
potentials or muscle activities, respectively, into control signals). In
many HMI designs, users are required to imagine specific motor
movements [1] or fixate on a target in a visual scene to evoke P300
[2] or steady-state visual responses [3]. Although effective, these
types of HMI designs rely on visual feedback or sustained visual
attention for operation, thereby limiting normal user visual
function. In an attempt to overcome this, multiple auditory-based
HMI designs have exploited the cortical potentials evoked by
presented auditory stimuli with increasing success [47].
Unfortunately, recent studies on HMI designs utilizing the
auditory modality have found that auditory feedback is inferior to
visual feedback, both in terms of resulting participant performance
[8] and required participant training time [9]. Nonetheless, it has
been shown that audio-visual training of spectrally-complex
auditory categories (via implicit association with visual stimulus
categories) could be used to obtain accurate participant
categorization of novel tokens from the trained auditory categorical
distributions [10], suggesting that audio-visual feedback can be
used to train auditory categorization.
Utilizing auditory categorical perception is complicated by the
fact that participants can experience greater difficulty forming
multi-dimensional auditory categorical judgments than
unidimensional judgments [11]. However, auditory categorization of vowel
sounds requires multi-dimensional categorization in the formant
one formant two (F1F2) plane and listeners are able to quickly
categorize these sounds for speech perception. Critically, listeners
tend to perform effective discrimination only among those vowel
categories that are perceptually relevant during early language
acquisition [12]. Moreover, these learned vowel categories exhibit
a so-called perceptual magnet effect, whereby similar repeated
stimuli both become more easily categorized yet less readily
discriminable [13]. For example, listeners perceive novel vowel
categories those not utilized in their native language in terms of
the perceptual categories of their native language [14], even when
explicitly trained to learn the new perceptual categories [15], and
listeners only distinguish between sounds belonging to one
particular category when explicitly trained to do so [16]. It is
not surprising that recent a recent study utilizing motor imagery
and implanted cortical electrodes [17] showed high performance
by mapping two dimensional control to two dimensional vowel
(formant) space.
Here we aim to determine the specific effects of underlying
native two-dimensional vowel categorization abilities of listeners
on HMI control in order to inform future development of effective
auditory HMIs. To obtain a high signal-to-noise ratio (SNR)
control signal to test the effectiveness of our auditory
vowelproduction feedback, here we utilize sEMG as it provides signals
several orders of magnitude larger in amplitude than EEG. The
sEMG-based system utilized here provides real-time feedback to
participants while requiring only a USB-based soundcard and
sEMG amplifier connected to a standard laptop running custom
C++ software (see Methods), but other control signals (e.g., EEG)
could be substituted in principle. After training participants to
produce specific vowel sounds based on continuous auditory and
visual feedback, we found that participants readily transferred
ability to control the HMI using auditory feedback alone.
Criti (...truncated)