A Supramodal Neural Network for Speech and Gesture Semantics: An fMRI Study
Citation: Straube B, Green A, Weis S, Kircher T (
A Supramodal Neural Network for Speech and Gesture Semantics: An fMRI Study
Benjamin Straube 0
Antonia Green 0
Susanne Weis 0
Tilo Kircher 0
Emmanuel Andreas Stamatakis, University Of Cambridge, United Kingdom
0 1 Department of Psychiatry and Psychotherapy, Philipps-University Marburg , Marburg, Germany , 2 Department of Psychiatry and Psychotherapy, RWTH Aachen University , Aachen, Germany , 3 Department of Neurology, RWTH Aachen University , Aachen, Germany , 4 Department of Psychology, Durham University , Durham , United Kingdom
In a natural setting, speech is often accompanied by gestures. As language, speech-accompanying iconic gestures to some extent convey semantic information. However, if comprehension of the information contained in both the auditory and visual modality depends on same or different brain-networks is quite unknown. In this fMRI study, we aimed at identifying the cortical areas engaged in supramodal processing of semantic information. BOLD changes were recorded in 18 healthy right-handed male subjects watching video clips showing an actor who either performed speech (S, acoustic) or gestures (G, visual) in more (+) or less (2) meaningful varieties. In the experimental conditions familiar speech or isolated iconic gestures were presented; during the visual control condition the volunteers watched meaningless gestures (G2), while during the acoustic control condition a foreign language was presented (S2). The conjunction of the visual and acoustic semantic processing revealed activations extending from the left inferior frontal gyrus to the precentral gyrus, and included bilateral posterior temporal regions. We conclude that proclaiming this frontotemporal network the brain's core language system is to take too narrow a view. Our results rather indicate that these regions constitute a supramodal semantic processing network.
-
Funding: This research was supported by a grant from the IZKF Aachen (Interdisciplinary Centre for Clinical Research within the faculty of Medicine at RWTH
Aachen University; VV N68-e) and by the DFG (Deutsche Forschungsgemeinschaft; IRTG 1328 and Ki 588/6-1). BS is supported by the German Federal Ministry of
Education and Research (BMBF; project no. 01GV0615). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of
the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
. These authors contributed equally to this work.
Comprehension of natural language is a complex capacity,
depending on several cognitive and neural systems. Over the last
years knowledge of the brain processes underlying single word and
sentence processing has grown by examining phonological,
semantic and syntactic/sentence processing networks. But not
only speech is a communicative source, features such as tone of
voice, facial expression, body posture, and gestures also transmit
meaning that has to be decoded. Whether such meaning derived
from speech and gesture is (at least partly) represented in a
common neural network is an important question to better
understand the neural organization of semantics and especially its
flexible utilization for communication. Therefore, this study
investigates whether there is a brain network common to the
processing of both speech and gesture semantics.
There is consensus that brain regions crucial for the processing
of spoken or written language are the left inferior frontal gyrus
(LIFG), the left temporal cortex, and their homologues in the right
hemisphere [13]. Retrieval of semantic information, the
processing of semantic relations between words and the processing of
syntax in sentences have been related to the LIFG (especially BA
44/45 and 47) [1,4,5]. The left temporal cortex is stronger
involved in sentential semantic processing than in syntactic
processing. Especially posterior aspects of the middle temporal
gyrus (MTG) and the inferior temporal gyrus (ITG) have been
linked to the interpretation of meaning on a sentence level [6],
detection of semantic anomalies [7], and maintenance of
conceptual information [8,9], with also the right hemispheric
homologue areas being involved [10]. These findings are
independent of the input modality, i.e. whether the language is
presented auditorily (spoken) or visually (written) [11,12].
From behavioral studies it is known that gestures indeed do
convey meaning. Several studies using event-related potentials
were able to show that gestures induce electrophysiological
correlates of semantic processing [1317]. Except pantomimes
(i.e. acting out a whole sequence of information) and emblems
(highly conventionalized symbols as the thumbs up-gesture), all
kinds of gestures are produced together with speech. However,
without accompanying speech the meaning of most gestures is not
fixed [18,19]. Concerning the neural correlates of gesture
processing without sentence context, several studies have
contrasted the viewing of meaningful complex gestures, such as
emblems, to that of meaningless gestures. Interestingly, the regions
commonly observed are the LIFG including Brocas area (BA 44,
45, 47), as well as the left middle temporal gyrus (MTG; BA 21;
[2022]). This activity was interpreted as the mapping of symbolic
gestures and spoken words onto common, corresponding
conceptual representations.
Further support for the idea that gesture semantics might be
processed in the same network as spoken language comes from
studies on sign language processing. Sign languages (SL) can
convey the same information contained in speech, but have
visuospatial properties similar to the properties of coverbal
gestures. Comparable to the results from spoken language
processing, neuroimaging studies on SL comprehension indicate
a crucial role for the left superior temporal gyrus/sulcus and the
LIFG (e.g., [23,24]).
Lastly, there is a growing number of studies examining the
processing of gestures in context of speech, highlighting the
importance of inferior frontal, posterior temporal and inferior
parietal regions (e.g., [2533]). Based upon the studies available it
seems justified to conclude that semantic processing of gestures
and semantic processing of speech activates an overlapping neural
network involving inferior frontal and posterior temporal regions.
The neural basis of gesture-speech interactions is investigated by
an increasing number of functional magnetic resonance imaging
(fMRI) studies [2539]. These studies predominantly focussed on
the processing of iconic coverbal gestures, suggesting that the left
posterior temporal cortex is especially relevant for the integration
of iconic gestures and the corresponding sentence context
[25,28,31,32]. However, left inferior frontal and parietal brain
activations were reported for mismatches between unrelated
concrete speech and iconic gesture information [25,29]. Although
these studies focussed on th (...truncated)