Separability and Commonality of Auditory and Visual Bistable Perception
Hirohito M. Kondo
2
3
Norimichi Kitagawa
2
3
Miho S. Kitamura
1
2
3
Ai Koizumi
0
2
3
Michio Nomura
2
3
4
7
Makio Kashino
2
3
5
6
0
Department of Psychology, The University of Tokyo
,
Tokyo 113-0033, Japan
1
Research Center for Advanced Science and Technology, The University of Tokyo
,
Tokyo 153-8904, Japan
2
The Author 2011. Published by Oxford University Press. All rights reserved. For permissions
, please
3
NTT Communication Science Laboratories
, NTT Corporation, Atsugi, Kanagawa 243-0198,
Japan
4
Current address: Department of Cognitive Psychology in Education, Graduate School of Education, Kyoto University
,
Kyoto 606-8501, Japan
5
Department of Information Processing, Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
, Yokohama, Kanagawa 226-8503,
Japan
6
ERATO Shimojo Implicit Brain Function Project,
Japan Science and Technology Agency
, Atsugi, Kanagawa 243-0198, Japan and
7
Division of Human Sciences, Graduate School of Integrated Arts and Sciences, Hiroshima University
, Higashi-Hiroshima, Hiroshima 739-8521,
Japan
It is unclear what neural processes induce individual differences in perceptual organization in different modalities. To examine this issue, the present study used different forms of bistable perception: auditory streaming, verbal transformations, visual plaids, and reversible figures. We performed factor analyses on the number of perceptual switches in the tasks. A 3-factor model provided a better fit to the data than the other possible models. These factors, namely the ''auditory,'' ''shape,'' and ''motion'' factors, were separable but correlated with each other. We compared the number of perceptual switches among genotype groups to identify the effects of neurotransmitter functions on the factors. We focused on polymorphisms of catechol-O-methyltransferase (COMT) Val158Met and serotonin 2A receptor (HTR2A) -1438G/A genes, which are involved in the modulation of dopamine and serotonin, respectively. The number of perceptual switches in auditory streaming and verbal transformations differed among COMT genotype groups, whereas that in reversible figures differed among HTR2A genotype groups. The results indicate that the auditory and shape factors reflect the functions of the dopamine and serotonin systems, respectively. Our findings suggest that the formation and selection of percepts involve neural processes in cortical and subcortical areas.
Introduction
We perceive the world as stable, although sensory inputs are
often ambiguous due to spatial and temporal occluders. This
raises an important question regarding how stable percepts are
formed in the brain. Bistable perception phenomena provide us
with clues enabling us to investigate that issue because
constant physical stimulation leads to spontaneous switching
between different stable percepts. Although in the past, the
formation and selection of percepts have been investigated with
binocular rivalry, reversible figures, and visual plaids in the visual
domain (Kleinschmidt et al. 1998; Tong et al. 1998;
CasteloBranco et al. 2002; Haynes et al. 2005; Wunderlich et al. 2005),
more recently, they have been studied with auditory streaming
and verbal transformations in the auditory domain (Gutschalk
et al. 2005; Kondo and Kashino 2007, 2009; Schadwinkel and
Gutschalk 2011). However, individual variation in perceptual
switching has been overlooked in favor of averaging differences
to focus on stimulus-specific commonalities.
An early study found a wide range of individual differences in
the rate of binocular rivalry (Pettigrew and Miller 1998), where
monocular images are presented to different eyes. The authors
also pointed out that the rate of binocular rivalry is slow in
patients with bipolar disorder, which is strongly heritable. This
suggests that genetic factors influence the formation and
selection of visual percepts. A large-sample twin heritability study
has demonstrated that an approximately 50% variance in
binocular rivalry rate is accounted for by additive genetic factors
(Miller et al. 2010). In addition, a recent twin study confirmed
that genetic factors affect the switching rate of the reversible
figure as well as that of binocular rivalry (Shannon et al. 2011).
These findings indicate that there is a substantial genetic
contribution to bistable perception, particularly in the visual
domain. However, it is unclear what neural processes are involved
in individual differences in bistable perception and whether
auditory bistability is functionally linked with visual bistability.
The present study elucidated the above issues using different
forms of ambiguous stimuli. Perceptual switches in auditory
streaming and verbal transformations are caused by prolonged
listening to a sound sequence consisting of a triplet tone (van
Noorden 1975; Bregman 1990) and word (Warren and Gregory
1958; Warren 1961). Perceptual switches in visual plaids and
reversible figures are produced by observing moving gratings
(Wallach 1935; Adelson and Movshon 1982; Hupe and Rubin
2003) and static figures (Long and Toppino 2004). There is still
some controversy as to whether spontaneous perceptual
switching is modulated by distributed processes within the
sensory cortices (Pressnitzer and Hupe 2006) or a central
oscillator within the subcortical areas (Pettigrew and Miller
1998). The present study employed 2 different approaches to
clarify the relationship between auditory and visual bistability.
First, we performed factor analyses of the number of the
perceptual switches in the tasks and compared the fit indices
of 1-factor and multifactor models. A factor analysis estimates
the degree to which the variances of the observed variables can
be explained by a small number of latent variables called
factors. Thus, the analysis allows us to specify the underlying
structure among observed and latent variables: The observed
variables are modeled as linear combinations of the common
factors and error terms. If perceptual switching in different
modalities is governed by a single rhythm generator, the 1-factor
model should provide a better fit to the data. Conversely, if
different forms of bistable perception are implemented in
several brain modules, the multifactor model should fit the data.
However, a factor analysis provides a heuristic interpretation of
the results and cannot identify the functional linkage between
the factors and neural processes.
Second, we used a genotype group comparison to examine
which neurotransmitter functions are associated with the
factors. Previous studies have argued that the timing of
perceptual switching is modulated by the autonomic nervous
system via noradrenaline (Einha user et al. 2008) and by drugs
affecting the functions of the serotonin receptors (Carter et al.
2005, 2007; Nagamine et al. 2008). Thus, the dopamine and
serotonin systems may be involved in the underlying neural
processes of bistable perception. We focused on the functional
p (...truncated)