Neural Mechanisms of Object-Based Attention
Cerebral Cortex April 2015;25:1080–1092
doi:10.1093/cercor/bht303
Advance Access publication November 11, 2013
Neural Mechanisms of Object-Based Attention
Elias H. Cohen and Frank Tong
Psychology Department and Vanderbilt Vision Research Center, Vanderbilt University, Nashville, TN 37240, USA
Address correspondence to Dr Elias H. Cohen. Email:
Keywords: fMRI, fusiform face area, human visual cortex, multivariate
pattern analysis, parahippocampal place area, visual attention
Introduction
According to prominent theories of object-based attention, the
attentional system is predisposed to select entire visual objects
during top-down enhancement (Duncan 1984; Kahneman
et al. 1992; Baylis and Driver 1993; Blaser et al. 2000; Driver
et al. 2001; Scholl 2001). The ability to enhance the visual representation of entire objects, even in the presence of spatially
overlapping distractors, may be particularly useful for distinguishing objects in cluttered real-world scenes (Peelen et al.
2009; Cohen et al. 2011). For example, consider a predator attempting to identify its prey hiding in a thicket of ferns. In
such situations, object-based attention could be used to selectively enhance the relevant portions of the image belonging to
the partially hidden animal, and to suppress information from
competing objects, such as the leafy branches that lie before or
around the attended object.
Most neural investigations of object-based attention have
relied on simple stimuli, such as intersecting lines, simple shapes,
or overlapping sets of moving dots, which can be readily segmented and perceptually organized based on their spatiotemporal
continuity. These studies suggest that top-down feedback to early
visual areas is important for the attentional selection of simple
objects or perceptual groups (Roelfsema et al. 1998; ValdesSosa et al. 1998; Blaser et al. 2000; Muller and Kleinschmidt
2003; Schoenfeld et al. 2003; Fallah et al. 2007; Ciaramitaro
et al. 2011; Hou and Liu 2012).
© The Author 2013. Published by Oxford University Press. All rights reserved.
For Permissions, please e-mail:
However, real-world stimuli such as people, vehicles, or
buildings are far more complex in their featural and spatial
characteristics. Correspondingly, a more sophisticated mechanism appears necessary to explain how top-down attention can
enhance the representation of a complex object when it
appears in the presence of a competing overlapping distractor.
In this case, object-based selection would need to be informed
by high-level knowledge regarding the detailed visual structure
of the attended object; otherwise, there would be little basis for
distinguishing the features of one object from those of another
under conditions of spatial overlap (see Fig. 1a). Only a few
studies have investigated this more challenging form of objectbased attentional selection, focusing on the modulatory effects
of attention in high-level object areas and the activation of frontoparietal control networks during this top-down selection
process (O’Craven et al. 1999; Serences et al. 2004; Furey et al.
2006). However, recent work by Al-Aidroos et al. (2012) has
provided evidence to suggest that feedback to early visual
areas may also contribute to the attentional selection of
complex objects. They found that the functional connectivity
between category-selective object areas and early visual areas
was reliably modulated, depending on whether participants
were attending to faces or scenes presented under conditions
of spatial overlap. These findings suggest a possible role for
early visual areas in the attentional selection of complex
objects; however, it is unclear what types of visual signals
might be enhanced in these early areas to mediate this selection process.
The goal of our study was to determine whether objectbased attention might rely on pattern-specific feedback to
early visual areas to selectively enhance the set of low-level features corresponding to the attended object. Although early
visual areas are primarily tuned to local features and insensitive to complex object properties, we hypothesized that attending to 1 of 2 overlapping objects may depend on selectively
enhancing the visual representations of the local features corresponding to the attended object. This hypothesis leads to the
following predictions. First, when covert attention is directed
toward 1 of 2 overlapping objects, activity patterns in early
visual areas should be biased toward the pattern that would
result if the attended stimulus were presented in isolation.
Such a prediction can be viewed as an extension of the biased
competition model (Desimone and Duncan 1995). Second, if
feedback to early visual areas contributes to the attentional selection of object-relevant signals, then the strength of this
pattern-specific attentional bias signal in early visual areas
should be predictive of the strength of attentional modulation
found in high-level object areas. Such functional coupling
would imply that early-stage attentional filtering can determine
the quality of object-selective information that ultimately
reaches higher level visual areas. Finally, we predicted that attentional modulation in early visual areas should be reliant
upon high-level object knowledge, such that relevant features
What neural mechanisms underlie the ability to attend to a complex
object in the presence of competing overlapping stimuli? We evaluated whether object-based attention might involve pattern-specific
feedback to early visual areas to selectively enhance the set of lowlevel features corresponding to the attended object. Using fMRI and
multivariate pattern analysis, we found that activity patterns in early
visual areas (V1–V4) are strongly biased in favor of the attended
object. Activity patterns evoked by single faces and single houses
reliably predicted which of the 2 overlapping stimulus types was
being attended with high accuracy (80–90% correct). Superior
knowledge of upright objects led to improved attentional selection in
early areas. Across individual blocks, the strength of the attentional
bias signal in early visual areas was highly predictive of the modulations found in high-level object areas, implying that pattern-specific
attentional filtering at early sites can determine the quality of objectspecific signals that reach higher level visual areas. Through computational modeling, we show how feedback of an average template to
V1-like units can improve discrimination of exemplars belonging to
the attended category. Our findings provide a mechanistic account of
how feedback to early visual areas can contribute to the attentional
selection of complex objects.
Materials and Methods
Participants
A total of 10 healthy observers, aged 23–32, participated in one or
more of the following experiments, with 6 observers in Experiment 1
(observers 1, 2, 3, 4, 5, 6), 5 observers in Experiment 2 (1, 2, 4, 5, 7), 5
observers in Experiment 3 (1, 3, 7, 8, 9), and 5 obse (...truncated)