Learning to grasp and extract affordances: the Integrated Learning of Grasps and Affordances (ILGA) model
Biol Cybern
Learning to grasp and extract affordances: the Integrated Learning of Grasps and Affordances (ILGA) model
James Bonaiuto 0 1 2 3
Michael A. Arbib 0 1 2 3
0 USC Brain Project, University of Southern California , Los Angeles, CA 90089-2520 , USA
1 Neuroscience Program, University of Southern California , Los Angeles, CA 90089-2520 , USA
2 Sobell Department of Motor Neuroscience and Movement Disorders, University College London , London WC1N3BG , UK
3 Computer Science Department, University of Southern California , Los Angeles, CA 90089-2520 , USA
The activity of certain parietal neurons has been interpreted as encoding affordances (directly perceivable opportunities) for grasping. Separate computational models have been developed for infant grasp learning and affordance learning, but no single model has yet combined these processes in a neurobiologically plausible way. We present the Integrated Learning of Grasps and Affordances (ILGA) model that simultaneously learns grasp affordances from visual object features and motor parameters for planning grasps using trial-and-error reinforcement learning. As in the Infant Learning to Grasp Model, we model a stage of infant development prior to the onset of sophisticated visual processing of hand-object relations, but we assume that certain premotor neurons activate neural populations in primary motor cortex that synergistically control different combinations of fingers. The ILGA model is able to extract affordance representations from visual object features, learn motor parameters for generating stable grasps, and generalize its learned representations to novel objects.
Neural network model; Grasping; Infant development; Affordances
1 Introduction
The notion of affordances as directly perceivable
opportunities for action
(Gibson 1966)
was used to interpret the activity
of certain parietal neurons as encoding affordances for
grasping in the FARS model of parieto-frontal interactions in
grasping
(Fagg and Arbib 1998)
. However, the FARS model
“hard-wires” these affordances, whereas our concern is with
the development of these affordances and the grasps they
afford. While computational models of infant grasp
learning
(Oztop et al. 2004)
and affordance learning
(Oztop et al.
2006)
have been developed that work in a staged fashion,
there do not exist any models that learn affordance
extraction and grasp motor programs simultaneously. This model
follows from a suggestion of
Arbib et al. (2009)
and
implements a dual learning system that simultaneously learns both
grasp affordances and motor parameters for planning grasps
using trial-and-error reinforcement learning. As in the Infant
Learning to Grasp Model
(ILGM, Oztop et al. 2004)
, we
model a stage of infant development prior to the onset of
sophisticated visual processing of hand–object relations, but
as in the FARS model
(Fagg and Arbib 1998)
, we assume
that certain premotor neurons activate neural populations in
primary motor cortex that synergistically controls different
combinations of fingers. The issue is to understand how
different visual patterns can activate the appropriate subset of
these neurons. Specifically, the task of ILGA is to learn (i)
“affordances,” representations of object features that indicate
where it can be grasped, and (ii) motor parameters that can
be used to successfully grasp objects based on these
representations.
Newborn infants aim their arm movements toward fixated
objects
(von Hofsten 1982)
. These early arm movements have
been related to the development of object-directed reaching
(Bhat et al. 2005)
, leading to grasping
(Bhat and
Galloway 2006)
, the development of which continues throughout
childhood
(Kuhtz-Buschbeck et al. 1998)
. Previous
relevant models of infant motor development include
Berthier’s
(1996
), Berthier et al. (2005) and
Caligiore et al.’s (2014
)
models of learning to reach and the ILGM. The thread shared
by these models is reinforcement-based learning of
intrinsically motivated goal-directed actions based on exploratory
movements, or motor babbling: Movements are generated
erratically in response to a target and the mechanisms
generating the movements are modified via positive reinforcement
(Cangelosi and Schlesinger 2013)
.
Grasping in development seems to increasingly involve
visual information in preprogramming the grasp
(Lasky
1977; Lockman et al. 1984; Von Hofsten and Ronnqvist 1988;
Clifton et al. 1993; Newell et al. 1993; Witherington 2005)
.
In ILGM, the affordance extraction module only represented
the presence, position, or orientation of an object. All
fingers were extended in the initial “preshape” portion of the
grasp to a maximal aperture. Initially, the enclosure was
triggered by the palmar reflex upon object contact. However,
each time the result of reflex grasping provided a stable
grasp, this grasp was reinforced, and over time, a
repertoire developed of situations in which a stable grasp could
be (...truncated)