Learning to grasp and extract affordances: the Integrated Learning of Grasps and Affordances (ILGA) model

Biological Cybernetics, Nov 2015

The activity of certain parietal neurons has been interpreted as encoding affordances (directly perceivable opportunities) for grasping. Separate computational models have been developed for infant grasp learning and affordance learning, but no single model has yet combined these processes in a neurobiologically plausible way. We present the Integrated Learning of Grasps and Affordances (ILGA) model that simultaneously learns grasp affordances from visual object features and motor parameters for planning grasps using trial-and-error reinforcement learning. As in the Infant Learning to Grasp Model, we model a stage of infant development prior to the onset of sophisticated visual processing of hand–object relations, but we assume that certain premotor neurons activate neural populations in primary motor cortex that synergistically control different combinations of fingers. The ILGA model is able to extract affordance representations from visual object features, learn motor parameters for generating stable grasps, and generalize its learned representations to novel objects.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs00422-015-0666-2.pdf

Learning to grasp and extract affordances: the Integrated Learning of Grasps and Affordances (ILGA) model

Biol Cybern Learning to grasp and extract affordances: the Integrated Learning of Grasps and Affordances (ILGA) model James Bonaiuto 0 1 2 3 Michael A. Arbib 0 1 2 3 0 USC Brain Project, University of Southern California , Los Angeles, CA 90089-2520 , USA 1 Neuroscience Program, University of Southern California , Los Angeles, CA 90089-2520 , USA 2 Sobell Department of Motor Neuroscience and Movement Disorders, University College London , London WC1N3BG , UK 3 Computer Science Department, University of Southern California , Los Angeles, CA 90089-2520 , USA The activity of certain parietal neurons has been interpreted as encoding affordances (directly perceivable opportunities) for grasping. Separate computational models have been developed for infant grasp learning and affordance learning, but no single model has yet combined these processes in a neurobiologically plausible way. We present the Integrated Learning of Grasps and Affordances (ILGA) model that simultaneously learns grasp affordances from visual object features and motor parameters for planning grasps using trial-and-error reinforcement learning. As in the Infant Learning to Grasp Model, we model a stage of infant development prior to the onset of sophisticated visual processing of hand-object relations, but we assume that certain premotor neurons activate neural populations in primary motor cortex that synergistically control different combinations of fingers. The ILGA model is able to extract affordance representations from visual object features, learn motor parameters for generating stable grasps, and generalize its learned representations to novel objects. Neural network model; Grasping; Infant development; Affordances 1 Introduction The notion of affordances as directly perceivable opportunities for action (Gibson 1966) was used to interpret the activity of certain parietal neurons as encoding affordances for grasping in the FARS model of parieto-frontal interactions in grasping (Fagg and Arbib 1998) . However, the FARS model “hard-wires” these affordances, whereas our concern is with the development of these affordances and the grasps they afford. While computational models of infant grasp learning (Oztop et al. 2004) and affordance learning (Oztop et al. 2006) have been developed that work in a staged fashion, there do not exist any models that learn affordance extraction and grasp motor programs simultaneously. This model follows from a suggestion of Arbib et al. (2009) and implements a dual learning system that simultaneously learns both grasp affordances and motor parameters for planning grasps using trial-and-error reinforcement learning. As in the Infant Learning to Grasp Model (ILGM, Oztop et al. 2004) , we model a stage of infant development prior to the onset of sophisticated visual processing of hand–object relations, but as in the FARS model (Fagg and Arbib 1998) , we assume that certain premotor neurons activate neural populations in primary motor cortex that synergistically controls different combinations of fingers. The issue is to understand how different visual patterns can activate the appropriate subset of these neurons. Specifically, the task of ILGA is to learn (i) “affordances,” representations of object features that indicate where it can be grasped, and (ii) motor parameters that can be used to successfully grasp objects based on these representations. Newborn infants aim their arm movements toward fixated objects (von Hofsten 1982) . These early arm movements have been related to the development of object-directed reaching (Bhat et al. 2005) , leading to grasping (Bhat and Galloway 2006) , the development of which continues throughout childhood (Kuhtz-Buschbeck et al. 1998) . Previous relevant models of infant motor development include Berthier’s (1996 ), Berthier et al. (2005) and Caligiore et al.’s (2014 ) models of learning to reach and the ILGM. The thread shared by these models is reinforcement-based learning of intrinsically motivated goal-directed actions based on exploratory movements, or motor babbling: Movements are generated erratically in response to a target and the mechanisms generating the movements are modified via positive reinforcement (Cangelosi and Schlesinger 2013) . Grasping in development seems to increasingly involve visual information in preprogramming the grasp (Lasky 1977; Lockman et al. 1984; Von Hofsten and Ronnqvist 1988; Clifton et al. 1993; Newell et al. 1993; Witherington 2005) . In ILGM, the affordance extraction module only represented the presence, position, or orientation of an object. All fingers were extended in the initial “preshape” portion of the grasp to a maximal aperture. Initially, the enclosure was triggered by the palmar reflex upon object contact. However, each time the result of reflex grasping provided a stable grasp, this grasp was reinforced, and over time, a repertoire developed of situations in which a stable grasp could be (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs00422-015-0666-2.pdf

James Bonaiuto, Michael A. Arbib. Learning to grasp and extract affordances: the Integrated Learning of Grasps and Affordances (ILGA) model, Biological Cybernetics, 2015, pp. 639-669, Volume 109, Issue 6, DOI: 10.1007/s00422-015-0666-2