Embodied Gesture Processing: Motor-Based Integration of Perception and Action in Social Artificial Agents

Cognitive Computation, Nov 2010

A close coupling of perception and action processes is assumed to play an important role in basic capabilities of social interaction, such as guiding attention and observation of others’ behavior, coordinating the form and functions of behavior, or grounding the understanding of others’ behavior in one’s own experiences. In the attempt to endow artificial embodied agents with similar abilities, we present a probabilistic model for the integration of perception and generation of hand-arm gestures via a hierarchy of shared motor representations, allowing for combined bottom-up and top-down processing. Results from human-agent interactions are reported demonstrating the model’s performance in learning, observation, imitation, and generation of gestures.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs12559-010-9082-z.pdf

Embodied Gesture Processing: Motor-Based Integration of Perception and Action in Social Artificial Agents

Amir Sadeghipour 0 Stefan Kopp 0 0 A. Sadeghipour (&) S. Kopp Sociable Agents Group, Cognitive Interaction Technology (CITEC), Bielefeld University , P.O. Box 100 131, 33501 Bielefeld, Germany A close coupling of perception and action processes is assumed to play an important role in basic capabilities of social interaction, such as guiding attention and observation of others' behavior, coordinating the form and functions of behavior, or grounding the understanding of others' behavior in one's own experiences. In the attempt to endow artificial embodied agents with similar abilities, we present a probabilistic model for the integration of perception and generation of hand-arm gestures via a hierarchy of shared motor representations, allowing for combined bottom-up and top-down processing. Results from human-agent interactions are reported demonstrating the model's performance in learning, observation, imitation, and generation of gestures. - In social interactions, one is continuously confronted with an intricate complexity of verbal and nonverbal behavior, including hand-arm gestures, body movements or facial expressions. All of these behaviors can be indicative of the others referential, communicative, or social intentions [1]. In this paper, we focus on hand-arm gestures. Interlocutors in social interaction incessantly and concurrently produce and perceive a variety of gestures. The generation of a handarm gesture, coarsely, consists of two steps. First, finding the proper gesture for an intention that is to be realized under current context constraints. Second, performing the gesture using ones motor repertoire. Similarly, the recipient perceives and analyzes the others movement both at motor and at intention levels. Cumulating evidence suggests that these two processes are not separate, but that recognizing and understanding a gesture is grounded in the perceivers own motor repertoire [2, 3]. In other words, a hand movement is understood, at least partially, by evoking the motor system of the observer. This is evidenced by socalled motor resonances showing that the motor and action (premotor) systems become activated during both performance and observation of bodily behavior [46]. One hypothesis is that these neural resonances reflect the involvement of the motor system in deriving predictions and evaluating hypotheses about the incoming observations. This integration of perception and action enables imitating or mimicking the observed behavior, either overtly or covertly, and thus forms an embodied basis for understanding other embodied agents [7], and for communication and intersubjectivity of intentional agents more generally (cf. simulation theory [8]). Hence, perception-action links (and resulting resonances) are assumed to be effective at various levels of a hierarchical perceptual-motor system, from kinematic features to motor commands to goals and intentions [9], whereas these levels interact bi-directionally; bottom-up and top-down [10]. Further, a close perceptionaction integration can be assumed to support two important ingredients of social interaction: First, fast and often subconscious inter-personal coordinations (e.g., alignment, mimicry, interactional synchrony) that lead to rapport [11] Fig. 1 Overall model for cognitive processes of embodied perception and generation, integrated in a shared motor knowledge and social resonance [12] between interactants. Second, social learning of behavior by means of imitation, which helps to acquire and interactively establish behavior through connected perceiving, processing, and reproducing of their pertinent features. All of these aforementioned effects may also applyat least to a certain extentto the interaction between humans and embodied agents, be it physical robots or virtual characters (see [12] for a detailed discussion). For example, brain imaging studies [13, 14] showed that artificial agents with sufficiently natural appearance and movements can evoke motor resonances in human observers. Against this background, we aim for interactive embodied systems ultimately able to engage in social interactions, in a human-like manner, based on cognitively plausible mechanisms. A central ingredient is a computational model for integrated perception and generation of hand-arm gestures. This model has to fulfill a number of requirements: (1) perceiving and generating behavior in a fast, robust, and incremental manner, (2) concurrent and mutually interacting perception and generation, (3) concurrent processing at different levels of motor abstraction, from movement trajectories to intentions; (4) incremental construction of hierarchical knowledge structures through learning from observation and imitation. In this paper, we present a cognitive computational model that has been devised and developed to meet the above-mentioned requirements for the domain of hand-arm gestures. Focusing on the motor aspect of gestures, it should also serve as a basis for future modeling of higher cognitive levels of social intentions. In the section Shared Motor Knowledge Model, we introduce the Shared Motor Knowledge Model that serves as a basis for integrating perception and action, both of which operate upon these knowledge structures by means of forward/inverse models. In A Probabilistic Model of Motor Resonances we present a probabilistic approach to simulate fast, incremental and concurrent resonances and their exploitation of these structures in both perceiving and generating behavior. Section Perception-Action Integration details how the integration of perception and action is achieved in this model and how this helps to model and cope with characteristics of nonverbal human social interaction. Results of applying this model to real-world data (marker-free gesture tracking) from a human-agent interaction scenario are reported in Results. In the final section we discuss our work in comparison to other related work. Shared Motor Knowledge Model In previous work [15], we have presented a cognitive model for hierarchical representations of motor knowledge for hand-arm gestures, and we proposed how these structures can be utilized for probabilistic embodied behavior perception. Here, we present an extended version of this model that serves as a unified basis for both perception and generation of hand-arm movements (wrist position trajectories, to be specific) as they occur in natural gesturing by human users in interaction with a humanoid virtual agent. Overall, the model consists of three main modules (see Fig. 1): shared motor knowledge, perception and generation. This model allows for parallel gesture generation and perception processes grounded in shared motor knowledge. Further, the hierarchical model enables bottom-up processing (mainly for perceptual tasks) interacting bidirectionally with top-down processing (for action production as well as attention and perception guidance). In the remainder of this section, we d (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs12559-010-9082-z.pdf
Article home page: http://link.springer.com/article/10.1007/s12559-010-9082-z

Amir Sadeghipour, Stefan Kopp. Embodied Gesture Processing: Motor-Based Integration of Perception and Action in Social Artificial Agents, Cognitive Computation, 2010, pp. 419-435, Volume 3, Issue 3, DOI: 10.1007/s12559-010-9082-z