Learning Dextrous Manipulation Skills for Multifingered Robot Hands Using the Evolution Strategy
Editors: Henry Hexmoor and Maja Mataric
0
Department of Computer Science, University of Rochester
,
Rochester, NY 14627
1
Centro de Investigacion en Computacion
, I.P.N., Mexico City,
Mexico 07738
We present a method for autonomous learning of dextrous manipulation skills with multifingered robot hands. We use heuristics derived from observations made on human hands to reduce the degrees of freedom of the task and make learning tractable. Our approach consists of learning and storing a few basic manipulation primitives for a few prototypical objects and then using an associative memory to obtain the required parameters for new objects and/or manipulations. The parameter space of the robot is searched using a modified version of the evolution strategy, which is robust to the noise normally present in real-world complex robotic tasks. Given the difficulty of modeling and simulating accurately the interactions of multiple fingers and an object, and to ensure that the learned skills are applicable in the real world, our system does not rely on simulation; all the experimentation is performed by a physical robot, in this case the 16-degree-of-freedom Utah/MIT hand. Experimental results show that accurate dextrous manipulation skills can be learned by the robot in a short period of time. We also show the application of the learned primitives to perform an assembly task and how the primitives generalize to objects that are different from those used during the learning phase.
1. Introduction
A dextrous manipulator is a robotic system composed of two or more cooperating serial
manipulators. Dextrous manipulators have potential applications in areas such as prosthetics
and space and deep sea exploration, where a single robotic system will be required to perform
a variety of tasks and thus versatility rather than precision is the main requirement. However,
programming these robots to solve complex tasks in the real world has remained an elusive
goal. The complexity and unpredictability of the interactions of multiple effectors with
objects is an important reason for this difficulty.
In complex and hard-to-model situations, such as dextrous manipulation, it would be
desirable if the behaviors or skills exhibited by the robot could be learned autonomously by
means of the robots interaction with the world instead of being programmed by hand.
However, machine learning of robotic tasks using dextrous manipulators is extremely difficult,
mainly due to the high dimensionality of the parameter space of these robots. Conventional
approaches to this problem face the well-known curse of dimensionality (Bellman, 1957),
which essentially states that the number of samples required to learn a task grows
exponentially with the number of parameters of the task. Another problem is that autonomous
experimentation with real robots is expensive both in terms of time and equipment wear.
For these reasons, most applications of machine learning to robotics have dealt with simple
robots, and have concentrated on simple tasks with a few discrete states and actions.
A commonly used approach to make robot learning feasible despite the high
dimensionality of the sensory and motor spaces is to run the learning algorithms using simulated
environments. However, in situations involving complex robots and environments, it is difficult
or impossible to gather enough knowledge to build a realistic simulation (Mataric, 1994).
Moreover, some physical events such as sliding and collisions are difficult to simulate even
when there is complete knowledge. For these reasons, we believe that for the learned skills
to be applicable by the physical robot in its environment, much of the learning and
experimentation has to be carried out by the physical robot itself. Given the high cost of real-world
experimentation, for the learning algorithms to be successfully applied, it is crucial that they
converge within a reasonable number of trials.
Observations made on human hands offer some clues about how to deal with the problem
of the high dimensionality of the parameter space of dextrous manipulators. Arbib et al.
(1983) introduced the concept of virtual fingers as a model for task representation at higher
levels in the human central nervous system. In this model, a virtual finger is composed of
one or more real fingers working together to solve a problem in a task. The use of virtual
fingers limits the degrees of freedom to those needed for a task, rather than the number of
physical degrees of freedom the hand, human or robotic, has. Iberall (1987) has shown
how the hand can be used as essentially three different grippers by changing the mapping
of virtual fingers to real fingers. A generalization of virtual fingers, called a virtual tool
(Nelson et al., 1995; Fuentes & Nelson, 1996b), has been proposed as a way of dealing with
the redundant degrees of freedom of complex robotic systems.
An object translation by a human hand using a precision grasp, that is, (...truncated)