Rapid Concept Learning for Mobile Robots
SRIDHAR MAHADEVAN GEORGIOS THEOCHAROUS
0
1
Editors: Henry Hexmoor and Maja Mataric
0
NIKFAR KHALEELI Wind River Systems, Alameda,
CA 94501
1
Department of Computer Science, Michigan State University
, East Lansing,
MI 48824
Concept learning in robotics is an extremely challenging problem: sensory data is often highdimensional, and noisy due to specularities and other irregularities. In this paper, we investigate two general strategies to speed up learning, based on spatial decomposition of the sensory representation, and simultaneous learning of multiple classes using a shared structure. We study two concept learning scenarios: a hallway navigation problem, where the robot has to induce features such as opening or wall. The second task is recycling, where the robot has to learn to recognize objects, such as a trash can. We use a common underlying function approximator in both studies in the form of a feedforward neural network, with several hundred input units and multiple output units. Despite the high degree of freedom afforded by such an approximator, we show the two strategies provide sufficient bias to achieve rapid learning. We provide detailed experimental studies on an actual mobile robot called PAVLOV to illustrate the effectiveness of this approach.
-
Programming mobile robots to successfully operate in unstructured environments, including
offices and homes, is tedious and difficult. Easing this programming burden seems necessary
to realize many of the possible applications of mobile robot technology (Engleberger, 1989).
One promising avenue towards smarter and easier-to-program robots is to equip them with
the ability to learn new concepts and behaviors. In particular, robots that have the capability
of learning concepts could be programmed or instructed more readily than their non-learning
counterparts. For example, a robot that could be trained to recognize landmarks, such as
doors and intersections, would enable a more flexible navigation system. Similarly, a
recycling robot, which could be trained to find objects such as trash cans or soda cans,
could be adapted to new circumstances much more easily than non-learning robots (for
example, new objects or containers could be easily accommodated by additional training).
Robot learning is currently an active area of research (e.g., see (Connell & Mahadevan,
1993, Dorigo, 1996, Franklin, Mitchell & Thrun, 1996, Mahadevan, 1994)). Many different
approaches to this problem are being investigated, ranging from supervised learning of
concepts and behaviors (Pomerleau, 1990), to learning behaviors from scalar feedback
(Mahadevan & Connell, 1992). While a detailed comparison of the different approaches
to robot learning is beyond the scope of this paper (see (Mahadevan, 1996)), it is arguable
that in the short term, robots are going to be dependent on human trainers for much of their
learning. Specifically, a pragmatic approach to robot learning is one where a human designer
provides the basic ingredients of the solution (e.g., the overall control architecture), with
the missing components being filled in by additional training. Also, approaches involving
considerable trial-and-error, such as reinforcement learning (Sutton & Barto, 1998), are
difficult to use in many circumstances, because they require long training times, or because
they expose the robot to dangerous situations. For these reasons, we adopt the framework
of supervised learning, where a human trainer provides the robot with labeled examples of
the desired concept.
Supervised concept learning from labeled examples is probably the most well-studied
form of learning (Mitchell, 1997). Among the most successful approaches are decision trees
(Quinlan, 1986) and neural networks (McClelland & Rumelhart, 1986). Concept learning
in robotics is an extremely challenging problem, for several reasons. Sensory data is often
very high-dimensional (e.g., even a coarsely subsampled image can contain millions of
pixels), noisy due to specularities and other irregularities, and typically data collection
requires the robot to move to different parts of its environment. Under these conditions, it
seems clear that some form of a priori knowledge or bias is necessary for robots to be able
to successfully learn interesting concepts.
In this paper, we investigate two general approaches to bias sensory concept learning for
mobile robots. The first is based on spatial decomposition of the sensor representation.
The idea here is to partition a high-dimensional sensor representation, such as a local
occupancy grid or a visual image, into multiple quadrants, and learn independently from
each quadrant. The second form of bias investigated here is to learn multiple concepts using
a shared representation. We investigate the effectiveness of these two approaches on two
realistic tasks, navigation and recycling. Both these tasks are studied on a real robot called
PAVLOV (see Figure 1). In both problems, we use a standardized function approximator,
in the form of a feedforward neural net, to represent concepts, although we believe the bias
strategies studied here would be applicable to other approximators (e.g., decision trees or
instance-based methods).
In the navigation task, PAVLOV is required to traverse across an entire floor of the
engineering building (see Figure 10). The navigational system uses a hybrid two-layered
architecture, combining a probabilistic planning and execution layer with a reactive
behaviorbased layer. The planning layer requires the robot to map sensory values into high-level
features, such as doors and openings. These observations are used in state estimation to
localize the robot, and are critical to successful navigation despite noisy sensing and actions.
We study how PAVLOV can be trained to recognize these features from local occupancy
grid data. We also show that spatial decomposition and multiple category learning provide
a relatively rapid training phase.
In the recycling task, PAVLOV is required to find items of trash (e.g., soda cans and
other litter) and deposit them in a specified trash receptacle. The trash receptacles are color
coded, to make recognition easier. Here, we study how PAVLOV can be trained to recognize
and find trash receptacles from color images. The data is very high dimensional, but once
again, spatial decomposition and multi-category learning are able to sufficiently constrain
the hypothesis space to yield fast learning.
The rest of the paper is organized as follows. We begin in Section 2 by describing the
two robotics tasks where we investigated sensory concept learning. Section 3 describes the
two general forms of bias, decomposition and sharing, used to make the concept learning
problem tractable. Section 4 describes the experimental results obtained on a real robot
platform. Section 5 discusses the limitations of our approach and proposes some directions
for further work. Section 6 discusses some related work. Final (...truncated)