Rapid Concept Learning for Mobile Robots (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1023%2FA%3A1007432422702.pdf

Rapid Concept Learning for Mobile Robots

SRIDHAR MAHADEVAN GEORGIOS THEOCHAROUS 0 1 Editors: Henry Hexmoor and Maja Mataric 0 NIKFAR KHALEELI Wind River Systems, Alameda, CA 94501 1 Department of Computer Science, Michigan State University , East Lansing, MI 48824 Concept learning in robotics is an extremely challenging problem: sensory data is often highdimensional, and noisy due to specularities and other irregularities. In this paper, we investigate two general strategies to speed up learning, based on spatial decomposition of the sensory representation, and simultaneous learning of multiple classes using a shared structure. We study two concept learning scenarios: a hallway navigation problem, where the robot has to induce features such as opening or wall. The second task is recycling, where the robot has to learn to recognize objects, such as a trash can. We use a common underlying function approximator in both studies in the form of a feedforward neural network, with several hundred input units and multiple output units. Despite the high degree of freedom afforded by such an approximator, we show the two strategies provide sufficient bias to achieve rapid learning. We provide detailed experimental studies on an actual mobile robot called PAVLOV to illustrate the effectiveness of this approach. - Programming mobile robots to successfully operate in unstructured environments, including offices and homes, is tedious and difficult. Easing this programming burden seems necessary to realize many of the possible applications of mobile robot technology (Engleberger, 1989). One promising avenue towards smarter and easier-to-program robots is to equip them with the ability to learn new concepts and behaviors. In particular, robots that have the capability of learning concepts could be programmed or instructed more readily than their non-learning counterparts. For example, a robot that could be trained to recognize landmarks, such as doors and intersections, would enable a more flexible navigation system. Similarly, a recycling robot, which could be trained to find objects such as trash cans or soda cans, could be adapted to new circumstances much more easily than non-learning robots (for example, new objects or containers could be easily accommodated by additional training). Robot learning is currently an active area of research (e.g., see (Connell & Mahadevan, 1993, Dorigo, 1996, Franklin, Mitchell & Thrun, 1996, Mahadevan, 1994)). Many different approaches to this problem are being investigated, ranging from supervised learning of concepts and behaviors (Pomerleau, 1990), to learning behaviors from scalar feedback (Mahadevan & Connell, 1992). While a detailed comparison of the different approaches to robot learning is beyond the scope of this paper (see (Mahadevan, 1996)), it is arguable that in the short term, robots are going to be dependent on human trainers for much of their learning. Specifically, a pragmatic approach to robot learning is one where a human designer provides the basic ingredients of the solution (e.g., the overall control architecture), with the missing components being filled in by additional training. Also, approaches involving considerable trial-and-error, such as reinforcement learning (Sutton & Barto, 1998), are difficult to use in many circumstances, because they require long training times, or because they expose the robot to dangerous situations. For these reasons, we adopt the framework of supervised learning, where a human trainer provides the robot with labeled examples of the desired concept. Supervised concept learning from labeled examples is probably the most well-studied form of learning (Mitchell, 1997). Among the most successful approaches are decision trees (Quinlan, 1986) and neural networks (McClelland & Rumelhart, 1986). Concept learning in robotics is an extremely challenging problem, for several reasons. Sensory data is often very high-dimensional (e.g., even a coarsely subsampled image can contain millions of pixels), noisy due to specularities and other irregularities, and typically data collection requires the robot to move to different parts of its environment. Under these conditions, it seems clear that some form of a priori knowledge or bias is necessary for robots to be able to successfully learn interesting concepts. In this paper, we investigate two general approaches to bias sensory concept learning for mobile robots. The first is based on spatial decomposition of the sensor representation. The idea here is to partition a high-dimensional sensor representation, such as a local occupancy grid or a visual image, into multiple quadrants, and learn independently from each quadrant. The second form of bias investigated here is to learn multiple concepts using a shared representation. We investigate the effectiveness of these two approaches on two realistic tasks, navigation and recycling. Both these tasks are studied on a real robot called PAVLOV (see Figure 1). In both problems, we use a standardized function approximator, in the form of a feedforward neural net, to represent concepts, although we believe the bias strategies studied here would be applicable to other approximators (e.g., decision trees or instance-based methods). In the navigation task, PAVLOV is required to traverse across an entire floor of the engineering building (see Figure 10). The navigational system uses a hybrid two-layered architecture, combining a probabilistic planning and execution layer with a reactive behaviorbased layer. The planning layer requires the robot to map sensory values into high-level features, such as doors and openings. These observations are used in state estimation to localize the robot, and are critical to successful navigation despite noisy sensing and actions. We study how PAVLOV can be trained to recognize these features from local occupancy grid data. We also show that spatial decomposition and multiple category learning provide a relatively rapid training phase. In the recycling task, PAVLOV is required to find items of trash (e.g., soda cans and other litter) and deposit them in a specified trash receptacle. The trash receptacles are color coded, to make recognition easier. Here, we study how PAVLOV can be trained to recognize and find trash receptacles from color images. The data is very high dimensional, but once again, spatial decomposition and multi-category learning are able to sufficiently constrain the hypothesis space to yield fast learning. The rest of the paper is organized as follows. We begin in Section 2 by describing the two robotics tasks where we investigated sensory concept learning. Section 3 describes the two general forms of bias, decomposition and sharing, used to make the concept learning problem tractable. Section 4 describes the experimental results obtained on a real robot platform. Section 5 discusses the limitations of our approach and proposes some directions for further work. Section 6 discusses some related work. Final (...truncated)