Research on Learning from Demonstration of Mobile Robot with Autonomous Navigation

MATEC Web of Conferences, Jan 2017

A new method of batch learning from Demonstration is presented, in order to solve the problem of mobile robot independent navigation. According to the actual situation, the model of the Learning form Demonstration is given, and the neural network is used to realize the robot’s learning. Considering that the single artificial neural network cause dimension disaster, we designed the Learning from Demonstration model which is the coexistence of multi-neural network and is dynamic switching. The simulation results demonstrate that mobile robot independent navigation is realized.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://www.matec-conferences.org/articles/matecconf/pdf/2017/18/matecconf_ic4m2017_02019.pdf

Research on Learning from Demonstration of Mobile Robot with Autonomous Navigation

MATEC Web of Conferences Research on Learning from Demonstration of Mobile Robot with Autonomous Navigation Navigation LI Yan-fei ZHANG Wen-zhi A new method of batch learning from Demonstration is presented, in order to solve the problem of mobile robot independent navigation. According to the actual situation, the model of the Learning form Demonstration is given, and the neural network is used to realize the robot's learning. Considering that the single artificial neural network cause dimension disaster, we designed the Learning from Demonstration model which is the coexistence of multi-neural network and is dynamic switching. The simulation results demonstrate that mobile robot independent navigation is realized. 1 Introduction Mobile robot should be able to perceive changes in the surrounding environment and in accordance with changes in the environment appropriate to adjust their action path and behavioral strategies [ 1 ]. In the field of military, mobile robot technology has been applied to a variety of advanced unmanned early warning aircraft, demining robots; In the civil field, domestic mobile, entertainment, medical and other types of mobile robots more and more people in the field of vision. In short, the mobile robot has a very broad space for development and application prospects. However, navigation is a necessary problem to be solved by the mobile robot, which determines the action set of the mobile robot from the initial point to the target point, and avoids the collision with the obstacle [ 2,3 ]. The existing algorithms include grid method, potential force method and fuzzy control method. These algorithms must be designed by the professionals according to the surrounding environment of the robot, and the environment changes will affect the navigation and obstacle avoidance of the mobile robot. And even the need to rewrite the control procedures by experts, bringing expensive human and material resources [4,5]. Aiming at the existing navigation algorithms of mobile robots, A navigation controller for mobile robot based on batch demonstration learning is proposed. According to the frame of demonstration and the actual situation of the mobile robot, a mobile robot model based on demonstration learning is designed. And the neural network learning algorithm is used to compensate the non-linear term between the environment state and the action in the model. Using the control method proposed in this paper, a two-wheeled mobile robot is used to simulate an arbitrary path in an obstacle-free environment in order to realize autonomous navigation. 2 Demonstrate learning model of Mobile robot 2.1 Frame of Batch of learning from demonstration In batch learning, all presenter sample data are collected prior to learning, and the learning update itself often uses the mathematical properties of the strategy evaluation value M. The batch learning process is shown in Fig1. From human brain strategies to collect large amounts of human state space data [6]. The human state space is mapped into a common task space by the human brain's task space operator. Through the theoretical analysis of the strategy evaluation value M, the effective data in the general state space is selected and imported into the updating operator U, and finally the robot control strategy is derived through the robot task space operator. 2.2 Overall Model of Mobile Robot Based on Batch Demonstration Learning The essential problem of mobile robot based on batch demo learning is to solve the nonlinear mapping between environment states and actions. Combining the frame structure of batch demonstration learning and the actual situation of mobile robot, the whole model of mobile robot based on batch demonstration learning is designed. As shown in Figure 2. y y Batch Demonstration Learner M U Motion demonstration Navigation planner State and motion data collectors Mobile robot x x In the overall model of the mobile robot, the navigation planner is the most critical module for the robot to realize self-navigation. Its role is to achieve the robot state of the environment and the implementation of the nonlinear mapping between the actions. There is no fixed pattern for this many-to-one mapping, so it is almost impossible to find the formulas between them. At present, artificial neural network (ANN) has been widely used in the development of nonlinear models. It is especially suitable for applications where input and output are not well defined. It is feasible to apply it to the navigation planner in Fig2. Considering the complexity of the navigation planning function, if a single neural network is used to complete the function of the navigation planner, the artificial neural network will be too large, the large ne In view of this feature, the navigation planner is divided into several smaller planners, and a classifier is added to select the corresponding subordinate planner to control the robot sailing movement according to the state of the robot. According to this idea, the navigation planner can be designed as shown in Figure 3 the overall structure. Neural network model Planner Planner Planner In the above diagram, each planner is implemented with a small-scale neural network to form the structure of the multi-neural network. Each neural network can use the same structure, but the neural network training using different data sets, of course, the network structure can also be used in different forms of structure. Although each neural network is not perfect, it can only generalize some types of robot environment states and motion maps, but through the model switching unit constructed before the mulch-neural network, the dynamics of each neural network model in the robot running process conversion. Dynamic conversion to make the performance of each neural network perfect play to achieve the proper function of the navigation planner. In order to determine the number of planners and planner functions in Figure 3, it is necessary to analyze the state information obtained by sensors on the actual robot, the characteristics of navigation target points and so on. The robot used to test the learning effect of the mobile robot is a two-wheeled robot equipped with three distance detection sensors with an angular spacing of 20° between the three sensors, which are located in front of the robot, front left and front right respectively. The state of the machine can be divided into eight states according to whether the three sensors detect an obstacle: NNN, NNE, NEN, ENE, NEE, ENE, EEN, and EEE, where E is the detected state Existing, N indicates that no obstacle has been detected (Nothing). In accordance with the overall structure of the design idea, each small planner uses an artificial neural network to replace. The entire navigation planner consists of eight neural networks and a classification switching unit. And represent the obstacle distance values detected by the three sensors on the mobile robot, respectively, and is the steering angle of the mobile robot, which can be used as the input signal of the eight neural networks. Thus, the navigation planner shown in Figure 3 can be further refined to form a detailed plan of the navigation planner as shown in Figure 4. The module switching unit dynamically triggers one of the eight neural networks according to the output value of the three sensors, and outputs the control parameter for controlling the steering of the robot. Model switching unit Begin Artificial y selected an obstacle state i The remote control robot adjusts its initial position The state of the robot is in accordance with the selected obstacle state Start the path path demo The robot exits this state Record the robot data in the path path demo Train the neural network corresponding to the obstacle state Test neural network The robot behaves similarly to the demonstrator End In order to make the mobile robot perform the task of demonstration learning, the weights of the internal nodes of each neural network in Figure 4 must be updated by using the data extracted by the state and action data collectors in Figure 3 after the demonstrator is finished. Can be used BP algorithm, the field of intelligent control is widely used in a kind of neural algorithm [7]. It can store and generalize this complex input-output mapping relationship and control the output precision of the network by training the steepest descent learning rule under the condition that the complex input-output mapping relationship is difficult to be expressed by mathematical function [8, 9]. 3 Simulation experiment 3.1 Demonstrate learning process In the MATLAB simulation process, the virtual mobile robot sensor distribution as shown in Figure 5, mobile robot, obstacle and target sphere position of any of the three placed. The learning flow is shown below. 3.2 Analysis of learning result With the human hands demonstrate the behavior of the ongoing remote control, eight kinds of obstacle status will produce a large number of presentation data. Using this data, the corresponding neural network is trained repeatedly, and finally eight neural network models are obtained which accord with the demonstrator behavior. Figure 6 shows 2D neural network control graphs without obstacle state, 3D neural network control diagram with single obstacle and double obstacle. a) NNN state b) ENN state c) NEN state d) NNE state e) EEN state f) ENE state 3.3 Simulation and experimental platform testing In order to test the performance of the self-navigation controller, the obstacle in 3D virtual scene of the simulation platform is rearranged and two kinds of complicated test environments are designed. The simulation results are shown in Figure 7. After testing the performance of the demonstration learning control model, the subsystem is built on the platform of the two-wheeled mobile robot, and the obstacle environment is arranged. The demonstration experiment is completed on the experimental platform. As shown in Figure 8, the robot successfully avoids a plurality of obstacles and reaches a predetermined target point. Experimental results on the experimental platform show that the self-navigation control strategy of mobile robot based on batch demonstration learning has better navigation performance. c) f) i) Aiming at the characteristics of mobile robot demonstration learning, artificial neural network is proposed to realize the learning action of mobile robot to demonstrator. Considering single artificial neural network to realize complex navigation demonstration learning will make neural network too complex or cause neural network convergence Neural network, and each neural network is relatively simple, only to achieve a certain state of the mobile robot learning action, the robot in its mobile In the process, according to their work status changes at any time to switch to the appropriate neural network, at any time there is only one neural network in working condition. Experiments show that the neural network with this structure has a faster convergence rate. The learning model is simulated by the simulation platform, and the test result is good. Finally, the test results on the real robot show that the self - navigation control method based on batch demonstration learning is feasible. 1. C. Mericli , Manuela Veloso, H. Levent Akin . A. IEEE- RAS ,( 2010 ). 2. H. Bener Suay , Sonia Chernova . A comparison of two algorithms for robot learning from demonstration C . Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics . IEEE Press, ( 2011 ). 3. C. Sun , Wei He, Weiliang Ge, Cheng Chang. Adaptive Neural Network Control of Biped Robots[C]. Man and Cybernetics: Systems . IEEE Press, ( 2016 ). 4. H. Niu , Niu Wang, Nan Li . The adaptive control based on BP neural network identification for two-wheeled robot C .World congress on intelligent control and automation . IEEE Press, ( 2016 ). Santiago Morante Juan G. Victores Carlos Balaguer . Automatic demonstration and feature selection for robotlearning C .Humanoid Robot . IEEE-RAS,( 2015 ). Sonia Chernova Manuela Veloso. Multi-thresholded approach to demonstration selection for interactive robot learning C . Proceedings of the 3rd ACM/IEEE International Conference on Human-Robot Interaction . IEEE Press, ( 2008 ) . S. Jia , Quan Qiu, Junmin Li; You Li, Yue Cong . BP neural network based localization for a front-wheel drive and differential steering mobile robot C . Information and Automation . IEEE,( 2015 ). Maria Koskinopoulou Stylianos Piperakis Panos Trahanias . Learning from Demonstration facilitates Human-Robot Collaborative task execution C . Human-Robot Interaction .ACM/IEEE,( 2016 ). Robotics and Autonomous Systems,( 2012 ).


This is a preview of a remote PDF: https://www.matec-conferences.org/articles/matecconf/pdf/2017/18/matecconf_ic4m2017_02019.pdf

Yan-fei Li, Wen-zhi Zhang. Research on Learning from Demonstration of Mobile Robot with Autonomous Navigation, MATEC Web of Conferences, 2017, DOI: 10.1051/matecconf/201710402019