Computer vision based obstacle detection and target tracking for autonomous vehicles (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_07004.pdf

Computer vision based obstacle detection and target tracking for autonomous vehicles

MATEC Web of Conferences 336, 07004 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133607004 Computer vision based obstacle detection and target tracking for autonomous vehicles Ruoyu Fang1, and Cheng Cai1,* 1 School of Electronic Information Engineering, Shanghai Dianji University, Shanghai, China Keywords: Target tracking, obstacle detection, target tracking, deep learning, PID Abstract. Obstacle detection and target tracking are two major issues for intelligent autonomous vehicles. This paper proposes a new scheme to achieve target tracking and real-time obstacle detection of obstacles based on computer vision. ResNet-18 deep learning neural network is utilized for obstacle detection and Yolo-v3 deep learning neural network is employed for real-time target tracking. These two trained models can be deployed on an autonomous vehicle equipped with an NVIDIA Jetson Nano motherboard. The autonomous vehicle moves to avoid obstacles and follow tracked targets by camera. Adjusting the steering and movement of the autonomous vehicle according to the PID algorithm during the movement, therefore, will help the proposed vehicle achieve stable and precise tracking. 1 Introduction The current target tracking technology is applied in various fields, and usually the device moves with the movement of people. Today, autonomous obstacle avoidance and target tracking are mainly based on laser ranging radar. This paper proposes a method based on computer vision to achieve obstacle avoidance and target tracking. We use an autonomous vehicle equipped with a drive board and a power board, and use the NVIDIA Jetson Nano motherboard to drive the autonomous vehicle to move through programming. We can run the neural network framework on Jetson nano through python language. We use the Resnet-18 two-classifier model to train the model to complete the identification of obstacles, and use the COCO data set to train the Yolo-v3 model and set the tracking label. After setting the tracking label, we use the PID algorithm to adjust the rotation and driving of the autonomous vehicle, so that the autonomous vehicle can accurately track the target. * Corresponding author: © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). MATEC Web of Conferences 336, 07004 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133607004 2 Implement 2.1 Deploy device Jetson Nano is installed on the autonomous vehicle equipped with two motors and a battery. Wireless network card is installed so that we can remotely control the movement of the autonomous vehicle through the computer. Then, we can program to control the movement of the autonomous vehicle and deploy the neural network framework. 2.2 Avoid obstacles The traditional autonomous obstacle avoidance is realized by laser ranging radar. This paper chooses the Resnet-18 two classification model. With the help of the jump structure, the information loss in the layer-by-layer feature extraction process of the deep learning model is reduced, and the gradient disappearance and gradient explosion in the deep network training process can be significantly solved. The structure of each residual block is shown in Fig.3. The residual unit can be expressed as: xi+1 = f(F(xi , Wi ) + h(xi )) (1) In the formula: 𝑥𝑥𝑥𝑥𝑖𝑖𝑖𝑖 , 𝑥𝑥𝑥𝑥𝑖𝑖𝑖𝑖+1 and 𝑊𝑊𝑊𝑊𝑖𝑖𝑖𝑖 represent the input, output and weight matrix of the i-th residual unit respectively; 𝑓𝑓𝑓𝑓(·) is the Relu nonlinear activation function; h(·) is the direct mapping part; F(·) is the residual mapping part. (a) (b) Fig.1. Residual block and Resnet-18 model. This experiment uses the Raspberry Pi CSI camera. We initialize the camera on the workstation to collect the required data set. This paper divides the data set into two categories: one is free, which means there are no obstacles, and the other is Block, which means that the car needs to avoid obstacles, and then collect 200 pictures for each category. We use cartons and cabinets as obstacles, each of which has a different size and height. (b) (a) Fig.2. Block dataset and Free dataset. 2 MATEC Web of Conferences 336, 07004 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133607004 We use the method of transfer learning and the weights of the pre-trained models that have been learned, and then input a small number of data sets to achieve high accuracy. This paper uses ResNet-18 network for transfer learning. These collected data sets are imported into the model for training, and the trained model is used to predict obstacles. Fig. 3. Best accuracy. With the help of the real-time image collected by the camera, the input image is predicted, and the threshold value of encountering obstacles is set. If the probability of detecting the obstacle exceeds the threshold value, the autonomous vehicle is set to avoid obstacles and rotate. Then we enable the car to avoid obstacles autonomously. 2.3 Target tracking We need to use a target detection algorithm to detect the object that we want to track, and then let our autonomous vehicle track the target. We choose the Yolo-v3 model for target detection. Yolo-v3 is a classic one-stage detection model for deep learning target detection. Yolo-v3 target detection algorithm has the advantages of fast speed and high accuracy, which meet the requirements of detection algorithms. Yolo-v3 uses the Darknet-53 as the network, and uses non-maximum suppression to remove redundant anchor frames. This paper uses COCO data sets to train the model. Fig. 4. YOLOV3 structure. Set people as tracking targets. Through the camera, we can see the target detected by the autonomous vehicle, and the target object frame becomes green. The autonomous vehicle turns to the target and moves to the target object. In Fig.6, we can detect the target accurately after setting the target tag for tracking, and the target can still be detected even if there is partial occlusion. Fig. 5. Target detected. (a) (c) (b) (d) 2.4 PID algorithm By calculating the central coordinates of the tracking target and finding the target closest to the centre of the image, we try to make the detected target closest to the centre of the field of view, and then guide the autonomous vehicle to move to the target. In order to keep the target in the centre of the field of view, we set the autonomous vehicle to adjust at a fixed angle. But the robot cannot accurately track the target, and the rotation range is too large, 3 MATEC Web of Conferences 336, 07004 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133607004 which can easily lead to the loss of the target in the autonomous vehicle's field of view. The robot swings back and forth, and it can't move smoothly to the target. So we introduce the PID algorithm to make the robot track the object more smoothly. The PID algorithm is proportional, integral, and deriva (...truncated)