Research on pedestrian detection algorithm in driverless urban traffic environment (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.matec-conferences.org/articles/matecconf/pdf/2021/05/matecconf_cscns20_06002.pdf

Research on pedestrian detection algorithm in driverless urban traffic environment

MATEC Web of Conferences 336, 06002 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133606002 Research on pedestrian detection algorithm in driverless urban traffic environment Xinchao Liu1,*, Ying Yan1, and Haiyun Gan1 1 Tianjin University of Technology and Education, Tianjin 300222, China Keyword: Unmanned. TidyYOLOv4, Pedestrian detection, Model pruning, Abstract. Pedestrian detection in urban traffic environment is an important field of driverless vehicle research. Due to the variability of traffic flow, target detection algorithm cannot extract complete feature information, which brings great challenges to driverless pedestrian detection. Target detection algorithm YOLOv4 has excellent detection performance in object detection, but it is not perfect in identifying semiblocked pedestrians. In this paper, the Spatial Pyramid Pooling was added in front of the third yolo detection head module of YOLOv4 to optimize the extraction of deep network features. Then, on the basis of optimizing the network, pruning strategy was adopted to simplify the target detection algorithm, which was called TidyYOLOv4.TidyYOLOv4 and YOLOv4 (network set input image size is 864×864) were compared on the self-made human head data set. Total BFLOPS decreased by 95.04% and Inference time decreased by 82.82%. The above experimental results show that the optimized TidyYOLOv4 algorithm is more suitable for driverless pedestrian detection in urban traffic environment. 1 Introduction With the progress of artificial intelligence, driverless vehicles have become one of the main research and development directions. Unmanned driving adopts a number of technology fusion detection, among which visual detection is one of the most important detection technologies. Pedestrian detection in urban roads is the basic task of visual perception applied to driverless cars in various traffic scenes. Because when the driverless vehicle does not detect the pedestrian in the road accurately, it may harm the life and safety of the pedestrian. Therefore, it is very important to ensure the accuracy of pedestrian detection. With the progress and improvement of deep learning algorithm, the detection of road pedestrian has been further improved, but it still needs to be further improved in practical application. There are two main problems: (1) The deep neural network vision algorithm needs strong computing power and running space. Currently, it is mainly used to test and verify its detection performance on the server, which is difficult to store and run on * Corresponding author: © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). MATEC Web of Conferences 336, 06002 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133606002 the on-board chip. (2) The complex traffic flow will make the target detection algorithm fail to extract complete feature information (for example, the body part of the pedestrian is blocked by other vehicles or traffic signs), so it is necessary to rely on part of the acquired information to determine the characteristics of the target. In order to solve the problem that deep learning target detection algorithm cannot be applied to unmanned chip, pruning algorithm is developed to reduce the spatial volume of target detection algorithm and reduce the consumption of computing force, so as to realize the reasonable deployment of target detection algorithm on unmanned chip. In order to improve the detection effect of obscured the pedestrian, pedestrian in the presence of block data set to validate the performance of the improved algorithm, due to the characteristics of the legs, hands and body information exists strong uncertainty, so choose to identify with high degrees of the head as an object of annotation besides has remarkable characteristics in the road obscured the probability is relatively low. Such annotation does not exist in the common open source pedestrian data set, so we made the head annotation data set with the human body partially obscured. The experimental results show that TidyYOLOv4, an optimization algorithm based on this data set, is more suitable than YOLOv4[1] to be applied to the detection of pedestrians by driverless vehicles in urban traffic environment. 2 Related work Machine vision is mainly divided into two categories: (1) Classifying the element information in the image; (2) To locate the object information in the image, and target detection is the fusion problem of classification and positioning. The initial target detection algorithm mainly extracts the target information in the image through the sliding window, and then analyzes the target positioning and classification. The result of the analysis cannot achieve satisfactory results. Until the advent of R-CNN aroused the interest of a large number of researchers, and became one of the hot research areas in the field of vision.Now more excellent target detection algorithms have been developed on the basis of R-CNN, such as R-CNN[2], Fast R-CNN[3], R-FCN[4], SSD[5], YOLO[6], YOLOv2[7], YOLOv3[8], YOLOv4[1], etc. These deep target detection algorithms are mainly divided into two categories according to their different network architectures: one is a two-stage target detector represented by RCNN and Fast R-CNN, which is composed of three major modules, namely, regional recommendation module, backbone network and detection head. First of all, the region detection module of the two-stage target detector will generate suggestions with regions of interest, and the detection head will conduct information classification based on these suggestions. Finally, position regression will be carried out to accurately locate the target object. The two-stage target detector achieves excellent detection accuracy through region suggestion. Its running process not only requires huge loss of computing power and running memory, but also leads to slow real-time target detection. In the other category, the singlestage target detector represented by YOLO series and SSD is set with k prior boxes densely covering each specific position of the image at each position of the feature graph, and no branch network similar to the regional suggestion is used. Therefore, the single-stage detector is faster than the two-stage detector in reasoning. In the single-stage target detector, YOLOv4 target detection algorithm has excellent detection speed and advanced detection accuracy. Therefore, In this study, YOLOv4's target detection algorithm was selected as the basic algorithm model for pruning. In combination with pruning strategy, a more efficient target detection model, TidyYOLOv4, was learned to improve the real-time detection of pedestrians by driverless vehicles in urban traffic. 2 MATEC Web of Conferences 336, 06002 (2021) CSCNS2020 https://doi.org/10.1051/matecconf/202133606002 3 (...truncated)