Research on pedestrian detection algorithm in driverless urban traffic environment
MATEC Web of Conferences 336, 06002 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133606002
Research on pedestrian detection algorithm in
driverless urban traffic environment
Xinchao Liu1,*, Ying Yan1, and Haiyun Gan1
1
Tianjin University of Technology and Education, Tianjin 300222, China
Keyword:
Unmanned.
TidyYOLOv4,
Pedestrian
detection,
Model
pruning,
Abstract. Pedestrian detection in urban traffic environment is an
important field of driverless vehicle research. Due to the variability of
traffic flow, target detection algorithm cannot extract complete feature
information, which brings great challenges to driverless pedestrian
detection. Target detection algorithm YOLOv4 has excellent detection
performance in object detection, but it is not perfect in identifying semiblocked pedestrians. In this paper, the Spatial Pyramid Pooling was added
in front of the third yolo detection head module of YOLOv4 to optimize
the extraction of deep network features. Then, on the basis of optimizing
the network, pruning strategy was adopted to simplify the target detection
algorithm, which was called TidyYOLOv4.TidyYOLOv4 and YOLOv4
(network set input image size is 864×864) were compared on the self-made
human head data set. Total BFLOPS decreased by 95.04% and Inference
time decreased by 82.82%. The above experimental results show that the
optimized TidyYOLOv4 algorithm is more suitable for driverless
pedestrian detection in urban traffic environment.
1 Introduction
With the progress of artificial intelligence, driverless vehicles have become one of the
main research and development directions. Unmanned driving adopts a number of
technology fusion detection, among which visual detection is one of the most important
detection technologies. Pedestrian detection in urban roads is the basic task of visual
perception applied to driverless cars in various traffic scenes. Because when the driverless
vehicle does not detect the pedestrian in the road accurately, it may harm the life and safety
of the pedestrian. Therefore, it is very important to ensure the accuracy of pedestrian
detection. With the progress and improvement of deep learning algorithm, the detection of
road pedestrian has been further improved, but it still needs to be further improved in
practical application. There are two main problems: (1) The deep neural network vision
algorithm needs strong computing power and running space. Currently, it is mainly used to
test and verify its detection performance on the server, which is difficult to store and run on
*
Corresponding author:
© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons
Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/).
MATEC Web of Conferences 336, 06002 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133606002
the on-board chip. (2) The complex traffic flow will make the target detection algorithm fail
to extract complete feature information (for example, the body part of the pedestrian is
blocked by other vehicles or traffic signs), so it is necessary to rely on part of the acquired
information to determine the characteristics of the target.
In order to solve the problem that deep learning target detection algorithm cannot be
applied to unmanned chip, pruning algorithm is developed to reduce the spatial volume of
target detection algorithm and reduce the consumption of computing force, so as to realize
the reasonable deployment of target detection algorithm on unmanned chip. In order to
improve the detection effect of obscured the pedestrian, pedestrian in the presence of block
data set to validate the performance of the improved algorithm, due to the characteristics of
the legs, hands and body information exists strong uncertainty, so choose to identify with
high degrees of the head as an object of annotation besides has remarkable characteristics in
the road obscured the probability is relatively low. Such annotation does not exist in the
common open source pedestrian data set, so we made the head annotation data set with the
human body partially obscured. The experimental results show that TidyYOLOv4, an
optimization algorithm based on this data set, is more suitable than YOLOv4[1] to be
applied to the detection of pedestrians by driverless vehicles in urban traffic environment.
2 Related work
Machine vision is mainly divided into two categories: (1) Classifying the element
information in the image; (2) To locate the object information in the image, and target
detection is the fusion problem of classification and positioning. The initial target detection
algorithm mainly extracts the target information in the image through the sliding window,
and then analyzes the target positioning and classification. The result of the analysis cannot
achieve satisfactory results. Until the advent of R-CNN aroused the interest of a large
number of researchers, and became one of the hot research areas in the field of vision.Now
more excellent target detection algorithms have been developed on the basis of R-CNN,
such as R-CNN[2], Fast R-CNN[3], R-FCN[4], SSD[5], YOLO[6], YOLOv2[7],
YOLOv3[8], YOLOv4[1], etc.
These deep target detection algorithms are mainly divided into two categories according
to their different network architectures: one is a two-stage target detector represented by RCNN and Fast R-CNN, which is composed of three major modules, namely, regional
recommendation module, backbone network and detection head. First of all, the region
detection module of the two-stage target detector will generate suggestions with regions of
interest, and the detection head will conduct information classification based on these
suggestions. Finally, position regression will be carried out to accurately locate the target
object. The two-stage target detector achieves excellent detection accuracy through region
suggestion. Its running process not only requires huge loss of computing power and running
memory, but also leads to slow real-time target detection. In the other category, the singlestage target detector represented by YOLO series and SSD is set with k prior boxes densely
covering each specific position of the image at each position of the feature graph, and no
branch network similar to the regional suggestion is used. Therefore, the single-stage
detector is faster than the two-stage detector in reasoning. In the single-stage target detector,
YOLOv4 target detection algorithm has excellent detection speed and advanced detection
accuracy. Therefore, In this study, YOLOv4's target detection algorithm was selected as the
basic algorithm model for pruning. In combination with pruning strategy, a more efficient
target detection model, TidyYOLOv4, was learned to improve the real-time detection of
pedestrians by driverless vehicles in urban traffic.
2
MATEC Web of Conferences 336, 06002 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133606002
3 (...truncated)