Computer vision based obstacle detection and target tracking for autonomous vehicles
MATEC Web of Conferences 336, 07004 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133607004
Computer vision based obstacle detection and
target tracking for autonomous vehicles
Ruoyu Fang1, and Cheng Cai1,*
1
School of Electronic Information Engineering, Shanghai Dianji University, Shanghai, China
Keywords: Target tracking, obstacle detection, target tracking, deep
learning, PID
Abstract. Obstacle detection and target tracking are two major issues for
intelligent autonomous vehicles. This paper proposes a new scheme to
achieve target tracking and real-time obstacle detection of obstacles based
on computer vision. ResNet-18 deep learning neural network is utilized for
obstacle detection and Yolo-v3 deep learning neural network is employed
for real-time target tracking. These two trained models can be deployed on
an autonomous vehicle equipped with an NVIDIA Jetson Nano
motherboard. The autonomous vehicle moves to avoid obstacles and
follow tracked targets by camera. Adjusting the steering and movement of
the autonomous vehicle according to the PID algorithm during the
movement, therefore, will help the proposed vehicle achieve stable and
precise tracking.
1 Introduction
The current target tracking technology is applied in various fields, and usually the
device moves with the movement of people. Today, autonomous obstacle avoidance and
target tracking are mainly based on laser ranging radar. This paper proposes a method
based on computer vision to achieve obstacle avoidance and target tracking.
We use an autonomous vehicle equipped with a drive board and a power board, and use
the NVIDIA Jetson Nano motherboard to drive the autonomous vehicle to move through
programming. We can run the neural network framework on Jetson nano through python
language.
We use the Resnet-18 two-classifier model to train the model to complete the
identification of obstacles, and use the COCO data set to train the Yolo-v3 model and set
the tracking label. After setting the tracking label, we use the PID algorithm to adjust the
rotation and driving of the autonomous vehicle, so that the autonomous vehicle can
accurately track the target.
*
Corresponding author:
Β© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons
Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/).
MATEC Web of Conferences 336, 07004 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133607004
2 Implement
2.1 Deploy device
Jetson Nano is installed on the autonomous vehicle equipped with two motors and a battery.
Wireless network card is installed so that we can remotely control the movement of the
autonomous vehicle through the computer. Then, we can program to control the movement
of the autonomous vehicle and deploy the neural network framework.
2.2 Avoid obstacles
The traditional autonomous obstacle avoidance is realized by laser ranging radar. This
paper chooses the Resnet-18 two classification model. With the help of the jump structure,
the information loss in the layer-by-layer feature extraction process of the deep learning
model is reduced, and the gradient disappearance and gradient explosion in the deep
network training process can be significantly solved. The structure of each residual block is
shown in Fig.3. The residual unit can be expressed as:
xi+1 = f(F(xi , Wi ) + h(xi ))
(1)
In the formula: π₯π₯π₯π₯ππππ , π₯π₯π₯π₯ππππ+1 and ππππππππ represent the input, output and weight matrix of the i-th
residual unit respectively; ππππ(Β·) is the Relu nonlinear activation function; h(Β·) is the direct
mapping part; F(Β·) is the residual mapping part.
(a)
(b)
Fig.1. Residual block and Resnet-18 model.
This experiment uses the Raspberry Pi CSI camera. We initialize the camera on the
workstation to collect the required data set. This paper divides the data set into two
categories: one is free, which means there are no obstacles, and the other is Block, which
means that the car needs to avoid obstacles, and then collect 200 pictures for each category.
We use cartons and cabinets as obstacles, each of which has a different size and height.
(b)
(a)
Fig.2. Block dataset and Free dataset.
2
MATEC Web of Conferences 336, 07004 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133607004
We use the method of transfer learning and the weights of the pre-trained models that have
been learned, and then input a small number of data sets to achieve high accuracy. This
paper uses ResNet-18 network for transfer learning. These collected data sets are imported
into the model for training, and the trained model is used to predict obstacles.
Fig. 3. Best accuracy.
With the help of the real-time image collected by the camera, the input image is predicted,
and the threshold value of encountering obstacles is set. If the probability of detecting the
obstacle exceeds the threshold value, the autonomous vehicle is set to avoid obstacles and
rotate. Then we enable the car to avoid obstacles autonomously.
2.3 Target tracking
We need to use a target detection algorithm to detect the object that we want to track, and
then let our autonomous vehicle track the target. We choose the Yolo-v3 model for target
detection. Yolo-v3 is a classic one-stage detection model for deep learning target detection.
Yolo-v3 target detection algorithm has the advantages of fast speed and high accuracy,
which meet the requirements of detection algorithms. Yolo-v3 uses the Darknet-53 as the
network, and uses non-maximum suppression to remove redundant anchor frames. This
paper uses COCO data sets to train the model.
Fig. 4. YOLOV3 structure.
Set people as tracking targets. Through the camera, we can see the target detected by the
autonomous vehicle, and the target object frame becomes green. The autonomous vehicle
turns to the target and moves to the target object.
In Fig.6, we can detect the target accurately after setting the target tag for tracking, and
the target can still be detected even if there is partial occlusion.
Fig. 5. Target detected.
(a)
(c)
(b)
(d)
2.4 PID algorithm
By calculating the central coordinates of the tracking target and finding the target closest to
the centre of the image, we try to make the detected target closest to the centre of the field
of view, and then guide the autonomous vehicle to move to the target. In order to keep the
target in the centre of the field of view, we set the autonomous vehicle to adjust at a fixed
angle. But the robot cannot accurately track the target, and the rotation range is too large,
3
MATEC Web of Conferences 336, 07004 (2021)
CSCNS2020
https://doi.org/10.1051/matecconf/202133607004
which can easily lead to the loss of the target in the autonomous vehicle's field of view. The
robot swings back and forth, and it can't move smoothly to the target. So we introduce the
PID algorithm to make the robot track the object more smoothly.
The PID algorithm is proportional, integral, and deriva (...truncated)