A Multisource Heterogeneous Data Fusion Method for Pedestrian Tracking
Hindawi Publishing Corporation
Mathematical Problems in Engineering
Volume 2015, Article ID 150541, 10 pages
http://dx.doi.org/10.1155/2015/150541
Research Article
A Multisource Heterogeneous Data Fusion Method for
Pedestrian Tracking
Zhenlian Shi, Yanfeng Sun, Linxin Xiong, Yongli Hu, and Baocai Yin
Beijing Key Laboratory of Multimedia and Intelligent Software Technology, College of Metropolitan Transportation,
Beijing University of Technology, Beijing 100124, China
Correspondence should be addressed to Yanfeng Sun;
Received 25 September 2014; Accepted 25 February 2015
Academic Editor: Shueei M. Lin
Copyright © 2015 Zhenlian Shi et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Traditional visual pedestrian tracking methods perform poorly when faced with problems such as occlusion, illumination changes,
and complex backgrounds. In principle, collecting more sensing information should resolve these issues. However, it is extremely
challenging to properly fuse different sensing information to achieve accurate tracking results. In this study, we develop a pedestrian
tracking method for fusing multisource heterogeneous sensing information, including video, RGB-D sequences, and inertial sensor
data. In our method, a RGB-D sequence is used to position the target locally by fusing the texture and depth features. The local
position is then used to eliminate the cumulative error resulting from the inertial sensor positioning. A camera calibration process
is used to map the inertial sensor position onto the video image plane, where the visual tracking position and the mapped position
are fused using a similarity feature to obtain accurate tracking results. Experiments using real scenarios show that the developed
method outperforms the existing tracking method, which uses only a single sensing dataset, and is robust to target occlusion,
illumination changes, and interference from similar textures or complex backgrounds.
1. Introduction
Visual pedestrian tracking is an important research subject in computer vision. Many object tracking methods for
visual pedestrian tracking have been developed recently.
In typical methods, time series information, such as the
Kalman filter [1] and the particle filter [2], is used for
tracking. Recently, sparse representation and compressed
sensing theories have also been introduced to represent
target features in visual tracking [3, 4]. However, traditional
visual tracking methods perform poorly in the presence
of challenges such as target occlusion, objects with similar
appearances, background changes, and illumination changes.
To solve these problems, researchers are increasingly fusing
various sensors for target tracking [5, 6]. These methods
can be generally divided into two types. The first type is
the passive method in which the target does not participate
in the tracking process. For example, Ros and Mekhnacha
developed a Bayesian occupancy filter for human tracking
by fusing different types of sensing information [7]. Kang
et al. developed a robust body tracking method by fusing
visual and thermal images [8]. The second type is the active
tracking method in which the target participates in the
tracking process via wearable or carry-on sensors, such as
MEMS inertial sensors, which can be used to locate the target
from acceleration and gyroscope sensing information. The
passive and active pedestrian tracking methods have their
respective advantages for different application scenarios. The
passive method is often used in public field surveillance
where the target is unaware of being tracked. The active
method can be used for self-navigation. However, some
applications demand robust tracking performance, such as
elder care, child protection, and criminal surveillance, for
which neither the passive nor the active tracking methods
produce satisfactory performance. For these cases, using
wearable sensors combined with passive visual sensors is
an ideal tracking solution. In this study, we developed a
pedestrian tracking method that integrates the passive and
active tracking methods. We used heterogeneous sensors,
including Android smartphones, Kinect cameras, and video
cameras, to implement pedestrian tracking and positioning
2
Mathematical Problems in Engineering
Temporal
spatial
aligning
Sensor
positioning
Depth
positioning
Sensor
positioning
with
depth
correction
Compressive
visual
tracking
Coordinate
projection
in sensor
positioning
Sensor
positioning
and visual
tracking
fusion
Figure 2: Multisource heterogeneous data fusion for pedestrian
tracking.
Figure 1: Multisensor pedestrian tracking scenario.
over a large range (see Figure 1). Different sensing data were
fused in the tracking procedure to obtain more robust results.
In principle, using different sensors for tracking should
improve tracking efficiency and accuracy. However, many
challenges must be met to realize these improvements, such
as the spatial and temporal alignment of different sensing
data, developing data and feature representations of different
types of data, and the use of a proper fusion method for
heterogeneous data. In this study, these issues were explored
using video, depth, and inertial sensing data for pedestrian
tracking. Figure 2 shows how multisource heterogeneous
sensing data were aligned, processed, represented, and fused
at different levels using different methods. The two primary
contributions of this study are as follows: (1) RGB-D data that
were captured by the depth sensor were used to eliminate
the cumulative error from the inertial sensor positioning,
which is a critical issue in using inertial sensors for long-term
tracking; and (2) the resulting inertial sensor positions were
fused with visual tracking results to solve target occlusion,
illumination changes, and other difficult problems in traditional visual tracking. The developed method was tested by
constructing a multisource sensing platform that was verified
using data captured from real scenarios. The experimental
results showed that the developed method exhibited good
position and tracking performance.
The rest of the paper is structured as follows. In Section 2,
the inertial sensor positioning with depth correction is
presented. The fusion of inertial sensor positioning and
visual tracking is described in Section 3. The experimental
results are presented in Section 4. The paper is concluded in
Section 5.
2. Inertial Sensor Positioning with
Depth Correction
2.1. Target Positioning with Inertial Sensor. The inertial sensor
consisted of an acceleration sensor and a gyroscopic sensor.
The acceleration sensor was used to obtain a sequence of
changes in the three acceleration components. We determined the relationship between the step length and its duration from a set of sequence samples. We used the gyroscopic
s (...truncated)