A Multisource Heterogeneous Data Fusion Method for Pedestrian Tracking

Mathematical Problems in Engineering, Mar 2015

Traditional visual pedestrian tracking methods perform poorly when faced with problems such as occlusion, illumination changes, and complex backgrounds. In principle, collecting more sensing information should resolve these issues. However, it is extremely challenging to properly fuse different sensing information to achieve accurate tracking results. In this study, we develop a pedestrian tracking method for fusing multisource heterogeneous sensing information, including video, RGB-D sequences, and inertial sensor data. In our method, a RGB-D sequence is used to position the target locally by fusing the texture and depth features. The local position is then used to eliminate the cumulative error resulting from the inertial sensor positioning. A camera calibration process is used to map the inertial sensor position onto the video image plane, where the visual tracking position and the mapped position are fused using a similarity feature to obtain accurate tracking results. Experiments using real scenarios show that the developed method outperforms the existing tracking method, which uses only a single sensing dataset, and is robust to target occlusion, illumination changes, and interference from similar textures or complex backgrounds.

Article PDF cannot be displayed. You can download it here:

http://downloads.hindawi.com/journals/mpe/2015/150541.pdf

A Multisource Heterogeneous Data Fusion Method for Pedestrian Tracking

Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2015, Article ID 150541, 10 pages http://dx.doi.org/10.1155/2015/150541 Research Article A Multisource Heterogeneous Data Fusion Method for Pedestrian Tracking Zhenlian Shi, Yanfeng Sun, Linxin Xiong, Yongli Hu, and Baocai Yin Beijing Key Laboratory of Multimedia and Intelligent Software Technology, College of Metropolitan Transportation, Beijing University of Technology, Beijing 100124, China Correspondence should be addressed to Yanfeng Sun; Received 25 September 2014; Accepted 25 February 2015 Academic Editor: Shueei M. Lin Copyright © 2015 Zhenlian Shi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Traditional visual pedestrian tracking methods perform poorly when faced with problems such as occlusion, illumination changes, and complex backgrounds. In principle, collecting more sensing information should resolve these issues. However, it is extremely challenging to properly fuse different sensing information to achieve accurate tracking results. In this study, we develop a pedestrian tracking method for fusing multisource heterogeneous sensing information, including video, RGB-D sequences, and inertial sensor data. In our method, a RGB-D sequence is used to position the target locally by fusing the texture and depth features. The local position is then used to eliminate the cumulative error resulting from the inertial sensor positioning. A camera calibration process is used to map the inertial sensor position onto the video image plane, where the visual tracking position and the mapped position are fused using a similarity feature to obtain accurate tracking results. Experiments using real scenarios show that the developed method outperforms the existing tracking method, which uses only a single sensing dataset, and is robust to target occlusion, illumination changes, and interference from similar textures or complex backgrounds. 1. Introduction Visual pedestrian tracking is an important research subject in computer vision. Many object tracking methods for visual pedestrian tracking have been developed recently. In typical methods, time series information, such as the Kalman filter [1] and the particle filter [2], is used for tracking. Recently, sparse representation and compressed sensing theories have also been introduced to represent target features in visual tracking [3, 4]. However, traditional visual tracking methods perform poorly in the presence of challenges such as target occlusion, objects with similar appearances, background changes, and illumination changes. To solve these problems, researchers are increasingly fusing various sensors for target tracking [5, 6]. These methods can be generally divided into two types. The first type is the passive method in which the target does not participate in the tracking process. For example, Ros and Mekhnacha developed a Bayesian occupancy filter for human tracking by fusing different types of sensing information [7]. Kang et al. developed a robust body tracking method by fusing visual and thermal images [8]. The second type is the active tracking method in which the target participates in the tracking process via wearable or carry-on sensors, such as MEMS inertial sensors, which can be used to locate the target from acceleration and gyroscope sensing information. The passive and active pedestrian tracking methods have their respective advantages for different application scenarios. The passive method is often used in public field surveillance where the target is unaware of being tracked. The active method can be used for self-navigation. However, some applications demand robust tracking performance, such as elder care, child protection, and criminal surveillance, for which neither the passive nor the active tracking methods produce satisfactory performance. For these cases, using wearable sensors combined with passive visual sensors is an ideal tracking solution. In this study, we developed a pedestrian tracking method that integrates the passive and active tracking methods. We used heterogeneous sensors, including Android smartphones, Kinect cameras, and video cameras, to implement pedestrian tracking and positioning 2 Mathematical Problems in Engineering Temporal spatial aligning Sensor positioning Depth positioning Sensor positioning with depth correction Compressive visual tracking Coordinate projection in sensor positioning Sensor positioning and visual tracking fusion Figure 2: Multisource heterogeneous data fusion for pedestrian tracking. Figure 1: Multisensor pedestrian tracking scenario. over a large range (see Figure 1). Different sensing data were fused in the tracking procedure to obtain more robust results. In principle, using different sensors for tracking should improve tracking efficiency and accuracy. However, many challenges must be met to realize these improvements, such as the spatial and temporal alignment of different sensing data, developing data and feature representations of different types of data, and the use of a proper fusion method for heterogeneous data. In this study, these issues were explored using video, depth, and inertial sensing data for pedestrian tracking. Figure 2 shows how multisource heterogeneous sensing data were aligned, processed, represented, and fused at different levels using different methods. The two primary contributions of this study are as follows: (1) RGB-D data that were captured by the depth sensor were used to eliminate the cumulative error from the inertial sensor positioning, which is a critical issue in using inertial sensors for long-term tracking; and (2) the resulting inertial sensor positions were fused with visual tracking results to solve target occlusion, illumination changes, and other difficult problems in traditional visual tracking. The developed method was tested by constructing a multisource sensing platform that was verified using data captured from real scenarios. The experimental results showed that the developed method exhibited good position and tracking performance. The rest of the paper is structured as follows. In Section 2, the inertial sensor positioning with depth correction is presented. The fusion of inertial sensor positioning and visual tracking is described in Section 3. The experimental results are presented in Section 4. The paper is concluded in Section 5. 2. Inertial Sensor Positioning with Depth Correction 2.1. Target Positioning with Inertial Sensor. The inertial sensor consisted of an acceleration sensor and a gyroscopic sensor. The acceleration sensor was used to obtain a sequence of changes in the three acceleration components. We determined the relationship between the step length and its duration from a set of sequence samples. We used the gyroscopic s (...truncated)


This is a preview of a remote PDF: http://downloads.hindawi.com/journals/mpe/2015/150541.pdf
Article home page: https://www.hindawi.com/journals/mpe/2015/150541/

Zhenlian Shi, Yanfeng Sun, Linxin Xiong, Yongli Hu, Baocai Yin. A Multisource Heterogeneous Data Fusion Method for Pedestrian Tracking, Mathematical Problems in Engineering, 2015, 2015, DOI: 10.1155/2015/150541