Human Depth Sensors-Based Activity Recognition Using Spatiotemporal Features and Hidden Markov Model for Smart Environments (pdf)

Article PDF cannot be displayed. You can download it here:

http://downloads.hindawi.com/journals/jcnc/2016/8087545.pdf

Human Depth Sensors-Based Activity Recognition Using Spatiotemporal Features and Hidden Markov Model for Smart Environments

Hindawi Publishing Corporation Journal of Computer Networks and Communications Volume 2016, Article ID 8087545, 11 pages http://dx.doi.org/10.1155/2016/8087545 Research Article Human Depth Sensors-Based Activity Recognition Using Spatiotemporal Features and Hidden Markov Model for Smart Environments Ahmad Jalal,1 Shaharyar Kamal,2 and Daijin Kim1 1 Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea KyungHee University, Suwon, Republic of Korea 2 Correspondence should be addressed to Ahmad Jalal; Received 30 June 2016; Accepted 15 September 2016 Academic Editor: Liangtian Wan Copyright © 2016 Ahmad Jalal et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Nowadays, advancements in depth imaging technologies have made human activity recognition (HAR) reliable without attaching optical markers or any other motion sensors to human body parts. This study presents a depth imaging-based HAR system to monitor and recognize human activities. In this work, we proposed spatiotemporal features approach to detect, track, and recognize human silhouettes using a sequence of RGB-D images. Under our proposed HAR framework, the required procedure includes detection of human depth silhouettes from the raw depth image sequence, removing background noise, and tracking of human silhouettes using frame differentiation constraints of human motion information. These depth silhouettes extract the spatiotemporal features based on depth sequential history, motion identification, optical flow, and joints information. Then, these features are processed by principal component analysis for dimension reduction and better feature representation. Finally, these optimal features are trained and they recognized activity using hidden Markov model. During experimental results, we demonstrate our proposed approach on three challenging depth videos datasets including IM-DailyDepthActivity, MSRAction3D, and MSRDailyActivity3D. All experimental results show the superiority of the proposed approach over the state-of-the-art methods. 1. Introduction Human tracking and activity recognition are defined as recognizing different activities by considering activity feature extraction and pattern recognition techniques based on specific input data from innovative sensors (i.e., motion sensors and video cameras) [1–5]. In recent years, advancement of these sensors has boosted the production of novel techniques for pervasive human tracking, observing human motion, detecting uncertain events [6–8], silhouette tracking, and emotion recognition in the real-world environments [9–11]. In these domains, the term which is most commonly used to cover all these topics is technically termed as human tracking and activity recognition [12–14]. In the motion sensorsbased activity recognition, activity recognition is based on classifying sensory data using one or more sensor devices. In [15], Casale et al. proposed a complete review about the state-of-the-art activity classification methods using data from one or more accelerometers. In this work, classification approaches are based on RFs features which classify five daily routine activities from bluetooth accelerometer placed at breast of the human body, using a 319-dimensional feature vector. In [16], fast FFT and decision tree classifier algorithm are proposed to detect physical activity using biaxial accelerometers attached on different parts of the human body. However, these motion sensors-based approaches are not feasible methods for recognition due to uncomfort of the users to wear electronic sensors in their daily life. Also, combining multiple sensors for improvement in recognition performance causes high computation load. Thus, videobased human tracking and activity recognition is proposed where the depth features are extracted from a RGB-D video camera. Depth silhouettes have made proactive contributions and are the most famous representation for human tracking and activity recognition from which useful human shape features 2 Journal of Computer Networks and Communications Depth image sequence Preprocessing Background denoising techniques Segmentation and tracking Feature extraction Depth shape features Joints points features Feature vectors Training and recognized activity Recognizer engine Maximum likelihood Clustering techniques based on K-mean method Figure 1: System architecture of the proposed human activity recognition system. are extracted. These depth silhouettes explore research issues and are used as practical applications including life-care systems, surveillance system, security system, face verification, patient monitoring systems, and human gait recognition systems. In [17], several algorithms are developed for feature extraction from the silhouette data of the tracked human subject using depth images as the pixel source. These parameters include ratio of height to weight of the tracked human subject. Also, motion characteristics and distance parameters are used as features for the activity recognition. In [14], a novel life logging translation and scaling invariant features approach is designed where 2D maps are computed through Radon transform which are further processed as 1D feature profiles through R transform. These features are further reduced by PCA and symbolized by Linde, Buzo, and Gray (LBG) clustering technique to train and recognize different activities. In [18], a discriminative representation method is proposed as structure-motion kinematics features including the structure similarity and head-floor distance based on skeleton joint points information. However, these effective trajectory projection based kinematic schemes are learnt by a SVM classifier to recognize activities using the depth maps. In [19], an activity recognition system is designed to provide continuous monitoring and recording of daily life activities. The system includes depth silhouettes as an input to produce skeleton model and its body points information. This information is used as features and is computed using a set of magnitude and direction angle features which are further used for training and testing via hidden Markov models (HMMs). These state-of-the-art methods [14, 17–19] proved more efficiency for recognition accuracy using depth silhouette. However, it is still difficult to find best features from limited information such as joint points information especially during occlusions. It shows bad impact over recognition accuracy. Therefore, we needed to develop methodology which provides combined effects of full-body silhouettes and joints information to improve activity recognition performance. In this paper, we proposed a novel method to recognize activities using sequence of depth images. During preprocessing steps, we extracted human depth silhouet (...truncated)