Improved Behavior Monitoring and Classification Using Cues Parameters Extraction from Camera Array Images
Regular Issue
Improved Behavior Monitoring and Classification Using
Cues Parameters Extraction from Camera Array Images
Ahmad Jalal, Shaharyar Kamal*
Department of Computer Science and Engineering, Air University, Islamabad (Pakistan)
Received 26 February 2018 | Accepted 6 July 2018 | Published 20 July 2018
Abstract
Keywords
Behavior monitoring and classification is a mechanism used to automatically identify or verify individual
based on their human detection, tracking and behavior recognition from video sequences captured by a depth
camera. In this paper, we designed a system that precisely classifies the nature of 3D body postures obtained
by Kinect using an advanced recognizer. We proposed novel features that are suitable for depth data. These
features are robust to noise, invariant to translation and scaling, and capable of monitoring fast human bodyparts movements. Lastly, advanced hidden Markov model is used to recognize different activities. In the
extensive experiments, we have seen that our system consistently outperforms over three depth-based behavior
datasets, i.e., IM-DailyDepthActivity, MSRDailyActivity3D and MSRAction3D in both posture classification
and behavior recognition. Moreover, our system handles subject's body parts rotation, self-occlusion and body
parts missing which significantly track complex activities and improve recognition rate. Due to easy accessible,
low-cost and friendly deployment process of depth camera, the proposed system can be applied over various
consumer-applications including patient-monitoring system, automatic video surveillance, smart homes/offices
and 3D games.
Activity Recognition,
Body Posture
Recognition System,
Pattern Clustering,
SmartCities.
I. Introduction
DENTIFICATION, monitoring, classification and recognition of
human from behavior images is very necessary as it is very effective
to convey subject’s situation, identity, emotion, gait and gestures [1-4].
Still human identification and monitoring is not absolutely perfect in
various conditions such as position changes, illumination, orientation,
noise variations and dark-area places [5-8]. In spite of the research
efforts and significant results in the past decade, recognition accuracy
of human behavior still remains a challenge because of self-occlusion
of human body parts, variation of body size and appearance, un-clear
or hidden body parts behind objects and fast human movements during
indoor scenes decade [9, 10]. In addition, several researchers mainly
focused on recognizing activities from videos captured by conventional
cameras which are less effective due to complex backgrounds, light
sensitivity and motion ambiguities (i.e. color and texture variability)
[11-13]. Thus, to access the high quality imaging and 3D motions, the
development of low-cost and easy-processing depth cameras such as
Microsoft Kinect or bumblebee, have initiated new era for a variety
of image recognition tasks including human behavior recognition
(BR) [14-16]. Depth images provide several opportunities to enhance
BR such as additional body joints information, spatial continuity,
insensitivity to lighting conditions and controlling overlapping issues
of different human body parts.
I
A large number of methods have been designed for efficient BR
method and also a lot of comparative studies were evaluated by series
* Corresponding author.
E-mail address:
DOI: 10.9781/ijimai.2018.07.003
of researchers over depth videos [16-18] to examine the best algorithms
for recognition. These methods mainly interact with depth data using
two different approaches: skeleton joints features and depth silhouette
features. For example, Oreifej and Liu [19] proposed a new descriptor
for behavior recognition using a histogram capturing the distribution
of the surface normal orientation in the 4D space of time, depth, and
spatial coordinates. To build the histogram, they created 4D projectors,
which quantize the 4D space and represent the possible directions
for the 4D normal. In [17], Yang et al described an effective method
that project depth maps onto three orthogonal planes and accumulate
global activities through entire video sequences to generate the Depth
Motion Maps (DMM). Histograms of Oriented Gradients (HOG) are
then computed to enhance the activity recognition results. In [20],
authors proposed a behavior recognition system that deals with motion
features as magnitude and directional angular features from body
joints information between consecutive frames to recognize daily
routine human activities. In [21], authors designed mid-level features
from Kinect skeletons by considering the orientations of human body
limbs connected by two skeleton joints and each orientation is encoded
into different states. They employed frequent pattern mining to pick
the most frequent feature values, relevant states of parts in continuous
several frames and recognize different activity/actions.
However, such methods show better performance and contributions,
but different factors having negative impact surrounded each method.
Those methods just relied on the skeleton data which became unreliable
for postures with self-occlusion. Also, some methods were depended on
depth silhouettes information which causes low recognition accuracy
especially in case of hidden or missing body parts, fast moving human
silhouettes and large distance of subject from the source (i.e. depth
camera). Therefore, we elaborate some novel features along with
- 71 -
International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 5, Nº 5
advanced HMM to overcome the above mentioned problems and
improve recognition accuracy.
In this paper, we propose a novel behavior recognition framework
based on cues-parameters, which has an improved accuracy over
existing algorithms. At the start of the BR framework, we handle the
noisy input posture and unclear background data by designing a set
of reliability measurement to extract true silhouettes and tracked joint
values. These true data is examined to extract human silhouette by
considering spatial/temporal continuity, constraints of human motion
information and frame differentiation. These data are further processed
to get feature representation by considering cues-parameters including
angular direction, spatiotemporal velocity and invariant features
which provide compact and sufficient feature values for better BR
performance. While, all feature values are mapped into codewords and
recognized each behavior via advanced Hidden Markov model (HMM).
We evaluate our method according to the standard experimental
protocols definition on three challenging depth behavior datasets: IMDailyDepthActivity, MSRDailyActivity3D and MSRAction3D. Our
experimental results show that the proposed method is able to achieve
better recognition accuracy than the state-of-the-art methods. Since our
system is well-organized, affordable and easily installable, therefore, it
is the preferable solutio (...truncated)