Online Detection of Abnormal Events in Video Streams
Hindawi Publishing Corporation
Journal of Electrical and Computer Engineering
Volume 2013, Article ID 837275, 12 pages
http://dx.doi.org/10.1155/2013/837275
Research Article
Online Detection of Abnormal Events in Video Streams
Tian Wang,1 Jie Chen,2 and Hichem Snoussi1
1
2
Institut Charles Delaunay, LM2S-UMR STMR 6279 CNRS, University of Technology of Troyes, 10004 Troyes, France
Observatoire de la Côte d’Azur, UMR 7293 CNRS, University of Nice Sophia-Antipolis, 06108 Nice, France
Correspondence should be addressed to Tian Wang;
Received 19 September 2013; Accepted 12 November 2013
Academic Editor: Yi Zhou
Copyright © 2013 Tian Wang et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
We propose an algorithm to handle the problem of detecting abnormal events, which is a challenging but important subject in
video surveillance. The algorithm consists of an image descriptor and online nonlinear classification method. We introduce the
covariance matrix of the optical flow and image intensity as a descriptor encoding moving information. The nonlinear online
support vector machine (SVM) firstly learns a limited set of the training frames to provide a basic reference model then updates the
model and detects abnormal events in the current frame. We finally apply the method to detect abnormal events on a benchmark
video surveillance dataset to demonstrate the effectiveness of the proposed technique.
1. Introduction
Visual surveillance is one of the major research areas in
computer vision. In a crowd image analysis problem, the
scientific challenge includes abnormal events detection. For
instance, Figure 1(a) illustrates a normal scene where the
people are walking. In Figure 1(b), all the people are suddenly
running in different directions. This dataset imitates panicdriven scenes.
Trajectory analysis of objects was described in [1–3]. The
moving object was labeled by a blob in consecutive frames,
and then a trajectory was produced. The deviation from the
learnt trajectories was defined as abnormal events. Tracking
based approaches are suitable for the sparse scenes with a few
objects. The target might be lost due to occlusion.
In [4, 5], abnormal detection approaches which used features encoding motion, texture, and size of the objects were
introduced. Local image regions in a video were analyzed by
employing background subtraction method; then a dynamic
Bayesian network (DBN) was constructed to model normal
and abnormal behavior, and finally a likelihood ratio test
was applied to detect abnormal behaviors. In [6], a spacetime Markov random field (MRF) model which detected
abnormal activities in a video was proposed, mixture of
probabilistic principal component analyzers (MPPCA) was
adopted to model local optical flow. The prediction is based
on probabilistic assumption techniques where an accurate
model exists, but there are various situations where a robust
and tractable model cannot be obtained; model-free methods
are needed to be studied.
Spatiotemporal motion features described by the context
of bag of video words were adopted to detect abnormal
events. In [7], the authors presented an algorithm which
monitored optical flow in a set of fixed spatial positions,
and constructed a histogram of optical flow. The likelihood
of the behavior in a new coming frame concerning the
probability distribution of the statistically learning behavior
was computed. If the likelihood fell below a preset threshold,
the behavior was considered as abnormal. In [8], irregular
behavior of images or videos was detected by an inference
process in a probabilistic graphical model. In [9, 10], the
video pixels were densely sampled to form the feature. These
methods are based on the partial information of images, such
as small blocks in a frame, without fully exploiting the global
information of the feature. In [11–13], spatiotemporal features
modeled motion regions of the frame as background, and
anomaly was detected by subtracting the newly sample to the
background template. These works are similar to the change
detection method when the background is not stable.
In this paper, the proposed algorithm is composed of two
parts. Firstly, a covariance feature descriptor is constructed
over the whole video frame, and then a nonlinear one-class
2
Journal of Electrical and Computer Engineering
support vector machine algorithm is applied in an online
fashion in order to detect abnormal events. The features
are extracted based on the optical flow which presents the
movement information. Experiments of real surveillance
video dataset show that our online abnormal detection techniques can obtain satisfactory performance. The rest of the
paper is organized as follows. In Section 2, covariance matrix
descriptor of motion feature is introduced. In Section 3, the
online one-class SVM classification method is presented. In
Section 4, two abnormal detection strategies based on online
nonlinear one-class SVM are proposed. In Section 5, we
present results of real-world video scenes. Finally, Section 6
concludes the paper.
2. Covariance Descriptor of Frame Behavior
The optical flow is a feature which presents the direction
and the amplitude of a movement. It can provide important
information about the spatial arrangement of the objects and
the change rate of this arrangement [14]. We adopt HornSchunck (HS) optical flow computation method in our work.
The optical flow of the gray scale image is formulated as the
minimizer of the following global energy functional:
2
𝐸 = ∬ [(𝐼𝑥 𝑢 + 𝐼𝑦 V + 𝐼𝑡 ) + 𝛼2 (‖∇𝑢‖2 + ‖∇V‖2 )] 𝑑𝑥𝑑𝑦, (1)
where 𝐼 is the intensity of the image, 𝐼𝑥 , 𝐼𝑦 , and 𝐼𝑡 are the
derivatives of the image intensity value along the 𝑥, 𝑦, and
time 𝑡 dimension, 𝑢 and V are the components of the optical
flow in the horizontal and vertical direction, and 𝛼 represents
the weight of the regularization term.
We introduce the covariance matrix encoding the optical
flow and intensity of each frame as the descriptor to represent
the movement. The covariance feature descriptor is originally
proposed by Tuzel et al. [15] for pattern matching in a target
tracking problem. The descriptor is defined as
𝐹 (𝑥, 𝑦, 𝑖) = 𝜙𝑖 (𝐼, 𝑥, 𝑦) ,
(2)
where 𝐼 is the color information of an image (which can be
gray, RGB, HSV, HLS, etc.), 𝜙𝑖 is a mapping relating the image
with the 𝑖th feature from the image, 𝐹 is the 𝑊 × 𝐻 × 𝑑
dimensional feature extracted from image 𝐼, 𝑊 and 𝐻 are the
image width and image height, and 𝑑 is the number of chosen
features. For each frame, the feature can be represented as
𝑑 × 𝑑 covariance matrix:
C=
1 𝑛
⊤
∑ (z − 𝜇) (z𝑘 − 𝜇) ,
𝑛 − 1 𝑘=1 𝑘
(3)
where 𝑛 is the number of the pixels sampled in the frame,
z𝑘 is the feature vector of pixel 𝑘, 𝜇 is the mean of all the
selected points, and C is the covariance matrix of the feature
vector 𝐹. (...truncated)