Gait Recognition Using Image Self-Similarity
EURASIP Journal on Applied Signal Processing 2004:4, 572–585
c 2004 Hindawi Publishing Corporation
Gait Recognition Using Image Self-Similarity
Chiraz BenAbdelkader
Identix Corporation, One Exchange Place, Jersey City, NJ 07302, USA
Email:
Ross G. Cutler
Microsoft Research, One Microsoft Way, Redmond, WA 98052-6399, USA
Email:
Larry S. Davis
Department of Computer Science, University of Maryland, College Park, MD 20742, USA
Email:
Received 30 October 2002; Revised 18 May 2003
Gait is one of the few biometrics that can be measured at a distance, and is hence useful for passive surveillance as well as biometric
applications. Gait recognition research is still at its infancy, however, and we have yet to solve the fundamental issue of finding
gait features which at once have sufficient discrimination power and can be extracted robustly and accurately from low-resolution
video. This paper describes a novel gait recognition technique based on the image self-similarity of a walking person. We contend
that the similarity plot encodes a projection of gait dynamics. It is also correspondence-free, robust to segmentation noise, and
works well with low-resolution video. The method is tested on multiple data sets of varying sizes and degrees of difficulty. Performance is best for fronto-parallel viewpoints, whereby a recognition rate of 98% is achieved for a data set of 6 people, and 70% for
a data set of 54 people.
Keywords and phrases: gait recognition, human identification at a distance, human movement analysis, behavioral biometrics,
pattern recognition.
1.
INTRODUCTION
1.1. Motivation
Gait is a relatively new and emergent behavioral biometric
[1, 2] that pertains to the use of an individual’s walking style
(or “the way he walks”) to determine identity. Gait recognition is the term typically used in the computer vision community to refer to the automatic extraction of visual cues that
characterize the motion of a walking person in video and is
used for identification purposes. Gait is particularly an attractive modality for passive surveillance since, unlike most
biometrics, it can be measured at a distance, hence not requiring interaction with or cooperation of the subject. However, gait features exhibit a high degree of intraperson variability, being dependent on various physiological, psychological, and external factors such as footwear, clothing, surface
of walking, mood, illness, fatigue, and so forth. The question
then arises as to whether there is sufficient gait variability between people that can discriminate them even in the presence
of large variation within each individual.
There is indeed strong evidence originating from psychophysical experiments [3, 4, 5] and gait analysis research
(a well-advanced multidisciplinary field that spans kinesiology, physiotherapy, orthopedic surgery, ergonomics, etc.)
[6, 7, 8, 9, 10] that gait dynamics contain a signature that is
characteristic of, and possibly unique to, each individual.
From a biomechanics standpoint, human gait consists of
synchronized, integrated movements of hundreds of muscles and joints of the body. These movements follow the
same basic bipedal pattern for all humans, and yet vary
from one individual to another in certain details (such as
their relative timing and magnitudes) as a function of their
entire musculo-skeletal structure, that is, body mass, limb
lengths, bone structure, and so forth. Because this structure is difficult to replicate, gait is believed to be unique to
each individual and can be completely characterized by a
few hundred kinematic parameters, namely, the angular velocities and accelerations at certain joints and body landmarks [6, 7]. Achieving such a complete characterization automatically from low-resolution video remains an open research problem in computer vision. The difficulty lies in that
feature detection and tracking is error prone due to selfocclusions, insufficient texture, and so forth. This is why
computer-aided motion analysis systems still rely on special
Gait Recognition Using Image Self-Similarity
wearable instruments, such as LED markers, and walking
surfaces [9].
Luckily, we may not need to recover 3D kinematics for
gait recognition after all. In Johansson’s early psychophysical
experiments [3], human subjects were able to recognize the
type of movement solely by observing light bulbs attached
to a few joints of the moving person. The experiments were
filmed in total darkness so that only the bulbs, a.k.a. moving
light displays (MLDs), are visible. Similar experiments later
suggested that the identity of a familiar person (“a friend”)
[5], as well as the gender of the person [4], may be recognizable from their MLDs. While it is widely agreed that these experiments provide evidence about motion perception in humans, there is no consensus on how the human visual system
actually interprets this MLD-type stimuli. Two main theories
exist: the first maintains that people recover the 3D structure of the moving object (person) and subsequently uses
it for recognition; the second theory states that motion information is directly used for recognition, without structure
recovery in the interim [11]. This seems to suggest that the
raw spatiotemporal (XYT) patterns generated by the person’s
motion in an MLD video encode information that is sufficient to recognize their movement.
In this paper, we describe a novel gait recognition
technique that derives classification features directly from
these XYT patterns. Specifically, it computes the image selfsimilarity plot (SSP), defined as the correlation of all pairs of
images in the sequence. Normalized feature vectors are extracted from the SSP and used for recognition. Related work
has demonstrated the effective use of SSP’s in recognizing different types of biological periodic motions, such as those of
humans and dogs, and applied the technique for human detection in video [12]. We use them here to classify the movement patterns of different people. We contend that the SSP
encodes a projection of planar gait dynamics and hence a
2D signature of gait. Whether it contains sufficient discriminant power for accurate recognition is what we set to determine.
As in any pattern recognition problem, these methods
typically consist of two stages: a feature extraction stage that
derives motion information from the image sequence and organizes it into some compact form (or representation), and
a recognition stage that applies some standard pattern classification technique to the obtained motion patterns, such as
K-nearest neighbor (KNN), support vector machines (SVM),
and hidden Markov models (HMM). In our view, the crux of
the gait recognition problem lies in perfecting the first stage.
The challenge is in finding motion patterns that are sufficiently discriminant despite the wide range of natural variability of gait, and that can be extracted reliably and consistently from video. The method of this paper is designed
with these two requirements in (...truncated)