A New Feature Selection Method for Hyperspectral Image Classification Based on Simulated Annealing Genetic Algorithm and Choquet Fuzzy Integral
Hindawi Publishing Corporation
Mathematical Problems in Engineering
Volume 2013, Article ID 537268, 13 pages
http://dx.doi.org/10.1155/2013/537268
Research Article
A New Feature Selection Method for Hyperspectral Image
Classification Based on Simulated Annealing Genetic Algorithm
and Choquet Fuzzy Integral
Hongmin Gao, Lizhong Xu, Chenming Li, Aiye Shi, Fengchen Huang, and Zhenli Ma
College of Computer and Information Engineering, Hohai University, Nanjing 211100, China
Correspondence should be addressed to Lizhong Xu;
Received 1 June 2013; Revised 14 September 2013; Accepted 15 September 2013
Academic Editor: Gianluca Ranzi
Copyright © 2013 Hongmin Gao et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Hyperspectral remote sensing technology is a rapidly developing new integrated technology that is widely used in numerous areas.
Rich spectral information from hyperspectral images can aid in the classification and recognition of the ground objects. However,
the high dimensions of hyperspectral images cause redundancy in information. Hence, the high dimensions of hyperspectral data
must be reduced. This paper proposes a hybrid feature selection strategy based on the simulated annealing genetic algorithm
(SAGA) and the Choquet fuzzy integral (CFI). The band selection method is proposed from subspace decomposition, which
combines the simulated annealing algorithm with the genetic algorithm in choosing different cross-over and mutation probabilities,
as well as mutation individuals. Then, the selecting bands are further refined by CFI. Experimental results show that the proposed
method can achieve higher classification accuracy than traditional methods.
1. Introduction
Hyperspectral remote sensors peculiarly provide measurements of the Earth’s surface with very high spectral resolution, usually resulting in tens of channels. Unlike multispectral sensors, the high spectral resolution renders hyperspectral remote sensors very powerful in applications requiring
the identification of subtle differences in ground covers (e.g.,
material quantification and target detection). On the other
hand, the large-dimensional data spaces generated by these
sensors introduce challenging methodological problems. In
the context of supervised classification, the most important
methodological issue raised by these sensors is the so-called
curse of dimensionality (also known as the Hughes effect)
that occurs when the numbers of features and of available
training samples are unbalanced [1].
Meanwhile, hyperspectral remote sensing images have
nonlinear properties. These nonlinear properties originate
from the multiscattering between photons and ground targets, within pixel spectral mixing, and from scene heterogeneity. In addition, given that the pixel size in most remote
sensing systems is sufficiently large to include different types
of land cover, classification error arises and produces unreliable classification results. In this case, traditional classifiers
may fail completely.
In remote sensing literature, numerous methods have
been developed to solve the hyperspectral data classification
problem. A successful approach to hyperspectral data classification is based on the support vector machine (SVM).
SVM determines two classes by identifying the optimal
separating hyperplane that maximizes the margin between
the closest training sample and the separating hyperplane.
Data samples located at the hyperplane border are referred
to as support vectors and are used to create a decision
surface. The properties of SVM for both full-dimensional
and reduced-dimensional data have been investigated, while
multi-class SVM strategies have been considered in [2].
Hyperspectral image classification using different kernelbased approaches has been analyzed and compared, and
SVM has been found to be more useful than other kernelbased methods in [3]. SVM classification performance is
compared with other well-known neural approaches in [4],
which exhibited that SVM provides simplicity, robustness,
and increased classification accuracy compared with neural
2
networks. In addition, some improved SVM methods have
also been successfully used in hyperspectral image classification. The proposed method, called contextual SVM using
Hilbert space embedding showed significant improvement
over other methods on several hyperspectral images in [5]. A
semisupervised method for addressing a domain adaptation
problem based on multiple-kernel SVMs in the classification
of hyperspectral data was presented in [6]. Thus, SVM is
very suitable for hyperspectral image classification. However,
dimension reduction is not sufficiently considered in SVM.
Commonly used dimension reduction methods fall into
two categories, namely, feature selection and feature extraction. Since every band of hyperspectral data has its own
corresponding image, the feature extraction approach maps
a high-dimensional feature space to low-dimensional space
via linear or nonlinear transformation. However, the original
physical interpretation of the image cannot be retained.
Thus, feature extraction approaches are unsuitable for the
dimension reduction of hyperspectral images. Given that the
spectral distance between adjacent bands in the hyperspectral
data is only 10 nm and because the correlation between them
is extremely high [7], a considerable redundancy is observed,
which should be largely reduced by the feature selection or
band selection methods to improve classification efficiency
and accuracy. A semisupervised feature-selection technique
for hyperspectral image classification was developed in [8].
A method for unsupervised band selection by transforming
the hyperspectral data into complex networks was presented
in [9]. Therefore, a new dimension reduction method is
proposed that combines the simulated annealing genetic
algorithm (SAGA) with the Choquet fuzzy integral (CFI).
A population and temperature ladder-based new genetic
algorithm (GA) or the so-called SAGA was recently proposed
to examine a sample from a distribution defined on a space
of finite binary sequence. The feature selection strategy of
hyperspectral images based on GA and SVM was proposed
in [10, 11]. A GA-based feature selection and local-Fisher’s
discriminant analysis-based feature projection are performed
for effective dimensionality reduction in [12]. But SAGA
method works by simulating a parallel population of samples
with different temperatures. The population is updated via
selection, mutation, cross-over, and exchange operations
that are highly similar with GA. SAGA has the learning
capability of GA, as well as the fast-mixing capability of
parallel tempering (simulated tempering). In most cases,
classification accuracy is only used as the fitness function,
but internal relations between bands and classes (...truncated)