Robustly building keypoint mappings with global information on multispectral images
Li et al. EURASIP Journal on Advances in Signal
Processing (2015)2015:53
DOI 10.1186/s13634-015-0240-z
R ESEA R CH
Open Access
Robustly building keypoint mappings with
global information on multispectral images
Yong Li* , Hongbin Jin, Wei Qiao, Jing Jing and Hang Yu
Abstract
This paper proposes an approach to robustly build keypoint mappings on multispectral images. The distinctiveness
and repeatability of descriptors often decrease significantly on multispectral images and thus give unreliable keypoint
mappings. To complement this decrease, global information over entire images is induced in this work to evaluate
keypoint mappings. Initial keypoint mappings are established by utilizing descriptors. A pair of keypoint mappings
determines a similarity transformation T, and then it is evaluated with the induced global information that is defined
to be the similarity metric between the reference image and the transformed image by T. A process is utilized that
iteratively considers the pairs of keypoint mappings and searches the best reference matched keypoint for every test
keypoint. Experimental results show that the proposed approach can provide more reliable keypoint mappings than
SIFT, ORB, FREAK, and ISS on multispectral images.
Keywords: Multispectral imaging; Keypoint mappings; Global information
1 Introduction
Multispectral imaging has been widely applied in a variety of applications such as monitoring of natural disaster
and battlefield surveillance. The fusion of images taken
by different spectral light can often provide more information about objects of interest and scenes than a singlespectrum light. A satisfying fusion usually requires image
registration as the building block, and the registration
performance has a great effect on the fusion quality.
1.1 Related work
Registering multispectral images has been a challenging
problem due to the lack of explicit or implicit relationship
between the values of corresponding pixels. In literature,
there are two categories of registration methods, registration based on image features and registration based
on image intensity [1]. Among intensity-based methods are mutual information [2], MIND [3], and maximum likelihood (ML) [4]. Let Ir (x, y) and It (x, y) denote
the reference and test image. Intensity-based methods typically construct an objective/registration function f (Ir (x, y), ItT (x, y)) of the transformation parameter
*Correspondence:
School of Electronic Engineering, Beijing University of Posts and
Telecommunications, Xitucheng Road, 100876, Beijing, China
T between images. Then, the task of aligning Ir (x, y)
and It (x, y) amounts to searching for the T at which
f (Ir (x, y), ItT (x, y)) achieves the extremum.
The problem with intensity-based methods is that any
optimization technique may fail to find the ground truth
transformation parameters [5]. To improve the convergence of an optimization algorithm, the misalignment
is often assumed to be small, e.g., several pixels. This
assumption is equivalent to the following: an estimate
T̃ of the ground truth can be obtained falling into the
converging basin of f (Ir (x, y), ItT (x, y)), allowing for the
optimization algorithm to achieve the global extremum.
When the misalignment is relatively large, any optimization algorithm may easily be trapped in local extrema,
ending with an unsuccessful registration.
Another category of intensity-based methods is Fourier
methods. The translation of two images in spatial domain
corresponds to the peak of the inverse Fourier transform of the product of two Fourier transformations.
Tzimiropoulos et al. [6] propose a FFT-based approach
to aligning scale-invariant images in which the log-polar
Fourier is used to estimate the scaling and rotation. Pan
et al. [7] propose multilayer fractional Fourier transform
(MLFFT) to improve the accuracy of registering images
with respect to both rotation and scaling. The problem with the Fourier methods lies in the difficulty that
© 2015 Li et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://
creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided
the original work is properly credited.
Li et al. EURASIP Journal on Advances in Signal Processing (2015)2015:53
translation, rotation, and scaling can not be dealt with
simultaneously generally.
Other intensity-based techniques include region-based
confidence weighted M-estimators [8] that deal with
image sets with arbitrarily shaped local illumination variations caused by changes and movement of light sources.
Zosso et al. [9] propose geodesic active fields that couple the registration term and regularization term. The
energy of the deformation field is measured with the
Polyakov energy weighted by a suitable image distance.
Xing and Qiu [10] propose the using of nonparametric
local smoothing to determine the underlying transformation, which does not need to assume that the mapping
transformation has a certain type of parametric form. Liu
et al. [11] propose mean local phase angle (MLPA) and
frequency spread phase congruency (FSPC) using local
frequency information to emphasize the common structural information while suppressing the sensor-dependent
information.
Feature-based registration methods firstly build feature
mappings and then compute the transformation parameters without resorting to any optimization techniques.
In the past, a variety of image features such as keypoints
have been proposed. Among commonly used features
are keypoints and descriptors. Lowe [12] proposed the
scale invariant feature transform (SIFT) detecting keypoints and descriptors invariant to scale and rotation.
A main orientation is assigned to a keypoint, and the
local gradient pattern with respect to the main orientation is computed as its descriptor. Bay et al. [13] proposed Speeded-Up Robust Features (SURF). SURF has
the same repeatability and distinctiveness as SIFT but is
computed faster than SIFT by employing integral images.
Alahi et al. [14] propose Fast Retina Keypoint (FREAK).
FREAK is a cascade of binary strings computed by comparing image intensities over a retinal sampling pattern.
Ambai and Yoshida [15] propose compact and real-time
descriptors (CARD). CARD can be computed rapidly by
utilizing lookup tables to extract histograms of oriented
gradients.
SIFT, SURF, FREAK, and CARD are suitable for
monomodal images. To utilize descriptors for building keypoint mappings on multispectral images, partial intensity invariant feature descriptor (PIIFD) was
proposed that adapted the gradient pattern to gradient and region reverse [16]. Saleem and Sablatnig
[17] proposed using normalized gradients for computing descriptors to achieve robustness against intensity changes between multispectral images. Wang et al.
[18] proposed modified sift feature extraction algorithm with shape-context descriptor (MSSCD). MSSCD
compute (...truncated)