CO-REGISTRATION BETWEEN MULTISOURCE REMOTE-SENSING IMAGES
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August – 01 September 2012, Melbourne, Australia
CO-REGISTRATION BETWEEN MULTISOURCE REMOTE-SENSING IMAGES
Joz Wu a,*, Chi Chang b, Hsien-Yu Tsai b, Ming-Che Liu b
a
b
CSRSR, National Central University, Jhongli, Taiwan –
Dept. of Civil Engineering, National Central University, Jhongli, Taiwan –
Commission III, WG III/5
KEY WORDS: Registration, Least-Squares Matching, SIFT, TPS, RANSAC
ABSTRACT:
Image registration is essential for geospatial information systems analysis, which usually involves integrating multitemporal and
multispectral datasets from remote optical and radar sensors. An algorithm that deals with feature extraction, keypoint matching,
outlier detection and image warping is experimented in this study. The methods currently available in the literature rely on
techniques, such as the scale-invariant feature transform, between-edge cost minimization, normalized cross correlation, leastsquares image matching, random sample consensus, iterated data snooping and thin-plate splines. Their basics are highlighted and
encoded into a computer program. The test images are excerpts from digital files created by the multispectral SPOT-5 and Formosat2 sensors, and by the panchromatic IKONOS and QuickBird sensors. Suburban areas, housing rooftops, the countryside and hilly
plantations are studied. The co-registered images are displayed with block subimages in a criss-cross pattern. Besides the imagery,
the registration accuracy is expressed by the root mean square error. Toward the end, this paper also includes a few opinions on
issues that are believed to hinder a correct correspondence between diverse images.
accepted in disciplines like computer vision, photogrammetry
and remote sensing. In the framework of image pyramiding,
convolution with blurring Gaussian kernels is carried out first.
The images are consecutively differenced to yield a stack of
scale-space images containing potential high-pass feature points.
1. INTRODUCTION
In general, there are image feature- and area-based matching
methods. Linear features can be extracted from a portion of an
image where the gray-level gradient varies (Tupin and Roux,
2003). On the other hand, the coefficient of correlation between
two image windows may serve as an index for gauging the
degree of similarity (Wolf and Dewitt, 2000). A hybrid method
allowing for both feature- and area-based matching techniques
is considered more versatile than either method operating alone.
No doubt, the design of a hybrid strategy could lead to a more
complex algorithm with heavy computation. Often, this is a
blessed trade-off because of the increased reliability of point
determination.
Symbol D is used to represent the resulting images as
D T
2D
1
) x xT
x , based on truncated
x
2
x 2
Taylor’s expansion with the capital superscript standing for
D ( x) D (
transposition, and with xT ( x, y , ) , a vector having the line
x (pixel), sample y (pixel) and scale coordinates. For an
extreme keypoint, the differentiated equation of D with respect
This paper is motivated to devise an algorithm that stresses not
only the matching accuracy and robustness between images, but
also the scale- and rotation-invariance between them. Many
generic methods for feature extraction and image registration
exist (Dare and Dowman, 2001; Mikolajczyk and Schmid,
2005). In particular, Lowe (2004) published a scale-invariant
feature transforming methodology, which allows us to generate
a large number of descriptor-based keypoints. Gross errors in
the coordinates of the image keypoints have to be detected and
removed, on a probabilistic basis (Schwarz and Kok, 1993;
Vennebusch et al., 2009). Indeed, the filtered feature points
possess good coordinate approximates. They may serve as
initial values for the subsequent high-precision least-squares
image matching.
to
x
is
2D
x 2
x
D
0 .
x
Consequently,
one
obtains
2 D 1 D
. After back-substitution, the image is
)
x
x 2
identified as
xˆ (
D(xˆ ) D
1 D T
( ) xˆ
2 x
(1)
The magnitude and orientation of gray-level gradients in the
image closest to a keypoint results in an orientation histogram
that accounts for a relative rotation between image windows,
within a plus or minus 5-degree tolerance. Based on the aligned
image at a keypoint, an SIFT user sets up a vector of 128
descriptive elements. Search for the corresponding point to
form a pair relies on a minimization of the Euclidean distance
between two descriptive vectors.
2.2 Area-based Matching
2. METHODOLOGY
2.1 Feature Points by SIFT
The SIFT (Scale-Invariant Feature Transform) algorithm by
Lowe (2004) has been famously known for its insensitivity to
imaging scale and orientation changes, and to scene
illumination differences, thereby allowing it to be widely
Generally speaking, the method of LSM (Least-Squares
Matching) outperforms that of normalized cross correlation
because the former incorporates affine parameters into the
* Corresponding author. This is useful to know for communication with the appropriate person in cases with more than one author.
439
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B3, 2012
XXII ISPRS Congress, 25 August – 01 September 2012, Melbourne, Australia
blunder is removed one at a time so that the algorithm is termed
IDS (Iterated Data Snooping).
matching model. For the target g ( x, y ) and search q (l , s )
images, the line and sample coordinates (pixel) are expressed as
l a0 a1x a2 y and s b0 b1x b2 y , with the a0 , a1 ,
As calculation of the covariance matrix for a large number of
data residuals can grow burdensome, the RANSAC algorithm is
usually conducted first. For the remaining data points, IDS
could be invoked to ensure that indeed they are regular samples
of an experiment at hand. Because of the difference in theory,
RANSAC and IDS are expected to be complementary.
a 2 , b0 , b1 and b2 symbols expressing affinity.
Differencing pixel values may lead to a gray-level function as
vi h0 h1qi (a0 a1 x a 2 y, b0 b1 x b2 y ) g i ( x, y ) 0. Inde
x i varies for n pixels in a window. Symbol vi denotes a zeromean residual error having the Gaussian distribution, or
2.4 Thin-plate Spline Interpolation
N (0, i2 ) with the i symbol meaning the standard deviation;
TPS (Thin-Plate Splines) stands for a flexible function in that it
emulates the minimized bending energy of a metal plate on
multiple tie-point constraints. A trend surface stems from a
global, affine transformation between two overlapping images.
h0 and h1 linearly modify pixel values.
Linear expansion at approximate unknowns results in a system
of error equations, defined as v Ax l with 02Q . By
referring to Mikhail (1976), one obtains the least-squares
solution of unknown p (...truncated)