ROBUST FEATURE MATCHING IN TERRESTRIAL IMAGE SEQUENCES (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XLII-3/3/2018/isprs-archives-XLII-3-3-2018.pdf

ROBUST FEATURE MATCHING IN TERRESTRIAL IMAGE SEQUENCES

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3, 2018 ISPRS TC III Mid-term Symposium “Developments, Technologies and Applications in Remote Sensing”, 7–10 May, Beijing, China ROBUST FEATURE MATCHING IN TERRESTRIAL IMAGE SEQUENCES A. Abbas1 , S. Ghuffar2 ∗ 1, 2 Geospatial Research and Education Lab (GREL) Dept. of Space Science, Institute of Space Technology, Islamabad, Pakistan (ahsan, sajid.ghuffar)@grel.ist.edu.pk Commission III, Urban Sensing and Mobility KEY WORDS: Feature Detection, Feature Matching, SIFT, SURF, RANSAC, 3D Reconstruction ABSTRACT: From the last decade, the feature detection, description and matching techniques are most commonly exploited in various photogrammetric and computer vision applications, which includes: 3D reconstruction of scenes, image stitching for panoramic creation, image classification, or object recognition etc. However, in terrestrial imagery of urban scenes contains various issues, which include duplicate and identical structures (i.e. repeated windows and doors) that cause the problem in feature matching phase and ultimately lead to failure of results specially in case of camera pose and scene structure estimation. In this paper, we will address the issue related to ambiguous feature matching in urban environment due to repeating patterns. 1. INTRODUCTION Many photogrammetric and computer vision applications are relying on more than one image of same scene or object. In order to relate images to one another, the corresponding points of same scene (3D features) are need to be matched across those images. From the last few years, image feature detectors and descriptors are most widely used techniques for such applications which includes 3D scene reconstruction, panoramic mosaicking/stitching, image classification, object recognition and robot localization etc., all are depends upon the presence of stable and representative features in an image space. Thus, the image features detection and extraction are important steps for these applications (Hassaballah et al., 2016). Nowadays there are number of algorithms available for feature detectors and descriptors, which provide region of interest, edges or corners (Remondino, n.d.) the most common of them are Speeded Up Robust Features (SURF) (Bay et al., 2006), Scale Invariant Feature Transform (SIFT) (Lowe, 2004), Features from Accelerated Segment Test (FAST) (Rosten and Drummond, 2005) or Binary Robust Invariant Scalable Key points (BRISK) (Leutenegger et al., 2011) etc. Ideally the feature matching characteristics reported by (Haralick and Shapiro, 1992) are: invariant (independent from geometric and radiometric distortions), stability (robust against image noise), distinctness (clearly distinguish from background) and uniqueness (distinguishable from other points). The feature detection and matching can be split into three steps. 1) Detection: find the keypoints in each images. 2) Description: Ideally, the local appearance around each feature point should be invariant to scale, rotation, noise, change in illuminations and affine transformations. The distinctive feature descriptors are calculated from each region by picking the neighborhood region around the every key point. Normally we end up with a descriptor vector for each keypoint. 3) Matching: To identify similar ∗ Corresponding author features, descriptors are compared across the images. In successfully matched features we may get the pairs of (xi , yi ) ↔ (xi , yi ). Where (xi , yi ) is features in first image and (xi , yi ) is the matched feature in other image. However in terrestrial imagery of the urban scenes, there are many repeated feature patterns, nearly identical or duplicate structures with similar texture patters, which ultimately cause the problems in feature matching and subsequently lead to applications result failure (e.g. sparse scene 3D reconstruction). Removal of these incorrect matches is a necessary step to perform specially in case of urban scenes, where the accurate recovery of camera pose and scene structure is necessary. Typical feature matching strategies lead to high number of outliers and due to the fact that the ambiguous matches are parallel to the epipolar lines due to inherent scene geometry and camera motion, robust estimators like RANSAC (used to reject incorrect matches) sometimes lead to wrong solution of correspondences and camera poses. In the current paper, we investigate and discuss the issues related to ambiguous feature matching using SIFT (Vedaldi and Fulkerson, 2008) and SURF (MATLAB based Implementation) algorithms in urban environment due to repeating patterns that ultimately lead to false camera pose estimation for scene reconstruction. We also provide advices and suggestions about the removal of these known issues. The reason of using SIFT and SURF descriptors is due to their good performance and are widely used technique in many applications. 2. RELATED WORK In urban scene architecture, symmetry and repetition in designs are most commonly used. The buildings contain hierarchy of symmetries and repetitions on frontage: for example windows and doors, which excessively appears along the horizontal direction. Changchang Wu et al. (Wu et al., 2010) presented the technique to find the repeated features on architectural frontal plane This contribution has been peer-reviewed. https://doi.org/10.5194/isprs-archives-XLII-3-3-2018 | © Authors 2018. CC BY 4.0 License. 3 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3, 2018 ISPRS TC III Mid-term Symposium “Developments, Technologies and Applications in Remote Sensing”, 7–10 May, Beijing, China with precise recovery of boundary selection for finding the repetition. There method works well for horizontal direction repetition and low-count. Kyle Wilson et al. (Wilson and Snavely, 2013) also presented the new approach for urban scenes, that contains the repeated features by considering the local visibility graph. There model leads to highly scalable, fast and simple technique for disambiguating the repeated elements without solely relying on geometric reasoning. They used the large datasets drawn from internet photo collections for demonstration of their method and compared it with other geometry based technique of disambiguation. Richard Roberts et al. (Roberts et al., 2011) examined the geometric ambiguities caused by existence of duplicate and repeated structures when different instances are matched on the basis of visual similarity. They proposed the algorithm that recovers the true data association (problem of determining the correspondence either in whole image or feature points) even if there is large number of false pairwise matches exist. Similarly, the Nianjuan Jiang et al. (Jiang et al., 2012) also worked on the repetitive scene structure, which cause the issue in epipolar geometry (EG) due to (...truncated)