AUTOMATIC DETECTION AND RECOGNITION OF MAN-MADE OBJECTS IN HIGH RESOLUTION REMOTE SENSING IMAGES USING HIERARCHICAL SEMANTIC GRAPH MODEL (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.int-arch-photogramm-remote-sens-spatial-inf-sci.net/XL-1-W1/333/2013/isprsarchives-XL-1-W1-333-2013.pdf

AUTOMATIC DETECTION AND RECOGNITION OF MAN-MADE OBJECTS IN HIGH RESOLUTION REMOTE SENSING IMAGES USING HIERARCHICAL SEMANTIC GRAPH MODEL

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany AUTOMATIC DETECTION AND RECOGNITION OF MAN-MADE OBJECTS IN HIGH RESOLUTION REMOTE SENSING IMAGES USING HIERARCHICAL SEMANTIC GRAPH MODEL X. Sun a,b,c *, A. Thiele a, S. Hinz a, K. Fu b,c a Institute of Photogrammetry and Remote Sensing (IPF), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany b Institute of Electronic, Chinese Academy of Sciences, Beijing, China c Key Laboratory of Spatial Information Processing and Application System Technology, Chinese Academy of Sciences, Beijing, China Email: , (antje.thiele, stefan.hinz)@kit.edu, KEY WORDS: Objects detection, Objects recognition, High resolution remote sensing images, Semantic graph model ABSTRACT: In this paper, we propose a hierarchical semantic graph model to detect and recognize man-made objects in high resolution remote sensing images automatically. Following the idea of part-based methods, our model builds a hierarchical possibility framework to explore both the appearance information and semantic relationships between objects and background. This multi-levels structure is promising to enable a more comprehensive understanding of natural scenes. After training local classifiers to calculate parts properties, we use belief propagation to transmit messages quantitatively, which could enhance the utilization of spatial constrains existed in images. Besides, discriminative learning and generative learning are combined interleavely in the inference procedure, to improve the training error and recognition efficiency. The experimental results demonstrate that this method is able to detect manmade objects in complicated surroundings with satisfactory precision and robustness. them to reflect the variances between different appearances and sizes accurately. Kannan et. al (2007) thus proposed a ‘jigsaw’ model, and the shapes, size of parts are learned from the repeated structures in a set of training images. By learning such irregularly shaped pieces, both the shape and the scale of parts can be discovered without supervision. Also, Ni et. al (2009) made some improvements, by constructing a generative model to capture the appearance and geometric structure of the whole scenes. Their models suffer from errors in scenes containing complicate contents because they only rely on single level processing. Furthermore, their descriptions do not make full use of spatial relations existed in images, particularly the ones with various background clutters. In this paper, we propose a specific hierarchical semantic graph model. Unlike traditional parts-based approaches, this model can yield more comprehensive understanding of images. It can not only build the semantic constrains between objects and background at high level, but also reinforces the geometrical relations between different components at low level. Our model also uses belief propagation to enhance the utilization of spatial information existed in scenes, by training local classifiers. This is done to calculate parts properties and using messages to transmit their semantic relationships quantitatively. Besides, discriminative learning and generative learning are combined in inference procedure interleavely, to improve the training and recognition efficiency. The experiments on our dataset demonstrate that it can detect and recognize man-made objects in high resolution remote sensing images with satisfactory precision and robustness. In the following, section 2 explains the hierarchical semantic model. Section 3 introduces the procedure of messages propagation, and section 4 illustrates the flow of hybrid 1. INTRODUCTION With the development of remote sensing technology, a large number of high-resolution remote sensing images are available, which can provide us geo-spatial information in detail. The task of interpreting various types of man-made objects has become a key problem in remote sensing image analysis. Many approaches have been proposed for object detection and recognition, using textural features, wavelet filters, and so on. Since most of man-made objects are complex structures and surrounded by disturbing background, the mentioned low-level methods can not detect objects as accurately as expected. Besides holistic approaches some parts-based models have been introduced, following the theory that man-made objects can be taken as a composition of features or sub-objects according to certain spatial rules. Initially, those works used simple primitives to describe parts, like structured lines or curves, and defined the relationships by numbers or ratio between adjacent ones. Obviously, those descriptors are too simple to explore useful information in images. Later, Webber et. al (2000) represent objects as constellations of rigid parts, and recognized objects with a join probability density function on the shape of rigid parts by similarity matching. Fergus et. al (2003) and Opelt et. al (2004) proposed category models composed of some more flexible parts, and estimated the parameters of the parts using expectation-maximization algorithm. Leibe et. al (2004) introduced an implicit shape model which organizes different contour fragments to extract objects from cluttered scenes. Vijayanarasimhan & Grauman (2008) also presented an unsupervised learning method to analyze objects by calculating relationship between their parts. However, the parts in those methods are mostly pre-defined, which means it is difficult for * Corresponding author 333 International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-1/W1, ISPRS Hannover Workshop 2013, 21 – 24 May 2013, Hannover, Germany The node B in M is associated with an offset vector l i = (lix , liy , liz ) to describe its spatial information, where lix and inference. Section 5 and section 6 give the experimental results and conclusion. liy are the offset value of node coordinate, liz is the offset value of node layer. Then, we can build a mapping function between segments in training image and nodes in semantic graph as: l i = (t i − ri ) mod G where (a) Level 1 (b) Level 2 (c) Level 3 Figure 1. Multi-segmentation results (2) t i = original vector of segments in I ri = semantic vector of nodes in G G = dimension of graph G The offset vector can be calculated as following: I1 ⎧lix = tix − rix ⎪ ⎨liy = tiy − riy ⎪l = t − r ⎩ iz iz iz M1 where (3) tix , tiy , tiz = center coordinates and layer of t i rix , rix , riy = center coordinates and layer of ri G Mn In Figure 2. Hierarchical semantic graph model It is easy to deduce that if two adjacent segments have the same offset values in an image, they should also be adjacent in mapping graph. We design following criterion to evaluate this consistent relationship: 2. HIERARCHICAL SEMANTIC GRAPH MODEL Though remote sensing images (...truncated)