Building Identification in Urban Areas using VHR Satellite Imagery

Evans Belly; Imdad Rizvi; M. M. Kadam

doi:10.35248/2469-4134.21.10.291

Awards Nomination 20+ Million Readerbase

Google Scholar citation report

Citations : 2407

Journal of Remote Sensing & GIS received 2407 citations as per Google Scholar report

Journal of Remote Sensing & GIS peer review process verified at publons

25+ Million Website Visitors

Indexed In

Open J Gate
RefSeek
Hamdard University
EBSCO A-Z
OCLC- WorldCat
Publons
International Scientific Indexing
Euro Pub
Google Scholar

Useful Links

Share This Page

Journal Flyer

Open Access Journals

Research Article - (2021) Volume 10, Issue 6

Building Identification in Urban Areas using VHR Satellite Imagery

Evans Belly^*, Imdad Rizvi and M. M. Kadam

¹Department of Electronics and Telecommunication Engineering, University of Mumbai, Maharashtra, India

^*Correspondence: Evans Belly, Department of Electronics and Telecommunication Engineering, University of Mumbai, India, Email:

, DOI: 10.35248/2469-4134.21.10.291

Abstract

Satellite imagery is one of the emerging technologies which is extensively utilized in various applications such as detection/extraction of man-made structures, monitoring of sensitive areas, creating graphic maps etc. The main approach here is the automated detection of buildings from very high resolution (VHR) optical satellite images. Initially the shadow and the building region are investigated and building extraction is mainly focused. Once all the landscape is collected a trimming process is done so as to eliminate the landscapes that may occur due to nonbuilding objects. Finally the label method is used to extract the building regions. The label method may be altered for efficient building extraction. The images used for the analysis are the ones which are extracted from the sensors having resolution less than 1 meter (VHR). This method provides an efficient way to produce good results. The additional overhead of mid processing is eliminated without compromising the quality of the output to ease out processing steps required and time consumed in the same. /

Introduction

In today’s scenario almost more than 50% of the population resides in urban and sub-urban environment. The manual monitoring of the land coverage area is difficult and not feasible as it would provide inaccurate values leading to inconvenient informatory data. So to obtain an acceptable database the satellite imagery technologies comes in picture. As the main concern for humans is the building areas in an environment hence the reliable and accurate extraction of buildings from satellite images becomes an important task. Taking into account the application perspective of satellite imagery this field emerges as an active research field. To justify the problem the basic concentration is on the automatic building extraction technique from the VHR images.

Automatic detection of buildings in very high spatial resolution remotely sensed imagery has been an important and critical problem because the detection/extraction results can be used in various applications viz. structure change detection, urbanization monitoring, and digital map production. This task also offers an excellent domain for studying the general problems of scene segmentation, 3D inference, and shape description under highly challenging conditions. Very high resolution satellite images provide valuable information to researchers. Among these, urbanarea boundaries and building locations play crucial roles. For a human expert, manually extracting this valuable information is tedious and time consuming. One possible solution to extract this information or data is using automated techniques. The most important data input source to be utilized for the purpose of object extraction are the very high resolution (VHR) satellite images [1]. Since more than 50% of the world population lives in urban and sub-urban environments [2], reliable and accurate detection of building objects from satellite images is an essential task and is a very active research field. The sensors which provide VHR satellite images are QuickBird, GeoEye I, GeoEye II, Worldview I, Worldview II etc, since its resolution is 1 meter or less. Human settlement analysis for slum and unorganized settlement monitoring can be assisted by automatically extracted building information because slum areas can generally be characterized by a high density of short and small buildings in irregular spatial arrangements [3].

The recent work performed by Akçay and Aksoy, investigates the shadow evidence to focus on building regions [3]. With accordance to this concept, the directional spatial relationship between buildings and their shadows with the prior knowledge of illumination direction is modelled. For the same, a new fuzzy landscape generation approach is proposed which is especially designed for modelling the directional relationship between buildings and their shadows. Once all landscapes are collected, a trimming process is applied for the elimination of the landscapes that may occur due to non-building objects viz. roads, sewages, garden wall, bridges etc.

Literature survey

Many studies and research have been carried out in the context of building detection, extraction and reconstruction. Simultaneously other man-made structures have also been considered for maintaining and updating geographic information system (GIS) databases. A number of surveys and methodologies were considered in the past to do the same. A state-of-the-art automatic object extraction technique from aerial imagery [4] was surveyed in the year 1999. This survey included approaches for object extraction from satellite images, which influenced the extraction from aerial imagery. It only covered models and strategies using well defined criteria. Algorithms and underlying technologies were not reviewed. Assessment, Complexity Criteria for the Assessment of Images/Models/Strategies, Characterization of Models, Characterization of Strategies, Classification of Models and Strategies were the approaches carried out in the survey. Since this was a survey it rendered as an information source for further analysis. With the existing geo-data a building and road detection technique [5] was also developed which focused on the analysis and aspects of knowledge that could be used for extraction such as types of knowledge, problems in using existing knowledge, knowledge representation and management, current and possible use of knowledge, upgrading and augmenting of knowledge [6]. Approaches were also developed for building extraction and updating from high resolution satellite imagery (Figure 1) [7].

Journal-of-Remote-Sensing-GIS-21-10-6-291-g001

Figure 1:Architectures and components of image analysis systems for object extraction.

The developed approaches include two main stages:

• Detecting the building patches and

• Delineating the building boundaries.

The building patches were detected from high resolution satellite imagery using the Support Vector Machines (SVM) classification, which is utilized for both the building extraction and updating approaches. In the building extraction part, the previously detected building patches were delineated using the Hough transform and boundary tracing based techniques.

Extraction and description of cultural man-made features and objects, such as buildings and transportation networks were also a research topic in the past. The textural features such as densities, shape of the structures, image quality were analysed. The methodology consisted of the following procedure: Detecting lines and corners, label corners based on shadows, trace object boundaries and verify hypotheses.

Since the shadow of an object also plays a vital role during the building extraction process, studying it was also expected. A computational technique for utilizing the relationship between shadows and man-made structures to aid in the automatic extraction of man-made structures from aerial imagery was studied. Four methods were described that performed the prediction of structure shape, grouping of related structures, verification of individual structures, and structure height estimation. In each method the relationship between structure and cast shadows was exploited in a unique fashion [8].

After the twentieth century the emergence of fuzzy logic was a widely accepted area of interest. An attempt was made to present an Object-based approach for urban land cover classification from high resolution multispectral image data that builds upon a pixel-based fuzzy classification [9] approach. This combined pixel/object approach was demonstrated using pan-sharpened multispectral IKONOS imagery from dense urban areas.

The fuzzy pixel-based classifier utilized both spectral and spatial information to discriminate between spectrally similar Road and Building urban land cover classes.

Further the images were segmented and accordingly the nonbuilding, non-road surface were eliminated. Using these techniques, the object-based classifier was able to identify Buildings, Impervious Surface, and Roads in dense urban areas with 76%, 81%, and 99% classification accuracies [5].

As the resolution quality of the satellite sensors upgraded, there arises a need for better quality performance tool for computer aided interpretation. Hence a system was designed for the detection and recognition of man-made objects in high resolution optical remote sensing images. Detection was done by finding a small rectangular area in the image containing an object. Recognition was the attribution of a class label [10]. Supervised learning approach based on support vector machines was used. The system would learn a generic model for each class of objects by using a geometric characterization of the examples in the database (SPOT 5 THR images, 2.5 m resolution). High number of geometric image features were utilized which allowed characterizing several classes of objects with different geometric properties using a supervised learning approach [3]. The results showed the possibility of discrimination of several classes of objects with classification rates higher than 80%.

Method

The system consists of multiple stages which are initially being segregated and then these individual stages are designed, which are further pooled to obtain the final required output.

Image preparation and Pre-processing: The image used or selected is pan-sharped which undergoes pre-processing. The pre-processing includes grey scale conversion and enhancing the image to fulfil the characteristics of the input image. The pre- processing also includes thresholding at various levels.

Vegetation extraction and shadow detection: For vegetation extraction NDVI (Normalized Difference Vegetation Index) is the widely accepted metric. By applying an appropriate threshold we compute a binary vegetation mask.

NDV I=((NIR–R))/((NIR+R))

(1)

NIR and R represent the normalized near-infrared and red image bands: For automatic shadow detection the multispectral false color shadow detection [11] is the convenient technique due to two reasons (i) it utilizes advantage of near-infrared (NIR) image (ii) it is fully independent of user and data-dependent thresholds.

(Ratio Map) RS=(S − I)/(S+I)

(2)

Where,

(S)-normalized saturation and

(I)-normalized intensity.

Shadow detection and removal: A model for the spatial arrangement between shadow and building is designed using a morphological fuzzy relation. With reference to the object and a specified direction, the landscape around the reference object along the given direction can be defined as a fuzzy set of membership values in image space. The landscape membership values are defined in the range of 0 and 1.

Fuzzy relation approach is used to determine the spatial arrangement between buildings and their shadows.

Morphological characteristics information are utilized to find the exact relationship.

With a reference (shadow) object and a direction specified by an angle −, the landscape around the reference object along the given direction is defined as a fuzzy set of membership values in image space.

In an urban area, it is essential for a building detection task to eliminate the landscapes that may occur due to shadows cast by non-building objects. To separate the landscapes of building and other non-building objects, the height difference of the objects compared to the terrain height is assessed. A minimum shadow length is computed, which is then compared with the perimeter pixels of a shadow object. If the length is found to be satisfying the length Lmin, an assumption is made that the shadow is cast from a non-building object, and thus, the generated fuzzy landscape is rejected (Figure 2) [12].

Journal-of-Remote-Sensing-GIS-21-10-6-291-g002

Figure 2: Building detection approach.

Building Detection: Finally now the building and the non-building region need to be extracted. The Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors.

The filtered image is then passed through Grab Cut [13] label methodology, bw and rgb to extract the building structure. The resultant blobs are numbered which is then outlined to reconstruct the building structures (Figure 3)

Journal-of-Remote-Sensing-GIS-21-10-6-291-g003

Figure 3: Color to HSI normalization.

Results and Discussion

Step 1: Pre-processing

Above are the pre-processing stages. The process includes conversion of Input Image (the image that contains building structure) Figure 4 to grayscale Figure 5 which then acts as the input to the enhancement stage. Enhancement is done using the morphological dilation and erosion methodology, the Figure 6 shows the enhanced image. The enhanced image then is made to undergo different levels of thresholding, Figures 7-9 to provide an appropriate input for the Vegetation Extraction and Shadow Detection stage.

Journal-of-Remote-Sensing-GIS-21-10-6-291-g004

Figure 4: Input Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g005

Figure 5: Input Grey Image.

Journal-of-Remote-Sensing-GIS-21-10-6-291-g006

Figure 6: Enhanced Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g007

Figure 7: Threshold, n=2

Journal-of-Remote-Sensing-GIS-21-10-6-291-g008

Figure 8: Threshold, n=3.

Journal-of-Remote-Sensing-GIS-21-10-6-291-g009

Figure 9: Threshold, n=4.

For vegetation extraction the NDVI is used. The filtered image Figure 10 seen is that which is obtained based on the output of the pre-processing and the actual input image which undergoes the NDVI processing. This then is smoothened Figure 11 to negate out the inducive noise and connected components. The histogram Figure 12 manifests the background and foreground for the original image, which is thresholded at a gray level of 70.

Journal-of-Remote-Sensing-GIS-21-10-6-291-g010

Figure 10: Filtered Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g011

Figure 11: Smoothened Image

Shadow detection and its extraction play a vital role in obtaining efficient and accurate output. The shadow here is similar to any unwanted noise (which needs to be taken care of) either in speech or image processing. In this stage of execution the image is filtered and thinned Figure 13 to eliminate the noise followed by edge detection Figure 14.

This edged image is then made to go through a dual adaptive threshold Figure 15 process to detect the abnormal regions and eliminate them too. Once this is done, “hole fill” is initiated to get rid of any background pixels inside the blobs Figure 16. The resultant image is then dilated Figure 17 and the regions which are minuet Figure 18 and merely useful are discarded so that the image has only those regions which have building structure in it.

The final stage involves partioning and post processing. The grab cut partioning utilizes the foreground and the background Figure 12 along with the BW labeled Figure 19 and pseudo colored labeled Figure 20 to segment the landscape. The resultant image is then outlined Figure 21 for the structure boundaries which at last is mapped over the original grayscale image to display the building region Figure 22.

Journal-of-Remote-Sensing-GIS-21-10-6-291-g012

Figure 12: Histogram: Background and Foreground

Journal-of-Remote-Sensing-GIS-21-10-6-291-g013

Figure 13: Thinned Morphology Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g014

Figure 14: Edged Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g015

Figure 15: Dual Adaptive Threshold Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g016

Figure 16: Filed Holes

Journal-of-Remote-Sensing-GIS-21-10-6-291-g017

Figure 17: Dilated Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g018

Figure 18: Removing Small Regions

Journal-of-Remote-Sensing-GIS-21-10-6-291-g019

Figure 19: BW Labelled Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g020

Figure 20: Pseudo Coloured Labelled Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g021

Figure 21: Outlined Image

Journal-of-Remote-Sensing-GIS-21-10-6-291-g022

Figure 22: Building Extracted Image

The automatic extraction of building is possible with,

• Higher accuracy.

• Least Processing time.

The performance of the above approach is affected majority by the shadow generation which seems to be a tentative drawback. Also the non-building regions come into picture to impact the quality of the output. In future the proposed method could be used to generate 3-D representation of the detected buildings. The problem of detecting buildings retains many complexities requiring substantial future research. The future scope would be to develop and integrate road and/or bridge detection with the current methodology to eliminate the superfluous land area which does not fall under the building category. By this way, most of the road segments that are erroneously labeled can be identified and eliminated. The other future scope would be to improve the boundary detection to enhance the output quality leading to a higher level of accuracy of the building by means of a generalization process. Additionally, there is also a possibility to reconstruct the detected buildings regions; therefore, as a final future work, there can be a plan to generate a 3-D representation of the detected buildings.

REFERENCES

Ozgun OA, Senaras C, Yuksel B. Automated detection of arbitrarily shaped buildings in complex environments from monocular vhr optical satellite imagery. IEEE transactions on Geoscience Remote Sens. 2013; 51(3).
Fischer, Kolbe TH, Lang F, Cremers AB, Förstner W, Plümer L, et al. Extracting buildings from aerial images using hierarchical aggregation in 2D and 3D. Comput Vis Image Understand. 1998; 72(2); 185-203
H. G. Akçay and S. Aksoy, “Building detection uses directional spatial constraints,” in Proc. IEEE IGARSS, 2010.
Mayer H. Automatic objects extraction from aerial imagery -- A survey focusing on buildings. Comput Vis Image Understand. 1999; 74(2): 138-149.
Rizvi AI, Mohan BK. Object-based image analysis of high-resolution satellite images using modified cloud basis function neural network and probabilistic relaxation labeling process” IEEE trans. on Geoscience and Remote Sensing. 2011; 49(12).
Baltsavias E. Object extraction and revision by image analysis using existing Geodata and knowledge: Current status and steps towards operational systems. J Photogramm Remote Sens. 58(3): 129–151.
San DK. Approaches for automatic urban building extraction and updating from high resolution satellite imagery. Ph.D. Thesis, Middle East Tech. Univ., Ankara, Turkey, 2009.
Irvin RB, Mc Keown Jr DM. Methods for exploiting the relationship between buildings and their shadows in aerial imagery. IEEE Trans. Syst., Man, Cybern. 19(6); 1564–1575
Huertas A, Nevatia R. Detecting buildings in aerial images. Comput Vis Graph Image Process. 1988; 41(2): 131–152
Liu JG. Smoothing filter-based intensity modulation: A spectral preserve image fusion Technique for improving spatial details. Int J Remote Sens. 2010; 21 (18): 3461-3472.
Teke M, Baseski E, Ok AO, Yüksel B, Senaras C. Multispectral false color shadow detection. Photogrammetric Image Anal. 2011; 6952: 109–119
I. Bloch. Fuzzy relative position between objects in image processing: A morphological approach. IEEE Trans Pattern Anal Mach Intell. 1999; 21(7); 657 – 664
Rother C, Kolmogorov V, Blake A. GrabCut: Interactive foreground extraction using iterated graph cuts. ACM Trans Graph. 2004; 23(3): 309–314.