Automatic building segmentation from remote sensing images is critical in the remote sensing image semantic segmentation. The success of deep neural networks has led to advances in using fully convolutional neural networks (FCN) to extract buildings from the high-resolution image. However, the downsampling processing inevitably leads to loss of details of the segmentation results. To solve this problem, some methods try to refine the results of FCN by using probability graph models such as fully connected CRF (Conditional Random Fields). Nevertheless, many fully connected CRF based methods are too time-consuming and not suitable for building segmentation tasks in some situations. In this paper, we propose a novel time-efficient end-to-end CRF model with the domain transform algorithm called DT-CRF. In the proposed model, in order to accelerate the message passing in the mean-field approximate inference algorithm, we take the edge maps as the joint image for DT-CRF and use the domain transformation algorithm to calculate the pair-wise potential instead of the Gaussian kernel function. Meanwhile, we design a multi-task network which can generate masks and edges simultaneously, and the network can make the DT-CRF to easily optimize the segmentation results using model information. The evaluation of remote sensing image datasets verifies the time and space efficiency of the proposed DTCRF and demonstrates a distinct improvement.
In the remote sensing area, how to automatically and accurately extract buildings from images is a hot and challenging topic in these years. With the rapid development of sensor and computer hardware technologies, it gets easier to gain remote sensing images with very high-resolution and extract buildings from them by the popular deep learning models such as Fully Convolutional Networks (FCN). However, current FCN based models always lead to blurred building boundaries and have poor abilities on extracting small buildings. Therefore, in this paper, we propose the Gaussian Dilate Convolution, which is a cascade of a trainable Gaussian Filter and an dilate convolution with proper hyperparameter initializations. Also, we carefully design a hierarchical dense feature fusion structure following the dense connection manners. Finally, we embed the Gaussian Dilate Convolution into the hierarchical dense fusion structure and name it as Dense Hierarchical Spatial Gaussian Pool (Dense-HSGP). More specifically, the Gaussian Dilate Convolution has the advantages of the original dilate convolution but preserves much more context information, while the hierarchical dense connection structure of Dense-HSGP provides more abundant receptive fields and higher feature reused abilities within the model. We execute the experiments on the widely used Inrial Labelling Dataset to verify the efficiency of the proposed model. The experimental results show that the proposed model achieves 96.45 % average accuracy and 77.17% IoU respectively, which are distinct improvements rather than several recent state-of-the-art building extraction models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.