Automated semantic labeling of complex urban scenes in remotely sensed 2D and 3D data is one of the most challenging steps in producing realistic 3D scene models and maps. Recent large-scale public benchmark data sets and challenges for semantic labeling with 2D imagery have been instrumental in identifying state of the art methods and enabling new research. 3D data from lidar and multi-view stereo have also been shown to provide valuable additional information to enable improved semantic labeling accuracy. In this work, we describe the development of a new large-scale data set combining public lidar and multi-view satellite imagery with pixel-level truth for ground labels and instance-level truth for building labels. We demonstrate the use of this data set to evaluate methods for ground and building labeling tasks to establish performance expectations and identify areas for improvement. We also discuss initial steps toward further leveraging this data set to enable machine learning for more complex semantic and instance segmentation and 3D reconstruction tasks. All software developed to produce this public data set and to enable metric scoring are also released as open source code.
One challenging problem in many remote sensing applications is identifying building footprints in 2D and/or 3D imagery. Existing solutions to this problem use a variety of sensing modalities as input. Recent public challenges have yielded high quality building footprint detection algorithms using high-resolution 2D and 3D imaging modalities as input. However, performance of many of these algorithms is typically degraded as the fidelity and post spacing of the input imagery is reduced. Other challenges use lower resolution 2D satellite imagery alone. The United States Special Operations Command (USSOCOM) sponsored a public prize challenge aimed at identifying building footprints using 2D RGB orthorectified imagery and coincident 3D Digital Surface Models (DSMs) created from commercial satellite imagery. The top 6 winning solutions have been made publicly available as open source software. This paper summarizes the public challenge and provides results and data analysis. In addition, we provide lessons learned and hope to encourage additional research by publicly releasing the benchmark dataset to the community.
KEYWORDS: Field programmable gate arrays, Cameras, Mirrors, LIDAR, Sensors, Video, 3D image processing, Video acceleration, Data processing, Image processing
Photon-counting Geiger-mode lidar detector arrays provide a promising approach for producing three-dimensional (3D) video at full motion video (FMV) data rates, resolution, and image size from long ranges. However, coincidence processing required to filter raw photon counts is computationally expensive, generally requiring significant size, weight, and power (SWaP) and also time. In this paper, we describe a laboratory test-bed developed to assess the feasibility of low-SWaP, real-time processing for 3D FMV based on Geiger-mode lidar. First, we examine a design based on field programmable gate arrays (FPGA) and demonstrate proof-of-concept results. Then we examine a design based on a first-of-its-kind embedded graphical processing unit (GPU) and compare performance with the FPGA. Results indicate feasibility of real-time Geiger-mode lidar processing for 3D FMV and also suggest utility for real-time onboard processing for mapping lidar systems.
KEYWORDS: LIDAR, Clouds, Global Positioning System, Sensors, Error analysis, Data processing, Transform theory, Signal processing, Defense and security, Applied physics
Very accurate geo-location (geo-coding) of imagery taken at long range is a very large challenge.
Whereas GPS can supply a very accurate sensor position, the hardware for the required precision pointing
can have a very large cost. Roth, et al (2005) showed that because of the accuracy of lidar range-data, a
tri-lateration method (called Multi-Look Lidar or Multi-Look Geo-Coding) can achieve very accurate geocoding
at very long ranges and very low cost by using data-driven processing. This paper presents
extensive flight-testing results using commercial airborne lidar. Because the tri-lateration method produces
a large number of control points, the resulting accuracy of the geo-coded lidar data is somewhat better than
that predicted for a single control point due to control-point averaging.
The construction of 3D models from light detection and ranging (LIDAR) data requires reliable and accurate alignment
of multiple overlapping scans. While established manual and automated 3D alignment methods generally perform well,
aligning scans of complex scenes from arbitrary perspectives with small amounts of overlap remains challenging. The
projection information available with scanned LIDAR data is generally underutilized and may be better exploited to
simplify the alignment process, avoiding manually specified algorithm parameters and improving reliability. In this
work, we present projective methods for manual and automated 3D alignment and introduce a projective measure of
surface interpenetration to quantify alignment error. Performance is demonstrated with a combination of indoor and
outdoor scan sets, including cluttered forest scenes, and compared to results obtained using an established commercial
product.
KEYWORDS: Synthetic aperture radar, System on a chip, Sensors, Data modeling, Automatic target recognition, Databases, Image classification, Detection and tracking algorithms, Camouflage, Radar
Classification of targets in high-resolution synthetic aperture radar imagery is a challenging problem in practice, due to extended operating conditions such as obscuration, articulation, varied configurations and a host of camouflage, concealment and deception tactics. Due to radar cross-section variability, the ability to discriminate between targets also varies greatly with target aspect. Potential space-borne and air-borne sensor systems may eventually be exploited to provide products to the warfighter at tactically relevant timelines. With such potential systems in place, multiple views of a given target area may be available to support targeting. In this paper, we examine the aspect dependence of SAR target classification and develop a Bayesian classification approach that exploits multiple incoherent views of a target. We further examine several practical issues in the design of such a classifier and consider sensitivities and their implications for sensor planning. Experimental results indicating the benefits of aspect diversity for improving performance under extended operating conditions are shown using publicly released 1-foot SAR data from DARPA's MSTAR program.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.