Image-to-image correspondence is important in numerous remote sensing applications ranging from image mosaicking to 3D reconstruction. While many local features used for these methods aim for robustness to changes in viewpoint/illumination, recent studies have suggested that traditional feature extractors may lack stability in multi-temporal applications. We have discovered that this is especially true in multi-modal sensor contexts, such as corresponding high-resolution UAS images to broad area overhead imagery (e.g., satellite images). This paper explores the performance of various local feature extraction methods as they pertain to image-to-image correspondence in scenes captured at different times, with different sensors. Experiments here specifically evaluate co-registration between low-altitude, nadir UAV frames, and imagery collected from satellite sources. Due to challenges in the localization of imagery with significantly different resolutions, spatial extents, and spectral characteristics, two further studies are presented beyond baseline evaluation. First, images undergo histogram matching to better understand how the discussed algorithms’ performance changes as image characteristics become more or less similar. Secondly, experiments are performed where key point feature matches are refined with information taken from segmentation maps inferred by pre-trained segmentation models. These methods are evaluated in regions where satellite and UAV images have been collected at different times, with spatial correspondences being hand-labeled.
Nanoenergetic materials offer high-density energy storage that may be reacted to produce heat and release gaseous products. However, the fundamental reaction mechanisms of isolated nanoenergetic fuel particles (typically aluminum - AL) remain poorly understood. In this study, the structure-property relationships of photothermally heated AL nanoparticles are explored using an optical microscope setup and laser-based photothermal heating. Our research explores optical imaging and computer vision techniques to measure distinctive features from images captured before and after directed energy excitation of nanoenergetic particles. These features are used to describe the reactions in the pursuit of creating an automated nanoenergetic material reaction characterization model. Specifically, optical imagery of nano-aluminum particle clusters is taken before and after the reaction is initiated via laser irradiation. Through image preprocessing and registration, we remove untargeted nanoparticle clusters and align the images. We then classify particle reactions into three classes, Spallation, Sintering, or a Combination of both, through an examination of various features derived from our preprocessed imagery. These techniques serve as tools to aid researchers in quantitatively measuring reaction properties, such as loss of mass, and accelerating the search for optimized reaction parameters.
The ability to accurately monitor the position of an object or group of objects in some video sequence can be of high value when making decisions with respect to security, maneuverability, ecology, and infrastructure. Efforts from our previous work1, 2 provided evidence that in overhead video, a system’s ability to perform this tracking could be greatly enhanced through effective and segmented calculation of frame-to-frame spatial correspondences. We now continue the investigations began by this previous work (wherein frames and their detections are co-mapped) through a case study wherein hand-crafted and machine learning (ML) approaches for co-registration of video frames are reviewed and compared. First, the merits and shortcomings of hand-crafted algorithms, ML models, and hybrid approaches are discussed for key-point detection and key-point description. Modifications to feature matching and homography estimation are discussed as well, and following this, a more recently published class of co-registration involving “detector-less” correspondence is outlined. These approaches are applied and evaluated on four overhead image sequences with ground-truth corresponding to object detection locations only. Because of this lack of ground control points for evaluation, co-mapped centroids of stationary objects are used to generate an accuracy metric for the various mapping approaches. Further, given the value of mapping and tracking in real-time contexts, this notion of accuracy is compared to computation time so that trade-offs in temporal and spatial performance can be better understood.
Unmanned aerial systems (UAS) equipped with visual sensors can be quickly deployed to map novel regions, a useful ability in GPS-denied regions, search and rescue operations, disaster response, and defense. Assisted by such a UAS, a ground vehicle could safely navigate a given region, aware of the potential hazards seen from airborne sensors. Here, we propose a pipeline for identifying and mapping maneuverable regions and objects pertinent to safe navigation (cars, barriers, etc.) in sequential imagery captured from UAS sensors. First we use a semantic labeling deep neural network for identifying roads, an object detection neural network for detecting hazards of known classes, and a model that uses linear features to detect potential road hazards in labeled road pixels. This visual evidence regarding maneuverability is collected across temporally-sequential images and is spatially fused into a single map via visual feature correspondence. Fusion of road evidence is done on a per-pixel basis while clustering techniques are used to find objects given a set of co-mapped detections. We show the use of this pipeline for the quick and automated creation of maps that contain useful information with regard to safe navigation of a region captured by UAS sensors. These techniques serve as a part in the development of a model for safe, ecient navigation of GPS-denied/rapidly changing regions through the use of UAS-enabled mapping.
Advancement in remote sensing capabilities have led to unprecedented quantity and quality of data across a number of sensing modalities. It is now possible to outfit nearly any mobile platform not only with high resolution cameras, but also with inexpensive infrared and LIDAR sensors. For the specific goal of providing a comprehensive assessment of vehicle maneuverability, we address the problem of co-registering multiple sensor phenomenologies, such as visual, infrared, and LIDAR imagery collected from vehicle-mounted sensors. We show that a data fusion across these sensors provides invaluable information in hazard detection, localization, and classification. In addition, the co-registered measurements lead to the feasibility of enhanced heterogeneous data machine learning methods. This approach is verified on a dataset collected by the U.S. Army ERDC.
Within computer vision, deep neural networks (DNNs) have gained tremendous popularity in recent years due to their ability to extract and classify visual features. As this technology has become more widespread, some of the shortcomings of the DNNs have become apparent. Recently, researchers have found that DNNs prefer to learn texture from visual signals rather than shape and have observed that more shape extraction correlates to better DNN performance. DNN have been applied to a vast number of problems, with excellent results in many of them, including object detection. The combination of DNN feature extractors with regression and region proposal techniques, like the Faster Region-Proposal Convolutional Neural Network (Faster R-CNN) and Single Shot Detector (SSD), have yielded promising results. Meanwhile, from the field of digital image processing, grayscale morphology extracts shape information using morphological operations. The Differential Morphological Profile (DMP) performs morphological opening and closings with varied structuring element sizes and computes the absolute difference between the resulting steps. The DMP provides a mechanism to improve shape extraction within DNN, and increase model robustness. To that end, a DMP-based neural network, DMPNet, has been created to assist DNN with extracting shape information by adding layers that perform DMP prior to the first convolutional layer. We use the DMPNet as a feature extractor for Faster R-CNN and SSD, and apply it to Maneuverability Hazard Detection in unmanned aerial system (UAS) imagery. The benefits of this approach include better explainability, lower training times, and models more tuned to shape information.
Semantic segmentation, the task of assigning a class label to each pixel within a given image, has applications in a wide variety of domains, ranging from medicine to self-driving vehicles. One successful deep neural network model that has been developed for semantic segmentation tasks is the U-Net architecture, a "U"-shaped neural network initially applied to segmentation of cell membranes in biomedical images. Additional variants of the U- Net have been developed within the research literature that incorporate new features such as residual layers and attention mechanisms. In this research, we evaluate various U-Net-based architectures on the task of segmenting the road and non-road in low-altitude UAS visible spectrum imagery. We show that these models can successfully extract the roads, detail a variety of performance metrics of the respective networks' segmentations, and show examples of successes and pending challenges using U.S. Army ERDC imagery collected from a variety of ight routes and altitudes in a complex environment.
Object detection and localization is an important problem in computer vision and remote sensing. While there have been several techniques presented and used in recent years, the You Only Look Once (YOLO) and derivative architectures have gained popularity due to their ability to perform real-time object localization as well as achieve remarkable detection scores in ground-based applications. Here, we present methods and results for performing maneuverability hazard detection and localization in low-altitude unmanned aerial systems (UAS) imagery. Imagery is captured over a variety of flight routes and altitudes, and then analyzed with modern deep learning techniques to discover objects such as civilian and military vehicles, barriers, and related hindrances to navigating cluttered semi-urban environments. We present our findings for the deep learning architectures under a variety of training and validation parameters that include pre-trained weights from benchmark public datasets, as well as training with a custom, mission-relevant dataset provided by U.S. Army ERDC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.