The massive shift in temperatures in the Arctic region has caused the increased Albedo effect as higher amount of solar energy is absorbed in the darker surface due to melting ice and snow. This continuous regional warming results in further melting of glaciers and loss of sea ice. Arctic melt ponds are important indicators of Arctic climate change. High-resolution aerial photographs are invaluable for identifying different sea ice features and are great source for validating, tuning, and improving climate models. Due to the complex shapes and unpredictable boundaries of melt ponds, it is extremely tedious, taxing, and time-consuming to manually analyze these remote sensing data that lead to the need for automatizing the technique. Deep learning is a powerful tool for semantic segmentation, and one of the most popular deep learning architectures for feature cascading and effective pixel classification is the UNet architecture. We introduce an automatic and robust technique to predict the bounding boxes for melt ponds using a Multiclass Recurrent Residual UNet (R2UNet) with UNet as a base model. R2UNet mainly consists of two important components in the architecture namely residual connection and recurrent block in each layer. The residual learning approach prevents vanishing gradients in deep networks by introducing shortcut connections, and the recurrent block, which provides a feedback connection in a loop, allows outputs of a layer to be influenced by subsequent inputs to the same layer. The algorithm is evaluated on Healy-Oden Trans Arctic Expedition (HO-TRAX) dataset containing melt ponds obtained during helicopter photography flights between 5 August and 30 September 2005. The testing and evaluation results show that R2UNet provides improved and superior performance when compared to UNet, Residual UNet (Res-UNet) and Recurrent U-Net (R-UNet).
Automatic ship detections in complex background during the day and night in infrared images is an important task. Additionally, we want to have the capability to detect the ships in various scales, orientations, and shapes. In this paper, we propose the use of neural network technology for this purpose. The algorithm used for this task is the Deep Neural Machine (DNM), which contains three different parts (backbone, neck, and head). Combining all three steps, this algorithm can extract the features, create prediction layers using different scales of the backbone, and give object predictions at different scales. The experimental results show that our algorithm is robust and efficient in detecting ships in complex background.
This Conference Presentation, “Learning classical image registration features using a deep learning architecture,” was recorded at SPIE Photonics West held in San Francisco, California, United States
This Conference Presentation, “Deep neural machine for multimodal information fusion,” was recorded at SPIE Photonics West held in San Francisco, California, United States
Multi-object tracking in wide-area motion imagery (WAMI) is facilitating great interest in the field of image processing that leads to numerous real-world applications. Among them, aircraft and unmanned aerial vehicles (UAV) with real-time robust visual trackers for long-term aerial maneuvering are currently attracting attention and have remarkably broadened the scope of applications of object tracking. In this paper, we present a novel attention-based feature fusion strategy, which effectively combines the template and searching region features. Our results demonstrate the efficacy of the proposed system on CLIF and UNICORN datasets.
Automated monitoring of low resolution, deep-space objects in wide field of view (WFOV) imaging systems can benefit from the improved performance of deep learning object detectors. The PANDORA sensor array, located in Maui at the Air Force Maui Optical and Supercomputing Site, is an exemplar of a scalable imaging architecture that can detect dim deep-space objects while maintaining a WFOV. The PANDORA system captures 20°×120° images of the night sky oriented along the GEO belt at a rate of two frames per minute. Prior work has established a baseline performance for the detection of Geosynchronous Earth Orbit (GEO) satellite objects using classical, feature-based detectors. This work extends GEO object detection and tracking methodologies by implementing a spatio-temporal deep learning architecture (GEO-SPANN), further improving the state of the art in GEO satellite object detection and tracking. GEO-SPANN consists of a learned spatial detector coupled with a tracking algorithm to detect and re-identify space objects in temporal sequences. We present the detection and tracking results of GEO-SPANN on an annotated PANDORA dataset, reporting an overall maximum F1 point of 0.814, corresponding to 0.766 precision and 0.868 recall. GEO-SPANN advances strategies for autonomous detection and tracking of GEO satellites, enabling the PANDORA sensor system to be leveraged for satellite orbit catalog maintenance and anomaly detection.
Aerial object detection is one of the most important applications in computer vision. We propose a deep learning strategy for detection and classification of objects on the pipeline right of ways by analyzing aerial images captured by flying aircrafts or drones. Due to the limitation of sufficient aerial datasets for accurately training the deep learning systems, it is necessary to create an efficient methodology for object data augmentation of the training dataset to achieve robust performance in various environmental conditions. Another limitation is the computing hardware that could be installed on the aircraft, especially when it is a drone. Hence a balance between the effectiveness and efficiency of object detector needs to be considered. We propose an efficient weighted IOU NMS (intersection over union non-maxima suppression) method to speed up the post-processing time that satisfies the onboard processing requirement. Weighted IOU NMS utilizes confidence scores of all proposed bounding boxes to regenerate a mean box in parallel. It processes the bounding box score at the same instant without removing the bounding box or decreasing the bounding box score. We perform both quantitative and qualitative evaluations of our network architecture on multiple aerial datasets. The experimental results show that our proposed framework achieves better accuracy than the state-of-the-art methods for aerial object detection in various environmental conditions.
Much research has been done in implementing deep learning architectures in detection and recognition tasks. Current work in auto-encoders and generative adversarial networks suggest the ability to recreate scenes based on previously trained data. It can be assumed that with the ability to recreate information is the ability to differentiate information. We propose a convolutional auto-encoder for both recreating information of the scene and for detection of vehicles from within the scene. In essence, the auto-encoder creates a low-dimensional representation of the data projected in a latent space, which can also be used for classification. The convolutional neural network is based on the concept of receptive fields created by the network, which are part of the detection process. The proposed architecture includes a discriminator network connected in the latent space, which is trained for the detection of vehicles. Through work in multi-task learning, it is advantageous to learn multiple representations of the data from different tasks to help improve task performance. To test and evaluated the network, we use standard aerial vehicle data sets, like Vehicle Detection in Aerial Imagery (VEDAI) and Columbus Large Image Format (CLIF). We observe that the neural network is able to create features representative of the data and is able to classify the imagery into vehicle and non-vehicle regions.
KEYWORDS: 3D modeling, Cameras, RGB color model, 3D surface sensing, Robotics, Environmental sensing, Motion models, 3D visualizations, Reconstruction algorithms, Sensors, 3D metrology
A new methodology for 3D change detection which can support effective robot sensing and navigation in a reconstructed indoor environment is presented in this paper. We register the RGB-D images acquired with an untracked camera into a globally consistent and accurate point-cloud model. This paper introduces a robust system that detects camera position for multiple RGB video frames by using both photo-metric error and feature based method. It utilizes the iterative closest point (ICP) algorithm to establish geometric constraints between the point-cloud as they become aligned. For the change detection part, a bag-of-word (DBoW) model is used to match the current frame with the previous key frames based on RGB images with Oriented FAST and Rotated BRIEF (ORB) feature. Then combine the key-frame translation and ICP to align the current point-cloud with reconstructed 3D scene to localize the robot position. Meanwhile, camera position and orientation are used to aid robot navigation. After preprocessing the data, we create an Octomap Model to detect the scene change measurements. The experimental evaluations performed to evaluate the capability of our algorithm show that the robot's location and orientation are accurately determined and provide promising results for change detection indicating all the object changes with very limited false alarm rate.
KEYWORDS: RGB color model, 3D modeling, Robotics, 3D surface sensing, Detection and tracking algorithms, Environmental sensing, Clouds, Video, Free space, Image processing, 3D modeling, Clouds, Sensors, 3D image processing, Robot vision, Data modeling
3D scene change detection is a challenging problem in robotic sensing and navigation. There are several unpredictable aspects in performing scene change detection. A change detection method which can support various applications in varying environmental conditions is proposed. Point cloud models are acquired from a RGB-D sensor, which provides the required color and depth information. Change detection is performed on robot view point cloud model. A bilateral filter smooths the surface and fills the holes as well as keeps the edge details on depth image. Registration of the point cloud model is implemented by using Random Sample Consensus (RANSAC) algorithm. It uses surface normal as the previous stage for the ground and wall estimate. After preprocessing the data, we create a point voxel model which defines voxel as surface or free space. Then we create a color model which defines each voxel that has a color by the mean of all points’ color value in this voxel. The preliminary change detection is detected by XOR subtract on the point voxel model. Next, the eight neighbors for this center voxel are defined. If they are neither all ‘changed’ voxels nor all ‘no changed’ voxels, a histogram of location and hue channel color is estimated. The experimental evaluations performed to evaluate the capability of our algorithm show promising results for novel change detection that indicate all the changing objects with very limited false alarm rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.