This Conference Presentation, “Towards a deep-learning aided point cloud labeling suite,” was recorded at SPIE Photonics West held in San Francisco, California, United States
With the emergence of advanced 2D and 3D sensors such as high-resolution visible cameras and less expensive lidar sensors, there is a need for a fusion of information extracted from senor modalities for accurate object detection, recognition, and tracking. To train a system with data captured by multiple sensors the regions of interest in the data must be accurately aligned. A necessary step in this process is a fine, pixel-level registration between multiple modalities. We propose a robust multimodal data registration strategy for automatically registering the visible and lidar data captured by sensors embedded in aerial vehicles. The coarse registration of the data is performed by utilizing the metadata, such as timestamps, GPS, and IMU information, provided by the data acquisition systems. The challenge is these modalities contain very different sets of information and are not able to be aligned using classical methods. Our proposed fine registration mechanism employs deep-learning methodologies for feature extraction of data in each modality. For our experiments, we use a 3D geopositioned aerial lidar dataset along with the visible data (coarsely registered) and extracted SIFT-like features from both of the data streams. These SIFT features are generated by appropriately trained deep-learning algorithms.
Point cloud completion aims to infer missing regions of a point cloud, given an incomplete point cloud. Like image inpainting, in the 2D domain, point cloud completion offers a way to recreate an entire point cloud, given only a subset of the information. However, current applications study only synthetic datasets with artificial point removal, such as the Completion3D dataset. Although these datasets are valuable, they are an artificial problem set that we can not apply to real-world data. This paper draws a parallel between point cloud completion and occlusion reduction in aerial lidar scenes. We propose a crucial change in the hierarchical sampling using selforganizing maps to propose new points representing the scene in a reduced resolution. These new points are a weighted combination of the original set using spatial and feature information. A new set of proposed points is more powerful than simply sampling existing points. We demonstrate this sampling technique by replacing the farthest point sampling in the Skip-attention Network with Hierarchical Folding (SA-Net) and show a significant increase in the overall results using the Chamfers distance as our metric. We also show that we can use this sampling method in the context of any technique which uses farthest point sampling.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.