One of the challenges when building Machine Learning (ML) models using satellite imagery is building sufficiently labeled data sets for training. In the past, this problem has been addressed by adapting computer vision approaches to GIS data with significant recent contributions to the field. But when trying to adapt these models to Sentinel-2 multi-spectral satellite imagery these approaches fall short. Previously, researchers used transfer learning methods trained on ImageNet and constrained the 13 channels to 3 RGB ones using existing training sets, but this severely limits the available data that can be used for complex image classification, object detection, and image segmentation tasks. To address this deficit, we present Distil, and demonstrate a specific method using our system for training models with all available Sentinel-2 channels. There currently is no publicly available rich labeled training data resource such as ImageNet for Sentinel-2 satellite imagery that covers the entire globe. Our approach using the Distil system was: a) pre-training models using unlabeled data sets and b) adapting to specific downstream tasks using a small number of annotations solicited from a user. We discuss the Distil system, an application of the system in the remote sensing domain, and a case study identifying likely locust breeding grounds in Africa from unlabeled 13-channel satellite imagery.
|