During end-to-end learning, application level performance metrics, in combination with large training sets, are used to optimize deep neural network pipelines for the task at hand. There are two main places where application level performance metrics are typically introduced: in energy functions that are minimized during inference and in loss functions that are minimized during training. Minimizing energy functions and minimizing loss functions are both hard problems in the general case and an application specific trade-off must be made between how much effort is spent in inference versus training. In this paper we explore this trade-off in the context of image segmentation. Specifically, we use a novel, computationally efficient, family of networks to investigate the trade-off between two traditional extremes. At one extreme are inference networks that minimize a correlation clustering energy function. At the other extreme are learning networks that minimize a Rand Error loss function.
Proc. SPIE. 9399, Image Processing: Algorithms and Systems XIII
KEYWORDS: Signal attenuation, Digital filtering, Control systems, Computer programming, Signal processing, Nonlinear optics, Machine learning, Nonlinear filtering, Optimization (mathematics), Binary data
Ordered Hypothesis Machines (OHM) are large margin classifiers that belong to the class of Generalized Stack Filters which were originally developed for non-linear signal processing. In previous work we showed how OHM classifiers are equivalent to a variation of Nearest Neighbor classifiers, with the advantage that training involves minimizing a loss function which includes a regularization parameter that controls class complexity. In this paper we report a new connection between OHM training and the Linear Assignment problem, a combinatorial optimization problem that can be solved efficiently with (amongst others) the Hungarian algorithm. Specifically, for balanced classes, and particular choices of parameters, OHM training is the dual of the Assignment problem. The duality sheds new light on the OHM training problem, opens the door to new training methods and suggests several new directions for research.
In material science and bio-medical domains the quantity and quality of microscopy images is rapidly increasing and there
is a great need to automatically detect, delineate and quantify particles, grains, cells, neurons and other functional "objects"
within these images. These are challenging problems for image processing because of the variability in object appearance
that inevitably arises in real world image acquisition and analysis. One of the most promising (and practical) ways to
address these challenges is interactive image segmentation. These algorithms are designed to incorporate input from a
human operator to tailor the segmentation method to the image at hand. Interactive image segmentation is now a key tool
in a wide range of applications in microscopy and elsewhere. Historically, interactive image segmentation algorithms have
tailored segmentation on an image-by-image basis, and information derived from operator input is not transferred between
images. But recently there has been increasing interest to use machine learning in segmentation to provide interactive tools
that accumulate and learn from the operator input over longer periods of time. These new learning algorithms reduce the
need for operator input over time, and can potentially provide a more dynamic balance between customization and
automation for different applications. This paper reviews the state of the art in this area, provides a unified view of these
algorithms, and compares the segmentation performance of various design choices.
A key step in many image quantification solutions is feature pooling, where subsets of lower-level features are combined
so that higher-level, more invariant predictions can be made. The pooling region, which defines the subsets, often has a
fixed spatial size and geometry, but data-adaptive pooling regions have also been used. In this paper we investigate
pooling strategies for the data-adaptive case and suggest a new framework for pooling that uses multiple sub-regions
instead of a single region. We show that this framework can help represent the shape of the pooling region and also
produce useful pairwise features for adjacent pooling regions. We demonstrate the utility of the framework in a number
of classification tasks relevant to image quantification in digital microscopy.
Visual analytics and interactive machine learning both try to leverage the complementary strengths of humans and machines to solve complex data exploitation tasks. These fields overlap most significantly when training is involved: the visualization or machine learning tool improves over time by exploiting observations of the human-computer interaction. This paper focuses on one aspect of the human-computer interaction that we call user-driven sampling strategies. Unlike relevance feedback and active learning sampling strategies, where the computer selects which data to label at each iteration, we investigate situations where the user selects which data is to be labeled at each iteration. User-driven sampling strategies can emerge in many visual analytics applications but they have not been fully developed in machine learning. User-driven sampling strategies suggest new theoretical and practical research questions for both visualization science and machine learning. In this paper we identify and quantify the potential benefits of these strategies in a practical image analysis application. We find user-driven sampling strategies can sometimes provide significant performance gains by steering tools towards local minima that have lower error than tools trained with all of the data. In preliminary experiments we find these performance gains are particularly pronounced when the user is experienced with the tool and application domain.
The task of turning raw imagery into semantically meaningful maps and overlays is a key area of remote sensing
activity. Image analysts, in applications ranging from environmental monitoring to intelligence, use imagery to generate and update maps of terrain, vegetation, road networks, buildings and other relevant features. Often these tasks can be cast as a pixel labeling problem, and several interactive pixel labeling tools have been developed. These tools exploit training data, which is generated by analysts using simple and intuitive paint-program annotation tools, in order to tailor the labeling algorithm for the particular dataset and task. In other cases, the task is best cast as a pixel segmentation problem. Interactive pixel segmentation tools have also been developed, but these tools typically do not learn from training data like the pixel labeling tools do. In this paper we investigate tools for interactive pixel segmentation that also learn from user input. The input has the form of segment merging (or grouping). Merging examples are 1) easily obtained from analysts using vector annotation tools, and 2) more challenging to exploit than traditional labels. We outline the key issues in developing these interactive merging tools, and describe their application to remote sensing.
Geospatial information systems provide a unique frame of reference to bring together a large and diverse set of data from
a variety of sources. However, automating this process remains a challenge since: 1) data (particularly from sensors) is
error prone and ambiguous, 2) analysis and visualization tools typically expect clean (or exact) data, and 3) it is difficult
to describe how different data types and modalities relate to each other. In this paper we describe a data integration
approach that can help address some of these challenges. Specifically we propose a light weight ontology for an
Information Space Model (ISM). The ISM is designed to support functionality that lies between data catalogues and
domain ontologies. Similar to data catalogues, the ISM provides metadata for data discovery across multiple,
heterogeneous (often legacy) data sources e.g. maps servers, satellite images, social networks, geospatial blogs. Similar
to domain ontologies, the ISM describes the functional relationship between these systems with respect to entities
relevant to an application e.g. venues, actors and activities. We suggest a minimal set of ISM objects, and attributes for
describing data sources and sensors relevant to data integration. We present a number of statistical relational learning
techniques to represent and leverage the combination of deterministic and probabilistic dependencies found within the
ISM. We demonstrate how the ISM provides a flexible language for data integration where unknown or ambiguous
relationships can be mitigated.
Morphological and microstructural features visible in microscopy images of nuclear materials can give information
about the processing history of a nuclear material. Extraction of these attributes currently requires a subject matter expert
in both microscopy and nuclear material production processes, and is a time consuming, and at least partially manual
task, often involving multiple software applications. One of the primary goals of computer vision is to find ways to
extract and encode domain knowledge associated with imagery so that parts of this process can be automated. In this
paper we describe a user-in-the-loop approach to the problem which attempts to both improve the efficiency of domain
experts during image quantification as well as capture their domain knowledge over time. This is accomplished through
a sophisticated user-monitoring system that accumulates user-computer interactions as users exploit their imagery. We
provide a detailed discussion of the interactive feature extraction and segmentation tools we have developed and
describe our initial results in exploiting the recorded user-computer interactions to improve user productivity over time.
Stack Filters define a large class of discrete nonlinear filter first introduced in image and signal processing for noise
removal. In recent years we have suggested their application to classification problems, and investigated their
relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous
domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate
their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in
which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. We use
the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type
classifier for low false alarm rate applications. We report results on both synthetic data and real-world image
To move from data to information in almost all science and defense applications requires a human-in-the-loop to validate
information products, resolve inconsistencies, and account for incomplete and potentially deceptive sources of
information. This is a key motivation for visual analytics which aims to develop techniques that complement and
empower human users. By contrast, the vast majority of algorithms developed in machine learning aim to replace human
users in data exploitation. In this paper we describe a recently introduced machine learning problem, called rare category
detection, which may be a better match to visual analytic environments. We describe a new design criteria for this
problem, and present comparisons to existing techniques with both synthetic and real-world datasets. We conclude by describing an application in broad-area search of remote sensing imagery.
Ship detection from satellite imagery is something that has great utility in various communities. Knowing where
ships are and their types provides useful intelligence information. However, detecting and recognizing ships is a
difficult problem. Existing techniques suffer from too many false-alarms. We describe approaches we have taken
in trying to build ship detection algorithms that have reduced false alarms. Our approach uses a version of the
grayscale morphological Hit-or-Miss transform. While this is well known and used in its standard form, we use a
version in which we use a rank-order selection for the dilation and erosion parts of the transform, instead of the
standard maximum and minimum operators. This provides some slack in the fitting that the algorithm employs
and provides a method for tuning the algorithm's performance for particular detection problems. We describe
our algorithms, show the effect of the rank-order parameter on the algorithm's performance and illustrate the
use of this approach for real ship detection problems with panchromatic satellite imagery.
As wide-area persistent imaging systems become cost effective, increasingly large areas of the earth can be imaged at
relatively high frame rates. Efficient exploitation of the large geo-spatial-temporal datasets produced by these systems
poses significant technical challenges for image and video analysis and for data mining. Significant progress in image
stabilization, moving object detection and tracking, are allowing automated systems to generate hundreds to thousands of
vehicle tracks from raw data, with little human intervention. However, tracking performance at this scale is unreliable,
and average track length is much smaller than the average vehicle route. These are limiting factors for applications that
depend heavily on track identity, i.e. tracking vehicles from their points of origin to their final destination. In this paper,
we propose and evaluate a framework for wide-area motion imagery (WAMI) exploitation that minimizes the
dependence on track identity. In its current form, this framework takes noisy, incomplete moving object detection tracks
as input, and produces a small set of activities (e.g. multi-vehicle meetings) as output. The framework can be used to
focus and direct human users and additional computation, and suggests a path towards high-level content extraction by
learning from the human-in-the-loop.
Moving object detection is of significant interest in temporal image analysis since it is a first step in many object
identification and tracking applications. A key component in almost all moving object detection algorithms is a pixellevel
classifier, where each pixel is predicted to be either part of a moving object or part of the background. In this paper
we investigate a change detection approach to the pixel-level classification problem and evaluate its impact on moving
object detection. The change detection approach that we investigate was previously applied to multi- and hyper-spectral
datasets, where images were typically taken several days, or months apart. In this paper, we apply the approach to lowframe
rate (1-2 frames per second) video datasets.
We describe the development of a simulation framework for anomalous change detection that considers both the
spatial and spectral aspects of the imagery. A purely spectral framework has previously been introduced, but
the extension to spatio-spectral requires attention to a variety of new issues, and requires more careful modeling
of the anomalous changes. Using this extended framework, we evaluate the utility of spatial image processing
operators to enhance change detection sensitivity in (simulated) remote sensing imagery.
We present a method for detecting a large number of moving targets, such as cars and people, in geographically
referenced video. The problem is difficult, due to the large and variable number of targets which enter and leave the field
of view, and due to imperfect geo-projection and registration. In our method, we assume feature extraction produces a
collection of candidate locations (points in 2D space) for each frame. Some of these locations are real objects, but many
are false alarms. Typical feature extraction might be frame differencing, or target recognition. For each candidate
location, and at each time step, our algorithm outputs a velocity estimate and confidence which can be thresholded to
detect objects with constant velocity. In this paper we derive the algorithm, investigate the free parameters, and compare
its performance to a multi-target tracking algorithm.
In this paper, we present an algorithm for determining a velocity probability distribution prior from low frame
rate aerial video of an urban area, and show how this may be used to aid in the multiple target tracking problem,
as well as to provide a foundation for the automated classification of urban transportation infrastructure. The
algorithm used to develop the prior is based on using a generic interest point detector to find automobile
candidate locations, followed by a series of filters based on scale and motion to reduce the number of false
alarms. The remaining locations are then associated between frame pairs using a simple matching algorithm,
and the corresponding tracks are then used to build up velocity histograms in the areas that are moved through
between the track endpoints. The algorithm is tested on a dataset taken over urban Tucson, AZ. The results
demonstrate that the velocity probability distribution prior can be used to infer a variety of information about
road lane directions, speed limits, etc..., as well as providing a means of describing environmental knowledge
about traffic rules that can be used in tracking.
In many tracking applications, adapting the target appearance model over time can improve performance. This approach
is most popular in high frame rate video applications where latent variables, related to the objects appearance (e.g.,
orientation and pose), vary slowly from one frame to the next. In these cases the appearance model and the tracking
system are tightly integrated, and latent variables are often included as part of the tracking system's dynamic model. In
this paper we describe our efforts to track cars in low frame rate data (1 frame / second), acquired from a highly unstable
airborne platform. Due to the low frame rate, and poor image quality, the appearance of a particular vehicle varies
greatly from one frame to the next. This leads us to a different problem: how can we build the best appearance model
from all instances of a vehicle we have seen so far. The best appearance model should maximize the future performance
of the tracking system, and maximize the chances of reacquiring the vehicle once it leaves the field of view. We propose
an online feature selection approach to this problem and investigate the performance and computational trade-offs with a
For accurate and robust analysis of remotely-sensed imagery it is
necessary to combine the information from both spectral and spatial domains in a meaningful manner. The two domains are intimately linked: objects in a scene are defined in terms of both their composition and their spatial arrangement, and cannot accurately be described by information from either of these two domains on their own.
To date there have been relatively few methods for combining spectral
and spatial information concurrently. Most techniques involve separate processing for extracting spatial and spectral information. In this paper we will describe several extensions to traditional morphological operators that can treat spectral and spatial domains concurrently and can be used to extract relationships between these domains in a meaningful way. This includes the investgation and development of suitable vector-ordering metrics and machine-learning-based techniques for optimizing the various parameters of the morphological operators, such as morphological operator, structuring element and vector ordering metric. We demonstrate their application to a range of multi- and hyper-spectral image analysis problems.
We present ZEUS, an algorithm for extracting features from images and time series signals. ZEIS is designed to solve a variety of machine learning problems including time series forecasting, signal classification, image and pixel classification of multispectral and panchromatic imagery. An evolutionary approach is used to extract features from a near-infinite space of possible combinations of nonlinear operators. Each problem type (i.e. signal or image, regression or classification, multiclass or binary) has its own set of primitive operators. We employ fairly generic operators, but note that the choice of which operators to use provides an opportunity to consult with a domain expert. Each feature is produced from a composition of some subset of these primitive operators. The fitness for an evolved set of features is given by the performance of a back-end classifier (or regressor) on training data. We demonstrate our multimodal approach to feature extraction on a variety of problems in remote sensing. The performance of this algorithm will be compared to standard approaches, and the relative benefit of various aspects of the algorithm will be investigated.
An increasing number and variety of platforms are now capable of
collecting remote sensing data over a particular scene. For many
applications, the information available from any individual sensor may
be incomplete, inconsistent or imprecise. However, other sources may
provide complementary and/or additional data. Thus, for an application
such as image feature extraction or classification, it may be that
fusing the mulitple data sources can lead to more consistent and
Unfortunately, with the increased complexity of the fused data, the
search space of feature-extraction or classification algorithms also
greatly increases. With a single data source, the determination of a
suitable algorithm may be a significant challenge for an image
analyst. With the fused data, the search for suitable algorithms can
go far beyond the capabilities of a human in a realistic time frame,
and becomes the realm of machine learning, where the computational
power of modern computers can be harnessed to the task at hand.
We describe experiments in which we investigate the ability of a suite
of automated feature extraction tools developed at Los Alamos National
Laboratory to make use of multiple data sources for various feature
extraction tasks. We compare and contrast this software's capabilities
on 1) individual data sets from different data sources 2) fused data
sets from multiple data sources and 3) fusion of results from multiple
individual data sources.
We introduce an algorithm for classifying time series data. Since our initial application is for lightning data, we call the algorithm Zeus. Zeus is a hybrid algorithm that employs evolutionary computation for feature extraction, and a support vector machine for the final backend classification. Support vector machines have a reputation for classifying in high-dimensional spaces without overfitting, so the utility of reducing dimensionality with an intermediate feature selection step has been questioned. We address this question by testing Zeus on a lightning classification task using data acquired from the Fast On-orbit Recording of Transient Events (FORTE) satellite.
Los Alamos National Laboratory has developed and demonstrated a highly capable system, GENIE, for the two-class problem of detecting a single feature against a background of non-feature. In addition to the two-class case, however, a commonly encountered remote sensing task is the segmentation of multispectral image data into a larger number of distinct feature classes or land cover types. To this end we have extended our existing system to allow the simultaneous classification of multiple features/classes from multispectral data. The technique builds on previous work and its core continues to utilize a hybrid evolutionary-algorithm-based system capable of searching for image processing pipelines optimized for specific image feature extraction tasks. We describe the improvements made to the GENIE software to allow multiple-feature classification and describe the application of this system to the automatic simultaneous classification of multiple features from MTI image data. We show the application of the multiple-feature classification technique to the problem of classifying lava flows on Mauna Loa volcano, Hawaii, using MTI image data and compare the classification results with standard supervised multiple-feature classification techniques.
Feature extraction from imagery is an important and long-standing problem in remote sensing. In this paper, we report on work using genetic programming to perform feature extraction simultaneously from multispectral and digital elevation model (DEM) data. We use the GENetic Imagery Exploitation (GENIE) software for this purpose, which produces image-processing software that inherently combines spatial and spectral processing. GENIE is particularly useful in exploratory studies of imagery, such as one often does in combining data from multiple sources. The user trains the software by painting the feature of interest with a simple graphical user interface. GENIE then uses genetic programming techniques to produce an image-processing pipeline. Here, we demonstrate evolution of image processing algorithms that extract a range of land cover features including towns, wildfire burnscars, and forest. We use imagery from the DOE/NNSA Multispectral Thermal Imager (MTI) spacecraft, fused with USGS 1:24000 scale DEM data.
Feature identification attempts to find algorithms that can consistently separate a feature of interest from the background in the presence of noise and uncertain conditions. This paper describes the development of a high-throughput, reconfigurable computer based, feature identification system known as POOKA. POOKA is based on a novel spatio-spectral network, which can be optimized with an evolutionary algorithm on a problem-by-problem basis. The reconfigurable computer provides speed up in two places: 1) in the training environment to accelerate the computationally intensive search for new feature identification algorithms, and 2) in the application of trained networks to accelerate content based search in large multi-spectral image databases. The network is applied to several broad area features relevant to scene classification. The results are compared to those found with traditional remote sensing techniques as well as an advanced software system known as GENIE. The hardware efficiency and performance gains compared to software are also reported.
We describe the implementation and performance of a parallel, hybrid evolutionary-algorithm-based system, which optimizes image processing tools for feature-finding tasks in multi-spectral imagery (MSI) data sets. Our system uses an integrated spatio-spectral approach and is capable of combining suitably-registered data from different sensors. We investigate the speed-up obtained by parallelization of the evolutionary process via multiple processors (a workstation cluster) and develop a model for prediction of run-times for different numbers of processors. We demonstrate our system on Landsat Thematic Mapper MSI , covering the recent Cerro Grande fire at Los Alamos, NM, USA.
KEYWORDS: Digital signal processing, Reconfigurable computing, Sensors, Image segmentation, Image processing, Remote sensing, Field programmable gate arrays, Feature extraction, Signal processing, Algorithm development
Compute performance and algorithm design are key problems of image processing and scientific computing in general. For example, imaging spectrometers are capable of producing data in hundreds of spectral bands with millions of pixels. These data sets show great promise for remote sensing applications, but require new and computationally intensive processing. The goal of the Deployable Adaptive Processing Systems (DAPS) project at Los Alamos National Laboratory is to develop advanced processing hardware and algorithms for high-bandwidth sensor applications. The project has produced electronics for processing multi- and hyper-spectral sensor data, as well as LIDAR data, while employing processing elements using a variety of technologies. The project team is currently working on reconfigurable computing technology and advanced feature extraction techniques, with an emphasis on their application to image and RF signal processing. This paper presents reconfigurable computing technology and advanced feature extraction algorithm work and their application to multi- and hyperspectral image processing. Related projects on genetic algorithms as applied to image processing will be introduced, as will the collaboration between the DAPS project and the DARPA Adaptive Computing Systems program. Further details are presented in other talks during this conference and in other conferences taking place during this symposium.
We consider the problem of pixel-by-pixel classification of a multi- spectral image using supervised learning. Conventional spuervised classification techniques such as maximum likelihood classification and less conventional ones s uch as neural networks, typically base such classifications solely on the spectral components of each pixel. It is easy to see why: the color of a pixel provides a nice, bounded, fixed dimensional space in which these classifiers work well. It is often the case however, that spectral information alone is not sufficient to correctly classify a pixel. Maybe spatial neighborhood information is required as well. Or maybe the raw spectral components do not themselves make for easy classification, but some arithmetic combination of them would. In either of these cases we have the problem of selecting suitable spatial, spectral or spatio-spectral features that allow the classifier to do its job well. The number of all possible such features is extremely large. How can we select a suitable subset? We have developed GENIE, a hybrid learning system that combines a genetic algorithm that searches a space of image processing operations for a set that can produce suitable feature planes, and a more conventional classifier which uses those feature planes to output a final classification. In this paper we show that the use of a hybrid GA provides significant advantages over using either a GA alone or more conventional classification methods alone. We present results using high-resolution IKONOS data, looking for regions of burned forest and for roads.
We describe the implementation and performance of a genetic algorithm (GA) which evolves and combines image processing tools for multispectral imagery (MSI) datasets. Existing algorithms for particular features can also be “re-tuned” and combined with the newly evolved image processing tools to rapidly produce customized feature extraction tools. First results from our software system were presented previously. We now report on work extending our system to look for a range of broad-area features in MSI datasets. These features demand an integrated spatio- spectral approach, which our system is designed to use. We describe our chromosomal representation of candidate image processing algorithms, and discuss our set of image operators. Our application has been geospatial feature extraction using publicly available MSI and hyperspectral imagery (HSI). We demonstrate our system on NASA/Jet Propulsion Laboratory’s Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) HSI which has been processed to simulate MSI data from the Department of Energy’s Multispectral Thermal Imager (MTI) instrument. We exhibit some of our evolved algorithms, and discuss their operation and performance.
The retrieval of scene properties (surface temperature, material type, vegetation health, etc.) from remotely sensed data is the ultimate goal of many earth observing satellites. The algorithms that have been developed for these retrievals are informed by physical models of how the raw data were generated. This includes models of radiation as emitted and/or reflected by the scene, propagated through the atmosphere, collected by the optics, detected by the sensor, and digitized by the electronics. To some extent, the retrieval is the inverse of this 'forward' modeling problem. But in contrast to this forward modeling, the practical task of making inferences about the original scene usually requires some ad hoc assumptions, good physical intuition, and a healthy dose of trial and error. The standard MTI data processing pipeline will employ algorithms developed with this traditional approach. But we will discuss some preliminary research on the use of a genetic programming scheme to 'evolve' retrieval algorithms. Such a scheme cannot compete with the physical intuition of a remote sensing scientist, but it may be able to automate some of the trial and error. In this scenario, a training set is used, which consists of multispectral image data and the associated 'ground truth;' that is, a registered map of the desired retrieval quantity. The genetic programming scheme attempts to combine a core set of image processing primitives to produce an IDL (Interactive Data Language) program which estimates this retrieval quantity from the raw data.