Explanations are generated to accompany a model decision indicating features of the input data that were the most relevant towards the model decision. Explanations are important not only for understanding the decisions of deep neural network, which in spite of their their huge success in multiple domains operate largely as abstract black boxes, but also for other model classes such as gradient boosted decision trees. In this work, we propose methods, using both Bayesian and Non-Bayesian approaches to augment explanations with uncertainty scores. We believe that uncertainty augmented saliency maps can help in better calibration of the trust between human analyst and the machine learning models.
Uncertainty represents the quantification of the spread of the distribution of possible ground truths that can be inferred from observed evidence. As such, uncertainty is one of the major factors in determining confidence when making decisions (i.e., uncertainty and confidence are in an inverse relationship). Bayesian statistics and subjective logic provide tools for Artificial Intelligence (AI) to derive uncertainty quantification. These processes require base rates, which are large-population determinations of probabilities that are not contextualized for the specific situation. The AI computes probabilities based upon the specific situation and context in light of historical (or training) data. As more evidence/training data is available for the context, the base rate gets washed out in the probability calculation. For most Army applications, an AI does not act or decide on its own with the rare exception of complete automaticity, but rather in collaboration with at least one human user. In this paper, we propose that the ways AI represents uncertainty ought to be optimally aligned with human preferences to provide best possible human-AI collaborative performance. Exploring this topic requires human-subjects experimentation to test how well users understand different representations of uncertainty that include base-rate information, which quantifies belief in predictions. Variations of these experiments could include different types of training to interpret uncertainty representations.
Recent years have seen significant advances in artificial intelligence (AI) and machine learning (ML) technologies applicable to coalition situational understanding (CSU). However, state-of-the-art ML techniques based on deep neural networks require large volumes of training data; unfortunately, representative training examples of situations of interest in CSU are usually sparse. Moreover, to be useful, ML-based analytic services must be capable of explaining their outputs. We describe an integrated CSU architecture that combines neural networks with symbolic learning and reasoning to address the problem of sparse training data. We also demonstrate how explainability can be achieved for deep neural networks operating on multimodal sensor feeds. The work focuses on real-time decision making settings at the tactical edge, with both the symbolic and neural network parts of the system --- including the explainabilty approaches --- able to deal with temporal features.
Situational understanding is impossible without causal reasoning and reasoning under and about uncertainty, i.e. probabilistic reasoning and reasoning about the confidence in the uncertainty assessment. We therefore consider the case of subjective (uncertain) Bayesian networks. In previous work we notice that when observations are out of the ordinary, confidence decreases because the relevant training data, effective instantiations, to determine the probabilities for unobserved variables, on the basis of the observed variables, is significantly smaller than the size of the training data, the total number of instantiations. It is therefore of primary importance for the ultimate goal of situational understanding to be able to efficiently determine the reasoning paths that lead to low confidence whenever and wherever it occurs: this can guide specific data collection exercises to reduce such an uncertainty. We propose three methods to this end, and we evaluate them on the basis of a case-study developed in collaboration with professional intelligence analysts.
The proliferation of real-time information on social media opens up unprecedented opportunities for situation awareness that arise from extracting unfolding physical events from their social media footprints. The paper describes experiences with a new social media analysis toolkit for detecting and tracking such physical events. A key advantage of the explored analysis algorithms is that they require no prior training, and as such can operate out-of-the-box on new languages, dialects, jargon, and application domains (where by "new", we mean new to the machine), including detection of protests, natural disasters, acts of terror, accidents, and other disruptions. By running the toolkit over a period of time, patterns and anomalies are also detected that offer additional insights and understanding. Through analysis of contemporary political, military, and natural disaster events, the work explores the limits of the training-free approach and demonstrates promise and applicability.
This paper investigates the problem of localizing an unknown number of transient emitters using a network of passive sensors measuring angles of arrival in the presence of missed detections and false alarms. It is assumed that measurements within a certain time window of interest have to be associated before they can be fused to estimate the emitter locations. Two measurement models — either that any target can generate at most one measurement per sensor or that any target can generate several measurements per sensor — are possible within this time window. These two measurement models lead to two diﬀerent problem formulations: one is an S-D assignment problem and the other is a cardinality selection problem. The S-D assignment problem can be solved by the Lagrangian relaxation algorithm eﬃciently with a high degree of accuracy when a small number of sensors are used. The sequential m-best 2-D assignment algorithm, which is resistant to the ghosting problem due to the estimation of the emitter signal’s emission time, is developed to solve the problem when the number of sensors becomes large. Simulation results show that the sequential m-best 2-D assignment algorithm is suitable for real time processing with reliable associations and estimates. The cardinality selection formulation models a list of measurements as a Poisson point process and is solved by applying the expectation-maximization (EM) algorithm and an information criterion. The convergence of the EM algorithm to the desired global maximum needs an initialization, which is close to the truth. Localization using passive sensors makes it diﬃcult to obtain such an initial estimate. An assignment-based initialization approach is therefore presented. Simulation studies showed that the EM algorithm based on the assignment initialization is able to estimate the number of targets, target locations and directions with a high degree of accuracy.
A limitation of standard Description Logics is its inability to reason with uncertain and vague knowledge. Although
probabilistic and fuzzy extensions of DLs exist, which provide an explicit representation of uncertainty, they do not provide
an explicit means for reasoning about second order uncertainty. Dempster-Shafer theory of evidence (DST) overcomes this
weakness and provides means to fuse and reason about uncertain information. In this paper, we combine DL-Lite with
DST to allow scalable reasoning over uncertain semantic knowledge bases. Furthermore, our formalism allows for the
detection of conflicts between the fused information and domain constraints. Finally, we propose methods to resolve such
conflicts through trust revision by exploiting evidence regarding the information sources. The effectiveness of the proposed
approaches is shown through simulations under various settings.
The work considers sensor fusion in a heterogeneous network of proximity and bearings-only sensors for multiple target tracking. Specifically, various particle implementations of the probability hypothesis density filter are proposed that consider two different fusion strategies: 1) the traditional iterated-corrector approach, and 2) explicit fusion of the multitarget density. This work also investigates sensor type (proximity or bearings-only) selection via the Renyi entropy criteria. The simulation results demonstrate comparable localization performances for the two fusion methods, and they show that sensor type selection usually outperforms single sensor type performance.
In modern coalition operations, decision makers must be capable of obtaining and fusing data from diverse
sources. The reliability of these sources can vary, and, in order to protect their interests, the data they provide
can be obfuscated. The trustworthiness of fused data depends on both the reliability of these sources and their
obfuscation strategy. Information consumers must determine how to evaluate trust in the presence of obfuscation,
while information providers must determine the appropriate level of obfuscation in order to ensure both that
they remain trusted, and do not reveal any private information. In this paper, through a coalition scenario, we
discuss and formalise trust and obfuscation in these contexts and the complex relationships between them.
This work derives the Cramer-Rao lower bound (CRLB) for an acoustic target and sensor localization system
in which the noise characteristics depend on the location of the source. The system itself has been previously
examined, but without deriving the CRLB and showing the statistical efficiency of the estimator used. Two
different versions of the CRLB are derived, one in which direction of arrival (DOA) and range measurements
are available ("full-position CRLB"), and one in which only DOA measurements are available ("bearing-only
CRLB"). In both cases, the estimator is found to be statistically efficient; but, depending on the sensor-target
geometry, the range measurements may or may not significantly contribute to the accuracy of target localization.
This paper investigates effects of operation parameters on multitarget tracking in proximity sensor networks. In
such a network, the sensors report a detection when a target is within the proximity; otherwise, the sensors report
no detection. Previous work has revealed the potential of multitarget tracking via the particle-based probability
hypothesis density (PHD) filter when incorporating these binary reports. This work investigates how the sensor
density, sensing range, and target separation affect the ability of the PHD filter to estimate the number of targets
in the scene and to localize these targets (as measured by four different metrics). Two possible measurement
models are considered. The disc model assumes target detection within a sensing radius, and the probabilistic
model assumes 1/rα propagation decay of the source signal so that the probability of detection decreases with
range r. The simulations demonstrate the simplistic disc model is inadequate for the PHD filter to estimate the
number of targets, and the filter for the disc model exhibits difficulty to localize widely separated targets for
low sensor densities. On the other hand, the more realistic probabilistic model leads to a PHD filter that can
accurately estimate the number and locations of targets even for small target separations.
In surveillance and reconnaissance applications, dynamic objects are dynamically followed by track filters with
sequential measurements. There are two popular implementations of tracking filters: one is the covariance or Kalman
filter and the other is the information filter. Evaluation of tracking filters is important in performance optimization not
only for tracking filter design but also for resource management. Typically, the information matrix is the inverse of the
covariance matrix. The covariance filter-based approaches attempt to minimize the covariance matrix-based scalar
indexes whereas the information filter-based methods aim at maximizing the information matrix-based scalar indexes.
Such scalar performance measures include the trace, determinant, norms (1-norm, 2-norm, infinite-norm, and Forbenius
norm), and eigenstructure of the covariance matrix or the information matrix and their variants. One natural question to
ask is if the scalar track filter performance measures applied to the covariance matrix are equivalent to those applied to
the information matrix? In this paper we show most of the scalar performance indexes are equivalent yet some are not.
As a result, the indexes if used improperly would provide an "optimized" solution but in the wrong sense relative to
track accuracy. The simulation indicated that all the seven indexes were successful when applied to the covariance
matrix. However, the failed indexes for the information filter include the trace and the four norms (as defined in
MATLAB) of the information matrix. Nevertheless, the determinant and the properly selected eigenvalue of the
information matrix were successful to select the optimal sensor update configuration. The evaluation analysis of track
measures can serve as a guideline to determine the suitability of performance measures for tracking filter design and
In this work we analyze the performance of several approaches to sniper localization in a network of mobile sensors.
Mobility increases the complexity of calibration (i.e., self localization, orientation, and time synchronization) in
a network of sensors. The sniper localization approaches studied here rely on time-difference of arrival (TDOA)
measurements of the muzzle blast and shock wave from multiple, distributed single-sensor nodes. Although
these approaches eliminate the need for self-orienting, node position calibration and time synchronization are
still persistent problems. We analyze the influence of geometry and the sensitivity to time synchronization and
node location uncertainties. We provide a Cramer-Rao bound (CRB) for location and bullet trajectory estimator
errors for each respective approach. When the TDOA is taken as the difference between the muzzle blast and
shock wave arrival times, the resulting localization performance is independent of time synchronization and is
less affected by geometry compared to other approaches.
The next generation of night vision goggles will fuse image intensified and long wave infra-red to create a hybrid image that will enable soldiers to better interpret their surroundings during nighttime missions. Paramount to the development of such goggles is the exploitation of image quality measures to automatically determine the best image fusion algorithm for a particular task. This work will introduce a novel monotonic correlation coefficient to investigate how well possible image quality features correlate to actual human performance, which is measured by a perception study. The paper will demonstrate how monotonic correlation can identify worthy features that could be overlooked by the traditional Pearson correlation.
Proc. SPIE. 4393, Unattended Ground Sensor Technologies and Applications III
KEYWORDS: Photovoltaics, Detection and tracking algorithms, Sensors, Signal attenuation, Error analysis, Sensor networks, Monte Carlo methods, Signal processing, Algorithm development, Unattended ground sensors
This paper extends our development of acoustical bearings-only target localization for the case of multiple moving targets. The resulting techniques can be used to locate and track targets traveling through a network of acoustical sensor arrays. Each array computes and transmits multiple direction-of-arrival (DOA) estimates to a central processor, which employs the target localization technique. In previous work, we developed ML techniques that may or may not account for the fact that a bearing measurement points to the location of a moving target at a retarded time. By inserting a simple bearings association computation in the ML methods, we define quasi-ML techniques that can estimate the location and velocity of multiple targets using multiple bearing estimates per a sensor array.
We introduce a pose estimation method for SAR imagery using the 2D continuous wavelet transform (CWT). The computational complexity of the new approach is comparable to other image- based approaches such as ones that incorporate principle component analysis (PCA). Using the public domain MSTAR database, we show that the CWT-based method provides a better pose estimate than the PCA method.
A new paradigm in ground surveillance consists of swarms of autonomous internetted sensors that can be used for target localization and environmental monitoring. The individual component is an inexpensive device containing multiple sensor types, a processor and wireless communication hardware. Scattered over a certain region, these devices are able to detect the direction or proximity of targets. One of the most limiting factors of the devices is the battery supply. In order to conserve power, these units should be able to adjust their activities to the current situations. Energy consuming signal processing should only be performed if the quality of the raw sensor data promises a significant improvement to the localization results. We propose a self-organized control system that allows the devices to select the algorithm complexity which balances the requirements for good localization performance and energy conservation. The devices make their selection autonomously, based on their own sensor data, information that they receive from other devices in the region, and the amount of energy they have left. The capability of this system will be demonstrated via computer simulations.
This paper presents a novel scheme to detect and discriminate landmines from other clutter objects during the image formation process for ultra-wideband (UWB) synthetic aperture radar (SAR) systems. By identifying likely regions containing the targets of interest, i.e., landmines, it is possible to speed up the overall formation time by pruning the processing to resolve regions that do not contain targets. The image formation algorithm is a multiscale approximation to standard backprojection known as the quadtree that uses a 'divide-and- conquer' strategy. The intermediate quadtree data admits multiresolution representations of the scene, and we develop a contrast statistic to discriminate structured/diffuse regions and an aperture diversity statistic to discriminate between regions containing mines and desert scrub. The potential advantages of this technique are illustrated using data collected at Yuma, AZ by the ARL BoomSAR system.
This paper discusses the utility of scale-angle continuous wavelet transform features for object classification. These features are used as input to two algorithms: character recognition and target recognition in FLIR images. The corresponding recognition algorithm is robust against noise and allows data reduction. A comparative study is made between two types of directional wavelets derived from the Mexican hat wavelet and the usual template matching.
In this work, we introduce a detection scheme that is able to identify regions of interest during the intermediate stages of an image formation process for ultra-wideband (UWB) synthetic aperture radar. Traditional detection methods manipulate the data after image formation. However, this approach wastes computational resources by resolving to completion the entire scene including area dominated by benign clutter. As an alternative, we introduce a multiscale focus of attention (FOA) algorithm that processes intermediate radar data from a quadtree-based backprojection image formation algorithm. As the stages of the quadtree algorithm progress, the FOA thresholds a detection statistic that estimates the signal-to-background ratio for increasingly smaller subpatches. Whenever a subpatch fails a detection, the FOA cues the image formation processor to terminate further processing of that subpatch. We demonstrate that the FOA is able to decrease the overall computational load of the image formation process by a factor of two. We also show that the new FOA method provides fewer false alarms than the two-parameter CFAR FOA over a small database of UWB radar data.
The Extended Fractal (EF) feature has been shown to lower the false alarm rate for the focus of attention (FOA) stage of a synthetic aperture radar (SAR) automatic target recognition (ATR) system. The feature is both contrast and size sensitive, and thus, can discriminate between targets and many types of cultural clutter at the earliest stages of the ATR. In this paper we modify the EF feature so that one can 'tune' the size sensitivity to the specific targets of interest. We show how to optimize the EF feature using target chip data from the public MSTAR database. We demonstrate improvements in performance for FOA algorithms that include the new feature by comparing the receiver operating characteristic (ROC) curves for all possible combinations of FOA algorithms incorporating EF, two-parameter CFAR, and variance features. Finally, we perform timing experiments on the fused detector to demonstrate the feasibility for implementation of the detector in a real system.
We addressed the problem of classifying 10 target types in imagery formed from synthetic aperture radar (SAR). By executing a group training process, we show how to increase the performance of 10 initial sets of target templates formed by simple averaging. This training process is a modified learning vector quantization (LVQ) algorithm that was previously shown effective with forward-looking infrared (FLIR) imagery. For comparison, we ran the LVQ experiments using coarse, medium, and fine template sets that captured the target pose signature variations over 60 degrees, 40 degrees, and 20 degrees, respectively. Using sequestered test imagery, we evaluated how well the original and post-LVQ template sets classify the 10 target types. We show that after the LVQ training process, the coarse template set outperforms the coarse and medium original sets. And, for a test set that included untrained version variants, we show that classification using coarse template sets nearly matches that of the fine template sets. In a related experiment, we stored 9 initial template sets to classify 9 of the target types and used a threshold to separate the 10th type, previously found to be a 'confusing' type. We used imagery of all 10 targets in the LVQ training process to modify the 9 template sets. Overall classification performance increased slightly and an equalization of the individual target classification rates occurred, as compared to the 10-template experiment. The SAR imagery that we used is publicly available from the Moving and Stationary Target Acquisition and Recognition (MSTAR) program, sponsored by the Defense Advanced Research Projects Agency (DARPA).
We address the problems of recognizing 10 types of vehicles in imagery formed from synthetic aperture radar (SAR). SAR provides all-weather, day, or night imagery of the battlefield. To aid in the analysis of the copious amounts of imagery available today, automatic target recognition (ATR) algorithms, which are either template-based or model- based, are needed. We enhanced template-based algorithms by using an artificial neural network (ANN) to increase the discriminating characteristics of 10 initial sets of templates. The ANN is a modified learning vector quantization (LVQ) algorithm, previously shown effective with forward-looking IR (FLIR) imagery. For comparison, we ran the experiments with LVQ using three different sized temporal sets. These template sets captured the target signature variations over 60 degrees, 40 degrees, and 20 degrees. We allowed LVQ to modify the templates, as necessary, using the training imager from all 10 targets. The resulting templates represent the 10 target types with greater separability in feature space. Using sequestered test imagery, we compared the pre- and post-LVQ template sets in terms of their ability to discriminate the 10 target types. All training and test imagery is publicly available from the Moving and Stationary Target Acquisition and Recognition program sponsored by the Defense Advanced Research Projects Agency.
This paper investigates methods to improve template-based synthetic aperture radar (SAR) automatic target recognition (ATR). The approach utilizes clustering methods motivated from the vector quantization (VQ) literature to search for templates that best represent the signature variability of target chips. The ATR performance using these new templates are compared to the performance using standard templates. For baseline SAR ATR, the templates are generated over uniform angular bins in the pose space. A merge method is able to generate templates that provide a nonuniform sampling of the pose space, and the templates produce modest gains in ATR performance over standard templates.
This paper investigates the use of Continuous Wavelet Transform (CWT) features for detection of targets in low resolution FLIR imagery. We specifically use the CWT features corresponding to the integration of target features at all relevant scales and orientations. These features are combined with non-linear transformations (thresholding, enhancement, morphological operations). We compare our previous results using the Mexican hat wavelet with those obtained using the two types of directional wavelets: the Morlet wavelet and the Cauchy wavelets. The algorithm was tested on the TRIM2 data base.
In this work, we evaluate the robustness of template matching schemes for automatic target recognition (ATR) against the effects of clutter layover. The results of our experiments indicate the performance of template matching ATR in various image transform domains against the signal to clutter ratio (SCR). The purpose of these transforms is to enhance the target features in a chip while suppressing features representative of background clutter or simple noise. The ATR experiments were performed for synthetic aperture radar imagery using target chips in the public domain MSTAR database. The transforms include pointwise nonlinearities such as the logarithm and power operations. The templates are generated using the training portion of the MSTAR database at the nominal SCR. Many different ATR parameterizations are considered for each transform domain where templates are built to represent different ranges of aspect angles in uniform angular bins of 5, 10, 15, 30, and 45 degree increments. The different ATRs were evaluated using the testing portion of the database where synthetic clutter was added to lower the SCR.
Two texture-based and one amplitude-based features are evaluated as detection statistics for synthetic aperture radar (SAR) imagery. The statistics include a local variance, an extended fractal, and a two-parameter CFAR feature. The paper compares the effectiveness of focus of attention (FOA) algorithms that consist of any number of combinations of the three statistics. The public MSTAR database is used to derive receiver-operator-characteristic (ROC) curves for the different detectors at various signal-to-clutter rations (SCR). The database contains one foot resolution X-band SAR imagery. The results in the paper indicate that the extended fractal statistic provides the best target/clutter discrimination, and the variance statistic is the most robust against SCR. In fact, the extended fractal statistic combines the intensity difference information used also by the CFAR feature with the spatial extent of the higher intensity pixels to generate an attractive detection statistics.
In images, anomalies such as edges or object boundaries take on a perceptual significance that is far greater than their numerical energy contribution to the image. Wavelet transform highlights these anomalies by representing them with significant coefficients. The contribution of a wavelet coefficient to the perceptual quality of the image is related to its magnitude. Degradation in image quality due to image compression reflects in the form of reduction in the magnitude of the wavelet coefficients. Since, significant wavelet coefficients appear across different scales and orientations, it is important to observe the wavelet transform at different scales and orientations. In this paper, the wavelet transform of a given image and the reconstructed images at various quality levels are represented in the form of energy density plots suggested in reference one. A quality metric is proposed based on the absolute difference between the energy densities corresponding to the original and reconstructed images. Preliminary results obtained using the scale-based image quality evaluation strategy are reported.
This paper presents a method to evaluate image quality using the continuous wavelet transform. The method utilizes a bank of filters tuned to different scales and orientations to extract the image details. The filters are designed according to the criterion suggested by Antoine and Murenzi. The wavelet transform of a given image and the reconstructed images at various quality levels are represented in the form of energy density plots. These density plots highlight image features such as edges, object boundaries and texture. Thus, they represent the details contained in the image. A quality metric is proposed based on the absolute difference between the energy densities corresponding to the original and reconstructed images. The proposed metric is used to measure the relative quality of the image. In addition, the metric is also used to study the performance of a specific ATR algorithm as a function of image quality.
The increase in the number of multimedia databases consisting of images has created a need for a quick method to search these databases for a particular type of image. An image retrieval system will output images from the database similar to the query image in terms of shape, color, and texture. For the scope of our work, we study the performance of multiscale Hurst parameters as texture features for database image retrieval over a database consisting of homogeneous textures. These extended Hurst features represent a generalization of the Hurst parameter for fractional Brownian motion (fBm) where the extended parameters quantize the texture roughness of an image at various scales. We compare the retrieval performance of the extended parameters against traditional Hurst features and features obtained from the Gabor wavelet. Gabor wavelets have previously been suggested for image retrieval applications because they can be tuned to obtain texture information for a number of different scales and orientations. In our experiments, we form a database combining textures from the Bonn, Brodatz, and MIT VisTex databases. Over the hybrid database, the extended fractal features were able to retrieve a higher percentage of similar textures than the Gabor features. Furthermore, the fractal features are faster to compute than the Gabor features.
The utility of multiscale Hurst features are determined for segmentation of clutter in SAR imagery. These multiscale Hurst features represent a generalization of the Hurst parameter for fractional Brownian motion (fBm) where these new features measure texture roughness at various scales. A clutter segmentation algorithm is described using only these new Hurst parameters as features. The performance of the algorithm was tested on measured one foot resolution SAR data, and the results are comparable to other algorithms proposed in the literature. The advantage of the multiscale Hurst features is that they can be computed quickly and they can discriminate clutter well in unprocessed single polarization magnitude detected SAR imagery.
In this paper, we introduce the 2D continuous wavelet transform (CWT) as a tool for the detection stages of an SAR ATR system. We demonstrate that the 2D CWT tuned to reasonable target sizes can enhance the signal to clutter radio and improve detection performance in a focus of attention algorithm as compared to the traditional CFAR algorithm. We also show that the 2D CWT can be used to estimate the size and pose of a target, which can provide important features for second level detection. The detection and feature extraction algorithms were tested on measured one foot resolution SAR data.
We propose a new method for terrain texture synthesis by using a generalized two dimensional fractional Brownian motion (fBm) model called the extended self-similar (ESS) process. The utility of 2-D fBm for terrain texture modeling has been examined by some researchers. Although the fBm may provide a good model for landscapes for some scales, it will not capture the behavior of the terrain at all scales. We introduce the ESS process to model terrains at all scales where the parameters of the ESS model provide a multiscale roughness representation of the landscape. Specifically, we define a generalized Hurst parameter which changes with respect to scales. To validate the usefulness of the new model, we show how to estimate the generalized Hurst parameters from 2-D data and how to synthesize an ESS process. The generation method is based on Fourier synthesis of the stationary ESS increments, and the algorithm has a complexity of O[N2 log(N)] for an image of size N X N. Then, we demonstrate the relation between the generalized Hurst parameter and visual roughness through examples of synthesized images. Finally, we examine the ability of the ESS process to render real terrain data.
In this work, we use the 1D Haar transform fractal estimation algorithm to calculate the local fractal dimension estimates of 2D texture data. The new algorithm provides directed fractal dimension estimates which are used as features for texture segmentation. The method is fast due to the pyramid structure of the Haar transform and nearly optimal in the maximum likelihood sense for fBm data. We compare the low complexity of this new algorithm with the complexity of existing fractal feature extraction techniques, and test our new method on fBm data and real Brodatz textures.