Object detection on imagery captured onboard aerial platforms involves different challenges than in ground-to-ground object detection. For example, images captured from UAVs with varying altitude and view angles present challenges for machine learning that are due to variations in appearance and scene attributes. Thus, it is essential to closely examine the critical variables that impact object detection from UAV platforms, such as the significant variations in pose, range to objects, background clutter, lighting, weather conditions, and velocity/acceleration of the UAV. To that end, in this work, we introduce a UAV-based image dataset, called the Archangel dataset, collected with a UAV that includes pose and range information in the form of metadata. Additionally, we use the Archangel dataset to conduct comprehensive studies of how the critical attributes of UAV-based images affect machine learning models for object detection. The extensive analysis on the Archangel dataset aims to advance optimal training and testing of machine learning models in general as well as the more specific case of UAV-based object detection using deep neural networks.
Object detection from high resolution images is increasingly used for many important application areas of defense and commercial sensing. However, object detection on high resolution images requires intensive computation, which makes it challenging to apply on resource-constrained platforms such as in edge-cloud deployments. In this work, we present a novel system for streamlined object detection on edge-cloud platforms. The system integrates multiple object detectors into an ensemble to improve detection accuracy and robustness. The subset of object detectors that is active in the ensemble can be changed dynamically to provide adaptively adjusted trade-offs among object detection accuracy, real-time performance, and energy consumption. Such adaptivity can be of great utility for resource-constrained deployment to edge-cloud environments, where the execution time and energy cost of full-accuracy processing may be excessive if utilized all of the time. To promote efficient and reliable implementation on resource-constrained devices, the proposed system design employs principles of signal processing oriented data ow modeling along with pipelining of data ow subsystems and integration on top of optimized, off-the-shelf software components for lower level processing. The effectiveness of the proposed object detection system is demonstrated through extensive experiments involving the Unmanned Aerial Vehicle Benchmark and KITTI Vision Benchmark Suite. While the proposed system is developed for the specific problem of object detection, we envision that the underlying design methodology, which integrates adaptive ensemble processing with data ow modeling and optimized lower level libraries, is applicable to a wide range of applications in defense and commercial sensing.
Object detection is increasingly used onboard Unmanned Aerial Vehicles (UAV) for various applications; however, the machine learning (ML) models for UAV-based detection are often validated using data curated for tasks unrelated to the UAV application. This is a concern because training neural networks on large-scale benchmarks have shown excellent capability in generic object detection tasks, yet conventional training approaches can lead to large inference errors for UAV-based images. Such errors arise due to differences in imaging conditions between images from UAVs and images in training. To overcome this problem, we characterize boundary conditions of ML models, beyond which the models exhibit rapid degradation in detection accuracy. Our work is focused on understanding the impact of different UAV-based imaging conditions on detection performance by using synthetic data generated using a game engine. Properties of the game engine are exploited to populate the synthetic datasets with realistic and annotated images. Specifically, it enables the fine control of various parameters, such as camera position, view angle, illumination conditions, and object pose. Using the synthetic datasets, we analyze detection accuracy in different imaging conditions as a function of the above parameters. We use three well-known neural network models with different model complexity in our work. In our experiment, we observe and quantify the following: 1) how detection accuracy drops as the camera moves toward the nadir-view region; 2) how detection accuracy varies depending on different object poses, and 3) the degree to which the robustness of the models changes as illumination conditions vary.
Recently, we introduced a state-of-the-art object detection approach referred to as Multi-Expert R-CNN (ME R-CNN) that featured multiple expert classifiers, each being responsible for recognizing objects with distinctive geometrical features. The ME R-CNN architecture consists of multiple components: a shared convolutional network, Multi-Expert classifiers (ME), and Expert Assignment Network (EAN). Both ME and EAN take as a common input the output from the convolutional network and also use each other's output during training. Thus, it is quite challenging to properly train all the components simultaneously to globally optimize the network parameters. The main innovation of the proposed work is to optimize the entire architecture by using a novel training strategy in which manually associated 'RoI-to-expert' mapping is used instead of using the direct output of ME for training EAN. Our experiments show that the proposed training strategy speeds up training time at least 4.2x while maintaining comparable object detection accuracy.
Object detection from images captured by Unmanned Aerial Vehicles (UAVs) are widely used for surveillance, precision agricultural, package delivery, aerial photography, among others. Very recently, a benchmark on object detection using UAVs collected images called VisDrone2018 has been released. However, large performance drop is observed when current state-of-the-art object detection approaches developed primarily for ground-to-ground images are directly applied on the VisDrone2018 dataset. For example, the best detection model on the VisDrone2018 has only achieved detection accuracy of 0.31 mAP, significantly lower than that of ground-based object detection. This performance drop is mainly caused by several challenges, such as 1) varying flying altitudes from 1000 feet to 10 feet, 2) different weather conditions like foggy, rainy and low-light 3) a wide range of camera viewing angles. To overcome these challenges, in this paper we propose to leverage a novel approach of adversarial training that aims to learn domain invariant features with respect to varying altitudes, viewing angles, weather conditions, and object scales. The adversarial training draws on “free” meta-data that comes with the UAV datasets providing information about the data themselves, such as heights, scene visibility, viewing angles, etc. We demonstrate the effectiveness of our proposed algorithm on the recently proposed UAVDT dataset, and also show it to generalize well when applied to a different VisDrone2018 dataset. We will also show robustness of the proposed approach to variations in altitude, viewing angle, weather, and object scale.
Machine learning based perception algorithms are increasingly being used for the development of autonomous navigation systems of self-driving vehicles. These vehicles are mainly designed to operate on structured roads or lanes and the ML algorithms are primarily used for functionalities such as object tracking, lane detection and semantic understanding. On the other hand, Autonomous/ Unmanned Ground Vehicles (UGV) being developed for military applications need to operate in unstructured, combat environment including diverse off-road terrain, inclement weather conditions, water hazards, GPS denied environment, smoke etc. Therefore, the perception algorithm requirements are different and have to be robust enough to account for several diverse terrain conditions and degradations in visual environment. In this paper, we present military-relevant requirements and challenges for scene perception that are not met by current state-of-the-art algorithms, and discuss potential strategies to address these capability gaps. We also present a survey of ML algorithms and datasets that could be employed to support maneuver of autonomous systems in complex terrains, focusing on techniques for (1) distributed scene perception using heterogeneous platforms, (2) computation in resource constrained environment (3) object detection in degraded visual imagery.
Recently, intelligent machine agents, such as a deep neural network (DNN), have been showing unparalleled capabilities in recognizing visual patterns, objects, semantic activities/events embedded in real-world images and videos. Hence, there has been an increasing need to deploy DNNs, to a battlefield to provide the Solider with realtime situational understanding by capturing a holistic view of battlespace. Soldiers engaged in tactical operations can greatly benefit from leveraging advanced at-the-point-of-need data analytics running on multimodal and heterogeneous platforms in distributed and constrained network environments. The proposed work aims to decompose DNNs and then distribute over edge nodes in such a way that a trade-off between resources available in the constrained network and recognition performance can be optimized. In this work, we decompose DNNs into two stages: an initial stage on an edge device and the remaining portion running on an edge cloud. To effectively and efficiently divide DNNs into two separate stages, we will rigorously analyze multiple widely used DNN architectures with respect to its memory size and FLOPs (Floating Point Operations) per each layer. Based on these analyses, we will develop advanced splitting strategies for DNNs to handle various network constraints.
Terror attacks are often targeted towards the civilians gathered in one location (e.g., Boston Marathon bombing). Distinguishing such ’malicious’ scenes from the ’normal’ ones, which are semantically different, is a difficult task as both scenes contain large groups of people with high visual similarity. To overcome the difficulty, previous methods exploited various contextual information, such as language-driven keywords or relevant objects. Although useful, they require additional human effort or dataset. In this paper, we show that using more sophisticated and deeper Convolutional Neural Networks (CNNs) can achieve better classification accuracy even without using any additional information outside the image domain. We have conducted a comparative study where we train and compare seven different CNN architectures (AlexNet, VGG-M, VGG16, GoogLeNet, ResNet- 50, ResNet-101, and ResNet-152). Based on the experimental analyses, we found out that deeper networks typically show better accuracy, and that GoogLeNet is the most favorable among the seven architectures for the task of malicious event classification.
Efficient and accurate real-time perception systems are critical for Unmanned Aerial Vehicle (UAV) applications that aim to provide enhanced situational awareness to users. Specifically, object recognition is a crucial element for surveillance and reconnaissance missions since it provides fundamental semantic information of the aerial scene. In this study, we describe the development and implementation of a perception frame-work on an embedded computer vision platform, mounted on a hexacopter for real-time object detection. The framework includes a camera driver and a deep neural network based object detection module and has distributed computing capabilities between the aerial platform and the corresponding ground station. Preliminary aerial real-time object detections using YOLO are performed onboard a UAV and a sequence of images are streamed to the base station where an advanced computer vision algorithm, referred to as Multi-Expert Region-based CNN (ME- RCNN), is leveraged to provide enhanced and fine-grained analytics on the aerial video feeds. Since annotated aerial imagery in the UAV domain is hard to obtain and not routinely available, we use a combination of aerial data as well as air-to-ground synthetic images, such as vehicles, generated by video gaming engines for training the neural network. Through this study, we quantify the level of improvements with the use of the synthetic dataset and the efficacy of using advanced object detection algorithms.
KEYWORDS: Convolutional neural networks, Electroencephalography, Data modeling, Visualization, Signal to noise ratio, Visual process modeling, Machine vision, Performance modeling, Convolution, Data fusion
Traditionally, Brain-Computer Interfaces (BCI) have been explored as a means to return function to paralyzed or otherwise debilitated individuals. An emerging use for BCIs is in human-autonomy sensor fusion where physiological data from healthy subjects is combined with machine-generated information to enhance the capabilities of artificial systems. While human-autonomy fusion of physiological data and computer vision have been shown to improve classification during visual search tasks, to date these approaches have relied on separately trained classification models for each modality. We aim to improve human-autonomy classification performance by developing a single framework that builds codependent models of human electroencephalograph (EEG) and image data to generate fused target estimates. As a first step, we developed a novel convolutional neural network (CNN) architecture and applied it to EEG recordings of subjects classifying target and non-target image presentations during a rapid serial visual presentation (RSVP) image triage task. The low signal-to-noise ratio (SNR) of EEG inherently limits the accuracy of single-trial classification and when combined with the high dimensionality of EEG recordings, extremely large training sets are needed to prevent overfitting and achieve accurate classification from raw EEG data. This paper explores a new deep CNN architecture for generalized multi-class, single-trial EEG classification across subjects. We compare classification performance from the generalized CNN architecture trained across all subjects to the individualized XDAWN, HDCA, and CSP neural classifiers which are trained and tested on single subjects. Preliminary results show that our CNN meets and slightly exceeds the performance of the other classifiers despite being trained across subjects.
A novel approach for the fusion of heterogeneous object classification methods is proposed. In order to effectively integrate the outputs of multiple classifiers, the level of ambiguity in each individual classification score is estimated using the precision/recall relationship of the corresponding classifier. The main contribution of the proposed work is a novel fusion method, referred to as Dynamic Belief Fusion (DBF), which dynamically assigns probabilities to hypotheses (target, non-target, intermediate state (target or non-target) based on confidence levels in the classification results conditioned on the prior performance of individual classifiers. In DBF, a joint basic probability assignment, which is obtained from optimally fusing information from all classifiers, is determined by the Dempster's combination rule, and is easily reduced to a single fused classification score. Experiments on RSVP dataset demonstrates that the recognition accuracy of DBF is considerably greater than that of the conventional naive Bayesian fusion as well as individual classifiers used for the fusion.
Recognizing faces acquired in the thermal spectrum from a gallery of visible face images is a desired capability for the
military and homeland security, especially for nighttime surveillance and intelligence gathering. However, thermal-tovisible
face recognition is a highly challenging problem, due to the large modality gap between thermal and visible
imaging. In this paper, we propose a thermal-to-visible face recognition approach based on multiple kernel learning
(MKL) with support vector machines (SVMs). We first subdivide the face into non-overlapping spatial regions or
blocks using a method based on coalitional game theory. For comparison purposes, we also investigate uniform spatial
subdivisions. Following this subdivision, histogram of oriented gradients (HOG) features are extracted from each block
and utilized to compute a kernel for each region. We apply sparse multiple kernel learning (SMKL), which is a MKLbased
approach that learns a set of sparse kernel weights, as well as the decision function of a one-vs-all SVM classifier
for each of the subjects in the gallery. We also apply equal kernel weights (non-sparse) and obtain one-vs-all SVM
models for the same subjects in the gallery. Only visible images of each subject are used for MKL training, while
thermal images are used as probe images during testing. With subdivision generated by game theory, we achieved
Rank-1 identification rate of 50.7% for SMKL and 93.6% for equal kernel weighting using a multimodal dataset of 65
subjects. With uniform subdivisions, we achieved a Rank-1 identification rate of 88.3% for SMKL, but 92.7% for equal
kernel weighting.
In this paper, a Support Vector Machine (SVM) based method to jointly exploit spectral and spatial information
from hyperspectral images to improve classication performance is presented. In order to optimally exploit this
joint information, we propose to use a novel idea of embedding a local distribution of input hyperspectral data
into the Reproducing Kernel Hilbert Spaces (RKHS). A Hilbert Space Embedding called mean map is utilized
to map a group of neighboring pixels of a hyperspectral image into the RKHS and then, calculate the empirical
mean of the mapped points in the RKHS. SVM based classication performed on the mean mapped points can
fully exploit the spectral information as well as ensure spatial continuity among neighboring pixels. The proposed
technique showed signicant improvement over the existing composite kernels on two hyperspectral image data
sets.
Non-O157:H7 Shiga toxin-producing Escherichia coli (STEC) strains such as O26, O45, O103, O111, O121 and O145
are recognized as serious outbreak to cause human illness due to their toxicity. A conventional microbiological method
for cell counting is laborious and needs long time for the results. Since optical detection method is promising for realtime,
in-situ foodborne pathogen detection, acousto-optical tunable filters (AOTF)-based hyperspectral microscopic
imaging (HMI) method has been developed for identifying pathogenic bacteria because of its capability to differentiate
both spatial and spectral characteristics of each bacterial cell from microcolony samples. Using the AOTF-based HMI
method, 89 contiguous spectral images could be acquired within approximately 30 seconds with 250 ms exposure time.
From this study, we have successfully developed the protocol for live-cell immobilization on glass slides to acquire
quality spectral images from STEC bacterial cells using the modified dry method. Among the contiguous spectral
imagery between 450 and 800 nm, the intensity of spectral images at 458, 498, 522, 546, 570, 586, 670 and 690 nm were
distinctive for STEC bacteria. With two different classification algorithms, Support Vector Machine (SVM) and Sparse
Kernel-based Ensemble Learning (SKEL), a STEC serotype O45 could be classified with 92% detection accuracy.
KEYWORDS: Hyperspectral imaging, Niobium, Detection and tracking algorithms, Target detection, Sensors, Data modeling, Digital imaging, Roads, Image classification, Binary data
In this paper, sparse kernel-based ensemble learning for hyperspectral anomaly detection is proposed. The
proposed technique is aimed to optimize an ensemble of kernel-based one class classifiers, such as Support Vector
Data Description (SVDD) classifiers, by estimating optimal sparse weights. In this method, hyperspectral
signatures are first randomly sub-sampled into a large number of spectral feature subspaces. An enclosing
hypersphere that defines the support of spectral data, corresponding to the normalcy/background data, in the
Reproducing Kernel Hilbert Space (RKHS) of each respective feature subspace is then estimated using regular
SVDD. The enclosing hypersphere basically represents the spectral characteristics of the background data in the
respective feature subspace. The joint hypersphere is learned by optimally combining the hyperspheres from the
individual RKHS, while imposing the l1 constraint on the combining weights. The joint hypersphere representing
the most optimal compact support of the local hyperspectral data in the joint feature subspaces is then used
to test each pixel in hyperspectral image data to determine if it belongs to the local background data or not.
The outliers are considered to be targets. The performance comparison between the proposed technique and the
regular SVDD is provided using the HYDICE hyperspectral images.
An ensemble learning approach using a number of weak classifiers, each classifier conducting learning based on a
random subset of spectral features (bands) of the training sample, is used to detect/identify a specific chemical plume.
The support vector machine (SVM) is used as the weak classifier. The detection results of the multiple SVMs are
combined to generate a final decision on a pixel's class membership. Due to the multiple learning processes conducted in
the randomly selected spectral subspaces, the proposed ensemble learning can improve solution generality. This work
uses a two-class approach, using samples taken from hyper-spectral image (HSI) cubes collected during a release of the
test chemical. Performance results in the form of receiver operator characteristic curves, show similar performance when
compared to a single SVM using the full spectrum. Initial results were obtained by training with samples taken from a
single HSI cube. These results are compared to results that are more recent from training with sample data from 28 HSI
cubes. Performance of algorithms trained with high concentration spectra show very strong responses when scored only
on high concentration data. However, performance drops substantially when low concentration pixels are scored as well.
Training with the low concentration pixels along with the high concentration pixels can improve over all solution
generality and shows the strength of the ensemble approach. However, it appears that careful selection of the training
data and the number of examples can have a significant impact on performance.
Recently, a SVM-based ensemble learning technique has been introduced by the authors for hyperspectral plume
detection/classification. The SVM-based ensemble learning consists of a number of SVM classifiers and the
decisions from these sub-classifiers are combined to generate a final ensemble decision. The SVM-based ensemble
technique first randomly selects spectral feature subspaces from the input data. Each individual classifier
then independently conducts its own learning within its corresponding spectral feature space. Each classifier
constitutes a weak classifier. These weak classifiers are combined to make an ensemble decision. The ensemble
learning technique provides better performance than the conventional single SVM in terms of error rate. Various
aggregating techniques like bagging, boosting, majority voting and weighted averaging were used to combine
the weak classifiers, of which majority voting was found to be most robust. Yet, the ensemble of SVMs is suboptimal.
Techniques that optimally weight the individual decisions from the sub-classifiers are strongly desirable
to improve ensemble learning performance. In the proposed work, a recently introduced kernel learning technique
called Multiple Kernel Learning (MKL) is used to optimally weight the kernel matrices of the sub-SVM classifiers.
MKL basically iteratively performs l2 optimization on the Euclidian norm of the normal vector of the separating
hyperplane between the classes (background and chemical plume) defined by the weighted kernel matrix followed
by gradient descent optimization on the l1 regularized weighting coefficients of the individual kernel matrices.
Due to l1 regularization on the weighting coefficients, the optimized weighting coefficients become sparse. The
proposed work utilizes the sparse weighting coefficients to combine decision results of the SVM-based ensemble
technique. A performance comparison between the aggregating techniques - MKL and majority voting as applied
to hyperspectral chemical plume detection is presented in the paper.
This paper presents several methods for change detection in a pair of multi-temporal synthetic aperture radar (SAR) images of the same scene. Several techniques which vary in complexity were implemented and compared. Among the simple methods that were implemented are differencing, Euclidean distance, and image mean ratioing. These methods require minimal processing time, with little computational complexity, and incorporate no statistical information. These methods have demonstrated some degree of accuracy in detecting changes in SAR imagery. However, the presence of highly correlated speckle noise, misregistration errors, and nonlinear variations in SAR images motivated us to seek more sophisticated methods of change detection in order to obtain more favorable results. Therefore, methods were implemented which incorporated second order statistic calculations in making a change decision in efforts to mitigate false alarms arising from the aforementioned causes. Pre-whitened the data was created and then a Wiener prediction-based method, Euclidean distance measure and subspace projection method was implemented. The performance of these methods were compared using multi-look SAR images containing several targets (mines). The results are presented in the form of receiver operating characteristics (ROC) curves.
In the past, many researchers have approached the "Hyperspectral-imagery-anomaly-detection" problem from the point of view of classical detection theory. This perspective has resulted in the development of algorithms like RX (Reed-Xiaoli) and the application of processing techniques like PCA (Principal Component Analysis) and ICA (Independent Component Analysis--algorithms and techniques that are based primarily on statistical and probabilistic considerations. In this paper we describe a new anomaly detection paradigm based on an adaptive filtering strategy known as "signal subspace processing". The signal-subspace-processing (SSP) techniques on which our algorithm is based have yielded solutions to a wide range of problems in the past (e.g. sensor calibration, target detection, and change detection). These earlier applications, however, utilized SSP to relate reference and test signals that were collected at different times. For our current application, we formulate an approach that relates signals from one spatial region in a hyperspectral image to those from a nearby spatial region in the same image. The motivation and development of the technique are described in detail throughout the course of the paper.
We begin by developing the signal subspace processing anomaly detector (SSPAD) and proceed to illustrate how it arises naturally from the adaptive filtering formulation. We then compare the algorithm with existing anomaly-detection schemes, noting similarities and differences. Finally, we apply both the SSPAD and various existing anomaly detectors to a hyperspectral data set and compare the results via receiver operating characteristic (ROC) curves.
In this paper we propose a Wiener filter-based change detection algorithm for the detection of mines in Synthetic Aperture Radar (SAR) imagery. By computing second order statistics, the Wiener filter-based method has demonstrated improved performance over Euclidean distance. It is more robust to the presence of highly correlated speckle noise, misregistration errors, and nonlinear variations in the two SAR scenes. These variations may result from differences in the data acquisition systems and varying conditions during the different data collect times. A method very similar to the Mahalanobis distance was also implemented to detect mines in SAR images and has shown similar performance to the Wiener filter-based method. We present results in the form of receiver operating characteristics (ROC) curves, comparing simple Euclidean difference change detection, Mahalanobis difference-based change detection, and the proposed Wiener filter-based change detection in both global and local implementations.
In this paper, we present a kernel-based nonlinear version of canonical correlation analysis (CCA),
so called kernel canonical correlation analysis (KCCA), for
hyperspectral anomaly detection applications. CCA only measures linear dependency
between two sets of signal vectors (target and background) ignoring higher order correlations
crucial for distinguishing between man-made objects and background clutter.
In order to exploit nonlinear correlations we implicitly map the two
sets of data into a high dimensional feature space where correlations of nonlinear features
extracted from the original data are exploited by a kernel function.
A generalized eigenproblem is then formulated for KCCA. In this paper, both CCA and KCCA are applied
to real hyperspectral images and detection performance of CCA and KCCA are compared to the
well-known RX anomaly detection algorithm.
In this paper, we compare several detection algorithms that are based on spectral matched (subspace) filters. Nonlinear (kernel) versions of these spectral matched (subspace) detectors are also discussed and their performance is compared with the linear versions. These kernel-based detectors exploit the nonlinear correlations between the spectral bands that are ignored by the conventional detectors. Several well-known matched detectors, such as matched subspace detector, orthogonal subspace detector, spectral matched filter and adaptive subspace detector (adaptive cosine estimator) are extended to their corresponding kernel versions by using the idea of kernel-based learning theory. In kernel-based detection algorithms the data is implicitly mapped into a high dimensional kernel feature space by a nonlinear mapping which is associated with a kernel function. The detection algorithm is then derived in the feature space which is kernelized in terms of the kernel functions in order to avoid explicit computation in the high dimensional feature space. Experimental results based on simulated toy-examples and real hyperspectral imagery show that the kernel versions of these detectors outperform the conventional linear detectors.
In this paper we present a nonlinear version of the well-known anomaly detection method referred to as the RX-algorithm. Extending this algorithm to a feature space associated with the original input space via a certain nonlinear mapping function can provide a nonlinear version of the RX-algorithm. This nonlinear RX-algorithm, referred to as the kernel RX-algorithm, is basically intractable mainly due to the high dimensionality of the feature space produced by the non-linear mapping function. However, it is shown that the kernel RX-algorithm can easily be implemented by kernelizing it in terms of kernels which implicitly compute dot products in the nonlinear feature space. Improved performance of the kernel RX-algorithm over the conventional RX-algorithm is shown by testing hyperspectral imagery with military targets.
In this paper, we compare several detection algorithms that are based on spectral matched (subspace) filters. Non-linear (kernel) versions of these spectral matched (subspace) detectors are also discussed and their performance is compared with the linear versions. These kernel-based detectors exploit the nonlinear correlations between the spectral bands that are ignored by the conventional detectors. Several well-known matched detectors, such as matched subspace detector, orthogonal subspace detector, spectral matched filter and adaptive subspace detector (adaptive co-sine estimator) are extended to their corresponding kernel versions by using the idea of kernel-based learning theory. In kernel-based detection algorithms the data is implicitly mapped into a high dimensional kernel feature space by a nonlinear mapping which is associated with a kernel function. The detection algorithm is then derived in the feature space which is kernelized in terms of the kernel functions in order to avoid explicit computation in the high dimensional feature space. Experimental results based on simulated toy-examples and real hyperspectral imagery show that the kernel versions of these detectors outperform the conventional linear detectors.
An adaptive target detection algorithm for forward-looking infrared (FLIR) imagery is proposed, which is based on measuring differences between structural information within a target and its surrounding background. At each pixel in the image a dual window is opened, where the inner window (inner image vector) represents a possible target signature and the outer window (consisting of a number of outer image vectors) represents the surrounding scene. These image vectors are then preprocessed by two directional highpass filters to obtain the corresponding image gradient vectors. The target detection problem is formulated as a statistical hypotheses testing problem by mapping these image gradient vectors into two linear transformations, P1 and P2, via principal component analysis (PCA) and eigenspace separation transform (EST), respectively. The first transformation P1 is only a function of the inner image gradient vector. The second transformation P2 is a function of both the inner and outer image gradient vectors. For the hypothesis H1 (target), the difference of the two functions is small. For the hypothesis H0 (clutter), the difference of the two functions is large. Results of testing the proposed target detection algorithm on two large FLIR image databases are presented.
In this paper, an adaptive target detection algorithm for FLIR imagery is proposed that is based on measuring differences between structural information within a target and its surrounding background. At each pixel in the image a dual window is opened where the inner window (inner image vector) represents a possible target signature and the outer window (consisting of a number of outer image vectors) represents the surrounding scene. These image vectors are preprocessed by two directional highpass filters to obtain the
corresponding image edge vectors. The target detection problem is formulated as a statistical hypotheses testing problem by mapping these image edge vectors into two transformations, P1 and P2, via Eigenspace Separation Transform (EST) and Principal Component Analysis (PCA). The first transformation P1 is a function of the inner image edge vector. The second transformation P2 is a function of both the inner and outer image edge vectors. For the hypothesis H1 (target): the difference of the two functions is small. For the hypothesis H0 (clutter): the difference of the two functions is large. Results of testing the proposed target detection algorithm on two large FLIR image databases are presented.
We propose adaptive anomaly detectors that find materials whose spectral characteristics are substantially different from those of the neighboring materials.The target spectral vectors are assumed to have different statistical characteristics from the background vectors. In order to detect anomalies we use a dual rectangular window that separates the local area into two regions-- the inner window region (IWR) and outer window region (OWR). The statistical spectral differences between the IWR and OWR is exploited by generating subspace projection vectors onto which the IWR and OWR vectors are projected. Anomalies are detected if the projection separation between the IWR and OWR vectors is greater than a predefined threshold. Four different methods are used to produce the subspace projection vectors. The four proposed anomaly detectors have been applied to HYDICE (HYperspectral Digital Imagery Collection Experiment)images and the detection performance for each method has been evaluated.
Target detection techniques play an important role in automatic target recognition (ATR) systems because overall ATR performance depends closely on detection results. A number of detection techniques based on infrared (IR) images have been developed using a variety of pattern recognition approaches. However, target detection based on a single IR sensor is often hampered by adverse weather conditions or countermeasures, resulting in unacceptably high false alarm rates. Multiple imaging sensors in different spectral ranges, such as visible and infrared bands, are used here to reduce such adverse effects. The imaging data from the different sensors are jointly processed to exploit the spatial characteristics of the objects. Four local features are used to exploit the local characteristics of the images generated from each sensor. A confidence image is created via feature-based fusion that combines the features to obtain potential target locations. Experimental results using two test sequences are provided to demonstrate the viability of the proposed technique.
We present an adaptive unsupervised segmentation technique, in which spectral features are obtained and processed without a priori knowledge of the spectral characteristics. The proposed technique is based on an iterative method, in which segmentation at a given iteration depends closely on the segmentation results at the previous iteration. The hyperspectral images are first coarsely segmented and then the segmentation is successively refined via an iterative spectral dissimilarity measure. The algorithm also provides reduced computational complexity and improved segmentation performance. The algorithm consists of (1) an initial segmentation based on a fixed spectral dissimilarity measure and the k-means algorithm, and (2) subsequent adaptive segmentation based on an iterative spectral dissimilarity measure over a local region whose size is reduced progressively. The iterative use of a local spectral dissimilarity measure provided a set of values that can discriminate among different materials. The proposed unsupervised segmentation technique proved to be superior to other unsupervised algorithms, especially when a large number of different materials are mixed in complex hyperspectral scenes.
We present an efficient segmentation algorithm to discriminate between different materials, such as painted metal, vegetation, and soils, using hyperspectral imagery. Most previously attempted segmentation techniques have used a relatively small number of infrared frequency bands that use thermal emission instead of solar radiation. This motivated the use of hyperspectral (or multispectral) imagery for segmentation purposes taken at the visible and near infrared bands with high spectral dimensionality. We propose a segmentation algorithm that uses either a pattern- matching technique using the selected band regions or a principal component analysis method. Segmentation results are provided using several hyperspectral images. We also present a band-selection process based on either pairwise performance evaluation or a band-thickening method to select the particular band regions that contain important band- value information for segmentation. A hyperspectral data set that contains a number of spectral band-value curves collected from eleven hyperspectral images is used as an evaluation data set for the band-selection process.
KEYWORDS: Computer programming, Video, Video coding, Video compression, Quantization, Image compression, Neural networks, Signal to noise ratio, Digital imaging, Neurons
Compression of video for low bitrate communication is studied in this paper. Use of vector quantization in the H.263 framework is proposed. A variable rate residual vector quantizer with a transform vector quantizer in the first stage is used along with a strategy to adapt the bit-rate to the activity in the block. This ability to adapt the bit- rate is very important for very low bitrate compression. The proposed multistage quantizer combined with an adaptive arithmetic codec produced very good results. The variability in the bit-rate was achieved by using smaller block sizes in the later stages of quantizer along with selective quantization of only high energy blocks at the later stages. Performance comparison of the proposed codec with that of H- 263 indicates that there is superior compression results especially bitrates less than 8kb/s.
Compression of SAR imagery for battlefield digitization is discussed in this paper. THe images are first processed to separate out possible target areas. These target areas are compressed losslessly to avoid any degradation of the images. The background information which is usually necessary to establish context, is compressed using a hybrid vector quantization algorithm. An adaptive variable rate residual vector quantizer is use to compress the residual signal generated by a neural network predictor. The vector quantizer codebooks are optimized for entropy coding using an entropy-constrained algorithm to further improve the coding performance. This constrained vector-quantizer combination performs extremely well as suggested by the experimental results.
In this paper a compression algorithm is developed to compress SAR imagery at very low bit rate. A new vector quantization (VQ) technique called the predictive residual vector quantizer (PRVQ) is presented for encoding the SAR imagery. Also a variable-rate VQ scheme called the entropy- constrained PRVQ (EC-PRVQ), which is designed by imposing a constraint on the output entropy of the PRVQ, is designed. Experimental results are presented for both PRVQ and EC-PRVQ at high compression ratios. The encoded images are also compared with that of a wavelet-based coder.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.