PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE
Proceedings Volume 6967, including the Title Page, Copyright
information, Table of Contents, Introduction (if any), and the
Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For over half a century, scientists and engineers have worked diligently to advance computational intelligence. One
application of interest is how computational intelligence can bring value to our war fighters. Automatic Target
Recognition (ATR) and sensor fusion efforts have fallen far short of the desired capabilities. In this article we review
the capabilities requested by war fighters. When compared to our current capabilities, it is easy to conclude current
Combat Identification (CID) as a Family of Systems (FoS) does a lousy job. The war fighter needed capable,
operationalized ATR and sensor fusion systems ten years ago but it did not happen. The article reviews the war fighter
needs and the current state of the art. The article then concludes by looking forward to where we are headed to provide
the capabilities required.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic target detection (ATD) systems process imagery to detect and locate targets in support of intelligence,
surveillance, reconnaissance, and strike missions. Accurate prediction of ATD performance would assist in system
design and trade studies, collection management, and mission planning. Specifically, a need exists for ATD performance
prediction based exclusively on information available from the imagery and its associated metadata. In response to this
need, we undertake a modeling effort that consists of two phases: a learning phase, where image measures are computed
for a set of test images, the ATD performance is measured, and a prediction model is developed; and a second phase to
test and validate performance prediction. The learning phase produces a mapping, valid across various ATD algorithms,
which is even applicable when no image truth is available (e.g., when evaluating denied area imagery). Ongoing efforts
to develop such a prediction model have met with some success. Previous results presented models to predict
performance for several ATD methods. This paper extends the work in several ways: extension to a new ATD method,
application of the modeling to a new image set, and an investigation of systematic changes in the image properties
(resolution, noise, contrast). The paper concludes with a discussion of future research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We consider automatic target recognition (ATR) in infrared (IR) imagery using the minimum noise and correlation
energy (MINACE) distortion-invariant filter (DIF). As in our prior work (SPIE 6566-03), we consider classification of
true-class CAD targets and rejection of real clutter and unseen confuser CAD objects with range and full 360° aspect
view variations. In this work, we address rejection of new UCIR bush clutter data. We also present performance scores
for several different training and test cases with attention to filter capacity, i.e., the number of training images that can be
included in one filter before performance on the test set deteriorates appreciably. We find that range rather than aspect
view distortions seem to affect filter capacity more. Initial target contrast ratio tests are also presented. To more
properly address clutter, in all tests we now form the magnitude of the output correlation plane before analysis. We also
address when and why linear versus circular correlations are best. We also address DIF filter-synthesis and fast
implementation for wide area "search" test regions. This introduces new issues concerning the region over which
correlation plane energy is minimized in filter synthesis and the size of the FFT to use in tests. A key issue is that both
training and tests should use the same procedures. This is vital for training and test metrics to be comparable. We
distinguish between whether linear or circular correlation plane energy is minimized.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A generic nonlinear dynamic range compression deconvolver (DRCD) is proposed. We have performed the dynamic
range compression deconvolution using three forms of nonlinearities: (a) digital implementation- A-law/μ-law, (b)
hybrid digital-optical implementation- two-beam coupling photorefractive holography, and (c) all optical
implementation- MEMS deformable mirrors. The performance of image restoration improves as the saturation
nonlinearity increases. The DRCD could be used as a preprocessor for enhancing Automatic Target Recognition (ATR)
system performance. In imaging through atmosphere, factors such as rain, snow, haze, pollution, etc. affect the received
information from a target; therefore the need for correcting these captured images before an ATR system is required. The
DRCD outperforms well-established image restoration filters such as the inverse and the Wiener filters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The last five years have seen a renewal of Automatic Target Recognition applications, mainly because of the latest
advances in machine learning techniques. In this context, large collections of image datasets are essential for training
algorithms as well as for their evaluation. Indeed, the recent proliferation of recognition algorithms, generally applied to
slightly different problems, make their comparisons through clean evaluation campaigns necessary.
The ROBIN project tries to fulfil these two needs by putting unclassified datasets, ground truths, competitions and
metrics for the evaluation of ATR algorithms at the disposition of the scientific community. The scope of this project
includes single and multi-class generic target detection and generic target recognition, in military and security contexts.
From our knowledge, it is the first time that a database of this importance (several hundred thousands of visible and
infrared hand annotated images) has been publicly released.
Funded by the French Ministry of Defence (DGA) and by the French Ministry of Research, ROBIN is one of the ten
Techno-vision projects. Techno-vision is a large and ambitious government initiative for building evaluation means for
computer vision technologies, for various application contexts. ROBIN's consortium includes major companies and
research centres involved in Computer Vision R&D in the field of defence: Bertin Technologies, CNES, ECA, DGA,
EADS, INRIA, ONERA, MBDA, SAGEM, THALES.
This paper, which first gives an overview of the whole project, is focused on one of ROBIN's key competitions, the
SAGEM Defence Security database. This dataset contains more than eight hundred ground and aerial infrared images of
six different vehicles in cluttered scenes including distracters. Two different sets of data are available for each target. The
first set includes different views of each vehicle at close range in a "simple" background, and can be used to train
algorithms. The second set contains many views of the same vehicle in different contexts and situations simulating
operational scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Over the five past years, the computer vision community has explored many different avenues of research for Automatic
Target Recognition. Noticeable advances have been made and we are now in the situation where large-scale evaluations
of ATR technologies have to be carried out, to determine what the limitations of the recently proposed methods are and
to determine the best directions for future works.
ROBIN, which is a project funded by the French Ministry of Defence and by the French Ministry of Research, has the
ambition of being a new reference for benchmarking ATR algorithms in operational contexts. This project, headed by
major companies and research centers involved in Computer Vision R&D in the field of Defense (Bertin Technologies,
CNES, ECA, DGA, EADS, INRIA, ONERA, MBDA, SAGEM, THALES) recently released a large dataset of several
thousands of hand-annotated infrared and RGB images of different targets in different situations.
Setting up an evaluation campaign requires us to define, accurately and carefully, sets of data (both for training ATR
algorithms and for their evaluation), tasks to be evaluated, and finally protocols and metrics for the evaluation. ROBIN
offers interesting contributions to each one of these three points.
This paper first describes, justifies and defines the set of functions used in the ROBIN competitions and relevant for
evaluating ATR algorithms (Detection, Localization, Recognition and Identification). It also defines the metrics and the
protocol used for evaluating these functions. In the second part of the paper, the results obtained by several state-of-the-art
algorithms on the SAGEM DS database (a subpart of ROBIN) are presented and discussed
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Existing techniques for hyperspectral image (HSI) anomaly detection are computationally intensive precluding real-time
implementation. The high dimensionality of the spatial/spectral hypercube with associated correlations between spectral
bands present significant impediments to real time full hypercube processing that accurately encapsulates the underlying
modeling. Traditional techniques have imposed Gaussian models, but these have suffered from significant
computational requirements to compute large inverse covariance matrices as well as modeling inaccuracies. We have
developed a novel data-driven, non-parametric HSI anomaly detector that has significantly reduced computational
complexity with enhanced HSI modeling, providing the capability for real time performance with detection rates that
match or surpass existing approaches. This detector, based on the Support Vector Data Description (SVDD), provides
accurate, automated modeling of multi-modal data, facilitating effective application of a global background estimation
technique which provides the capability for real time operation on a standard PC platform. We have demonstrated one
second processing time on hypercubes of dimension 256×256×145, along with superior detection performance relative to
alternate detectors. Computation performance analysis has been quantified via processing runtimes on a PC platform,
and detection/false-alarm performance is described via Region Operating Characteristic (ROC) curve analysis for the
SVDD anomaly detector vs. alternate anomaly detectors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, the 1-D spectral fringe-adjusted joint transform correlation (SFJTC) technique has been proposed as an
effective means for performing deterministic target detection in hyperspectral imagery. In this work, we explore the use
of the discrete wavelet transform (DWT) as a pre-processing tool for SFJTC-based target detection. To quantify
improvement and compare performance in the detection process, receiver operating characteristic (ROC) curves are
generated and the areas under the ROC curves (AUROC) are computed. The basic premise of this work is that selected
coefficients generated from a desired level of the DWT decomposition of the data can be used in place of the original
data for improved SFJTC-based detection. We illustrate this by conducting experiments on two different hyperspectral
scenes containing varying amounts of simulated noise. Results indicate that use of the DWT coefficients significantly
improves the detection performance, especially in the presence of noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel variational method using level sets that incorporate spectral angle distance in the model for automatic target
detection is presented. Algorithms are presented for detecting both spatial and pixel targets. The new method is tested in
tasks of unsupervised target detection in hyperspectral images with more than 100 bands, and the results are compared
with a widely used region-based level sets algorithm. Additionally, techniques of band subset selection are evaluated for
the reduction of data dimensionality. The proposed method is adapted for supervised target detection and its
performance is compared with traditional orthogonal subspace projection and constrained signal detector for the
detection of pixel targets. The method is evaluated with different complexity such as noise levels and target sizes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hyperspectral data (HS) is increasingly used in target detection applications since it provides
both spatial and spectral information about the scene. One of the main challenges in HS data is to
handle a large volume of data. On the other hand, mutispectral data provides the information
with reduced number of bands. As a result, target detection in multispectral image is more challenging due to lack of information about the objects. In this paper, we presented a new approach to detect land mines in multispectral images. We showed that application of matched filter (MF) to multispectral data is not suitable to detect the targets but after selecting some features based on principal component analysis (PCA) enables it to detect all the targets. We also described a segmentation technique-sliding concentric window (SCW) to extract the land mines from the clutter.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target Detection and Classification using Active Sensors I
Three dimensional imaging is a powerful tool for object detection, identification, and classification. 3D imaging allows
removal of partial obscurations in front of the imaged object. Traditional 3D image sensing has been Laser Radar
(LADAR) based. Active imaging has benefits; however, its disadvantages are costs, detector array complexity, power,
weight, and size. In this keynote address paper, we present an overview of 3D sensing approaches based on passive
sensing using commercially available detector technology. 3D passive sensing will provide many benefits, including
advantages at shorter ranges. For small, inexpensive UAVs, it is likely that 3D passive imaging will be preferable to
active 3D imaging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Feature-aided target verification is a challenging field of research, with the potential to yield significant increases in the
confidence of re-established target tracks after kinematic confusion events. Using appropriate control algorithms
airborne multi-mode radars can acquire a library of HRR (High Range Resolution) profiles for targets as they are
tracked. When a kinematic confusion event occurs, such as a vehicle dropping below MDV (Minimum Detectable
Velocity) for some period of time, or two target tracks crossing, it is necessary to utilize feature-aided tracking methods
to correctly associate post-confusion tracks with pre-confusion tracks. Many current HRR profile target recognition
methods focus on statistical characteristics of either individual profiles or sets of profiles taken over limited viewing
angles. These methods have not proven to be very effective when the pre- and post- confusion libraries do not overlap in
azimuth angle.
To address this issue we propose a new approach to target recognition from HRR profiles. We present an algorithm that
generates 2-D imagery of targets from the pre- and post-confusion libraries. These images are subsequently used as the
input to a target recognition/classifier process. Since, center-aligned HRR Profiles, while ideal for processing, are not
easily computed in field systems, as they require the airborne platform's center of rotation to line up with the geometric
center of the moving target (this is impossible when multiple targets are being tracked), our algorithm is designed to
work with HRR profiles that are aligned to the leading edge (the first detection above a threshold, commonly referred to
as Edge-Aligned HRR profiles).
Our simulated results demonstrate the effectiveness of this method for classifying target vehicles based on simulations
using both overlapping and non-overlapping HRR profile sets. The algorithm was tested on several test cases using an
input set of .28 m resolution XPATCH generated HRR profiles of 20 test vehicles (civilian and military) at various
elevation angles.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper primarily investigates the use of shape-based features by an Automatic Target Recognition (ATR) system to
classify various types of targets in Synthetic Aperture Radar (SAR) images. In specific, shapes of target outlines are
represented via Elliptical Fourier Descriptors (EFDs), which, in turn, are utilized as recognition features. According to
the proposed ATR approach, a segmentation stage first isolates the target region from shadow and ground clutter via a
sequence of fast thresholding and morphological operations. Next, a number of EFDs are computed that can sufficiently
describe the salient characteristics of the target outline. Finally, a classification stage based on an ensemble of Support
Vector Machines identifies the target with the appropriate class label. In order to experimentally illustrate the merit of
the proposed approach, SAR intensity images from the well-known Moving and Stationary Target Acquisition and
Recognition (MSTAR) dataset were used as 10-class and 3-class recognition problems. Furthermore, comparisons were
drawn in terms of classification performance and computational complexity to other successful methods discussed in the
literature, such as template matching methods. The obtained results portray that only a very limited amount of EFDs are
required to achieve recognition rates that are competitive to well-established approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Template-based classification algorithms used with synthetic aperture radar (SAR) automatic target recognition (ATR)
degrade in performance when used with spatially mismatched imagery. The degradation, caused by a spatial mismatch
between the template and image, is analyzed to show acceptable tolerances for SAR systems. The mismatch between
the image and template is achieved by resampling the test imagery to different pixel spacings. A consistent SAR dataset
is used to examine pixel spacings between 0.1069 and 0.2539 meters with a nominal spacing of 0.2021 meters.
Performance degradation is observed as the pixel spacing is adjusted, Small amounts of variation in the pixel spacing
cause little change in performance and allow design engineers to set reliable tolerances. Alternatively, the results show
that using templates and images collected from slightly different sensor platforms is a very real possibility with the
ability to predict the classification performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target Detection and Classification using Active Sensors II
This paper presents a preliminary study of information-theoretic divergence between sets of LADAR image data. This
study has been motivated by the hypothesis that despite the huge dimensionality of raw image space, related images
actually lie on embedded manifolds within this set of all possible images and can be represented in much lowerdimensional
sub-spaces. If these low-dimensional representations can be found, information theoretic properties of the
images can be exploited while circumventing many of the problems associated with the so-called "curse of
dimensionality." In this study, PCA techniques are used to find a low-dimensional sub-space representation of LADAR
image sets. A real LADAR image data set was collected using the AFSTAR sensor and a synthetic image data set was
created using the Irma LADAR image modeling program. One unique aspect of this study is the use of an entirely
synthetic data set to find a sub-space representation that is reasonably valid for both the synthetic data set and the real
data set. After the sub-space representation is found, an information-theoretic density divergence measure (Cauchy-
Schwarz divergence) is computed using Parzen window estimation methods to find the divergence between and among
the sets of synthetic and real target classes. These divergence measures can then be used to make target classification
decisions for sets of images. In practice, this technique could be used to make classification decisions on multiple images
collected from a moving sensor platform or from a geographically distributed set of cooperating sensor platforms
operating in a target region.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a radar target identification system using down range profile radar signatures. The recognition is
performed using a multi-layer perceptron trained via particle swarm optimization (PSO). The recognition results are
compared with those obtained using back propagation training. The paper also uses PSO for modeling target signatures
and extracting target scattering centers assuming that they can be modeled as an auto regressive moving average model.
Real radar signatures of commercial aircraft are used to assess the performance of the techniques proposed. The results
focus on comparing PSO based techniques with others used for target modeling and recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Underwater mine identification persists as a critical technology pursued aggressively by the Navy for fleet protection.
As such, new and improved techniques must continue to be developed in order to provide measurable increases in mine
identification performance and noticeable reductions in false alarm rates. In this paper we show how recent advances in
the Volume Correlation Filter (VCF) developed for ground based LIDAR systems can be adapted to identify targets in
underwater LIDAR imagery. Current automated target recognition (ATR) algorithms for underwater mine identification
employ spatial based three-dimensional (3D) shape fitting of models to LIDAR data to identify common mine shapes
consisting of the box, cylinder, hemisphere, truncated cone, wedge, and annulus. VCFs provide a promising alternative
to these spatial techniques by correlating 3D models against the 3D rendered LIDAR data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Using matched filters to find targets in cluttered images is an old idea. Human operators can interactively find threshold
values to be applied to the correlation surface that will do a good job of binarizing it into signal/non-signal pixel regions.
Automating the thresholding process with nine measured image statistics is the goal of this paper. The nine values are
the mean, maximum, and standard deviation of three images: the input image presumed to have some signal, an NxN
matched filter kernel in the shape of the signal, and the correlation surface generated by convolving the input image with
the matched filter kernel. Several thousand input images with known target locations and reference images were run
through a correlator with kernels that resembled the targets. The nine numbers referred to above were calculated in
addition to a threshold found with a time consuming brutal algorithm. Multidimensional radial basis functions were
associated with each nine number set. The bump height corresponded to the threshold value. The bump location was
within a nine dimensional hypercube corresponding to the nine numbers scaled so that all the data fell within the interval
0 to 1 on each axis. The sigma (sharpness of the radial basis function) was calculated as a fraction of the squared
distance to the closest neighboring bump. A new threshold is calculated as a weighted sum of all the Gaussian bumps in
the vicinity of the input 9D vector. The paper will conclude with a table of results using this method compared to other
methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many traditional methods produce classification results by processing one image frame at a time.
For instance, conventional correlation filters are designed to yield well defined correlation peaks
when a pattern or object of interest is present in the input image. However, the decision process
is memory-less, and does not take advantage of the history of results on previous frames in a
sequence. Recently, Kerekes and Kumar introduced a new Bayesian approach for multi-frame
correlation that first produces an estimate of the object's location based on previous results, and
then builds up the hypothesis using both the current data as well as the historical estimate. A
motion model is used as part of this estimation process to predict the probability of the object at a
particular location. Since the movement and behavior of objects can change with time, it may be
disadvantageous to use a fixed motion model. In this paper, we show that it is possible to let the
motion model vary over time, and adaptively update it based on data. Preliminary analysis shows
that the adaptive multi-frame approach has the potential for yielding significant performance
improvements over the conventional approach based on individual frames.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Rayleigh Quotient Quadratic Correlation Filter (RQQCF) has been used to achieve very good
performance for Automatic Target Detection/Recognition. The filter coefficients are obtained as the
solution that maximizes a class separation metric, thus resulting in optimal performance. Recently, a
transform domain approach was presented for ATR using the RQQCF called the Transform Domain
RQQCF (TDRQQCF). The TDRQQCF considerably reduced the computational complexity and storage
requirements, by compressing the target and clutter data used in designing the QCF. In addition, the
TDRQQCF approach was able to produce larger responses when the filter was correlated with target and
clutter images. This was achieved while maintaining the excellent recognition accuracy of the original
spatial domain RQQCF algorithm. The computation of the RQQCF and the TDRQQCF involve the inverse
of the term A1 = Rx + Ry where Rx and Ry are the sample autocorrelation matrices for targets and
clutter respectively. It can be conjectured that the TDRQQCF approach is equivalent to regularizing A1. A
common regularization approach involves performing the Eigenvalue Decomposition (EVD) of A1, setting
some small eigenvalues to zero, and then reconstructing A1, which is now expected to be better
conditioned. In this paper, this regularization approach is investigated, and compared to the TDRQQCF.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mixed state or hybrid state space systems are useful tools for various problems in computer vision. These
systems model complicated system dynamics as a mixture of inherently simple sub-systems, with an additional
mechanism to switch between the sub-systems. This approach of modeling using simpler systems allows for
ease in learning the parameters of the system and in solving the inference problem. In this paper, we study
the use of such mixed state space systems for problems in recognition and behavior analysis in video sequences.
We begin with a dynamical system formulation for recognition of faces from a video. This system is used to
introduce the simultaneous tracking and recognition paradigm that allows for improved performance in both
tracking and recognition. We extend this framework to design a second system for verification of vehicles across
non-overlapping views using structural and textural fingerprints for characterizing the identity of the target.
Finally, we show the use of such modeling for tracking and behavior analysis of bees from video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel analytical method measuring the impact of reference model accuracy on tactical, model-based
automatic target acquisition (ATA) algorithm performance. Military tacticians are currently governed by various
standards regarding quality requirements for georeferenced imagery and geospatial data sources. In some new generation
systems, this imagery provides the basis for generating 3D reference models for input into model-based ATA systems.
This paper analyses the criticality of this absolute coordinate accuracy requirement by assessing ATA algorithm
performance using 3D reference models created from a variety of commercially available data sources, including aerial
and terrestrial photography, and airborne laser scanner data. ATA algorithm performance is analysed using a software
tool that uses a variety of open source techniques and image processing functions typically found in tactical, air-to-ground,
model-based ATA systems. Each of the 3D reference models, extracted in a number of different areas of interest,
was matched against a corresponding sequence of infrared video data. This provided a series of results, which were
analysed as a function of both reference model accuracy and object selection and representation. Initial results indicate
that the absolute accuracy of the reference models created for this research has a minor impact on ATA algorithm
performance when compared with the impact of the content of the reference models, taking into account the complexity
of the area of interest. This suggests that a wider array of data sources may be suitable for 3D reference model
construction, than is currently accepted.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper considers the following problem: given a 3D model of a reference target and a sequence of images of a 3D
scene, identify the object in the scene most likely to be the reference target and determine its current pose. Finding the
best match in each frame independently of previous decisions is not optimal, since past information is ignored. Our
solution concept uses a novel Bayesian framework for multi target tracking and object recognition to define and
sequentially update the probability that the reference target is any one of the tracked objects. The approach is applied to
problems of automatic lock-on and missile guidance using a laser radar seeker. Field trials have resulted in high target hit
probabilities despite low resolution imagery and temporarily highly occluded targets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Targeting and precision-guided munitions rely on precise image and video registration. Current approaches for
geo-registration typically utilize registration algorithms that operate in two-dimensional (2D) transform spaces,
in the absence of an underlying three-dimensional (3D) surface model. However, because of their two-dimensional
motion assumptions, these algorithms place limitations on the types of imagery and collection geometries that
can be used. Incorporating a 3D reference surface enables the use of 2D-to-3D registration algorithms and
removes many such limitations. The author has previously demonstrated a fast 2D-to-3D registration algorithm
for registration of live video to surface data extracted from medical images. The algorithm uses an illumination-tolerant
gradient-descent based optimization to register a 2D image to 3D surface data in order to globally locate
the camera's origin with respect to the 3D model. The rapid convergence of the algorithm is achieved through
a reformulation of the optimization problem that allows many data elements to be re-used through multiple
iterations. This paper details the extension of this algorithm to the more difficult problem of registering aerial
imagery to terrain data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic detection of surface objects, like vessels, in a maritime environment from images is
an important issue in naval surveillance. Two different approaches - gradient filter and background
estimation - are presented in this paper and the test results on real data, both infrared as well as visible light
images, are discussed. In the gradient approach, a gradient filter scans the sea-part of the image horizontally
and vertically resulting in peaks at locations where the gradient exceeds a predefined local threshold. In the
second approach, the background estimation, a polynomial model of the background is fitted locally to the seapart
of the image. Using these polynomial background-estimators in the actual sea-analysis, objects are
detected. In this paper the advantages and disadvantages of both approaches are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Edge features are often used in computer vision for image exploitation algorithms. A method to extract edge
features that is robust to contrast change, translation, rotation, noise and scale change is presented. This method
consists of the following steps: decompose the image into it's level set shapes, smooth the shapes, locate sections
of the shape borders that have nearly constant curvature, and locate a key point based on these curve sections.
The level sets are found using the Fast Level Set Transform (FLST). An affine invariant smoothing technique was
then applied to the level set shape borders to reduce pixel effects and noise, and an intrinsic scale was estimated
from the level set borders. The final step was key point location and scale estimation using the Helmholtz
principle. These key points were found to be more resilient to large scale changes than the SIFT key points.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we introduce and test a new similarity measure for use in a template matching process for target detection
and recognition. The measure has recently been developed for multi-modal registration of medical images and is known
as phase mutual information (PMI). The key advantage of PMI is that it is invariant to lighting conditions, the ratio
between foreground and background intensity and the level of background clutter, which is critical for target detection
and recognition from the surveillance images acquired from various sensors. Several experiments were conducted using
real and synthetic datasets to evaluate the performance of PMI when compared with a number of commonly used
similarity measures including mean squared difference, gradient error and intensity mutual information. Our results show
that PMI consistently provided the most accurate detection and recognition performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Humans are better at detecting targets in literal imagery than any known algorithm. Recent advances in modeling visual
processes have resulted from f-MRI brain imaging with humans and the use of more invasive techniques with monkeys. There are four startling new discoveries. 1) The visual cortex does not simply process an incoming image. It constructs a physics based model of the image. 2) Coarse category classification and range-to-target are estimated quickly - possibly through the dorsal pathway of
the visual cortex, combining rapid coarse processing of image data with expectations and goals. This data is then
fed back to lower levels to resize the target and enhance the recognition process feeding forward through the ventral
pathway. 3) Giant photosensitive retinal ganglion cells provide data for maintaining circadian rhythm (time-of-day) and modeling
the physics of the light source. 4) Five filter types implemented by the neurons of the primary visual cortex have been determined. A computer model for automatic target detection has been developed based upon these recent discoveries. It uses an
artificial neural network architecture with multiple feed-forward and feedback paths. Our implementation's efficiency
derives from the observation that any 2-D filter kernel can be approximated by a sum of 2-D box functions. And, a 2-D
box function easily decomposes into two 1-D box functions. Further efficiency is obtained by decomposing the largest
neural filter into a high pass filter and a more sparsely sampled low pass filter.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The analysis of complex infrastructure from aerial imagery, for instance a detailed analysis of an airfield, requires the
interpreter, besides to be familiar with the sensor's imaging characteristics, to have a detailed understanding of the
infrastructure domain. The required domain knowledge includes knowledge about the processes and functions involved
in the operation of the infrastructure, the potential objects used to provide those functions and their spatial and functional
interrelations. Since it is not possible yet to provide reliable automatic object recognition (AOR) for the analysis of such
complex scenes, we developed systems to support a human interpreter with either interactive approaches, able to assist
the interpreter with previously acquired expert knowledge about the domain in question, or AOR methods, capable of
detecting, recognizing or analyzing certain classes of objects for certain sensors. We believe, to achieve an optimal result
at the end of an interpretation process in terms of efficiency and effectivity, it is essential to integrate both interactive and
automatic approaches to image interpretation. In this paper we present an approach inspired by the advancing semantic
web technology to represent domain knowledge, the capabilities of available AOR modules and the image parameters in
an explicit way. This enables us to seamlessly extend an interactive image interpretation environment with AOR
modules in a way that we can automatically select suitable AOR methods for the current subtask, focus them on an
appropriate area of interest and reintegrate their results into the environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Three-dimensional (3D) Laser Detection and Ranging (LADAR) range data is being investigated for automatic target
recognition applications. The spin-image provides a useful data representation for 3D point cloud data. In the spirit of
recent work that shows ℓ1-sparseness to be a useful data compression metric, we propose to use Nonnegative Matrix
Factorization (NMF) to help find features that capture the salient information resident in the spin-image representation.
NMF is a technique for decomposing nonnegative multivariate data into its 'parts', resulting in a compressed and usually
sparse representation. As a surrogate for measured 3D LADAR data, we generate 3D point clouds from computer-aided-design
models of two land targets, and we generate spin-images at multiple support scales. We select the support scale
that provides the highest separability between the spin-image stacks from the two land targets. We then apply NMF to
the spin-images at this support scale, and seek elements corresponding to meaningful parts of the land vehicles (e.g., a
tank turret or truck wheels), that in a joint sense should provide significant discriminative capability. We measure the
separability in the sparse NMF subspace. For measuring separability, we use the Henze-Penrose measure of multivariate
distributional divergence.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the intelligence community, aerial video has become one of the fastest growing data sources and it has been
extensively used in intelligence, surveillance, reconnaissance, tactical and security applications. This paper
presents a tracking approach to detect moving vehicles and person in such videos taken from aerial platform.
In our approach, we combine the layer segmentation approach with background stabilization and post-tracking
refinement to reliably detect small moving objects at the relatively low processing speed. For each individual
moving object, a corresponding layer is created to maintain an independent appearance and motion model
during the tracking process. After the online tracking process, we apply a post-tracking refinement process to
link the track fragments into a long consistent track ID to further reduce false alarm and increase detection rate.
Furthermore, a vehicle and person classifier is also integrated into the approach to identify the moving object
categories. The classifier is based on image histogram of gradient (HOG), which is more reliable to illumination
variation or camera automatic gain change. Finally, we report the results of our algorithms on a large scale of
EO and IR data set collected from VIVID program, and the results show that our approach achieved a good
and stable tracking performance on the data set that is more than eight hours.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic target detection (ATD) is a very challenging problem for the Army in ground-to-ground scenarios using
infrared (IR) sensors. I propose an ATD algorithm based on vector quantization (VQ). VQ is typically used for image
compression where a codebook is created using the Linde Buzo Gray (LBG) algorithm from an example image. The
codebook will be trained on clutter images containing no targets thus creating a clutter codebook. The idea is to encode
and decode new images using the clutter codebook and calculate the VQ error. The error due solely to the compression
will be approximately consistent across the image. In the areas that contain new objects in the scene (objects the
codebook has not been trained on) we should see the consistent compression error plus an increased "non-training error"
due to the fact that pixel blocks representing the new object are not included in the codebook. After the decoding
process, areas in the image with large overall error will correlate to pixel blocks not in the codebook. The Kolomogorov-Smirnov distance is used to classify new objects from a reference clutter error distribution. The VQ algorithm trains on
clutter so it will never have a problem with new targets like many "trained algorithms". The algorithm is run over a data
set of images and the results show that the VQ detection algorithm performs as well as the Army benchmark algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In a FLIR image sequence, a target may disappear permanently or may reappear after some frames and
crucial information such as direction, position and size related to the target are lost. If the target reappears at a
later frame, it may not be tracked again because the 3D orientation, size and location of the target might be
changed. We have proposed two methods: FKT-DCCF and FKT-PDCCF. We have implemented several tests
by using DCCF and PDCCF when a target reappears to field-of-view. At the end of those test results we
compared the DCCF and PDCCF methods by looking the percentage of correctly identified targets. Test
results using both mid-wave and long-wave FLIR sequences (M1415, L1805, L1702, L1920, L19NS, and
L1915) are incorporated to verify the effectiveness of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at participating nodes. Therefore, the feature-extraction method based on the Haar DWT is presented that employs a maximum-entropy measure to determine significant wavelet coefficients. Features are formed by calculating the energy of coefficients grouped around the competing clusters. A DWT-based feature extraction algorithm used for vehicle classification in WSNs can be enhanced by an added rule for selecting the optimal number of resolution levels to improve the correct classification rate and reduce energy consumption expended in local algorithm computations.
Published field trial data for vehicular ground targets, measured with multiple sensor types, are used to evaluate the wavelet-assisted algorithms. Extracted features are used in established target recognition routines, e.g., the Bayesian minimum-error-rate classifier, to compare the effects on the classification performance of the wavelet compression. Simulations of feature sets and recognition routines at different resolution levels in target scenarios indicate the impact on classification rates, while formulas are provided to estimate reduction in resource use due to distributed compression.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose several fusion techniques in the design of a hybrid composite classification system. Our composite
classifier taps into the strengths of two separate classification paradigms and examines various fusion methods
for combining the two. The first classifier uses non-numeric features, similar to those found in syntactic pattern
recognition, by exploiting the overall structure of the patterns themselves. The second method uses a more
classical feature vector method that bins the patterns and uses the maximum values within each bin in developing
the feature vector for each pattern. By using these two separate approaches, we explore conditions that allow
the two techniques to be complementary in nature, thus improving, when fused, the overall performance of the
classification system. We examine four seperate fusion techniques, the Basic Ensemble Method, the Probabilistic
Neural Network, the Borda Count and the Bayesian Belief Network using a ten class problem in our experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Small Unmanned Aerial Vehicles (UAVs) are increasingly being used in-theater to provide low-cost, low-profile aerial
reconnaissance and surveillance capabilities. However, inherent platform limitations on size, weight, and power restrict
the ability to provide sensors and communications which can present high-resolution imagery to the end-user. This paper
discusses methods to alleviate this restriction by performing on-board pre-processing of high resolution images and
downlinking the post-processed imagery. This has the added benefit of reducing the workload for a warfighter who is
already heavily taxed by other duties.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
During a previous technology programme, a simple landscape and complex target geometries were modelled and
demonstrated in a COTS infrared (IR) simulation tool. A preliminary assessment of training-based ATR on real and
synthetic imagery was performed, which was presented at SPIE D&S in 2005.
The current technology programme has assessed model-based ATR on real and synthetic IR imagery for a 5-class case.
Real IR imagery was recorded during a flight campaign. A complex landscape and complex targets were modelled and
simulated in a wide variety of conditions in the IR simulation tool.
A survey was conducted regarding the current state-of-the-art of model-based ATR approaches. Another survey
concerning contour extraction methods for ATR was performed. The best ATR algorithms and contour extraction
methods were selected from the survey results. These algorithms were implemented for a multi-class ATR case and
adapted to work on the characteristics of IR imagery. The algorithms were benchmarked and compared on the simulated
and recorded IR imagery using classical measures. A process for performance assessment of multi-class ATR methods
was defined according to an ATR benchmarking concept developed by the German Fraunhofer Research Institute. The
assessment was then conducted on the algorithms using a multi-class evaluation approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.