|
1.IntroductionThis paper presents a tutorial on the performance metrics, status, and prognosis of aided/automatic target recognition (Ai/ATR) for those whom are not close to the military application of the technology, but who may be able to contribute to its ultimate successful development. Ai/ATR is a generic term to describe automated processing functions carried out on imaging sensor data in order to perform operations ranging from simple cuing of a human observer to complex, fully autonomous object acquisition and identification. ATR is fully autonomous, such as, for example, the terminal acquisition phase of a missile seeker. However, aided target recognition (AiTR) processing presents image annotations to the human observer to make the final decision as to the importance and veracity of the information generated and the action to be taken. In this paper, the imaging sensors that generate the data for the Ai/ATR processor are platform centric, including visible and electro-optics–infrared (EO/IR), 3-D LADAR, and imaging radar [e.g., synthetic aperture radar (SAR)]. EO/IR includes multi- and hyperspectral imaging. Signal processing of data from nonimaging sensors, such as acoustic, seismic, and magnetic sensors, is not considered; although these sensor outputs can be used as cues in a multisensor configuration for Ai/ATR. 2.Military ImportanceAi/ATR is an extremely important technology for military operations that has not yet realized its full tactical promise. A fully reliable Ai/ATR can enhance lethality and survivability of the war fighter and platform. An Ai/ATR operates on sensor data in order to process information for decision making. The primary value added to a weapons system of an Ai/ATR is engagement timeline reduction for target(s) acquisition. The rapid acquisition and servicing of targets increase lethality and survivability of the weapons platform/soldier. Whether the tactical scenario is the onslaught of an array of combat vehicles coming through the Fulda Gap, which was feared during the Cold War, or the identification of humans with intent to kill in an urban scene, the identification of the threat for avoidance or engagement is paramount to survival and threat neutralization. There are many military scenarios where a reliable Ai/ATR capability would provide an enormous capability to the soldier. A rapid wide-area search to provide alerts in larger fields of regard is the classical example that has always been envisioned. Ai/ATR can also enable the overcoming of unmanned air-/ground-vehicle bandwidth limitations by selection for transmission of only target information to a weapons platform. A reliable onboard Ai/ATR would select and send only target information back to the unmanned air vehicle (UAV) operator without the enormous data bandwidth for transmission of the complete scene over the flight path from which the operator must extract the target. Munitions precision targeting and lock-on-after-launch seekers are other examples of fully autonomous ATR. Persistent surveillance (PS) presents a first military application opportunity for lower technically sophisticated Ai/ATR in that change and anomaly detection are of primary significance. Things that change in a scene, or are different, are of primary importance in PS. Temporal techniques, such as change-detection algorithms and moving-target indication (MTI) become first-step candidate approaches. Change detection can be a major tool in improvised explosive detection (IED) detection. Disturbed earth, where a device has been buried, presents a significantly different signature than undisturbed earth. The disturbed earth presents a much more uniform, blackbody-like, spectral signature compared to the much more structured signature of undisturbed soil.1, 2 Extremely large coverage areas, such as that required in PS or for airborne detection of IEDs along a roadway, with sufficient resolution and update rate become driving sensor parameters. The need for ground-to-ground Ai/ATR in urban environments is amplified due to the huge fields of regard (∼2π steradians), the shortness of timelines, and the need to discriminate combatant from noncombatant. The Ai/ATR task difficulty is extremely task dependent, and a canonical data set is always a concern for training and evaluation in a military scenario. All three services are engaged in research and development for reliable Ai/ATR capabilities for myriad combat missions. Army, Navy, and Air Force are pursuing Ai/ATR with sensor packages for their respective platforms to do the following: reconnaissance, intelligence, surveillance, target acquisition, fire control, wide-area search and track, countermine, and sensor fusion. Change detection and MTI that relates to target disposition are also of interest. Army sensor assets typically emphasize EO/IR because of sensor size, weight, and power constraints on the platform, whereas Navy and Air Force tend to emphasize high range resolution and SAR radars due to the long stand-off ranges associated with ship and aircraft engagement ranges. This paper focuses on the extremely difficult ground-to-ground missions associated with Army or Marine combat. More extensive discussion of sea and air Ai/ATR missions can be found in the unclassified open literature at the Defense Technical Information Center (DTIC).3, 4, 5, 6, 7, 8 There is a whole hierarchy of possible tasks that can be of interest for an Ai/ATR algorithm. The level of discrimination can cover a whole gamut, from detection to classification to recognition to identification. Definitions of these military tasks for EO/IR and rf/SAR can be found in the literature.9, 10, 11, 12, 13 There can be other tactical tasks that do not fall neatly into this hierarchy. For example, target tracking, aim point selection, and target prioritization are target-engagement relevant tasks that can be candidates for automation in the target-acquisition and fire-control processes. When the Soviet Union was the premier potential adversary for the United States and nuclear war was not considered as an option, the most important land warfare conflict envisioned was tank-on-tank battles. In this scenario, the classic ATR task was detection and recognition with sufficient detail to engage the target with a weapon. Today, one of the most difficult tasks of interest is identification of intent. Whereas, in the past, detection of a human may have been sufficient, today the soldier must also determine the intent of the human detected. Is the intent of the detected human hostile? In PS for situational awareness, changes are the most important information in order to alert and bring other sensor assets to bear. Have military significant assets moved or are have new ones appeared? Although the technical sophistication of Ai/ATR has not progressed rapidly, the sophistication of the required performance from automated sensing has increased significantly. A very simplified diagram of a generic Ai/ATR algorithm is shown in Fig. 1. The image from the sensor is fed into the front end of the processor. Preprocessing conditioning is performed. These can be standard image-processing techniques to reduce/remove noise, perform image orientation, etc. Features are extracted so that candidate regions of interest (ROIs) are segmented, anomalies identified, and detections declared. Higher level features are found, for example, by comparing segmented regions to templates or stored models of targets. At this point, higher level discriminations may be declared. As mentioned earlier, there exists a whole hierarchy of potential target discriminations. For ground combat, examples of these two class discriminations are as follows: classification (tracked versus wheeled), recognition (truck versus tank), and identification (M1 tank versus T72 tank). Similar discriminations exist for air and naval warfare. In recent years, higher level discrimination may include “fingerprinting” when a specific entity identity is required, such as “that” vehicle was the one that planted the IED. There is an enormous array of algorithms that have been proposed, implemented in hardware, and tested within many Department of Defense (DOD) services and agencies. A selection of algorithm classes are statistical, shape based (template/model), MTI, increased dimensionality (e.g., 3-D LADAR),14, 15, 16 hyper-/multispectral (MS/HS),17, 18, 19 and neural nets. Multisensor phenomenologies have been tried, including multisensor, where more than one sensor is looking at the same target; multilook, where one sensor gets several looks at the target from different aspects; and multimode fusion, where sensors of different modalities sense the target (e.g., acoustic and EO signals are fused). Many variations of algorithms have been proposed and attempted in hardware and software, and a survey list of algorithms can be found in the literature.20 In order to illustrate the ground-to-ground Ai/ATR difficulty, Fig. 2 shows a representative set of IR sensor scenes for the same targets in each scene in a variety of backgrounds. The targets are a sedan, a pickup truck, van, and SUVs. In order to give the reader a realistic feel for the task difficulty, no annotations are given to show the targets in the scenes. The same scenes, with the targets indicated by a box superimposed on the scene that display the ATR annotations for the imagery, will be shown later in Fig. 6. These figures are also shown in order to demonstrate the difficulty of the Ai/ATR task with midwave IR thermal imagers. The most prolific battlefield sensors in the U.S. Army after the human eyeball and night-vision goggles are thermal imagers. 3.Figures of Merit3.1.Three Bottom-Line Figures of MeritThe three bottom-line figures of merit for Ai/ATR are receiver operator characteristics (ROC) curves, confusion matrices, and time. The ROC curves show the relationship of the algorithm-detection probability to the false alarms. They show how well the ATR discriminates real targets of interest from noise sources or background clutter objects. Figure 3 21 shows a typical ROC curve for a developmental Army ATR. The different curves correspond to using different numbers of spectral bands in the midwave IR (MWIR) and long-wave IR spectral regions using a constant false alarm (CFAR) decision algorithm. The movement of the family of curves to the left and higher indicates higher performance. Confusion matrices show the relationship between the real target identity to what the ATR called it. Higher level discrimination performance, such as recognition or identification, is displayed in the confusion matrices. Figure 4 shows a stylized confusion matrix for algorithm identification performance against ground combat vehicles. A detailed discussion of the considerations for the measurement of confusion matrices is given in Ref. 22. Time to acquire the targets within the sensor field of view and field of regard is the real benefit for use of Ai/ATR. Measured AiTR timeline performance when compared to human-alone performance has been shown to realize an order of magnitude reduction (see Ref. 23). 3.2.Imagery Data SetIn order to carry out a performance evaluation of an ATR algorithm for imaging sensors in terms of ROC curves and confusion matrices, it is necessary to have a relevant imagery data set. For example, if the desired ATR algorithm is for a tank fire control mission and the main sensor is the IR gunner's primary sight, then, IR imagery of a scene with threat targets, for all variants and at all poses and orientations, is required in all relevant backgrounds. For air-to-air combat and surface naval warfare, a similar set of the relevant targets is required. Another scenario of close air support requires a similar set of ground targets, but with another set of variables for target reticulation (i.e., which direction the gun is pointing). The issue of the target signature set has been well documented.24 These requirements then imply the generation of a library of IR imagery, which is, typically, classified. Not only is the imagery classified, but the sensor parameters for the gathering device are classified. This has been a significant issue for military Ai/ATR development. The necessary classified imagery is easy enough to obtain and the service laboratories do this extensively. However, the imagery set can be quite extensive and cannot be released to noncleared organizations. For example, university researchers with noncitizen students cannot get the necessary imagery to design the algorithm and test it. Instead, we have had to live with algorithms developed against civilian vehicles on U.S. highways. Extrapolation to realistic military scenarios is extremely difficult, if not impossible, and there can be no free back-and-forth interaction for the development of Ai/ATR among the government labs, defense industry, and academia. The issue of the canonical imagery data set for performance quantification is so severe that a special DoD committee has been chartered to define problem sets. This is the ATR Working Group.25 Sanctioned problem sets permit the establishment of universal metrics to assess algorithm performance collectively and scientifically. The same problem set can be used to test any number of candidate algorithms and permits quantifiable difference measurements across all candidates. The problem of an unclassified, canonical stimulus set of imagery has been somewhat addressed in the last several years with the release of a specially gathered, unclassified IR imagery set for AI/ATR algorithm development. Unclassified sensors were used to obtain MWIR and visible imagery of tactical vehicles, civilian vehicles, and people in realistic tactical scenes with corresponding ground-truth and meteorological data. This >300-GB imagery data set is available by contacting SENSIAC26 at a cost of several hundred dollars. Although this is only one data set for one scenario, it is a significant step toward enabling the injection of a wider academic community into the research on Ai/ATR. Once stimulus data have been obtained, the data must be separated into a training set and a test set. The algorithm must be trained on a relevant set of imagery that relates to the mission scenario and will expose the algorithm to all the variables that it will be expected to handle. This means the target set must be appropriate, including not only the target set, but also the variants of the members of the set to environments, operational conditions, and backgrounds. Various environments are needed because the same vehicle can appear differently from day to night, season, and even time of day. Think of especially diurnal and seasonal variations in the infrared spectral regions. A set of relevant operational conditions is needed because the target signature will vary whether it is stationary, moving, firing, if it has been rained on, camouflage, etc. Different backgrounds present a variety of confusers and competitive false targets. Because it is impossible to sample all the infinitely large sample sets of conditions, a judicious set of samples must be chosen that represents a sufficient expanse of the complete target/background set and that gives some confidence that the entire space has been faithfully represented. Agreement on this point is usually a major bone of contention between a government evaluator and an industrial developer. In order to use the chosen data set of stimuli for Ai/ATR testing, the data set must include ground-truth data. That is, the location of legitimate targets in the scene must be determined digitally in order to score the ATR annotations. Usually, an error box is associated around the true target, where an ATR annotation is accepted as a true detection. This can be a tedious process, even with modern computer software. There are only a few laboratories in the Defense Department that routinely carry out these ground-truth and score Ai/ATR algorithms for the community, source selections, and mission accomplishment. Two of these are the Army's Night Vision & Electronics Sensors Directorate (NVESD) and the Air Force's Wright Patterson Research Laboratories. 3.3.Simulated ImageryOne might consider the utility of generating simulated imagery of tactical scenes as a surrogate for realistic test imagery that could negate testing against all the possible real-world scenes. This concept has been investigated and shown to be problematic.27 Testing with simulated imagery has shown that, although the detection probabilities are quite comparable between synthetic and realistic imagery, the false alarm rate (FAR) was much different with simulated imagery compared to the realistic image inputs. The hypothesis suggested for this difference is the differences in real and simulated backgrounds, where false alarms are generated. The synthetic image generators evidently produce different target confusing regions in the background from real backgrounds. Additionally, the synthetic noise generation can be significantly different from true sensor noise characteristics. It is important to characterize the sensor noise characteristics extremely well to simulate sensor realistic noise. 3.4.Receiver Operator Characteristics Curve DeterminationIn order to test an Ai/ATR algorithm for its detection performance, as determined by its ROC curve, the requisite data set from a relevant imager viewing a relevant operational scene needs to be digitized and fed into the algorithm under test. The single-frame processors process the imagery, frame by frame, and nominate image sections as targets. Usually, a recognition decision is also reported. The annotations are scored by comparison to the ground truth and an ROC curve generated for that data set (Fig. 5). Slightly different approaches to scoring and evaluation are required for multiframe processors and those designed to look for moving targets. This Fig. 5 curve is generated by feeding the digitized image frames into the computer that is hosting the ATR algorithm. As each detection decision is made, its location is matched to the ground-truth data file for the real targets. If the ATR algorithm declaration is within an established error region of the real target, then it is recorded as a true detection. If the declaration is not in a real target region, then it is recorded as a false alarm. The higher the curve is and the more to the left, the better the performance is. See Fig. 3 for a set of curves showing performance improvements as the curves move higher to the left. The generated curve is unique for the processor, scene, training set, and test set. Here lies the bane of the ATR technology. Any change in target condition, location, signature, background, processor/algorithm characteristics, etc., can cause a different call for the annotation. The hope of the technology is that sufficiently robust algorithms can still correctly acquire the targets in backgrounds to give the operator enough confidence to use it with the commensurate improvement in combat effectiveness. The reliability with which ATR can do this in military applications is generally not acceptable for all but a few situations. The veracity of this statement is difficult to substantiate without reference to classified literature. The method of measurement of how an ATR algorithm performs (i.e., the determination of the ROC curve) is crucial to understanding what we expect an ATR to do and establishes the database characteristics to evaluate it. The method described here is the method developed at the U.S. Army's CERDEC NVESD by a team led by Carl Hoover and Clare Walters.28, 29 Other evaluations of ATR ROC curves are similar. The first thing to recognize is that most ATR algorithms today, and that have been tested in the laboratory, are based on a CFAR parameter. That is, the determination of the target rests on the setting of a threshold for the number of false alarms will be tolerated. Whatever the algorithm parameters that are calculated from the processing of the digital image, a confidence is established as a function of that set of parameters based on training against a relevant image set. The threshold parameter can then be chosen based on the FAR and detection probability (Pd) that are desired. Pd and FAR are set based on the operational requirements. Once the threshold CFAR is chosen, the ATR algorithm can be tested against a test/evaluation set of imagery that is different from the training set. The representation of the training data set to the test/operational set is always a matter of intense discussion between algorithm developer and service evaluator. The CFAR being selected, the algorithm can be run against the test imagery set. All annotations of the algorithm, real target or false alarm, can then be ordered against a confidence. Critical to this process is the relating of the algorithm's annotations to real targets based on the image ground truth. This is another critical component of the evaluation process. How to establish/score an annotation as a true target hit or a false alarm is critical. However, software has been developed to do this. The ROC curve shown in Fig. 5 can be now generated. Starting with the highest confidence value, a point is established on the Pd versus FAR axis. As the computer runs down the threshold confidence values, from highest to lowest, true target detections and false alarms are plotted. As the number of annotations is increased, a smooth curve similar to Fig. 3 is generated. The importance of a set of training imagery as representative of the operational situation is another crucial consideration. Careful, judicious choices must be made by the evaluator to ensure all real targets are deployed in tactically relevant scenes. The algorithm must be stressed such that the war fighter has confidence in its use. Conversely, the algorithm developer must understand the scenario in order to design the algorithm to go after the tactically significant artifacts in the scene. It is obvious that range to the target can be very important information for the processor to help size the window of investigation. Estimations on range to points in the image must be made. If the weapon system has an integral rangefinder, then range is given to the algorithm under test. If not, then the algorithm usually uses some technique programed, such as a flat-earth technique, to estimate range which can introduce significant errors into the range value and, consequently, the target size. Knowledge of range in the scene can enable a great enhancement to algorithm performance. Other approaches, such as rescaling selected regions in the image to a fixed range, have also been used. The ROC curve is generated as the probability of detection on the vertical axis and false alarms on the horizontal axis. Usually, FAR is in units of false alarms per square degree for ground combat and false alarms per square kilometer for airborne sensors. This is because on the ground, the ground covered by the sensor field of view goes from very close to the horizon. Typically, this experimental ROC curve is compared to a specification ROC curve based on a weapon system requirement to determine if the algorithm meets the performance requirement. 4.What's the Problem?The extreme difficulty of the military target acquisition task has thwarted progress in the development of image-processing techniques that enable an acceptable level of performance for the war fighter in harm's way. Aided target recognition in relatively benign environments, such as low clutter, has been shown to perform at a useful level. However, medium to highly cluttered backgrounds introduce an unacceptable amount of false alarms, whereas target variability and operational environmental conditions also have a significant degrading effect. Higher level discriminations, such as target recognition and identification, fall off significantly compared to detection. Previous technical articles on the performance of military Ai/ATR technology can be found in the literature.30, 31, 32, 33 An excellent synopsis of types of algorithms is given by Bhanu.34 4.1.False AlarmsA primary operational limitation for ATR is the false alarm problem driven by objects in the scene that can be confused with targets (confusers) and background clutter that causes the operator to spend excessive time interrogating them. This problem is exacerbated by sensors that are not visible imagers and do not have the resolution of visible imagers and are not familiar to normal human vision, such as thermal imaging. Additionally, in tactical situations the threat can introduce countermeasures such as camouflage and defilade. The ability of humans to discern targets is still significantly greater than that of electronic processing algorithms.35 However, humans cannot process the available information and make decisions at a fast enough rate to engage targets effectively. The electronics can process the information at a much faster rate, and that is why the military continues to pursue an effective Ai/ATR technology for military combat requirements. It has been shown that timelines for target acquisition can be reduced on the order of a magnitude using Ai/ATR with a human over human only.35 Although there have been some successes in military Ai/ATR in the services, there have been some significant limitations to the desired performance. The main challenge that has been identified for military Ai/ATR is the level of false alarms for detection encountered in real environments. The level of false alarms in a tactical ground-to-ground scenario can be sufficient that the operator will turn off the AiTR/ATR. Besides increasing the time to acquire the real target and the frustration, false alarms can be dangerous. Firing at a false target will give the position of the firing platform away and make it a target of counterfire. 4.2.ClutterA primary limitation of ATR technology is lack of an understanding of clutter and a reliable clutter model that can quantify the scene difficulty. This difficulty is compounded by the obvious dependence of scene difficulty on the target of interest. Clutter that confuses detection of a vehicle is different when attempting to detect personnel. Clutter models that are more sophisticated than simple signal-to-clutter models representative of human performance models are required. Examples of approaches to quantifying clutter, such as Lanterman ,36 appear in the literature, and there are information theoretic approaches.37, 38 The ultimate clutter metric must surely contain some target conspicuity factor. A clutter metric that is primarily a function of signal-to-noise ratio or signal to clutter will not show the true dependency of performance on real-world clutter. Further discussion of clutter modeling39 can be found from research funded by the Army at the Center for Image Sciences.40 Besides clutter, camouflage and signature disrupters can also degrade Ai/ATR performance, which is another major reason Ai/ATR has been very difficult for military applications. 4.3.Target VariabilityAnother performance limiter to aided target acquisition is target variability under operational conditions. The target can present all aspects and can have different signatures under different environmental, operational, and background conditions. Camouflage, concealment, and deception (decoys) increase the target dimensional space significantly. These variables plague Ai/ATR's in all the services and set the most severe limitations of performance today for this technology. Probabilities of higher order than detection performance degrade as the sophistication increases. The limitations imposed by false alarms and variable environmental conditions might imply that the best we can hope for is aided target recognition. Full automatic target recognition may be unattainable, or at best, take a long time to mature. 5.What Works?Years of research and development coupled with constant test-fix-test cycles for specific ad hoc mission targeting applications have resulted in the level of maturity for Ai/ATR that we have today. It is impossible to give precise data representative of the state-of-the-art performance today in an unclassified forum. Shape-based approaches to ground-to-ground stationary target indication have shown to give useful performance in low to medium clutter. The clutter level is subjective because there has been no clutter metric to accurately determine task difficulty. One of the greatest unmet challenges of this technology is a reliable clutter metric. Shape-based algorithm suites are typically template matching schemes or comparison of target images to stored target models. Ai/ATR from airborne sensor platforms has been shown to be somewhat better performing compared to ground to ground. This is because the clutter from the air is not as competitive as opposed to the ground scenario. There is also an advantage of recognizing an overhead aspect, which is not as complex compared to ground to ground, and is not as easily confused with overhead views of ground clutter. In addition, typically, aircraft altitude is known, which makes range estimation easier and the sensor field of view does not extend to infinity as it does on the ground. Atmospheric attenuation from the air tends to be much less that for ground-to-ground lines of sight. The current conflicts in southwest Asia have refocused the important application of Ai/ATR from fire control to persistent surveillance missions. Previously, the focus of the technology was on the acquisition of targets for the Comanche helicopter fire control. The paramount application today of persistent surveillance to detect hostile activity is potentially a somewhat easier task. The approaches developed for PS that have had some success are change detection and MTI. Detection of new targets and missing targets in images compared to previous images of the same location has had some success. MTI from the air and from stationary ground platforms also has been shown. However, MTI from moving ground platforms has problems with optical flow and confusion of whether the motion is target or platform induced. Two more sophisticated sensor approaches that offer more image features on which a decision can be based are MS/HSI and 3-D LADAR. However, these system concepts require increased system complexity and cost. Both add another orthogonal dimension to the decision space. MS/HSI uses the unique spectral content of objects as the discrimination metric between backgrounds and targets. MS/HSI sensors can be used in a search mode for target detection, however, presently they are day-only operation and are large expensive systems. 3-D LADAR–based imagers add the depth dimension to the image as another target discriminator. The system implications downside of 3-D LADARs is that they have high power requirements, are not useful for searching, and laser power requirements constrain the practical range that can be realized. 6.Opportunities for Advances in Aided and Automatic Target Recognition PerformanceThere are many potential applications throughout the services for reliable Ai/ATRs. However, except for some small number of applications, the attainable level of performance must be significantly improved to handle all the false alarms and environmental variables that are encountered in military scenarios. We cannot look to improvements in the imaging sensors being used as the front ends for Ai/ATR’s. They are already pushing the limits of physics. Performance improvements must be in the ATR algorithm concepts or in the way AiTR annotations are presented to the observer in order to engage more of the observer's intellectual image processing. New techniques for extracting objects from complex backgrounds are needed. These new techniques would be expected to originate in academia and could form the basis for a new springboard for image science in the pursuit of useful military Ai/ATR capability. Candidate starting points for military relevant image science approaches to Ai/ATR are pattern theoretic approaches to understanding complex scenes41 or a recognition-by-parts42, 43, 44 approach. The recognition of a tactical, canonical geon in an image that is partially obscured could imply the presence of a target of interest. Eye-brain research could lead to more understanding of what needs to be extracted from a tactical image and presented to the operator for enhanced recognition ability. Other nonimage-based techniques, such as category theory,45 hierarchical systems,46 and gradient index flow,47 are possible formalisms that might be applied to help the Ai/ATR problem. Any improvements realized in ATR performance would be due to algorithm improvements in software as opposed to improving sensor hardware. A system-level approach to increasing Ai/ATR performance is to take advantage of tactical networks on the battlefield. There is a plethora of imaging and nonimaging sensors on the battlefield that are being networked together for transmission of information, such as targets, across platforms. At each platform, the ATR could take the off-board data and build a case for each, indicated onboard detection as to target or false alarm. For example, an unattended acoustic sensor could supply information to a tank computer that an image annotation was, in fact, another tank. An example of an existing Army platform that is designed for this kind of decision making is distributed common ground station. This approach to low false alarm Ai/ATR would be limited by the bandwidth limitations of the tactical network. This kind of approach stresses the tactical network capabilities with some sophistication improvement in the algorithm software and no impact on the sensor. 7.Summary and ConclusionsAi/ATR can provide significant enhancement to military weapons platforms over human-only performance. AiTR can provide enhancements to the weapons operator or intelligence analyst for fire control, surveillance, reconnaissance, intelligence, persistent surveillance, and situational awareness. ATR can provide fully autonomous target engagements, such as for missile seekers. Present use of Ai/ATR by the military has been limited due to the level of difficulty of the automated task. However, the technology is being pursued in academia, industry, and government laboratories. Most prevalent state-of-the-art Ai/ATR algorithms today are shape based, in which performance degrades significantly under realistic operational conditions, such as clutter, variable target set, and variability. Ground-to-ground degradations have been shown to be more severe than airborne target acquisition. Enhancements to shape-based approaches potentially can provide a more robust capability. Temporal techniques, such as change detection and MTI are examples. Sensor-level improvements are MS/HSI in wide-area search-and camouflage-detection and 3-D LADAR for higher level recognition and identification. Aided target recognition will mature more rapidly than ATR. By off-loading the higher level decisions to the human, the value of the AiTR will be to potentially provide an order of magnitude improvement in target acquisition times. This is significant in war fighting with increasingly more importance in urban warfare. AcknowledgmentsThe author thanks Lynda Graceffo for Fig. 1 and Dr. Raguveer Rao, Dr. Shuowen Hu, and Dr. Asif Mehmood for their technical comments. Special thanks go to Dr. Clare Walters for the IR imagery, his technical discussions, and for figuring out how to test these ATRs, and to Dr. Susan Young for her comments and help on organizing this paper. Also, I acknowledge Carl Hoover, posthumously, for his contribution and leadership in the development of the ATR measurement facility, procedures, and techniques at the U.S. Army Night Vision Laboratory. ReferencesE. M. Winter, M. J. Schiangen, A. P. Bowman, M. R. Carter, C. L. Bennett, D. J. Fields, W. D. Aimonetti, P. G. Lucey, J. R. Johnson, K. A. Horton, T. J. Williams, A. D. Stocker, A. Oshagan, A. T. DePersia, and C. J. Sayre,
“Experiments to support the development of techniques for hyperspectral mine detection,”
Proc. SPIE, 2759 139
–148
(1996). https://doi.org/10.1117/12.241163 Google Scholar
P. G. Lucey, K. A. Horton, and T. Williams,
“Performance of a longwave infrared hyperspectral imager using a Sagnac interferomenter and an uncooled microbolometer array,”
Appl. Opt., 47
(28), F107
–113
(2008). https://doi.org/10.1364/AO.47.00F107 Google Scholar
R.-N. P. Singh and M. A. Abdallah,
“Image-based automatic target recognition,”
(2000) Google Scholar
A. B. Nevis, T. James, and B. J. S. Cordes,
“Object detection using a background anomaly approach for electro-optic identification sensors,”
(2002) Google Scholar
M. J. Marlin,
“Wide area search and engagement simulation validation,”
(2007) Google Scholar
Y. Kang,
“Combat identification using multiple TUAV swarm,”
(2008) Google Scholar
G. Wilson,
“A time-critical targeting roadmap,”
(April 2002) Google Scholar
J. R. Rufa,
“Development of an experimental platform for testing autonomous UAV guidance and control algorithms,”
(March 2007) Google Scholar
J. Johnson,
“Analysis of image forming systems,”
249
–273
(1958). Google Scholar
J. A. Ratches,
“Static performance model for thermal imaging systems,”
Opt. Eng., 15
(6), 525
–530
(1976). Google Scholar
J. A. Ratches,
“Night vision modeling; historical perspective,”
Proc. SPIE, 3701 1
–12
(1999). Google Scholar
J. A. Ratches, R. H. Vollmerhausen, and R. G. Driggers,
“Target acquisition performance modeling of infrared imaging systems: past, present, and future,”
IEEE Sens. J., 1
(1), 31
–40
(2001). https://doi.org/10.1109/JSEN.2001.923585 Google Scholar
R. G. Driggers, J. A. Ratches, J. C. Leachtenauer, and R. W. Kistner,
“Synthetic aperture radar target acquisition model based on National Imagery Interoperability Rating Scale to probability of discrimination conversion,”
Opt. Eng., 42
(7), 2104
–2112
(2003). https://doi.org/10.1117/1.1580831 Google Scholar
F. Selzer and D. Gutfinger,
“LADAR and FLIR based sensor fusion for automatic target classification,”
Proc. SPIE, 1003 236
–246
(1989). Google Scholar
A. V. Forman, D. J. Sullivan, and A. W. Chang,
“Parallel algorithm for automatic target recognition using laser radar imagery,”
Proc. SPIE, 1348 493
–502
(1990). https://doi.org/10.1117/12.23503 Google Scholar
M. Snorrason, H. Ruda, and A. Caglayan,
“Automatic target recognition in laser radar imagery,”
552
–560
(1994). Google Scholar
A. El-Saba, M. S. Alam, and W. A. Sakla,
“Pattern recognition via multispectral, hyperspectral and polarization-based imaging,”
Proc. SPIE, 7696 76961M
(2010). https://doi.org/10.1117/12.850520 Google Scholar
P. W. T. Yuen and M. Richarson,
“An introduction to hyperspectral imaging and its application for security, surveillance and target acquisition,”
Imaging Sci. J., 58
(5), 241
–253
(2010). https://doi.org/10.1179/174313110X12771950995716 Google Scholar
M. T. Eismann, J. Meola, and A. D. Stocker,
“Automated hyperspectral target detection and change detection from an airborne platform: Progress and challenges,”
4354
–3457
(2010). Google Scholar
B. Bhanu,
“Automatic target recognition: state of the art survey,”
IEEE Trans. Aerospace Electron. Syst., AES-22
(4), 364
–379
(1986). https://doi.org/10.1109/TAES.1986.310772 Google Scholar
A. Mehmood and N. Nasrabadi,
“Wavelet-RX anomaly detection for dual-band forward-looking infrared imaging,”
Appl. Opt., 49
(24), 4621
–4632
(2010). https://doi.org/10.1364/AO.49.004621 Google Scholar
T. D. Ross, L. A. Westerkamp, R. L. Dilsavor, and J. C. Mossing,
“Performance measures for summarizing confusion matrices: the AFRL COMPASE approach,”
Proc. SPIE, 4727 310
–321
(2002). https://doi.org/10.1117/12.478692 Google Scholar
D. Reago and W. Gercken,
“Machine aided search: results of human performance,”
(1996). Google Scholar
E. G. Zelnio, F. D. Garber, L. Westerkamp, S. W. Worrell, J. J. Westerkamp, M. Jarrat, C. E. Deardorf, and P. A. Ryan,
“Characterization of ATR systems,”
Proc. SPIE, 3070 223
–234
(1997). https://doi.org/10.1117/12.281560 Google Scholar
L. A. Westerkamp, T. J. Wild, D. Meredith, S. A. Morrison, J. C. Mossing, R. K. Avent, A. Bergman, A. Bruckheim, D. A. Castanon, F. J. Corbett, D. Hugo, R. A. Hummel, J. M. Irvine, B. Merle, L. Otto, R. Reynolds, C. Sadowski, B. J. Schachter, K. M. Simonson, G. Smit, and C. P. Walters,
“Problem set guidelines to facilitate ATR research, development and performance assessment,”
Proc. SPIE, 4726 310
–315
(2002). https://doi.org/10.1117/12.477039 Google Scholar
SENSIAC is a DOD Information Analysis Center. Sponsored by the Defense Technical Information Center, www.sensiac.gatech.edu Google Scholar
C. P. Walters, C. Hoover, and J. A. Ratches,
“Performance of an automatic target recognizer algorithm against real and two versions of synthetic imagery,”
Opt. Eng., 39
(8), 2270
–2284
(2000). https://doi.org/10.1117/1.1305538 Google Scholar
C. P. Walters,
“Removing the automatic target recognition performance evaluation bottleneck: the C2NVEO AUTOSPEC facility,”
Opt. Eng., 30 247
–253
(1991). https://doi.org/10.1117/12.55806 Google Scholar
C. P. Walters,
“Development of imagery sets and performance evaluation methods for MTI using ground-based thermal imagers,”
B3
–B7
(2001). Google Scholar
J. A. Ratches, C. P. Walters, R. G. Buser, and B. G. Guenther,
“Aided and automatic target recognition based upon sensory inputs from image forming systems,”
IEEE Trans. Pattern Anal. Mach. Intell., 19
(9), 1004
–1019
(1997). https://doi.org/10.1109/34.615449 Google Scholar
S. Chow and T. Jones,
“Army's FLIR/ATR evolution path,”
(1989). Google Scholar
J. A. Ratches, P. Gillespie, J. Hilger, and C. P. Walters,
“Autonomous/aided target recognition (ATR/AiTR) assessment,”
(2006) Google Scholar
B. Bhanu and T. L. Jones,
“Image understanding research for automatic target recognition,”
IEEE Trans. Aerospace Electron. Syst., 8
(10), 15
–23
(1993). https://doi.org/10.1109/62.240102 Google Scholar
B. Bhanu,
“Automatic target recognition: state of the art survey,”
IEEE Trans. Aerospace Electron. Syst., AES-22
(4), 364
–379
(1986). https://doi.org/10.1109/TAES.1986.310772 Google Scholar
B. Weber,
“Comparison of human observer and algorithmic target detection in non-urban FLIR imagery,”
Opt. Eng., 44
(7), 076401
(2005). https://doi.org/10.1117/1.1948147 Google Scholar
A. D. Lanterman, J. A. O’Sullivan, and M. I. Miller,
“Kullback-Leibler distances for quantifying clutter and models,”
Opt. Eng., 38
(12), 2134
–2146
(1999). https://doi.org/10.1117/1.602323 Google Scholar
L. G. Clark and V. J. Velten,
“Image characterization for automatic target recognition algorithm evaluations,”
Opt. Eng., 30
(2), 147
–153
(1991). https://doi.org/10.1117/12.55784 Google Scholar
L. G. Clark, L. I. Perlovsky, W. H. Schoendorf, C. P. Plum, and T. J. Keller,
“Evaluation of forward-looking infrared-sensors for automatic target recognition using an inforamtion-theoretic approach,”
Opt. Eng., 31
(12), 2618
–2627
(1992). https://doi.org/10.1117/12.60009 Google Scholar
S. C. Zhu, A. Lanterman, and M. Miller,
“Clutter modeling and performance analysis in automatic target recognition,”
(1998) Google Scholar
Army Research Office funded Center for Image Sciences at Washington University, St. Louis, MO, Contract No. DAAD-19-99-1-0012 (
(1999–2001) Google Scholar
A. D. Lanterman, M. I. Miller, and D. L. Snyder,
“The unification of detection, tracking, and recognition for millimeter wave and infrared sensors,”
Proc. SPIE, 2562 150
–161
(1995). https://doi.org/10.1117/12.216951 Google Scholar
I. Biederman,
“Recognition-by-components: a theory of human image understanding,”
Psychol. Rev., 94
(2), 115
–147
(1987). https://doi.org/10.1037/0033-295X.94.2.115 Google Scholar
S Agarwal, A. Awan, and D. Roth,
“Learning to detect objects in images via a sparse, part-based representation,”
IEEE Trans. Pattern Anal. Mach. Intell., 26
(11), 1475
–1490
(2004). https://doi.org/10.1109/TPAMI.2004.108 Google Scholar
C. Wallraven, A. Schwaninger, S. Schuhmacher, and H. H. Bülthoff,
“View-based recognition of faces in man and machine: re-visiting inter-extra-ortho,”
Biologically Motivated Comput. Vis., Lect. Notes Comput. Sci., 2525 651
–660
(2002). https://doi.org/10.1007/3-540-36181-2 Google Scholar
S. A. DeLoach,
“Category theory approach to fusion of wavelet-based features,”
117
–124
(1999). Google Scholar
R. Song, H. Ji, S. Xia, W. Hu, and W. Yu,
“Hierarchical modular structure for automatic target recognition systems,”
Proc. SPIE, 4554 57
–61
(2001). Google Scholar
C. Xu and J. L. Prince,
“Snakes, shapes, and gradient vector flow,”
IEEE Trans. Image Process., 7
(3), 359
–369
(1998). Google Scholar
James A. Ratches received a BS degree from Trinity College, Hartford, CT in 1964 and MS and PhD degrees from Worcester Polytechnic Institute, Worcester, MA in 1966 and 1969, respectively. He retired from the U.S. Army's Communications-Electronics Command Night Vision and Electro-Optics Directorate (NVESD) in 2009 after almost 40 years of service. During his career, he was a major contributor to the areas of modeling, simulation, characterization and evaluation of electro-optical systems. He retired as Chief Scientist for NVESD. Since 2009, he has been the Associate Director for S&T at the U.S. Army Research Laboratory's Sensors & Electron Devices Directorate (SEDD) at Adelphi, MD. SEDD conducts fundamental research for the Army in electro-optics and photonics, signal and image processing, electronics and RF technology, and power and energy. |