Proc. SPIE. 8048, Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVII
KEYWORDS: Target detection, Signal to noise ratio, Hyperspectral imaging, Detection and tracking algorithms, Sensors, Reflectivity, Transform theory, Image classification, Heads up displays, Signal detection
This paper describes a novel approach for the detection and classification of man-made objects using discriminating
features derived from higher-order spectra (HOS), defined in terms of higher-order moments of hyperspectral-signals.
Many existing hyperspectral analysis techniques are based on linearity assumptions. However, recent research suggests
that significant nonlinearity arises due to multipath scatter, as well as spatially varying atmospheric water vapor
concentrations. Higher-order spectra characterize subtle complex nonlinear dependencies in spectral phenomenology of
objects in hyperspectral data and are insensitive to additive Gaussian noise. By exploiting these HOS properties, we have
devised a robust method for classifying man-made objects from hyerspectral signatures despite the presence of strong
background noise, confusers with spectrally similar signatures and variable signal-to-noise ratios. We tested
classification performance hyperspectral imagery collected from several different sensor platforms and compared our
algorithm with conventional classifiers based on linear models. Our experimental results demonstrate that our HOS
algorithm produces significant reductions in false alarms. Furthermore, when HOS-based features were combined with
standard features derived from spectral properties, the overall classification accuracy is substantially improved.
This paper presents the result of an ONR-sponsored ocean- surface reconstruction project. The goal of the LIDAR project is to investigate a method suitable for obtaining the shape, and in particular, the slopes of the large gravitational waves, to be used in a Navy application for under-water mine detection. Towards this goal, Summus has designed, built and tested a laser-based device for water surface measurement. A field test was conducted at the Army Field Research Facility at Kitty Hawk, North Carolina in October 1998. This paper describes the basic design conducted of our method, and the experimental results.
In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.
A significant amount of text now present in World Wide Web documents is embedded in image data, and a large portion of it does not appear elsewhere at all. To make this information available, we need to develop techniques for recovering textual information from in-line Web images. In this paper, we describe two methods for Web image OCR. Recognizing text extracted from in-line Web images is difficult because characters in these images are often rendered at a low spatial resolution. Such images are typically considered to be 'low quality' by traditional OCR technologies. Our proposed methods utilize the information contained in the color bits to compensate for the loss of information due to low sampling resolution. The first method uses a polynomial surface fitting technique for object recognition. The second method is based on the traditional n-tuple technique. We collected a small set of character samples from Web documents and tested the two algorithms. Preliminary experimental results show that our n-tuple method works quite well. However, the surface fitting method performs rather poorly due to the coarseness and small number of color shades used in the text.
Traditional document analysis systems often adopt a top-down framework, i.e., they are composed of various locally interacting functional components, guided by a central control mechanism. The design of each component is determined by a human expert and is optimized for a given class of inputs. Such a system can fail when confronted by an input that falls outside its anticipated domain. This paper investigates the use of a genetic-based adaptive mechanism in the analysis of complex test formatting. Specifically, we explore a genetic approach to the binarization problem. As opposed to a single, pre-defined, 'optimal' thresholding scheme, the genetic-based process applies various known methods and evaluates their effectiveness on the input image. Individual regions are treated independently, while the genetic algorithm attempts to optimize the overall result for the entire page. Advantages and disadvantages of this approach are discussed.
In this paper, we examine the effects of systematic differences (bias) and sample size (variance) on computed OCR accuracy. We present results from large-scale experiments simulating several groups of researchers attempting to perform the same test, but using slightly different equipment and procedures. We first demonstrate that seemingly minor systematic differences between experiments can result in significant biases in the computed OCR accuracy. Then we show that while a relatively small number of pages is sufficient to obtain a precise estimate of accuracy in the case of `clean' input, real-world degradation can greatly increase the required sample size.
Interpretation of gaps and touching characters continues to challenge current OCR designs. We approach this and other difficult problems in character recognition by deferring decisions to a stage where a character-specific knowledge base can be applied to the problem. We show how to extract and interpret saddle ridge features, at locations where there is either a narrow gap or a thin stroke. Since the color of the ideal image at these points cannot be locally deduced reliably from the features, special treatment is needed. Mathematically a saddle ridge is one where, roughly, there are strong eigenvalues of the Hessian of the gray scale surface that are of opposite sign. The recognition module is based on the matching of subgraphs homomorphic to previously defined prototypes. It generates candidate matchings of groups of input features with each part of the prototype. In this context, each saddle ridge is decided to be a piece of a stroke or a separation between strokes. The quality of each grouping is measured by the cost of transformations carrying the candidate features into the prototype.
This paper reports proof of concept of a design for recognizing postal address blocks. The system must function with varying and unspecified fonts, dot matrix printing, and poor print quality. Our design achieves tolerance to differing contrast and degraded print via grayscale analysis, and omnifont capability by encoding character shapes as graphs. The current prototype, restricted to digits, successfully recognizes degraded numeric fields. There are four major modules. First, the strokes comprising each character are detected as ridges in grayscale space. Our design is tolerant of wide contrast variation even within a single character, and produces connected strokes from dot matrix print. Second, strokes are grouped to produce line segments and arcs, which are linked to produce a graph describing the character. The third stage, recognition by matching the input character graph to prototype graphs, is described in a companion paper by Rocha and Pavlidis. Finally, secondary classification is applied to break near ties by focusing on discriminating features. The secondary classifier is described in a companion paper by Zhou and Pavlidis. Experimental results on 2000 address blocks supplied by the USPS are presented. We also report experiments on subsampling the data, which indicate that the performance at 100 dpi is very close to that at the original 300 dpi.
Certain characters are distinguishable from each other only by fine detail, and, therefore, in our method we group those characters into hierarchical categories. When the first classifier assigns a character to one of those categories, the second process, which is called disambiguation, is applied. We actually use two types of disambiguators. In one we look at the skeleton graph in finer detail and in the other we look at the original gray scale data. The first disambiguation process is geared toward resolving ties between the top two choices of the first classifier (pair disambiguation). Each procedure uses the skeleton features matched to the prototypes by the first classifier and then takes a closer look at the geometrical relations between arcs and strokes. Error analysis of the results suggested the need for the re- examination of the gray scale data. For this purpose the gray scale values of skeletons are used to find an accurate threshold in order to extract contours. Interpretations whose characteristics do not align well with the features measured from the contours are eliminated progressively. Experiments conducted with this method on address blocks supplied by the USPS indicated that the overall performance was substantially improved.
Omnifont optical character recognition proceeds by computing features on the input image and then classifying the image. Past omnifont optical character recognition techniques that use features have always binarized the image by comparing the brightness of an input pixel with a threshold level and then labeling it as `black'' or `white'' and then computing the features for each character. However, for poorly printed text such binarization results into broken or merged characters and consequently incorrect features. We propose a method for the direct computation of geometrical features, such as strokes, directly from the gray scale image. To this aim we use a model of the image forming process, namely the convolution of the original binary image with the point spread function of the digitizer. We also estimate how printing distortions and noise affect the result so that we can deduce how different parts of a printed character should appear under those conditions. Detected features are then clustered for each set of samples of the training set. The clustering guides the selection of prototypes and the final classification is made by graph matching between prototypes and new (unknown) characters.