In this paper, we study families of images generated by varying a
parameter that controls the appearance of the object/scene in each image. Each image is viewed as a point in high-dimensional space; the family of images forms a low-dimensional submanifold that we call an image appearance manifold (IAM). We conduct a detailed study of some representative IAMs generated by translations/rotations of simple objects in the plane and by rotations of objects in 3-D space. Our central, somewhat surprising, finding is that IAMs generated by images with sharp edges are nowhere differentiable. Moreover, IAMs have an inherent multiscale structure in that approximate tangent planes fitted to ε-neighborhoods continually twist off into new dimensions as the scale parameter ε varies. We explore and explain this phenomenon. An additional, more exotic kind of local non-differentiability happens at some exceptional parameter points where occlusions cause image edges to disappear. These non-differentiabilities help to understand some key phenomena in image processing. They imply that Newton's method will not work in general for image registration, but that a multiscale Newton's method will work. Such a multiscale Newton's method is similar to existing coarse-to-fine differential estimation algorithms for image registration; the manifold perspective offers a well-founded theoretical motivation for the multiscale approach and allows quantitative study of convergence and approximation. The manifold viewpoint is also generalizable to other image understanding problems.
We develop a quaternion wavelet transform (QWT) as a new multiscale analysis tool for geometric image features. The QWT is a near shift-invariant, tight frame representation whose coefficients sport a magnitude and three phase values, two of which are directly proportional to local image shifts. The QWT can be efficiently computed using a dual-tree filter bank and is based on a 2-D Hilbert transform. We demonstrate how the QWT's magnitude and phase can be
used to accurately analyze local geometric structure in images. We also develop a multiscale flow/motion estimation algorithm that computes a disparity flow map between two images with respect to local object motion.
In this paper, we link concepts from nonuniform sampling, smoothness function spaces, interpolation, and wavelet denoising to derive a new multiscale interpolation algorithm for piecewise smooth signals. We formulate the optimization of finding the signal that balances agreement with the given samples against a wavelet-domain regularization. For signals in the Besov space Bαp(Lp) p ≥ 1, the optimization corresponds to convex programming in the wavelet domain. The algorithm simultaneously achieves signal interpolation and wavelet denoising, which makes it particularly suitable for noisy sample data, unlike classical approaches such as bandlimited and spline interpolation.
Natural images can be viewed as combinations of smooth regions,
textures, and geometry. Wavelet-based image coders, such as the
space-frequency quantization (SFQ) algorithm, provide reasonably
efficient representations for smooth regions (using zerotrees, for
example) and textures (using scalar quantization) but do not properly
exploit the geometric regularity imposed on wavelet coefficients by
features such as edges. In this paper, we develop a representation for
wavelet coefficients in geometric regions based on the wedgelet
dictionary, a collection of geometric atoms that construct
piecewise-linear approximations to contours. Our wedgeprint
representation implicitly models the coherency among geometric
wavelet coefficients. We demonstrate that a simple compression
algorithm combining wedgeprints with zerotrees and scalar quantization
can achieve near-optimal rate-distortion performance D(R) ~ (log R)2/R2 for the class of piecewise-smooth images containing smooth C2 regions separated by smooth C2 discontinuities. Finally, we extend this simple algorithm and propose a complete compression framework for natural images using a rate-distortion criterion to balance the three representations. Our Wedgelet-SFQ (WSFQ) coder outperforms SFQ in terms of visual quality and mean-square error.
In the last few years, it has become apparent that traditional wavelet-based image processing algorithms and models have significant shortcomings in their treatment of edge contours. The standard modeling paradigm exploits the fact that wavelet coefficients
representing smooth regions in images tend to have small magnitude, and that the multiscale nature of the wavelet transform implies that these small coefficients will persist across scale (the canonical
example is the venerable zero-tree coder). The edge contours in the image, however, cause more and more large magnitude wavelet coefficients as we move down through scale to finer resolutions. But if the contours are smooth, they become simple as we zoom in on them, and are well approximated by straight lines at fine scales. Standard wavelet models exploit the grayscale regularity of the smooth regions of the image, but not the geometric regularity of the contours.
In this paper, we build a model that accounts for this geometric regularity by capturing the dependencies between complex wavelet coefficients along a contour. The Geometric Hidden Markov Tree (GHMT) assigns each wavelet coefficient (or spatial cluster of wavelet
coefficients) a hidden state corresponding to a linear approximation of the local contour structure. The shift and rotational-invariance properties of the complex wavelet transform allow the GHMT to model
the behavior of each coefficient given the presence of a linear edge at a specified orientation --- the behavior of the wavelet coefficient given the state. By connecting the states together in a quadtree, the GHMT ties together wavelet coefficients along a contour, and also models how the contour itself behaves across scale.
We demonstrate the effectiveness of the model by applying it to feature extraction.
In this paper, we present an unsupervised scheme aimed at segmentation of laser radar (LADAR) imagery for Automatic Target Detection. A coding theoretic approach implements Rissanen's concept of Minimum Description Length (MDL) for estimating piecewise homogeneous regions. MDL is used to penalize overly complex segmentations. The intensity data is modeled as a Gaussian random field whose mean and variance functions are piecewise constant across the image. This model is intended to capture variations in both mean value (intensity) and variance (texture). The segmentation algorithm is based on an adaptive rectangular recursive partitioning scheme. We implement a robust constant false alarm rate (CFAR) detector on the segmented intensity image for target detection and compare our results with the conventional cell averaging (CA) CFAR detector.
In this paper, a coding theoretic approach is presented for the unsupervised segmentation of SAR images. The approach implements Rissanen's concept of Minimum Description Length (MDL) for estimating piecewise homogeneous regions. Our image model is a Gaussian random field whose mean and variance functions are piecewise constant across the image. The model is intended to capture variations in both mean value (intensity) and variance (texture). We adopt a multiresolution/progressive encoding approach to this segmentation problem and use MDL to penalize overly complex segmentations. We develop two different approaches both of which achieve fast unsupervised segmentation. One algorithm is based on an adaptive (greedy) rectangular recursive partitioning scheme. The second algorithm is based on an optimally-pruned wedgelet-decorated dyadic partition. We present simulation results on SAR data to illustrate the performance obtained with these segmentation techniques.
Multiscale processing, in particular using the wavelet transform, has emerged as an incredibly effective paradigm for signal processing and analysis. In this paper, we discuss a close relative of the Haar wavelet transform, the multiscale multiplicative decomposition. While the Haar transform captures the differences between signal approximations at different scales, the multiplicative decomposition captures their ratio. The multiplicative decomposition has many of the properties that have made wavelets so successful. Most notably, the multipliers are a sparse representative for smooth signals, they have a dependency structure similar to wavelet coefficients, and they can be calculated efficiently. The multiplicative decomposition is also a more natural signal representation than the wavelet transform for some problems. For example, it is extremely easy to incorporate positivity constraints into multiplier domain processing. In addition, there is a close relationship between the multiplicative decomposition and the Poisson process; a fact that has been exploited in the field of photon-limited imaging. In this paper, we will show that the multiplicative decomposition is also closely tied with the Kullback-Leibler distance between two signals. This allows us to derive an n-term KL approximation scheme using the multiplicative decomposition.
Besov spaces classify signals and images through the Besov norm, which is based on a deterministic smoothness measurement. Recently, we revealed the relationship between the Besov norm and the likelihood of an independent generalized Gaussian wavelet probabilistic model. In this paper, we extend this result by providing an information- theoretical interpretation of the Besov norm as the Shannon codelength for signal compression under this probabilistic mode. This perspective unites several seemingly disparate signal/image processing methods, including denoising by Besov norm regularization, complexity regularized denoising, minimum description length processing, and maximum smoothness interpolation. By extending the wavelet probabilistic model, we broaden the notion of smoothness space to more closely characterize real-world data. The locally Gaussian model leads directly to a powerful wavelet- domain. Wiener filtering algorithm for denoising.
We develop a general framework to simultaneously exploit texture and shape characterization in multiscale image segmentation. By posing multiscale segmentation as a model selection problem, we invoke the powerful framework offered by minimum description length (MDL). This framework dictates that multiscale segmentation comprises multiscale texture characterization and multiscale shape coding. Analysis of current multiscale maximum a posteriori segmentation algorithms reveals that these algorithms implicitly use a shape coder with the aim to estimate the optimal MDL solution, but find only an approximate solution.
Traditional filtering methods operate on the entire signal or image. In some applications, however, errors are concentrated in specific regions or features. A prime example is images generated using computed tomography. Practical implementations limit the amount of high frequency content in the reconstructed image, and consequently, edges are blurred. We introduce a new post-reconstruction edge enhancement algorithm, based on the reassignment principle and wavelets, that localizes its sharpening exclusively to edge features. Our method enhances edges without disturbing the low frequency textural details.
We study the segmentation of SAR imagery using wavelet-domain Hidden Markov Tree (HMT) models. The HMT model is a tree- structured probabilistic graph that captures the statistical properties of the wavelet transforms of images. This technique has been successfully applied to the segmentation of natural texture images, documents, etc. However, SAR image segmentation poses a difficult challenge owing to the high levels of speckle noise present at fine scales. We solve this problem using a 'truncated' wavelet HMT model specially adapted to SAR images. This variation is built using only the coarse scale wavelet coefficients. When applied to SAR images, this technique provides a reliable initial segmentation. We then refine the classification using a multiscale fusion technique, which combines the classification information across scales from the initial segmentation to correct for misclassifications. We provide a fast algorithm, and demonstrate its performance on MSTAR clutter data.
We introduce a new document image segmentation algorithm, HMTseg, based on wavelets and the hidden Markov tree (HMT) model. The HMT is a tree-structured probabilistic graph that captures the statistical properties of the coefficients of the wavelet transform. Since the HMT is particularly well suited to images containing singularities (edges and ridges), it provides a good classifier for distinguishing between different document textures. Utilizing the inherent tree structure of the wavelet HMT and its fast training and likelihood computation algorithms, we perform multiscale texture classification at a range of different scales. We then fuse these multiscale classifications using a Bayesian probabilistic graph to obtain reliable final segmentations. Since HMTseg works on the wavelet transform of the image, it can directly segment wavelet-compressed images, without the need for decompression into the space domain. We demonstrate HMTseg's performance with both synthetic and real imagery.
We discover a new relationship between two seemingly different image modeling methodologies; the Besov space theory and the wavelet-domain statistical image models. Besov spaces characterize the set of real-world images through a deterministic characterization of the image smoothness, while statistical image models capture the probabilistic properties of images. By establishing a relationship between the Besov norm and the normalized likelihood function under an independent wavelet-domain generalized Gaussian model, we obtain a new interpretation of the Besov norm which provides a natural generalization of the theory for practical image processing. Base don this new interpretation of the Besov space, we propose a new image denoising algorithm based on projections onto the convex sets defined in the Besov space. After pointing out the limitations of Besov space, we propose possible generalizations using more accurate image models.
We propose a hybrid approach to wavelet-based deconvolution that comprises Fourier-domain system inversion followed by wavelet-domain noise suppression. In contrast to conventional wavelet-based deconvolution approaches, the algorithm employs a regularized inverse filter, which allows it to operate even when the system in non-invertible. Using a mean-square-error (MSE) metric, we strike an optimal balance between Fourier-domain regularization (matched to the system) and wavelet-domain regularization (matched to the signal/image). Theoretical analysis reveals that the optimal balance is determined by the economics of the signal representation in the wavelet domain and the operator structure. The resulting algorithm is fast (O(Nlog22N) complexity for signals/images of N samples) and is well-suited to data with spatially-localized phenomena such as edges. In addition to enjoying asymptotically optimal rates of error decay for certain systems, the algorithm also achieves excellent performance at fixed data lengths. In simulations with real data, the algorithm outperforms the conventional time-invariant Wiener filter and other wavelet- based deconvolution algorithms in terms of both MSE performance and visual quality.
We introduce a new image texture segmentation algorithm, HMTseg, based on wavelet-domain hidden Markov tree (HMT) models. The HMT model is a tree-structured probabilistic graph that captures the statistical properties of wavelet coefficients. Since the HMT is particularly well suited to images containing singularities, it provides a good classifier for textures. Utilizing the inherent tree structure of the wavelet HMT and its fast training and likelihood computation algorithms, we perform multiscale texture classification at various scales. Since HMTseg works on the wavelet transform of the image, it can directly segment wavelet-compressed images, without the need for decompression. We demonstrate the performance of HMTseg with synthetic, aerial photo, and document image segmentations.
Wavelet-domain hidden Markov models have proven to be useful tools for statistical signal and image processing. The hidden Markov tree model captures the key features of the joint density of the wavelet coefficients of real-world data. One potential drawback to the HMT framework is the need for computationally expensive iterative training. In this paper, we prose two reduced-parameter HMT models that capture the general structure of a broad class of real-world images. In the image HMT (iHMT) model we use the fact that for a large class of images the structure of the HMT is self-similar across scale. This allows us to reduce the complexity of the iHMT to just nine easily trained parameters. In the universal HMT (uHMT) we take a Bayesian approach and fix these nine parameters. The uHMT requires no training of any kind. While simple, we show using a series of image estimation/denoising experiments that these two new models retain nearly all of the key structure modeled by the full HMT. Finally, we propose a fast shift-invariant HMT estimation algorithm that outperforms all other wavelet- based estimators in the current literature, both in mean- square error and visual metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.