We consider the coding properties of multilayer LNL (linear-nonlinear-linear) systems. Such systems consist
of interleaved layers of linear transforms (or filter banks), nonlinear mappings, linear transforms, and so forth.
They can be used as models of visual processing in higher cortical areas (V2, V4), and are also interesting
with respect to image processing and coding. The linear filter operations in the different layers are optimized
for the exploitation of the statistical redundancies of natural images. We explain why even simple nonlinear
operations-like ON/OFF rectification-can convert higher-order statistical dependencies remaining between
the linear filter coefficients of the first layer to a lower order. The resulting nonlinear coefficients can then
be linearly recombined by the second-level filtering stage, using the same principles as in the first stage. The
complete nonlinear scheme is invertible, i.e., information is preserved, if nonlinearities like ON/OFF rectification
or gain control are employed. In order to obtain insights into the coding efficiency of these systems we investigate
the feature selectivity of the resulting nonlinear output units and the use of LNL systems in image compression.
We investigate the hypothesis that the basic representation of space which underlies human navigation does not
resemble an image-like map and is not restricted by the laws of Euclidean geometry. For this we developed a
new experimental technique in which we use the properties of a virtual environment (VE) to directly influence
the development of the representation. We compared the navigation performance of human observers under two
conditions. Either the VE is consistent with the geometrical properties of physical space and could hence be
represented in a map-like fashion, or it contains severe violations of Euclidean metric and planar topology, and
would thus pose difficulties for the correct development of such a representation. Performance is not influenced
by this difference, suggesting that a map-like representation is not the major basis of human navigation. Rather,
the results are consistent with a representation which is similar to a non-planar graph augmented with path
length information, or with a sensorimotor representation which combines sensory properties and motor actions.
The latter may be seen as part of a revised view of perceptual processes due to recent results in psychology and
neurobiology, which indicate that the traditional strict separation of sensory and motor systems is no longer
tenable.
KEYWORDS: Neurons, Complex systems, Linear filtering, Visual process modeling, Systems modeling, Nonlinear filtering, Visualization, Optical filters, Spatial filters, Data modeling
The standard model of visual processing is based on the selective properties of linear spatial filters which are tuned to different orientations and radial frequencies. This standard model is well suited for the description of a wide range of phenomena in vision but it is not clear, whether the whole range of basic properties of early vision is entirely within the models explanatory scope. Here we suggest that there exists a basic selective processing property in early vision which is definitely outside the explanatory scope of the standard model: the selectivity for intrinsically 2D signals. This property has already been observed in the classical experiments of Hubel and Wiesel, and has more recently been found in more complex form in the extra-classical receptive field properties of various visual neurons. We show here that this selectivity cannot be described within the framework of linear spatial filtering because of reasons which lie at the heart of the theory o f linear systems: the restriction of such systems to OR- combinations of their intrinsically 1D eigenfunctions. We present a general nonlinear framework for the modeling of i2D-selective systems which is based on AND-like combinations of frequency components, and which is closely related to the Wiener-Volterra representation of nonlinear systems. To our knowledge, i2D-selectivity is the only non- standard property for which such a theoretical framework yet exists. The framework enables the combination of the nonlinear i2D-selectivity with other basic selectivities of visual neurons, for examples with simple and complex-like properties, and makes it thus possible, to construct models for the variety of neurophysiological observations on the i2D-selective processing in visual neurons. As an insight of general interest for the recent discussion on second-order properties in early vision, the framework reveals the existence of extended equivalence classes in which nonlinear schemes can have very dissimilar structural properties, and lead nevertheless to identical input-output relations. Finally, there is a close relation between i2D-selectivity and the higher-order statistical redundancies in natural images.
The perception of an image by a human observer is usually modeled as a parallel process in which all parts of the image are treated more or less equivalently, but in reality the analysis of scenes is a highly selective procedure, in which only a small subset of image locations is processed by the precise and efficient neural machinery of foveal vision. To understand the principles behind this selection of the 'informative' regions of images we have developed a hybrid system, which consists of a combination of a knowledge-based reasoning system wit a low-level preprocessing by linear and nonlinear neural operators. This hybrid system is intended as a first step towards a compete model of the sensorimotor system of saccadic scene analysis. In the analysis of a scene, the system calculates in each step which eye movement has to be made to reach a maximum of information about the scene. The possible information gain is calculated by means of a parallel strategy which is suitable for adaptive reasoning. The output of the system is a fixation sequence, and finally, a hypothesis about the scene.
KEYWORDS: Linear filtering, Neurons, Statistical analysis, Wavelets, Image filtering, Computer programming, Communication engineering, Visualization, Visual process modeling, Systems modeling
The classical approach in vision research - the derivation of basically linear filter models form experiments with simple artificial test stimuli - is currently undergoing a major revision. Instead of trying to keep the dirty environment out of our clean labs we put it now right into the focus of scientific exploration. The new approach has a close relation to basic engineering strategies for electronic image processing since its major concept is the exploration of the statistical redundancies of the environment by appropriate neural transformations. The standard engineering methods are not sufficient, however. Even a basic biological feature like orientation selectivity requires the consideration of higher-order statistics, like cumulants or polyspectra. Furthermore, there exists an abundance of nonlinear phenomena in biological vision, for example the phase-invariance of complex cells, cortical gain control, or end-stopping, which make it necessary to consider unconventional modeling approaches like differential geometry or Volterra-Wiener system. By use of such methods we cannot only gain a deeper understanding of the adaption of the visual system to the complex natural environment, but we can also make the biological system an inspiring source for the design of novel strategies in electronic image processing.
In this paper we analyze the properties of a repeated isotropic center-surround inhibition which includes single nonlinearities like half-wave rectification and saturation. Our simulation results show that such operations, here implemented as iterated nonlinear differences and ratios of Gaussians (INDOG and INROG), lead to endstopping. The benefits of the approach are twofold. Firstly, the INDOG can be used to design simple endstopped operators, e.g., corner detectors. Secondly, the results can explain how endstopping might arise in a neural network with purely isotropic characteristics. The iteration can be implemented as cascades by feeding the output of one NDOG to a next stage of NDOG. Alternatively, the INDOG mechanism can be activated in a feedback loop. In the latter case, the resulting spatio-temporal response properties are not separable and the response becomes spatially endstopped if the input is transient. Finally, we show that ON- and OFF-type INDOG outputs can be integrated spatially to result in quasi- topological image features like open versus closed and the number of components.
KEYWORDS: Visualization, Data processing, Visual system, Visual process modeling, Computer programming, Neurons, Signal processing, Associative arrays, Data modeling, Sensors
The processing and representation of motion information is addressed from an integrated perspective comprising low- level signal processing properties as well as higher-level cognitive aspects. For the low-level processing of motion information we argue that a fundamental requirement is the existence of a spatio-temporal memory. Its key feature, the provision of an orthogonal relation between external time and its internal representation, is achieved by a mapping of temporal structure into a locally distributed activity distribution accessible in parallel by higher-level processing stages. This leads to a reinterpretation of the classical concept of `iconic memory' and resolves inconsistencies on ultra-short-time processing and visual masking. The spatial-temporal memory is further investigated by experiments on the perception of spatio-temporal patterns. Results on the direction discrimination of motion paths provide evidence that information about direction and location are not processed and represented independent of each other. This suggests a unified representation on an early level, in the sense that motion information is internally available in form of a spatio-temporal compound. For the higher-level representation we have developed a formal framework for the qualitative description of courses of motion that may occur with moving objects.
KEYWORDS: Fractal analysis, Machine vision, Computer vision technology, Visualization, Neurons, Visual process modeling, Image filtering, Visual system, Signal processing, Human vision and color perception
Basic properties of 2-D-nonlinear scale-space representations of images are considered. First, local-energy filters are used to estimate the Hausdorff dimension, DH, of images. A new fractal dimension, DN, defined as a property of 2-D-curvature representations on multiple scales, is introduced as a natural extension of traditional fractal dimensions, and it is shown that the two types of fractal dimensions can give a less ambiguous description of fractal image structure. Since fractal analysis is just one (limited) aspect of scale-space analysis, some more general properties of curvature representations on multiple scales are considered. Simulations are used to analyze the stability of curvature maxima across scale and to illustrate that spurious resolution can be avoided by extracting 2-D-curvature features.
A coding scheme for image sequences is designed in analogy to human visual information processing. We propose a feature-specific vector quantization method applied to multi-channel representation of image sequences. The vector quantization combines the corresponding local/momentary amplitude coefficients of a set of three-dimensional analytic band-pass filters being selective for spatiotemporal frequency, orientation, direction and velocity. Motion compensation and decorrelation between successive frames is achieved implicitly by application of a non-rectangular subsampling to the 3D-bandpass outputs. The nonlinear combination of the outputs of filters which are selective for constantly moving one- dimensional (i.e. spatial elongated) image structures allows a classification of the local/momentary signal features with respect to their intrinsic dimensionality. Based on statistical investigations a natural hierarchy of signal features is provided. This is then used to construct an efficient encoding procedure. Thereby, the different sensitivity of the human vision to the various signal features can be easily incorporated. For a first example, all multi- dimensional vectors are mapped to constantly moving 1D-structures.
This paper considers how basic geometrical properties like curvature, rigidity, and possible embeddings can be related to efficient image encoding and the statistical concept of redundancy. In particular, the redundancy of planar and parabolic patches of images as surfaces is revealed by reconstructing the original image from curvature measures that are zero for non-elliptic regions. This approach also gives a new perspective on encoding principles in biological vision.
Intrinsic signal dimensionality, a property closely related to Gaussian curvature, is shown to be an important conceptual tool in multi-dimensional image processing for both biological and engineering sciences. Intrinsic dimensionality can reveal the relationship between recent theoretical developments in the definition of optic flow and the basic neurophysiological concept of 'end-stopping' of visual cortical cells. It is further shown how the concept may help to avoid certain problems typically arising from the common belief that an explicit computation of a flow field has to be the essential first step in the processing of spatio- temporal image sequences. Signals which cause difficulties in the computation of optic flow, mainly the discontinuities of the motion vector field, are shown to be detectable directly in the spatio-temporal input by evaluation of its three-dimensional curvature. The relevance of the suggested concept is supported by the fact that fast and efficient detection of such signals is of vital importance for ambulant observers in both the biological and the technical domain.
KEYWORDS: Quantization, Signal analyzers, Image filtering, Electronic filtering, Human vision and color perception, Image compression, Visual process modeling, Bandpass filters, Linear filtering, Statistical analysis
Image decomposition via even- and odd-symmetric, size and orientation selective band-pass filters, as
suggested by the receptive field properties of visual cortical neurons, is well suited to image coding purposes.
Interpretation of the even/odd filter outputs as a complex ("analytic") signal offers the alternative of a polar
signal representation by a local amplitude and a local phase component. This is also indicated by our
measurement of a rotation symmetric shape of the two-dimensional probability density function (pdf) of the
even/odd filter outputs.
Our investigations into the properties of such a representation show that it provides an interesting separation of
the "amount of signal variation" (local amplitude) vs. the "type of signal variation" (local phase) . Furthermore
an efficient vector quantization procedure can be applied to the two-dimensional amplitude/phase vector. This
procedure divides the 2D signal space of the analytic signal into polar separable patches. Since phase
quantization errors are more tolerable at small amplitude levels local phase is quantized dependent on the
amplitude level. While typical pdf-optimized quantizers produce an increasingly higher amplitude resolution
towards very small amplitudes, human vision allows the application of an appropriate threshold which leads to an
"irrelevance zone" wherein obviously no phase information has to be coded. Using this coding scheme good
image quality can be obtained with about 0.8 bit/pixel.
KEYWORDS: Sensors, Signal detection, Filtering (signal processing), Visual process modeling, Human vision and color perception, Electronic filtering, Optical filters, Electronic imaging, Nonlinear filtering, Signal processing
Empirical evidence from both psychology and physiology stresses the importance of inherently
two-dimensional signals and corresponding operations in vision. Examples of this are the existence of
"bug-detectors" , hypercomplex and dot-responsive cells, the occurence of contour illusions, and interactions of
patterns with clearly separated orientations. These phenomena can not be described, and have been largely
ignored, by common theories of size and orientation selective channels. The reason for this is shown to be
located at the heart of the theory of linear systems: their one-dimensional eigenfunctions and the "or"-like
character of the superposition principle. Consequently, a nonlinear theory is needed. We present a first
approach towards a general framework for the description of 2D-signals and 2D-cells in biological vision.
We present an image coding scheme based on the properties of the early stages of the human visual system. The image signal is decomposed via even and odd symmetric, frequency and orientation selective band-pass filters in analogy to the quadrature phase simple cell pairs in the visual cortex. The resulting analytic signal is transformed into a local amplitude and local phase representation in order to achieve a better match to its signal statistics. Both intra filter dependencies of the analytic signal and inter filter dependencies between different orientation filters are exploited by a suitable vector quantization scheme.
Inter orientation filter dependencies are demonstrated by means of a statistical evaluation of the multidimensional probability density function. The results can be seen as an empirical confirmation of the suitability of vector quantization in subband coding. Instead of generating a code book by use of an conventional design-algorithm, we suggest a feature specific partitioning of the multidimensional signal space matched to the properties of human vision. Using this coding scheme satisfactory image quality can be obtained with about 0.78 bit/pixel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.