Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition.
The human visual system is an exquisitely engineered system that can serve as a model and inspiration for the design of many imaging systems. Optics and optical engineering play a key role in developing new techniques and approaches for both the study of human vision and the design of novel imaging systems. For example, advances in optical sensing and imaging have led to important discoveries about retinal image processing, and optical design tools are necessary for improving vision in patients. While advances in optics are improving our understanding of the human visual system, this understanding has also led to improvements in artificial vision systems, image processing algorithms, visual displays, and even modern optical elements and systems.
Prism distortions and spurious reflections are not usually considered when prescribing prisms to compensate for visual field loss due to homonymous hemianopia. Distortions and reflections in the high-power Fresnel prisms used in peripheral prism placement can be considerable, and the simplifying assumption that prism deflection power is independent of angle of incidence into the prisms results in substantial errors. We analyze the effects of high prism power and incidence angle on the field expansion, size of the apical scotomas, and image compression/expansion. We analyze and illustrate the effects of reflections within the Fresnel prisms, primarily due to reflections at the bases, and secondarily due to surface reflections. The strength and location of these effects differs materially depending on whether the serrated prismatic surface is placed toward or away from the eye, and this affects the contribution of the reflections to visual confusion, diplopia, false alarms, and loss of contrast. We conclude with suggestions for controlling and mitigating these effects in clinical practice.
In this retrospective we trace in broad strokes the development of image quality measures based on the study of the early
stages of the human visual system (HVS), where contrast encoding is fundamental. We find that while presenters at the
Human Vision and Electronic Imaging meetings have frequently strived to find points of contact between the study of
human contrast psychophysics and the development of computer vision and image quality algorithms. Progress has not
always been made on these terms, although indirect impact of vision science on more recent image quality metrics can be
For the visual system, luminance contrast is a fundamental property of images, and is one of the main inputs of any
simulation of visual processing. Many models intended to evaluate visual properties such as image discriminability
compute perceived contrast by using contrast sensitivity functions derived from studies of human spatial vision. Such use
is of questionable validity even for such applications (i.e. full-reference image quality metrics), but it is usually
inappropriate for no-reference image quality measures. In this paper, we outline why the contrast sensitivity functions
commonly used are not appropriate in such applications, and why weighting suprathreshold contrasts by any sensitivity
function can be misleading. We propose that rather than weighting image contrasts (or contrast differences) by some
assumed sensitivity function, it would be more useful for most purposes requiring estimates of perceived contrast or
quality to develop an estimate of efficiency: how much of an image is making it past the relevant thresholds.
We have developed a mobile vision assistive device based on a head mounted display (HMD) with a video camera,
which provides image magnification and contrast enhancement for patients with central field loss (CFL). Because the
exposure level of the video camera is usually adjusted according to the overall luminance of the scene, the contrast of
sub-images (to be magnified) may be low. We found that at high magnification levels, conventional histogram
enhancement methods frequently result in over- or under-enhancement due to irregular histogram distribution of subimages.
Furthermore, the histogram range of the sub-images may change dramatically when the camera moves, which
may cause flickering. A piece-wise histogram stretching method based on a center emphasized histogram is proposed
and evaluated by observers. The center emphasized histogram minimizes the histogram fluctuation due to image changes
near the image boundary when the camera moves slightly, which therefore reduces flickering after enhancement. A
piece-wise histogram stretching function is implemented by including a gain turnaround point to deal with very low
contrast images and reduce the possibility of over enhancement. Six normally sighted subjects and a CFL patient were
tested for their preference of images enhanced by the conventional and proposed methods as well as the original images.
All subjects preferred the proposed enhancement method over the conventional method.
It has been observed that electronic magnification of imagery results in a decrease in the apparent contrast of the
magnified image relative to the original. The decrease in perceived contrast might be due to a combination of image blur
and of sub-sampling the larger range of contrasts in the original image. In a series of experiments, we measured the
effect on apparent contrast of magnification in two contexts: either the entire image was enlarged to fill a larger display
area, or a portion of an image was enlarged to fill the same display area, both as a function of magnification power and
of viewing distance (visibility of blur induced by magnification). We found a significant difference in the apparent
contrast of magnified versus unmagnified video sequences. The effect on apparent contrast was found to increase with
increasing magnification, and to decrease with increasing viewing distance (or with decreasing angular size). Across
observers and conditions the reduction in perceived contrast was reliably in the range of 0.05 to 0.2 log units (89% to
63% of nominal contrast). These effects are generally consistent with expectations based on both the contrast statistics
of natural images and the contrast sensitivity of the human visual system. It can be demonstrated that 1) local areas
within larger images or videos will usually have lower physical contrast than the whole; and 2) visibility of 'missing
content' (e.g. blur) in an image is interpreted as a decrease in contrast, and this visibility declines with viewing distance.
Measuring preferences for moving video quality is harder than for static images due to
the fleeting and variable nature of moving video. Subjective preferences for image
quality can be tested by observers indicating their preference for one image over another.
Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999).
Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting
and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for
the items that are compared (e.g. enhancement levels). However, Thurstone scaling does
not determine the statistical significance of the differences between items on that
perceptual scale. Recent papers have provided inferential statistical methods that produce
an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we
demonstrate that binary logistic regression can analyze preferences for enhanced video.
It can be useful to present a different image to each of the two eyes while they cooperatively view the world. Such dichoptic presentation can occur in investigations of stereoscopic and binocular vision (e.g., strabismus, amblyopia) and vision rehabilitation in clinical and research settings. Various techniques have been used to construct dichoptic displays. The most common and most flexible modern technique uses liquid-crystal (LC) shutters. When used in combination with cathode ray tube (CRT) displays, there is often leakage of light from the image intended for one eye into the view of the other eye. Such interocular crosstalk is 14% even in our state of the art CRT-based dichoptic system. While such crosstalk may have minimal impact on stereo movie or video game experiences, it can defeat clinical and research investigations. We use micromirror digital light processing (DLPTM) technology to create a novel dichoptic visual display system with substantially lower interocular crosstalk (0.3%; remaining crosstalk comes from the LC shutters). The DLP system normally uses a color wheel to display color images. Our approach is to disable the color wheel, synchronize the display directly to the computer's sync signal, allocate each of the three (former) color presentations to one or both eyes, and open and close the LC shutters in synchrony with those color events.
Spectacle-mounted telescopic systems are prescribed for individuals with visual impairments. Bioptic telescopes are typically mounted toward the top of the spectacle lens (or above the frame) with the telescope eyepiece positioned above the wearer's pupil. This allows the wearer to use up and down head tilt movements to quickly alternate between the unmagnified wide view (through the carrier lens) and the magnified narrow field of view (available through the eyepiece). Rejection of this visual aid has been attributed mainly to its appearance and to the limited field of view through the smaller Galilean designs. We designed a wide-field Keplerian telescope that is built completely within the spectacle lens. The design uses embedded mirrors inside the carrier lens for optical pathway folding, and conventional lenses or curved mirrors for magnification power. The short height of the ocular, its position, and a small tilt of the ocular mirror enable the wearer to simultaneously view the magnified field above the unmagnified view of the uninterrupted horizontal field. These features improve the cosmetics and utility of the device. The in-the-lens design allows the telescope to be mass produced as a commodity ophthalmic lens blank that can be surfaced to include the wearer's spectacle prescription.
The normal visual system provides a wide field of view apparently at high resolution. The wide field is continuously
monitored at low resolution for navigation and detection of objects of interest. These objects are sampled using the high-resolution
fovea, applying a temporal multiplexing scheme. Most vision impairments that cause low vision impact upon
only one of the components; the peripheral low-resolution wide field or the central high-resolution fovea. The loss of one
of these components prevents the interplay of central and peripheral vision needed for normal function and causes
disability. Traditional low-vision aids improve the impacted component, but usually at a cost of a significant loss in the
surviving component. For example, magnifying devices increase resolution but reduce the field of view, while minifying
devices increase the field of view but reduce resolution. A general optical engineering approach - vision multiplexing
- is presented. Vision multiplexing seeks to provide both the wide field of view and the high-resolution information in
ways that could be accessed and interpreted by the visual system. The use of various optical and electro-optical methods
in the development of a number of new visual aids, all of which apply vision multiplexing to restore the interplay of
high-resolution and wide-angle vision using eye movements in a natural way, will be described. Vision-multiplexing
devices at various stages of development and testing illustrate the successes and difficulties in applying this approach for
patients with tunnel vision, hemianopia (half blindness), and visual acuity loss (usually due to central retinal disease).
Purpose: Patients with tunnel vision have great difficulties in mobility. We have developed an augmented vision head
mounted device, which can provide patients 5x expanded field by superimposing minified edge images of a wider field
captured by a miniature video camera over the natural view seen through the display. In the minified display, objects
appear closer to the heading direction than they really are. This might cause users to overestimate collision risks, and
therefore to perform unnecessary obstacle-avoidance maneuvers. A study was conducted in a virtual environment to test
the impact of minified view on collision judgment.
Methods: Simulated scenes were presented to subjects as if they were walking in a shopping mall corridor. Subjects
reported whether they would make any contact with stationary obstacles that appeared at variable distances from their
walking path. Perceived safe passing distance (PSPD) was calculated by finding the transition point from reports of yes
to no. Decision uncertainty was quantified by the sharpness of the transition. Collision envelope (CE) size was calculated
by summing up PSPD for left and right sides. Ten normally sighted subjects were tested (1) when not using the device
and with one eye patched, and (2) when the see-through view of device was blocked and only minified images were
Results: The use of the 5x minification device caused only an 18% increase of CE (13cm, p=0.048). Significant impact
of the device on judgment uncertainty was not found (p=0.089).
Conclusion: Minification had only a small impact on collision judgment. This supports the use of such a minifying
device as an effective field expander for patients with tunnel vision.
Foveated imaging systems applicable in various single-user displays mimic the visual system's image structure, where resolution decreases gradually away from the fovea. The main benefit is the low average image resolution while maintaining high resolution at the center of the gaze. When the end user is a human observer, it is advantageous for the foveation process to closely match the visual system parameters. This work directly applies a multichannel model of the visual system to form foveated images. A systems-engineering approach applied to the vision model produces quantitative image spectral content across the visual channels. Foveated images are constructed according to the contrast threshold and image content calculated at different eccentricities. Also, variable-resolution feature detection (edge and bar) that corresponds to early visual processing is produced, based on the available image content across the channels. Motion between shifted foveated images (required in applications such as image compression and motion compensation) is estimated using either the foveated images or the detected feature images. Results using several similarity metrics and imaging conditions show that reliable motion estimation can be achieved, while features with nonsimilar resolutions (different scales) are matched.
Simulating mobility tasks in a virtual environment reduces risk for research subjects, and allows for improved experimental control and measurement. We are currently using a simulated shopping mall environment (where subjects walk on a treadmill in front of a large projected video display) to evaluate a number of ophthalmic devices developed at the Schepens Eye Research Institute for people with vision impairment, particularly visual field defects. We have conducted experiments to study subject's perception of "safe passing distance" when walking towards stationary obstacles. The subject's binary responses about potential collisions are analyzed by fitting a psychometric function, which gives an estimate of the subject's perceived safe passing distance, and the variability of subject responses. The system also enables simulations of visual field defects using head and eye tracking, enabling better understanding of the impact of visual field loss. Technical infrastructure for our simulated walking environment includes a custom eye and head tracking system, a gait feedback system to adjust treadmill speed, and a handheld 3-D pointing device. Images are generated by a graphics workstation, which contains a model with photographs of storefronts from an actual shopping mall, where concurrent validation experiments are being conducted.
An optical see-through head-mounted display (HMD) system integrating a miniature camera that is aligned with the user's pupil is developed and tested. Such an HMD system has a potential value in many augmented reality applications, in which registration of the virtual display to the real scene is one of the critical aspects. The camera alignment to the user's pupil results in a simple yet accurate calibration and a low registration error across a wide range of depth. In reality, a small camera-eye misalignment may still occur in such a system due to the inevitable variations of HMD wearing position with respect to the eye. The effects of such errors are measured. Calculation further shows that the registration error as a function of viewing distance behaves nearly the same for different virtual image distances, except for a shift. The impact of prismatic effect of the display lens on registration is also discussed.
An MPEG-based image contrast enhancement algorithm for people with low vision is presented. Contrast enhancement is achieved by modifying the inter- and intra-quantization matrices in the MPEG decoder during the decompression stage. The algorithm has low computational complexity and does not affect the MPEG compressibility of the original image. We propose an enhancement filter based on the visual characteristics of low-vision patients, and report the results of preference experiments with 24 visually impaired subjects. Subjects favored low to moderate levels of enhancement for two of the tested video sequences, but favored only low levels of enhancement and rejected higher enhancement for two other sequences that had fast motion.
Spectacle mounted telescopic systems have been prescribed for visual impairment, providing magnified images of objects at farther distances. Typically, bioptic telescopes are mounted toward the top of spectacle lenses or above the frame with the telescope eyepiece positioned above the eye's pupil. This allows the wearer to alternate between the magnified narrow field of view available through the eyepiece and the unmagnified wide view through the carrier lens using head motion. The main obstacles to acceptance are the obvious appearance, limited field of the smaller Galilean telescopes, and weight of the larger Keplerian telescopes. We designed a spectacle-mounted wide-field Keplerian telescope built completely inside the spectacle lens. The design uses embedded mirrors inside the carrier lens for optical pathway folding and conventional lenses or curved mirrors. The small size of the ocular and its position with additional mirror tilt enable the user to view the magnified field simultaneously and above the unmagnified view of the uninterrupted horizontal field that is important for user's safety. This design enables the construction of cosmetic telescopes that can be produced as a commodity lens blank and surfaced to include the patient prescription. These devices may be also of utility in military and civilian use.
In a previous study the simulation of image appearance from different distances was shown to be effective. The simulated observation distance accurately predicted the distance at which the simulated image could be discriminated from the original image. Due to the 1/f nature of natural images spatial spectra, the individual CSF used was actually tested only at one retinal spatial frequency. To test the CSF relevant for the discrimination task over a wide range of frequencies, the same simulations and testing procedure were applied to 5 contrast versions of the images. The lower contrast imags probe the CSF at lower spatial frequencies, while higher contrast images test the CSF value at higher spatial frequencies. Images were individually processed for each of 4 observers using their individual CSF to represent the appearance of the images from 3 distances where they span 1, 2, and 4 deg of visual angle, respectively. Each of the 4 pictures at the 5 contrast levels and the 3 simulated distances was presented 10 times side-by-side with the corresponding original image. Images were observed from 9 different observation distances. Subject task was to determine which of the two was the original, unprocessed images. For each simulated distance the data was used to determine the discrimination distance threshold.
This paper describes a contrast-based monochromatic fusion process. The fusion process is aimed for on board real time the information content in the combined image, while retaining visual clues that are essential for navigation/piloting tasks. The method is a multi scale fusion process that provides a combination of pixel selection from a single image and a weighing of the two/multiple images. The spectral region is divided into spatial sub bands of different scales and orientations, and within each scale a combination rule for the corresponding pixels taken from the two components is applied. Even when the combination rule is a binary selection the combined fused image may have a combination of pixel values taken from the two components at various scales since it is taken at each scale. The visual band input is given preference in low scale, large features fusion. This fusion process provides a fused image better tuned to the natural and intuitive human perception. This is necessary for pilotage and navigation under stressful conditions, while maintaining or enhancing the targeting detection and recognition performance of proven display fusion methodologies. The fusion concept was demonstrated against imagery from image intensifiers and forward looking IR sensors currently used by the US Navy for navigation and targeting. The approach is easily extendible to more than two bands.
The local contrast in an image may be approximated by the contrast of a Gabor patch of varying phase and bandwidth. In a search for a metric for such local contrast, perceived (apparent) contrast, as indicated by matching of such patterns, were compared here to the physical contrast calculated by a number of methods. The 2 cycles/deg 1-octave Gabor patch stimuli of different phases were presented side by side separated by 4 degrees. During each session the subjects (n equals 5) were adapted to the average luminance, and four different contrast levels (0.1, 0.3, 0.6, and 0.8) were randomly interleaved. The task was repeated at four mean luminance levels between 0.75 and 37.5 cd/m2. The subject's task was to indicate which of the two patterns was lower in contrast. Equal apparent contrast was determined by fitting a psychometric function to the data from 40 to 70 presentations. There was no effect of mean luminance on the subjects settings. The matching results rejected the hypothesis that either the Michelson formula or the King-Smith & Kulikowski contrast (CKK equals (Lmax-Laverage)/Laverage) was used by the subjects to set the match. The use of the Nominal contrast (the Michelson contrast of the underlying sinusoid) as an estimate of apparent contrast could not be rejected. In a second experiment the apparent contrast of a 1-octave Gabor patch was matched to the apparent contrast of a 2-octave Gabor patch (of Nominal contrast of 0.1, 0.3, 0.6, 0.8) using the method of adjustment. The result of this experiment rejected the prediction of the Nominal contrast definition. The local band limited contrast measure (Peli, 1990), when used with the modifications suggested by Lubin (1995), as an estimate of apparent contrast could not be rejected by the results of either experiment. These results suggest that a computational contrast measure based on multi scale bandpass filtering is a better estimate of apparent perceived contrast than any of the other measures tested.
Evaluation of retinal images is essential to modern ophthalmic care. With the advent of image processing equipment, digital recording and processing of retinal images is starting to replace the standard film based fundus photography. The ability to enhance images is cited as one of the major benefits of this expensive technology. This paper critically reviews the practices employed in the image enhancement literature. It is argued that the papers published to date have not presented convincing evidence regarding the diagnostic value of retinal image enhancement. The more elaborate studies in radiology suggest, at best, modest diagnostic improvement with enhancement. The special difficulties associated with the demonstration of an improved diagnosis in ophthalmic imaging are discussed in terms of the diagnostic task and the selection of study populations.
The luminance emitted from a cathode ray rube (CRT) display is a nonlinear function (the gamma function) of the input video signal voltage. In most analog video systems, compensation for this nonlinear transfer function is implemented in the camera amplifiers. When CRT displays are used to present psychophysical stimuli in vision research, the specific display nonlinearity usually is measured and accounted for to ensure that the luminance of each pixel in the synthetic image properly represents the intended value. However, when using digital image processing, the linear analog-to-digital converters store a digital image that is nonlinearly related to the displayed or recorded image. The effect of this nonlinear transformation on a variety of image-processing applications used in visual communications is described.
Digital high-pass filtering is used frequently to enhance details in scientific, industrial, and military images. High-pass filtered (HPF) images also are used both to illustrate and test models of visual perception. The visual system appears to interpret HPF images in the context of a multiplicative model of high-frequency reflectance and low-frequency illumination whenever possible. HPF images can be treated as a form of two-dimensional amplitude modulation signals. The low-frequency information, which is coded in the modulation envelope, disappears with the carrier if low-pass filtered. The envelope may be retrieved (demodulated) using one of many possible nonlinear operations followed by a low-pass filter. The compressive nonlinearity ofthe visual system is shown to suffice for demodulating such images. Simulations show that HPF images cannot be used to reject the hypothesis that illusions and grouping phenomena are due to low-frequency channels.
The luminance emitted from a cathode ray tube, (CRT) display is a nonlinear function (the gamma function) of the input video signal voltage. In most analog video systems, compensation for this nonlinear transfer function is implemented in the camera amplifiers. When CRT displays are used to present psychophysical stimuli in vision research, the specific display nonlinearity usually is measured and accounted for to ensure that the luminance of each pixel in the synthetic image properly represents the intended value. However, when using digital image processing, the linear analog-to-digital converters store a digital image that is nonlinearly related to the displayed or recorded image. This paper describes the effect of this nonlinear transformation on a variety of image-processing applications used in visual communications.
Image enhancement as an aid for the visually impaired may be used to improve visibility of broadcast TV programs and to provide a portable visual aid. Initial work in this area was based on a linear model. The finite dynamic range available in the video display and contamination of the enhanced image by high spatial frequency noise limited the usefulness of this model. I propose a new enhancement method to address some of the limitations of the original model. It considers the nonlinear response of the visual system and requires enhancement of sub-threshold spatial information only. This modification increases the dynamic range available by decreasing the range previously used by the linear models to enhance visible details. Implementation of an image-enhancing visual aid in a head-mounted binocular full-field virtual vision device may cause substantial difficulties. Adaptation for the patient may be difficult due to head movement and interaction of the vestibular system response with the head-mounted display. I propose an alternate bioptic design in which the display is positioned above or below the line of sight to be examined intermittently possibly in a freeze-frame mode. Such implementation is also likely to be less expensive enabling more users access to the device. 1.
Invariant perception of objects is desirable. Contrast constancy assures invariant appearance of suprathreshold image features as they change their distance from the observer. Fully robust size invariance also requires equal contrast thresholds across all spatial frequencies and eccentricities so that near-threshold image features do not appear or disappear with distance changes. This clearly is not the case, since contrast thresholds increase exponentially with eccentricity. We showed that a less stringent constraint actually may be realized. Angular size and eccentricity of image features covary with distance changes. Thus the threshold requirement for invariance could be approximately satisfied if contrast thresholds were to vary as the product of spatial frequency and eccentricity from the fovea. Measurements of observers' orientation discrimination contrast thresholds fit this model well over spatial frequencies of 1 - 16 cycles/degree and for retinal eccentricities up to 23 degrees. Measurements of observers’ contrast detection thresholds from three different studies provided an even better fit to this model over even wider spatial frequency and retinal eccentricity ranges. The fitting variable, die fundamental eccentricity constant, was similar for all three studies (0.036, 0.036, 0.030, respectively). The eccentricity constant for the orientation discrimination thresholds was higher (0.048 and 0.050 for two observers, respectively). We simulated the appearance of images with a nonuniform visual system by applying the proper threshold at each eccentricity and spatial frequency. The images exhibited only small changes over a simulated 4-octave distance range. However, the change in simulated appearance over the same distance range was dramatic for patients with central visual field loss. The changes of appearance across the image as a function of eccentricity were much smaller than in previous simulations, which used data derived from visual cortex anatomy rather than direct measurements of visual function. Our model provides a new tool for analyzing the visibility of displays and for designing equal visibility or various visibility displays.
A miniature display device, recently available commercially, is aimed at providing a portable, inexpensive means of visual information communication. The display is head mounted in front of one eye with the other eye's view of the environment unobstructed. Various visual phenomena
are associated with this design. The consequences of these phenomena for visual safety, comfort, and efficiency of the user were evaluated: (1) The monocular, partially occluded mode ofoperation interrupts binocular vision. Presenting disparate images to each eye results in binocular
rivalry. Most observers can use the display comfortably in this rivalrous mode. In many cases, it is easier to use the display in a peripheral
position, slightly above or below the line of sight, thus permitting normal binocular vision of the environment. (2) As a head-mounted device, the
displayed image is perceived to move during head movements due to the response of the vestibulo-ocular reflex. These movements affect the visibility of small letters during active head rotations and sharp accelerations. Adaptation is likely to reduce this perceived image motion. No evidence for postural instability or motion sickness was noted as a result of these conflicts between visual and vestibular inputs. (3) Small displacements of the image are noted even without head motion, resulting from eye movements and the virtual lack of display persistence. These movements are noticed spontaneously by few observers and are unlikely to interfere with the display use in most tasks.