A prototype, wide-field, optical sense-and-avoid instrument was constructed from low-cost commercial off-the-shelf
components, and configured as a network of smart camera nodes. To detect small, general-aviation aircraft
in a timely manner, such a sensor must detect targets at a range of 5-10 km at an update rate of a few
Hz. This paper evaluates the
flight test performance of the "DragonflEYE" sensor as installed on a Bell 205
helicopter. Both the Bell 205 and the Bell 206 (intruder aircraft) were fully instrumented to record position and
orientation. Emphasis was given to the critical case of head-on collisions at typical general aviation altitudes and
airspeeds. Imagery from the DragonflEYE was stored for the offline assessment of performance. Methodologies
for assessing the key figures of merit, such as the signal-to-noise ratio, the range at first detection (R0) and
angular target size were developed. Preliminary analysis indicated an airborne detection range of 6:7 km under
typical visual meteorological conditions, which significantly exceeded typical visual acquisition ranges under the
On-die optics have been proposed for stand-alone image sensors. Previous works by the authors have proposed fabricating diffractive optical elements using the upper metal layers in a commercial CMOS process. This avoids the cost associated with process steps associated with microlens fabrication, but results in a point spread function that varies with the wavelength, angle, and polarization of incident light. Wavelength and angle sensitivities have been addressed by previous works. This paper models the effects of polarization on the point spread function of the imaging system, and proposes optical and algorithmic methods for compensating for these effects. The imaging behaviors of the resulting systems are evaluated. Simulations indicate that the uncorrected system can locate point sources to within +/-0.1 radian, and polarized point sources to within +/-0.05 radian along the axis of polarization. A system is described that uses a polarization-insensitive optical element and a deconvolution filter to achieve a corrected resolution pf +/-0.05 radian, with the ability to perform imaging of non-point sources with white light illumination.
Expected temporal effects in a night vision goggle (NVG) include the fluorescence time constant, charge depletion at high signal levels, the response time of the automatic gain control (AGC) and other internal modulations in the NVG. There is also the possibility of physical damage or other non-reversible effects in response to large transient signals. To study the temporal behaviour of an NVG, a parametric Matlab model has been created. Of particular interest in the present work was the variation of NVG gain, induced by its automatic gain control (AGC), after a short, intense pulse of light. To verify the model, the reduction of gain after a strong pulse was investigated experimentally using a simple technique. Preliminary laboratory measurements were performed using this technique. The experimental methodology is described, along with preliminary validation data.
On-die optics have been proposed for imaging, spectral analysis, and
communications applications. These systems typically require extra process
steps to fabricate on-die optics. Fabrication of diffractive optics using
the metal layers in commercial CMOS processes circumvents this
requirement, but produces optical elements with poor imaging behavior.
This paper discusses the application of Wiener filtering to reconstruction
of images suffering from blurring and chromatic aberration, and to
identification of the position and wavelength of point sources. Adaptation
of this approach to analog and digital FIR implementations are discussed,
and the design of a multispectral imaging sensor using analog FIR
filtering is presented. Simulations indicate that off-die post-processing
can determine point source wavelength to within 5% and position to
within ±0.05 radian, and resolve features 0.4 radian in size in
images illuminated by white light. The analog hardware implementation is simulated to resolve
features 0.4 radian in size illuminated by monochromatic light, and 0.7
radian with white light.
On-die optics are an attractive way of reducing package size for imaging and non-imaging optical sensors. While systems incorporating on-die optics have been built for imaging and spectral analysis applications, these have required specialized fabrication processes and additional off-die components. This paper discusses the fabrication of an image sensor with neither of these limitations. Through careful design, an image sensor is implemented that uses on-die diffractive optics fabricated using a standard 0.18 micron bulk CMOS process, with simulations indicating that the resulting die is capable of acting as a standalone imaging system resolving spatial features to within ±0.15 radian and spectral features to within ±40 nm wavelength accuracy.
For objects on a plane, a "scale factor" relates the physical dimensions of the objects to the corresponding dimensions in a camera image. This scale factor may be the only calibration parameter of importance in many test applications. The scale factor depends on the angular size of a pixel of the camera, and also on the range to the object plane. A measurement procedure is presented for the determination of scale factor to high precision, based on the translation of a large-area target by a precision translator. A correlation analysis of the images of a translated target against a reference image is used to extract image shifts and the scale factor. The precision of the measurement is limited by the translator accuracy, camera noise and various other secondary factors. This measurement depends on the target being translated in a plane perpendicular to the optic axis of the camera, so that the scale factor is constant during the translation. The method can be extended to inward-looking 3D camera networks and can, under suitable constraints, yield both scale factor and transcription angle.
This paper addresses the optimization of power at the circuit level in the main blocks of CMOS APS image sensors. A pixel bias current of zero during the readout period is shown to reduce the static power and enhance the settling time of the pixel. A balanced operational transconductance amplifier (OTA) has been demonstrated to be a better candidate as an amplifier when employed in a correlated double sampling (CDS) circuit or as a comparator in an analog-to-digital (A/D) converter, as compared to a Miller two-stage amplifier. Using common-mode feedback (CMFB) in an OTA can further reduce the quiescent power of the amplifier. The low power capability of a CMFB OTA is discussed in this paper by performing a comparison with a conventional OTA using a 0.18 μm technology.
The human vision system (HVS) is remarkably robust against eye distortions. Through a combination of eye movements and visual feedback, the HVS can often appropriately interpret scene information acquired from flawed optics. Inspired by biological systems, we have built an electronically and mechanically reconfigurable "saccadic" camera system. The saccadic camera is designed to efficiently examine scenes through foveated imaging, where scrutiny is reserved for salient regions of interest. The system's "eye" is an electronic image sensor used in multiple modes of resolution. We use a subwindow set at high resolution as the system's fovea, and capture the remaining visual field at a lower resolution. The ability to program the subwindow's size and position provides an analog to biological eye movements. Similarly, we can program the system's mechanical components to provide the "neck's" locomotion for modified perspectives. In this work, we use the saccadic camera to develop a "work-around" routine in response to possible degradations in the camera's lens. This is particularly useful in situations where the camera's optics are exposed to harsh conditions, and cannot be easily repaired or replaced. By exploiting our knowledge of the image sensor's electronic coordinates relative to the camera's mechanical movement, the system is able to develop an empirical distortion model of the image formation process. This allows the saccadic camera to dynamically adapt to changes in its image quality.
Visual information is of vital significance to both animals and artificial systems. The majority of mammals rely on two images, each with a resolution of 107-108 'pixels' per image. At the other extreme are insect eyes where the field of view is segmented into 103-105 images, each comprising effectively one pixel/image. The great majority of artificial imaging systems lie nearer to the mammalian characteristics in this parameter space, although electronic compound eyes have been developed in this laboratory and elsewhere. If the definition of a vision system is expanded to include networks or swarms of sensor elements, then schools of fish, flocks of birds and ant or termite colonies occupy a region where the number of images and the pixels/image may be comparable. A useful system might then have 105 imagers, each with about 104-105 pixels. Artificial analogs to these situations include sensor webs, smart dust and co-ordinated robot clusters. As an extreme example, we might consider the collective vision system represented by the imminent existence of ~109 cellular telephones, each with a one-megapixel camera. Unoccupied regions in this resolution-segmentation parameter space suggest opportunities for innovative artificial sensor network systems. Essential for the full exploitation of these opportunities is the availability of custom CMOS image sensor chips whose characteristics can be tailored to the application. Key attributes of such a chip set might include integrated image processing and control, low cost, and low power. This paper compares selected experimentally determined system specifications for an inward-looking array of 12 cameras with the aid of a camera-network model developed to explore the tradeoff between camera resolution and the number of cameras.
When a bright light source is viewed through Night Vision Goggles (NVG), the image of the source can appear enveloped in a “halo” that is much larger than the “weak-signal” point spread function of the NVG. The halo phenomenon was investigated in order to produce an accurate model of NVG performance for use in psychophysical experiments. Halos were created and measured under controlled laboratory conditions using representative Generation III NVGs. To quantitatively measure halo characteristics, the NVG eyepiece was replaced by a CMOS imager. Halo size and intensity were determined from camera images as functions of point-source intensity and ambient scene illumination. Halo images were captured over a wide range of source radiances (7 orders of magnitude) and then processed with standard analysis tools to yield spot characteristics. The spot characteristics were analyzed to verify our proposed parametric model of NVG halo event formation. The model considered the potential effects of many subsystems of the NVG in the generation of halo: objective lens, photocathode, image intensifier, fluorescent screen and image guide. A description of the halo effects and the model parameters are contained in this work, along with a qualitative rationale for some of the parameter choices.
A general purpose FPGA architecture for real-time thresholding is proposed in this paper. The hardware architecture is based on a weight-based clustering threshold algorithm that takes the thresholding as a problem of clustering background and foreground pixels. This method employs the clustering capability of a two-weight neural network to find the centriods of the two pixel groups. The image threshold is the average of these two centriods. The proposed method is an adaptive thresholding technique because for every input pixel the closest weight is selected for updating. Updating is based on the difference between the input pixel gray level and the associated weight, scaled by a learning rate factor.
The hardware system is implemented on a FPGA platform and consists of two pipelined functional blocks. While the first block is obtaining the threshold value for current frame, another block is applying the threshold value to the previous frame. This parallelism and the simple hardware component of both blocks make this approach suitable for real-time applications, while the performance remains comparable with the Otsu technique frequently used in off-line threshold determination.
Results from the proposed algorithm are presented for numerous examples, both from simulations and experimentally using the FPGA. Although the primary application of this work is to centroiding of laser spots, its use in other applications will be discussed.
A concept is described for the detection and location of transient objects, in which a "pixel-binary" CMOS imager is used to give a very high effective frame rate for the imager. The sensitivity to incoming photons is enhanced by the use of an image intensifier in front of the imager. For faint signals and a high enough frame rate, a single "image" typically contains only a few photon or noise events. Only the event locations need be stored, rather than the full image. The processing of many such "fast frames" allows a composite image to be created. In the composite image, isolated noise events can be removed, photon shot noise effects can be spatially smoothed and moving objects can be de-blurred and assigned a velocity vector. Expected objects can be masked or removed by differencing methods. In this work, the concept of a combined image intensifier/CMOS imager is modeled. Sensitivity, location precision and other performance factors are assessed. Benchmark measurements are used to validate aspects of the model. Options for a custom CMOS imager design concept are identified within the context of the benefits and drawbacks of commercially available night vision devices and CMOS imagers.
The capture of a wide field of view (FOV) scene by dividing it into multiple sub-images is a technique with many precedents in the natural world, the most familiar being the compound eyes of insects and arthropods. Artificial structures of networked cameras and simple compound eyes have been constructed for applications in robotics and machine vision. Previous work in this laboratory has explored the construction and calibration of sensors which produce multiple small images (of ~150 pixels in diameter) for high-speed object tracking.
In this paper design options are presented for electronic compound eyes consisting of 101 - 103 identical 'eyelets'. To implement a compound eye, multiple sub-images can be captured by distributing cameras and/or image collection optics. Figures of merit for comparisons will be developed to illustrate the impact of design choices on the field of view, resolution, information rate, image processing, calibration, environmental sensitivity and compatibility with integrated CMOS imagers.
Whereas compound eyes in nature are outward-looking, the methodology and subsystems for an outward-looking compound-eye sensor are similar for in an inward-looking sensor, although inward-looking sensors have a common region viewable to all eyelets simultaneously. The paper addresses the design considerations for compound eyes in both outward-looking and inward-looking configurations.
We demonstrate a non-orthogonal image sensor architecture, called pyramid architecture
in which the 2D sampling co-centric rings replaces the 1D row sampling in the classical
imager architectures and the diagonals output busses replaces the classical vertical
column busses. As the imager fixed pattern noise (FPN) is distributed on the output
busses, the noise in the classical CMOS imagers will be distributed vertically leading to
vertical strips. In our imager, this noise strips will be distributed diagonally. It is a well
known fact that the human visual system is less sensitive to obliquely ordered contrast
than to the orthogonal contrast. This characteristic is a very important feature of the
human visual system which is therefore more sensitive to orthogonally distributed noise
than to the diagonally generated noise of our pyramidal imager. So our pyramidal imager
is benefiting from this limitation of human vision system sensitivity to diagonal contrast
to make its inherent noise (FPN) not apparent to the human eye. Moreover, we proposed
a scanning scheme in which instead of rolling over to the first ring (or row) at the end of
image scanning it bounces off each time it reaches the two edges of the pyramid imager
and samples back the image to the starting ring and continues on. This leads to a two
scenes of rings’ integration time profiles that after being fused results in foveated
Damage in CMOS image sensors caused by heavy ions with moderate energy (~10MeV) are discussed through the effects on transistors and photodiodes. SRIM (stopping and range of ions in matter) simulation results of heavy ion radiation damage to CMOS image sensors implemented with standard 0.35μm and 0.18μm technologies are presented. Total ionizing dose, displacement damage and single event damage are described in the context of the simulation. It is shown that heavy ions with an energy in the order of 10 MeV cause significant total ionizing dose and displacement damage around the active region in 0.35μm technology, but reduced effects in 0.18μm technology. The peak of displacement damage moves into the substrate with increasing ion energy. The effect of layer structure in the 0.18 and 0.35 micron technologies on heavy ion damage is also described.
An optical beam combined with an array detector in a suitable geometrical arrangement is well-known to provide a range measurement based on the image position. Such a 'triangulation' rangefinder can measure range with short-term repeatability below the 10-5 level, with the aid of spatial and temporal image processing. This level of precision is achieved by a centroid measurement precision of ±0.02 pixel. In order to quantify its precision, accuracy and linearity, a prototype triangulation rangefinder was constructed and evaluated in the laboratory using a CMOS imager and a collimated optical source. Various instrument, target and environmental conditions were used. The range-determination performance of the prototype instrument is described, based on laboratory measurements and augmented by a comprehensive parametric model. Temperature drift was the dominant source of systematic error. The temperature and vibration environments and target orientation and motion were controlled to allow their contributions to be independently assessed. Laser, detector and other effects were determined both experimentally and through modeling. Implementation concepts are presented for a custom CMOS imager that can enhance the performance of the rangefinder, especially with regards to update rate.
Vanishing point and Z-tranform image center calibration techniques are reported for a prototype “compound-eye” camera system which can contain up to 25 “eyelets”. One application of this system is to track a fast-moving object, such as a tennis ball, over a wide field of view. Each eyelet comprises a coherent fiber bundle with a small imaging lens at one end. The other ends of the fiber bundles are aligned on a plane, which is re-imaged onto a commercial CMOS camera. The design and implementation of the Dragonfleye prototype is briefly described. Calibration of the image centers of the eyelet lenses is performed using a vanishing point technique, achieving an error of approximately ±0.2 pixels. An alternative technique, the Z-transform, is shown to be able to achieve similar results. By restricting the application to a two-dimensional surface, it is shown that similar accuracies can be achieved using a simple homography transformation without the need for calibrating individual eyelets. Preliminary results for object tracking between eyelets are presented, showing an error between actual and measured positions of around 3.5 mrad.
Night vision devices are important tools that extend the operational capability of military and civilian flight operations. Although these devices enhance some aspects of night vision, they distort or degrade other aspects. Scintillation of the NVG signal at low light levels is one of the parameters that may affect pilot performance. We have developed a parametric model of NVG image scintillation. Measurements were taken of the output of a representative NVG at low light levels to validate the model and refine the values of the embedded parameters. A simple test environment was created using a photomultiplier and an oscilloscope. The model was used to create sequences of simulated NVG imagery that were characterized numerically and compared with measured NVG signals. The sequences of imagery are intended for use in laboratory experiments on depth and motion-in-depth perception.
We demonstrate a non-orthogonal architecture for a CMOS active pixel image sensor, called here pyramid architecture, for improved two-dimensional spatial sampling. In the pyramid architecture 2D sampling using concentric rings replaces the 1D row sampling in the classical imager architecture, and diagonal output busses replace the conventional vertical column busses. Moreover, we propose a scanning scheme in which, instead of rolling over to the first ring (or row) at the end of image capture, the scan returns from the outer ring towards the first inner ring at the centre of the sensor. This leads to two scenes of differing integration times that, after being fused, results in a foveated increase in intra-scene dynamic range. Results from a sensor fabricated in 0.18μm CMOS technology are presented and discussed. We will also present a multi-resolution architecture which is based on the pixel structures as building block to control the acquired image resolution.
Zoom magnification is an essential element of video-based low vision enhancement systems. However, since optical
zoom systems are bulky and power intensive, digital zoom is an attractive alternative. This paper determines the visual
acuity of 15 subjects when a letter chart is viewed through a video system with various levels of digital zoom. A strategy
in which the 1:1 magnified image is obtained by combining optical magnification with digital minification gives the best
result, provided background scene information is know from the other cameras. A real-time FPGA based system for
simultaneous zoom and smoothing is also demonstrated for text reading and enhancement.
Compound eyes are a highly successful natural solution to the issue of wide field of view and high update rate for vision systems. Applications for an electronic implementation of a compound eye sensor include high-speed object tracking and depth perception. In this paper we demonstrate the construction and operation of a prototype compound eye sensor which currently consists of up to 20 eyelets, each of which forms an image of approximately 150 pixels in diameter on a single CMOS image sensor. Post-fabrication calibration of such a sensor is discussed in detail with reference to experimental measurements of accuracy and repeatability.
A pixel-parallel image sensor readout technique is demonstrated for CMOS active pixel sensors to facilitate a range of applications where the high-speed detection of the presence of an object, such as a laser spot, is required. Information concerning the object’s location and size is more relevant that a captured image for such applications. A sensor for which the output comprises the numbers of pixels above a global threshold in both rows and columns is demonstrated in 0.18 μm CMOS technology. The factors limiting the ultimate performance of such a system are discussed. Subsequently, techniques for enhancing information retrieval from the sensor are introduced,including centroid calculations using multiple thresholds, multi-axis readout, and run-length encoding.
In this paper, we demonstrate a CMOS active pixel sensor chip, integrated with binary image processing on a single monolithic chip. A prototype chip comprising a 64 X 64 photodiode array with on-chip binary image processing is fabricated in standard 0.35 micrometers CMOS technology, with 3.3 V power supply. The binary image processing functionality is embedded in the column structure, were each processing element is placed per column, reducing processing time and power consumption. This column processing structure is scalable to higher resolution. A 3 X 3 local mask (also called structure element) is implemented every column so that row-parallel processing can be achieved with a conventional progressive scanning method.
Mosaic imagers increase field of view cost effectively, by connecting single-chip cameras in a coordinated manner equivalent to a large array o9f sensors. Components that would conventionally have been in separate chips can be integrated on the same focal plane by using CMOS image sensors (CIS). Here, a mosaic imaging system is constructed using CIS connected through a bus line which shares common input controls and output(s), and enables additional cameras to be inserted with little system modification. The image- bus consumes relatively low power by employing intelligent power control techniques. However, the bandwidth of the bus will still limit the number of camera modules that can be connected in the mosaic array. Hence, signal-processing components, such as data reduction and encoding, are needed on-chip in order to achieve high readout speed. One such method is described in which the number and sizes of pixel clusters above an intensity threshold are determined using a novel 'object positioning algorithm' architecture. This scheme identifies significant events or objects in the scene before the camera's data are transmitted over the bus, thereby reducing the effective bandwidth. In addition, basic modules in single-chip camera are suggested for efficient data transfer and power control in mosaic imager.
A novel multispectral remote sensing instrument for microsatellites is described. By using 102 - 103 'chipxels,' a combination of high angular resolution, large coverage region, multispectral operation, and redundancy can be achieved. Each 'chipxel' has a detector array, optics, electronics, and an intelligent bus interface.
The detection of incipient wildfires from space is optimized by high spatial resolution, redundant coverage of a large swath, modest spectral resolution, and a high image frame rate. The desired information rate can exceed 109 bytes/sec, which is difficult to achieve with conventional sensor designs. A design is described for a distributed sensor consisting of 102 - 103 identical detection modules linked by a serial bus to a central controller. Each detection module or 'chipxel' contains an intelligent bus interface, a detector array, a multiplexer, amplifiers, digitizers, local data and program memory, a local controller, and modest image reprocessing. Clock, timing, and power control can also be present. The baseline detector element is an active CMOS image sensor, although a mix of detectors can share a common readout structure. The paper will describe the specifications for a two-chip implementation of a chipxel for space-based wildfire detection, with emphasis on the intelligent bus interface, power control, and on-chip preprocessing. Key analog and digital elements of the chip have been implemented in CMOS 0.35 micrometer technology, while ancillary functions and design augmentations can be evaluated in a gate array or similar hardware.
The uniformity of the output of an integrated, quasi-linear array of bolometer elements is evaluated in terms of substrate temperature gradients, variations in bolometer thermal conductivity and temperature coefficient of resistance, and self-heating during readout. With a suitable offset compensation procedure, the array non-uniformity can be as low as a few parts per million of the DC offset voltage. Uniform substrate temperature changes as large as 10K can be tolerated.