In recent years, we have seen highly successful blind image deblurring algorithms that can even handle large
motion blurs. Most of these algorithms assume that the entire image is blurred with a single blur kernel. This
assumption does not hold if the scene depth is not negligible or when there are multiple objects moving differently
in the scene. In this paper, we present a method for space-varying point spread function (PSF) estimation and
image deblurring. Regarding the PSF estimation, we do not make any restrictions on the type of blur or how the
blur varies spatially. That is, the blur might be, for instance, a large (non-parametric) motion blur in one part of
an image and a small defocus blur in another part without any smooth transition. Once the space-varying PSF
is estimated, we perform space-varying image deblurring, which produces good results even for regions where it
is not clear what the correct PSF is at first. We provide experimental results with real data to demonstrate the
effectiveness of our method.
In this paper, we investigate super-resolution image restoration from multiple images, which are possibly degraded
with large motion blur. The blur kernel for each input image is separately estimated. This is unlike many existing
super-resolution algorithms, which assume identical blur kernel for all input images. We also do not make any
restrictions on the motion fields among images; that is, we estimate dense motion field without simplifications
such as parametric motion. We present a two-step algorithm: In the first step, each input image is deblurred
using the estimated blur kernel. In the second step, super-resolution restoration is applied to the deblurred
images. Because the estimated blur kernels may not be accurate, we propose a weighted cost function for the
super-resolution restoration step, where a weight associated with an input image reflects the reliability of the
corresponding kernel estimate and the deblurred image. We provide experimental results from real video data
captured with a hand-held camera, and show that the proposed weighting scheme is robust to motion deblurring
Recently we proposed frequency division multiplexed imaging (FDMI), which allows capturing multiple images in a
single shot through spatial modulation and frequency domain filtering. This is achieved by spatially modulating the
images so that different images or sub-exposures are placed at different locations in the Fourier domain. As long as there
is no overlap of the individual bands, we can recover different components by band-pass filtering the multiplexed image.
In this paper, we present a Texas Instruments DMD based implementation of FDMI. An image is formed on the DMD
chip; pixels are modulated by the micro-mirrors; and the modulated image is captured by a camera. By applying
modulation during a sub-exposure period, the corresponding sub-exposure image is at the end recovered from the fullexposure
image. Such a system could be used in a variety of applications, such as motion analysis and image deblurring.
We will provide experimental results with the setup, and discuss possible applications as well as limitations.
High speed videoendoscopy (HSV) is widely used for the assessment of vocal fold vibratory behavior. Due to the huge
volume of HSV data, an automated and accurate segmentation of glottal opening is demanded for objective
quantification and analysis of vocal fold vibratory characteristics. In this study, a simplified dynamic programming
based algorithm is presented to do glottis segmentation. The underlying idea is to track glottal edge in gradient image,
where the average gradient magnitude along edge path is assumed to be maximal. To achieve accurate segmentation
results and enable further analysis, we addressed different aspects of the problem, including reflection removal, detection
of posterior and anterior commissures and determination of open and closed portions of glottal area. Reflection removal,
which is essential for robust segmentation, is also achieved by dynamic programming. Posterior and anterior
commissures in each frame of HSV data help pre-define the range of glottal area which needs to be segmented and
therefore decrease the segmentation cost. In addition to the proposed algorithm, three other methods (including active
contour, standard dynamic programming and fixed-threshold segmentation) have been implemented. The experimental
results show that the proposed algorithm is more efficient and accurate than the others.
Measuring the type and amount of food intake of free-living (outside controlled clinical research centers) people
is an important task in nutrition research. One practical method, called the Remote Food Photography Method
(RFPM),1 is to provide camera-equipped smartphones to participants, who are trained to take pictures of
their foods and send these pictures to the researchers over a wireless network. These pictures can then be
analyzed by trained raters to accurately estimate food intake, though the process can be labor intensive. In this
paper, we describe a computer vision application to estimate food intake from the pictures captured and sent
by participants. We describe the application in detail, including segmentation, pattern classification, volume
estimation modules, and provide comprehensive experimental results to evaluate its performance.
The initial stage of many computer vision algorithms such as object recognition and tracking is to detect interest points on an image. Some of the existing interest point detection algorithms are robust to illumination variations to a certain extent. We have recently proposed the contrast stretching technique to improve the repeatability rate of the Harris corner detector under large illumination changes5. In this paper the contrast stretching technique has been incorporated into two scale invariant interest point detectors, specifically multi-scale Harris and multi-scale Hessian detectors. We show that, with the adoption of contrast stretching technique, the performances of these detectors improve not only under illumination variations but also under variations of viewpoint, scale, blur, and compression. In addition, we discuss GPU implementation of the proposed technique.
Focus stacking and high dynamic range (HDR) imaging are two paradigms of computational photography. Focus
stacking aims to produce an image with greater depth of field (DOF) from a set of images taken with different focus
distances, whereas HDR imaging aims to produce an image with higher dynamic range from a set of images taken
with different exposure settings. In this paper, we present an algorithm which combines focus stacking and HDR
imaging in order to produce an image with both higher dynamic range and greater DOF than any of the input
images. The proposed algorithm includes two main parts: (i) joint photometric and geometric registration and (ii)
joint focus stacking and HDR image creation. In the first part, images are first photometrically registered using an
algorithm that is insensitive to small geometric variations, and then geometrically registered using an optical flow
algorithm. In the second part, images are merged through weighted averaging, where the weights depend on both
local sharpness and exposure information. We provide experimental results with real data to illustrate the algorithm.
The algorithm is also implemented on a smartphone with Android operating system.
In this paper, we describe frequency division multiplexed imaging (FDMI), where multiple images are captured
simultaneously in a single shot and can later be extracted from the multiplexed image. This is achieved by
spatially modulating the images so that they are placed at different locations in the Fourier domain. The
technique assumes that the images are band-limited and they are placed at non-overlapping frequency regions
through the modulation process. The FDMI technique can be used for extracting sub-exposure information and
in applications where multiple cameras or captures are needed, such as high-dynamic-range and stereo imaging.
We present experimental results to illustrate the FDMI idea.
A well-known technique in high dynamic range (HDR) imaging is to take multiple photographs, each one with
a different exposure time, and then combine them to produce an HDR image. Unless the scene is static and the
camera position is fixed, this process creates the so-called "ghosting" artifacts. In order to handle non-static
scenes or moving camera, images have to be spatially registered. This is a challenging problem because most
optical flow estimation algorithm depends on the constant brightness assumption, which is obviously not the
case in HDR imaging. In this paper, we present an algorithm to estimate the dense motion field in image
sequences with photometric variations. In an alternating optimization scheme, the algorithm estimates both the
dense motion field and the photometric mapping. As a latent information, the occluded regions are extracted
and excluded from the photometric mapping estimation. We include experiments with both synthetic and real
imagery to demonstrate the efficacy of the proposed algorithm. We show that the ghosting artifacts are reduced
significantly in HDR imaging of non-static scenes.
In this paper, we present a spatially adaptive method to reduce compression artifacts observed in block discrete
cosine transform (DCT) based image/video compression standards. The method is based on the bilateral filter,
which is very effective in denoising images without smoothing edges. When applied to reduce compression
artifacts, the parameters of the bilateral filter should be chosen carefully to have a good performance. To avoid
over-smoothing texture regions and to effectively eliminate blocking and ringing artifacts, in this paper, texture
regions and block boundary discontinuities are first detected; these are then used to control/adapt the spatial
and intensity parameters of the bilateral filter. Experiments show that the proposed method improves over the
standard non-adaptive bilateral filter visually and quantitatively.
The bilateral filter is a nonlinear filter that does spatial averaging without smoothing edges; it has shown to be
an effective image denoising technique in addition to some other applications. There are two main contributions
of this paper. First, we provide an empirical study of the optimal parameter selection for the bilateral filter in
image denoising applications. Second, we present an extension of the bilateral filter: multi-resolution bilateral
filter, where bilateral filtering is applied to low-frequency subbands of a signal decomposed using an orthogonal
wavelet transform. Combined with wavelet thresholding, this new image denoising framework turns out to be
very effective in eliminating noise in real noisy images. We provide experimental results with both simulated
data and real data.
Image demosaicing is a problem of interpolating full-resolution color images from so-called color-filter-array
(CFA) samples. Among various CFA patterns, Bayer pattern has been the most popular choice and demosaicing
of Bayer pattern has attracted renewed interest in recent years partially due to the increased availability of source
codes/executables in response to the principle of "reproducible research". In this article, we provide a systematic
survey of over seventy published works in this field since 1999 (complementary to previous reviews22, 67).
Our review attempts to address important issues to demosaicing and identify fundamental differences among
competing approaches. Our findings suggest most existing works belong to the class of sequential demosaicing
- i.e., luminance channel is interpolated first and then chrominance channels are reconstructed based on recovered
luminance information. We report our comparative study results with a collection of eleven competing
algorithms whose source codes or executables are provided by the authors. Our comparison is performed on
two data sets: Kodak PhotoCD (popular choice) and IMAX high-quality images (more challenging). While
most existing demosaicing algorithms achieve good performance on the Kodak data set, their performance on
the IMAX one (images with varying-hue and high-saturation edges) degrades significantly. Such observation
suggests the importance of properly addressing the issue of mismatch between assumed model and observation
data in demosaicing, which calls for further investigation on issues such as model validation, test data selection
and performance evaluation.