Anisotropy-based robust focus measure for non-mydriatic retinal imaging

Andres G. Marrugo; María S. Millán; Hector C. Abril Baez; Gabriel Cristóbal; Salvador Gabarda

doi:10.1117/1.JBO.17.7.076021

17 July 2012 Anisotropy-based robust focus measure for non-mydriatic retinal imaging

Andres G. Marrugo, María S. Millán, Hector C. Abril Baez, Gabriel Cristóbal, Salvador Gabarda

Author Affiliations +

Journal of Biomedical Optics, Vol. 17, Issue 7, 076021 (July 2012). https://doi.org/10.1117/1.JBO.17.7.076021

Abstract

Non-mydriatic retinal imaging is an important tool for diagnosis and progression assessment of ophthalmic diseases. Because it does not require pharmacological dilation of the patient's pupil, it is essential for screening programs performed by non-medical personnel. A typical camera is equipped with a manual focusing mechanism to compensate for the refractive errors in the eye. However, manual focusing is error prone, especially when performed by inexperienced photographers. In this work, we propose a new and robust focus measure based on a calculation of image anisotropy which, in turn, is evaluated from the directional variance of the normalized discrete cosine transform. Simulation and experimental results demonstrate the effectiveness of the proposed focus measure.

1. Introduction

Ocular fundus imaging has long played a key role in the documentation, diagnosis, and progression assessment of ophthalmic diseases. With the advent of digital cameras, ophthalmic imaging changed dramatically. Among the advantages of digital imaging are the ease and speed of access to data, fast and exact duplication, archiving and transmission, and digital image analysis techniques. Altogether, these advantages set the foundations for modern ophthalmology in the framework of telemedicine.

Fundus cameras can be mydriatic or non-mydriatic. Mydriatic fundus cameras require pharmacological dilation, while non-mydriatic cameras use a near-infra-red (NIR) viewing system to exploit the patient’s natural dilation in a dark room.¹ Infra-red light is used to preview the retina on a video monitor. Once the monitor’s image is focused and aligned, a flash of visible light from a Xenon arc lamp is fired and the image is captured. Non-mydriatic fundus cameras are equipped with a focusing mechanism that displaces a compensation lens. It is basically an aspheric objective lens design that, when combined with the optics of the eye, matches the image plane to the eye fundus. The focus control of the fundus camera is used to compensate for refractive errors in the subject’s eye. Until recently,² these cameras were entirely operated manually with the focusing mechanism assisted by a split-line visual aid. Manual focusing is error prone, especially when performed by inexperienced photographers, and may lead to images that require additional restoration or enhancement.³ The auto-focus feature offered in new retinal cameras is a significant advance that ultimately leads to a more robust imaging system, especially for medical screening purposes. On the other hand, the auto-focus feature still relies on the split-line mechanism, whereas in this work we propose a passive focus measure (FM) completely based on image analysis. For further details on fundus imaging, the reader is referred to Refs. 1, 4, and 5.

In Ref. 6 we studied the performance of several state-of-the-art no-reference image quality metrics for eye fundus imaging. The most interesting finding relates to the importance of directional properties with image quality. In other words, the measure of anisotropy as a quality metric. This was proposed by two co-authors of this paper (Gabarda and Cristóbal) in Ref. 7 and it represents an important step forward in the area of no-reference quality metrics. However, given the properties of the NIR fundus focusing system, the FM should be robust to noise (spatial and temporal) and to changes in illumination and contrast. Furthermore, real-time imaging is constrained by the overhead required to compute the directional Rényi entropy as described in Ref. 7. Therefore, in this work we made use of our findings on the directional dependency of eye fundus images against defocus to define a new and robust FM based on the directional variance of the normalized discrete cosine transform (DCT). The FM proposed here could impact the design of portable retinal cameras with autofocus function or the manufacturing of low-cost retinal cameras because they would require fewer optical components.

1.1.

Focusing

In a single-lens optical imaging system operating within the paraxial regime, the process of focusing consists in adjusting the relative position of the object, the lens, the image sensor, or a certain combination of the three to obtain a focused image. Let $f (x, y)$ be the focused image of a planar object and $g_{i} (x, y)$ a sequence of images recorded for a sequence of camera parameter settings. The eye fundus is actually a curved surface; however, in our case $f (x, y)$ corresponds to a small region of the fundus so it can be considered as an isoplanatic patch.⁸ We consider the variation of only one camera parameter at a time—either the lens position or the focal length. The acquired set of images can be expressed by convolution

Eq. (1)

g_{i} (x, y) = (f * h_{i}) (x, y), i = 1, \dots, m,

where

h_{i} (x, y)

is the point spread function (PSF) of the blur in the

i

th observation. In a practical imaging system, the image magnification and mean image brightness may change while focusing even if nothing has changed in the scene. Normalization with respect to these two parameters can be carried out. However, illumination normalization is more easily performed. Image magnification may be neglected because in most practical applications the magnification is less than 3%.⁹ Ideally, the best possible case occurs when

h_{i} (x, y) = δ (x, y)

, therefore

g_{i} (x, y) = f (x, y)

. In practice, all

h_{i} (x, y)

have an unknown low-pass filter effect.

An FM may be understood as a functional defined on the image space which reflects the amount of blurring introduced by $h_{i} (x, y)$ . Let $S$ be the FM with which we look for the “best” image by maximizing/minimizing $S (g_{i})$ over $i = 1, \dots, m$ . A reasonable FM should be monotonic with respect to blur and robust to noise. Groen et al.¹⁰ used eight different criteria for the evaluation of focus functions. Ideally, the focus function should be unimodal, but in practice it can present various local maxima that can affect the convergence of the auto-focus procedure. Moreover, the focus curve should ideally be sharp at the top and long tailed, which can accelerate the convergence of the screening procedure.

2. Related Works

Various FMs have been reported in the literature.²^,⁹^,¹¹^–¹³ They mainly consist of a focus-measuring operator that estimates the sharpness of the image. The image that yields a maximum FM is considered as the focused one. Almost all FMs depend directly on the amount of high-frequency information in the image. The high-frequency components correspond to edge information. On the other hand, their accuracy can deviate depending on the content of the processed images. Because well-focused images have sharper edges, they are expected to have higher-frequency content than blurred ones.¹³ The common FMs are based on norm of gradient or second derivative of the image, gray level variance, and Laplacian energy. Surprisingly, little is known about the performance of these methods for fundus imaging, and the literature on this subject is scarce.

To the best of our knowledge, there exist only two published works that deal with autofocusing in retinal imaging;²^,¹⁴ however, they use conventional mydriatic imaging in the visible spectrum, which is not our case. In Ref. 14 they do not propose a FM; instead they use several preprocessing operations to improve the performance of traditional FMs for segmentation purposes. On the other hand, in the recent work by Moscaritolo et al.,² they propose a filtering technique to assess sharpness of optic nerve head images; however, they do not compare with other methods. In this section we briefly summarize five notable approaches—including that of Moscaritolo et al.²—for later comparison with our proposed method.

The first FM $S_{1}$ was proposed in Ref. 2. It may be defined mathematically as

Eq. (2)

S_{1} = Var (z_{med} | g_{i} - z_{lp} (g_{i}) |),

where

z_{lp}

is a low-pass filtering of

g_{i} (x, y)

,

z_{med}

is a nonlinear median filter of the absolute value

| . |

of the difference for removing noise, and

Var (.)

is the variance. Another important measure is the

l_{2}

-norm of image gradient, also called the energy of gradient, defined as

Eq. (3)

S_{2} = \sum_{x} \sum_{y} {[\frac{\partial g_{i} (x, y)}{\partial x}]}^{2} + {[\frac{\partial g_{i} (x, y)}{\partial y}]}^{2} .

The third measure is the Laplacian energy. It can analyze high frequencies associated with image edges and is calculated as

Eq. (4)

S_{3} = \sum_{x} \sum_{y} {[\nabla^{2} g_{i} (x, y)]}^{2} .

Nayar and Nakagawa¹⁵ proposed a noise-insensitive FM based on the summed modified Laplacian operators. When two second partial derivatives with respect to horizontal and vertical directions have different signs, one offsets the other and the evaluated focus value is incorrect. The method is a modification to obtain the absolute value of each second partial derivative as

Eq. (5)

S_{4} = \sum_{x} \sum_{y} [| \partial^{2} \frac{g_{i} (x, y)}{\partial x^{2}} | + | \partial^{2} \frac{g_{i} (x, y)}{\partial y^{2}} |] .

The frequency-selective weighted median (FSWM) filter¹⁶ is a high-pass nonlinear filter based on the difference of medians. It is well known as a nonlinear edge detector that removes impulsive noise effectively. The FSWM uses several nonlinear subfilters having a weight according to the frequency acting like a bandpass filter as

Eq. (6)

z_{F} (x) = \sum_{j}^{P} β_{j} {\hat{z}}_{j} (x),

where

z_{F} (x)

is the FSWM filter,

P

is the number of subfilters,

β_{j} \in R

, and

{\hat{z}}_{j} (x)

is the weighted median filter. The FM is produced by summing FSWM results,

F x

and

F y

, applied to an image along the horizontal and vertical directions as

Eq. (7)

S_{5} = \sum_{x} \sum_{y} (F_{x}^{2} + F_{y}^{2}) .

Subbarao and Tyan¹⁷ analyzed the robustness of three FMs: the image variance (not included here), $S_{2}$ , and $S_{3}$ . They recommended using $S_{3}$ because of its tolerance to additive noise; however, the differences among individual measures were not significant. There are many other FMs, such as the wavelet-based FM proposed in Ref. 12 or the mid-frequency discrete cosine FM in Ref. 11, but they were not included in our study either because of their lack of robustness to noise or their complex implementation. For a review and evaluation of FMs in natural images, the reader is referred to Refs. 9 and 13.

3. Discrete Cosine Transform

DCT is an invertible, linear transformation $T : R^{N} \to R^{N}$ . An image is transformed to its spectral representations by projection onto a set of of orthogonal two-dimensional (2-D) basis functions. The amplitudes of these projections are called the DCT coefficients. Let $g (x, y)$ , for $x = 0, 1, 2, \dots, M - 1$ and $y = 0, 1, 2, \dots, N - 1$ , denote an $M \times N$ image and its DCT denoted by $T [g (x, y)] : G (u, v)$ , given by the equation

Eq. (8)

G (u, v) = \sum_{x = 0}^{M - 1} \sum_{y = 0}^{N - 1} g (x, y) α (u) α (v) \cos [\frac{(2 x + 1) u π}{2 M}] \cos [\frac{(2 y + 1) v π}{2 N}],

where

Eq. (9)

α (ξ; A) = {\begin{matrix} \sqrt{\frac{1}{A}} & ξ = 0, \\ \sqrt{\frac{2}{A}} & otherwise, \end{matrix}

where

A = {M, N}

depending on variables

u

and

v

, respectively. Low-order basis functions represent low spatial frequencies, while those of higher orders represent the higher spatial frequencies (Fig. 1). Therefore, low-order coefficients depict slow spatial variations in image intensity, while those of higher orders depict rapid variations.

Fig. 1

Relationship between DCT coefficients and frequency components of an image.

The DCT is closely related to the discrete Fourier transform (DFT), a standard tool in signal processing, and has been reported as a suitable transform for spectral-based focusing algorithms.¹⁸ However, the DCT has a greater energy compaction property than the DFT, i.e., most of the image information tends to be concentrated in a few low-frequency DCT coefficients. This is also why the JPEG compression standard is based on the DCT. In addition, many efficient schemes for the computation of DCT exist,¹⁹ and hardware implementations are commonly available.²⁰

3.1.

Normalized DCT

The normalized DCT²¹ of an image is defined as

Eq. (10)

\tilde{G} (u, v) = \tilde{T} [g] (u, v) = \frac{| T [g] (u, v) |}{\sum_{(u, v)} | T [g] (u, v) |} .

This normalization is important because it leads to invariance to changes in the contrast of the image. This can be shown with the following: let

g^{'} (x, y) = c g (x, y)

, where

c

is a non-zero scaling factor. Given that the DCT is linear, the normalized DCT of

g^{'}

is

Eq. (11)

\tilde{T} [g^{'}] (u, v) = \frac{c | T [g] (u, v) |}{c \sum_{(u, v)} | T [g] (u, v) |} = \tilde{T} [g] (u, v),

which implies that the normalized DCT is contrast invariant and any measure based on this transform as well.

For illustrating the nature of blurring and the behavior of the DCT, we take the red channel from a sharp RGB fundus image (because it resembles more the NIR image) and simulate the imaging system as a linear shift-invariant system to acquire a sequence of images by varying the lens position. This was carried out by means of Fresnel propagation. In Fig. 2, we show the original sharp image, image patches of both the sharp and blurred images, and their DCT spectra (in the same log scale). Notice how the spectrum changes—there is less high- and mid-frequency content in the blurred image spectrum. In addition, in the original spectrum there are some favored orientations in the mid- and low-frequency coefficients, while in the blurred spectrum they seem to become more uniformly distributed. Another important feature is that in the blurred spectrum the coefficients related to high frequency have decreased significantly, and, as described in Sec. 2, many FMs are actually based on the idea of emphasizing high frequencies. While this may be true in theory, in practice there will always be noise contributing to the high-frequency content due to different acquisition conditions. Furthermore, given that the focusing mechanism involves acquiring a sequence of images, there will be spatial and temporal variations of noise.

Fig. 2

(a), Original sharp fundus image (R channel from RGB fundus image). (b), ROI from sharp image, and (c), its DCT spectrum. (d), ROI from blurred image, and (e), its DCT spectrum. For visualization purposes both spectra are shown in log scale. Coefficients with higher values are shown in red and those with lower values are shown in blue. The blurred image spectrum is dominated by low-order coefficients.

4. Focus Measure

4.1.

Measure of Anisotropy

As we have seen in the previous example, the overall nature of blurring can be described as a low-pass filtering that tends to break down the characteristic anisotropy of the original image. The FM proposed here aims to quantify this anisotropic dependence based on the normalized DCT of the image.

To define our measure, we introduce some notation. From Eq. (10), $\tilde{G} (u, v)$ is the normalized DCT of $g (x, y)$ of size $N \times N$ , and $λ_{j}$ , for $j = 1, 2, 3$ , is a vector along one of the three main orientations of the spectrum depicted in Fig. 3. We will restrict our study to angular partitions of the spectrum roughly equivalent to vertical, diagonal, and horizontal components of the image space. Our measure of anisotropy mainly consists in calculating a difference of weighted coefficients along these orientations. Let ${\tilde{G}}_{j} = {\tilde{G} (u, v) : θ = arctan (\frac{v}{u}), θ_{j} \leq θ < θ_{j + 1}, j = 1, 2, 3}$ be the set of DCT coefficients located between $θ_{j}$ and $θ_{j + 1}$ angles, for $θ_{j} \in {0 \deg, 30 \deg, 60 \deg, 90 \deg}$ . The function $ψ_{λ_{j}} (.)$ takes as input $\tilde{G_{j}}$ , performs orthogonal projection of all its elements along vector $λ_{j}$ , and averages the elements that after projection fall on the same discrete $(u, v)$ coordinates. With $ψ_{λ_{j}} (.)$ we seek to compact the information around the three main orientations in a one-dimensional vector of $N$ elements. To illustrate, let us compute $ψ_{λ_{1}} (\tilde{G_{1}}) = {[ψ_{λ_{1}}^{1}, ψ_{λ_{1}}^{2}, \dots, ψ_{λ_{1}}^{N}]}^{T}$ , where $\tilde{G_{1}}$ is the set of coefficients located between $θ_{1} = 0 \deg$ and $θ_{2} = 30 \deg$ . In Fig. 3(b), we show the projection of the coefficient with coordinates (4,2) along $λ_{1}$ . After projection, this coefficient has coordinates (4,1). Therefore, the element $ψ_{λ_{1}}^{4} = mean [\tilde{G} (4, 1), \tilde{G} (4, 2)]$ . Consequently, we can stack all $ψ_{λ_{j}}$ to form the following matrix,

Ψ = [\begin{array}{c} ψ_{λ_{1}}^{1} & ψ_{λ_{2}}^{1} & ψ_{λ_{3}}^{1} \\ ψ_{λ_{1}}^{2} & ψ_{λ_{2}}^{2} & ψ_{λ_{3}}^{2} \\ ⋮ & ⋮ & ⋮ \\ ψ_{λ_{1}}^{N} & ψ_{λ_{2}}^{N} & ψ_{λ_{3}}^{N} \end{array}] .

Note that the first element of each vector corresponds to the dc coefficient. This coefficient does not convey any directional information of the image; however, we decided to keep it in the matrix for the sake of completeness. To obtain a measure of anisotropy—the FM itself—from

Ψ

we compute the variance of the weighted sum of the columns, computed as the matrix product

w Ψ

,

Eq. (12)

S_{a} (g) = Var (w Ψ) = E [{(w Ψ - μ)}^{2}],

where

w = [w_{1}, w_{2}, \dots, w_{N}]

, E is the expected value, and

μ

is the mean of the matrix product

w Ψ

. The vector

w

can be regarded as a weighting procedure and with it we aim to achieve robustness to noise and illumination variation.

Fig. 3

(a) Vectors along the main directions of the DCT. (b) Projection of a coefficient along $λ_{1}$ .

4.2.

DCT Coefficient Weighting

The first issue to address is the selection of a suitable $w$ . In DCT-based pattern recognition, robustness is achieved by means of coefficient truncation.²² It is known that low frequencies are related to illumination variation and smooth regions, and high frequencies represent noise as well as small variations (like edge and details) of the image. The middle-frequency coefficients contain useful information of basic structure; therefore these are suitable candidates for recognition.²³ Consequently, a trade-off between low-frequency and high-frequency truncation should be achieved to obtain a robust FM that is monotonic with respect to blur, unimodal, and at the same time robust to noise and illumination variations.

We decided to find a $w$ that meets our requirements based on a training set of $m$ images. This can be formulated as an optimization problem. The goal would be to find the vector $w = [w_{1}, w_{2}, \dots, w_{N}]$ that simultaneously optimizes K objective values ${J_{1} (w), J_{2} (w), \dots, J_{K} (w)}$ . Every objective value $J_{k} (w)$ is formulated so that the FM $S_{a}$ decreases with respect to blur, $S_{a} (g_{i}^{k}) > S_{a} (g_{i + 1}^{k}) \forall i = 1, \dots, m$ . There are $K$ subsets of $g_{i} (x, y)$ all generated in the same way as described in Eq. (1), but they differ in that every $k$ stands for a different kind of noise degradation, except for $k = 1$ , the noise-free case. In other words, we want to find a $w$ that guarantees monotonicity of $S_{a}$ with respect to blur under different types of noise. The objective values are implicitly defined in terms of permutations of the ordered set $H = {S_{a} (g_{1}), S_{a} (g_{2}), \dots, S_{a} (g_{m})}$ . Thus, the reference permutation is $π_{r} = {1, 2, \dots, m}$ , and any other arbitrary permutation of $H$ violates the decreasing property of $S_{a}$ with respect to blur. As a result, our goal is to find a $w$ that produces permutations $π_{k}$ for all $K$ types of noise equal to that of $π_{r}$ . The objective value is defined as the $l_{1}$ -norm of the difference between $π_{r}$ and $π_{k}$ ,

Eq. (13)

J_{k} (w) : \sum_{j}^{m} | π_{r} (j) - π_{k} (j) | .

It is zero for two identical permutations, and approaches zero as

π_{k}

approaches

π_{r}

. This is the same for all

J_{k} (w)

; hence our single aggregate objective function²⁴ is the weighted linear sum of all

J_{k} (w)

, where all weights are equal to 1.

The solution to this problem is not a straightforward task, as the search space is multivariate and a unique global optimum cannot be guaranteed to exist. Therefore, we solved it using a probabilistic metaheuristic approach called simulated annealing.²⁵ It provides an acceptably good solution in a fixed amount of time. Each step of the algorithm replaces the current solution by a random nearby solution, chosen with a probability that depends both on the difference between the corresponding function values and on a global parameter $T$ (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when $T$ is large, but increasingly downhill as $T$ goes to zero. (For further details see Ref. 24.)

4.3.

Implementation

Typically, FMs are applied to a region called the focusing window, which is much smaller than the image. To achieve real-time computation, we decided to implement our measure by dividing the focusing window into subwindows. The measure is computed in the following manner:

1. The focusing window is divided into non-overlapping sub-images of size $16 \times 16$ . This is chosen so that the most basic structures of the image fit in the subwindows.
2. Each subwindow image is transformed with the normalized DCT, and the FM $S_{a}$ is computed.
3. An overall FM $\bar{S_{a}}$ is computed by taking the mean of all $S_{a}$ values from the subwindows.

According to this implementation, the parameter $w$ consists of 16 elements. The considered noise degradations for the procedure described in Sec. 4.2 are Gaussian noise ( $σ^{2} = 0.001$ ), speckle noise ( $σ^{2} = 0.001$ ), and impulsive noise ( $d = 0.01$ ). The resulting $w$ is shown in Fig. 4. As expected, the first two coefficients are practically zero. Observe the distribution of $w$ instead of the individual values per coefficient. This means that a strong emphasis should be put to mid-frequency coefficients. It is perhaps not surprising that the distribution resembles a bandpass filter. This finding is consistent with the work in Ref. 9, where they showed that bandpass filtering causes the FMs to have sharp peaks while retaining monotonicity and unimodality. Interestingly, these weights also resemble the band pass response of the contrast sensitivity function of the human visual system. In the DCT domain, different approaches have been considered for computing visually optimized coefficients for a given image.²⁶ A major feature of our approach is the fast computation of the FM. The average execution time per frame, in MATLAB implementation on a PC with a 2.66-GHz Intel Core 2 Duo processor, is 40 ms. In most cases this is sufficient; however, if needed, implementation in a low-level programming language could significantly reduce the execution time. In addition, because we divide the focusing window into subwindows, our implementation could be further improved by taking advantage of large parallel architectures such as in graphics processor unit computing.

Fig. 4

DCT coefficient weights obtained from the optimization procedure. The distribution resembles a bandpass filter.

5. Results

5.1.

Simulated Images and Robustness Assessment

To evaluate the robustness of our proposed FM $S_{a}$ we simulated the focusing procedure. We generated a sequence $g_{i} (x, y)$ for $i = 1, \dots, m$ from the red channel of a sharp RGB fundus image and propagated it at different distances through a linear imaging system of fixed focal length by means of Fresnel propagation. This is equivalent to displacing the lens or the sensor to look for the optimal focus position. From this noise-free sequence, we generated six additional sequences by corrupting it with two levels of three different types of noise: Gaussian, speckle, and impulse noise. We carried out this procedure for 20 retinal images for a total of 140 focusing sequences. Ideally, a noise-robust FM should produce the same (or similar) focusing curve for both the noise-free and corrupted sequences. To quantify the similarity between two focusing curves $S_{r}$ and $S_{c}$ we used the zero-lag normalized cross-correlation defined as

Eq. (14)

R (S_{r}, S_{c}) = \frac{\sum_{i} S_{r} (i) \cdot S_{c} (i)}{\sqrt{\sum_{i} {S_{r}}^{2} (i) \cdot \sum_{i} {S_{c}}^{2} (i)}},

where

r

stands for the reference curve computed from the noise-free sequence and

c

the curve computed from the noise-corrupted sequence. The output is 1 in the case of perfect correlation and 0 for no correlation at all. The reason for the zero-lag calculation, as opposed to the regular cross-correlation by sliding dot product, is that we need the maxima of the curves to coincide in the horizontal position as well as the matching of the profiles.

All FMs were computed using a focusing window of $128 \times 128 pixels$ located over retinal structures. In Fig. 5, we show an example to illustrate the robustness assessment of the FMs. The FM curves represent the normalized measure value over the search space for different lens positions. The highest value should be obtained when the lens is in the optimal focus position identified by the dashed vertical line. As the lens gets farther from the optimal position, the measure value should decrease proportionally to the distance. It comes as no surprise that all measures performed sufficiently well in the noise-free sequence shown in Fig. 5(a), where all curves follow a typical bell shape with a unique maximum. However, in the curves shown in Fig. 5(b)–5(d) where the focusing sequence is corrupted by different types of noise, the proposed FM $S_{a}$ clearly outperforms the other measures in terms of monotonicity and unimodality. Notice that under Gaussian and speckle noise [Fig. 5(b)–5(c)], the $S_{a}$ curves are nearly identical to the noise-free $S_{a}$ curve in Fig. 5(a). Without jumping to conclusions, this result is interesting because it graphically shows the robustness of the proposed FM. The results for all of the 140 sequences are summarized in Table 1. Each value represents the average cross-correlation obtained for all 20 sequences corrupted with a specified type and level of noise for a particular FM. The overall average for each FM is shown in the last column. These results provide further evidence that the proposed FM $S_{a}$ has a considerable robustness to noise, with an overall performance value of 0.929 and an exceptional 0.996 for the sequence corrupted with Gaussian noise with $σ^{2} = 0.001$ . The second and third best FMs were $S_{4}$ and $S_{1}$ , with overall values of 0.781 and 0.502, respectively. In comparison with $S_{a}$ these values represent a moderate to mild noise robustness. In the following section, we use these two FMs to compare with $S_{a}$ in real images.

Fig. 5

Focus measures curves for the simulated images. The dashed vertical line indicates the correct focused position. (a) Noise-free images. (b) Gaussian noise ( $σ^{2} = 0.001$ ). (c) Speckle noise ( $σ^{2} = 0.001$ ). (d) Impulse noise ( $d = 0.01$ ).

Table 1

Average normalized cross-correlation results for noise robustness assessment of focus measures from 140 sequences generated from 20 retinal images corrupted with different types and levels of noise. The three best FMs are in bold type.

	Gaussian ( $σ^{2} = 0.001$ )	Gaussian ( $σ^{2} = 0.005$ )	Speckle ( $σ^{2} = 0.001$ )	Speckle ( $σ^{2} = 0.005$ )	Impulse ( $d = 0.01$ )	Impulse ( $d = 0.05$ )	Overall average
$S_{1}$	0.554	0.486	0.635	0.422	0.477	0.438	0.502
$S_{2}$	0.524	0.499	0.468	0.408	0.476	0.462	0.473
$S_{3}$	0.449	0.444	0.370	0.359	0.420	0.417	0.410
$S_{4}$	0.784	0.782	0.750	0.746	0.836	0.791	0.781
$S_{5}$	0.495	0.380	0.495	0.304	0.795	0.362	0.472
$S_{a}$	0.996	0.939	0.997	0.992	0.979	0.667	0.929

5.2.

Real Images

In this subsection we show the results obtained from real NIR-focusing eye fundus images. The images have a relatively low signal to noise ratio (SNR) which justifies the need for a robust FM. All images were acquired using a digital fundus camera system (TRC-NW6S, Topcon, Tokyo, Japan) taking the video output from the infrared focusing system with a resolution of $640 \times 480$ . The focusing system enables a compensation range of $- 13 D ∶ 12 D$ in normal operation. For strong myopia or hyperopia, two additional compensation lenses are available to compensate the ranges: $- 12 D ∶ - 33 D$ and $+ 9 D ∶ + 40 D$ , respectively. The image sequences analyzed here were acquired by means of an in-house assembled motor mechanism for the displacement of the compensation lens across the whole range for normal operation.

It is well known that as a person ages the crystalline lens of the eye gradually gets opacified, obstructing the passage of light. This is called a cataract. A complete loss of transparency is observed only in advanced stages in untreated patients. In early stages of cataracts retinal examination is considered practicable—however, it is not without difficulty. For this reason, we decided to test our focusing method on healthy young subjects and elderly subjects with first signs of cataracts, not only to demonstrate its applicability on real images, but to assess its limitations as well. In this work we show results from five representative subjects of ages 27, 40, 68, 70, and 81 years for a total number of 10 eye fundi.

First we show the effects of placing the focusing window on different regions of the retinal image. A retinal image has distinct sharp structures such as the blood vessels and the optic disk, as opposed to the relatively uniform background. No FM is reliable without placing the focusing window on top of structures with edges, a fact easily appreciable from the three focusing curves shown in Fig. 6, which were computed from the right eye fundus of the 27-year-old subject. The optimal focus position identified by the dashed vertical line was verified via the split-line focusing mechanism. The $S_{a}$ curves computed from regions (b) and (c) are clearly reliable in terms of monotonicity and unimodality and coincide on the optimal focus position. Conversely, the $S_{1}$ and $S_{4}$ curves fail to produce a reliable profile against the $S_{a}$ curves that display a steeper peak at the optimal focus position, evidence of the measure’s robustness to noise. In contrast, all measures computed from region (d) are unusable because they are mainly given by noise.

Fig. 6

Focus measure curves obtained by placing the focusing window over different regions of (a) the retinal image. The dashed vertical line indicates the correct focused position. Areas (b) and (c) are located over prominent retinal structures, whereas (d) is located over a relatively uniform region.

To illustrate the link between the focusing curves and the image quality, in Fig. 7 we show three image details depicting the optic disk region for three different focusing positions. The image detail in Fig. 7(b) corresponds to the focused image (optimal focus position 11 in the $S_{a}$ curve Fig. 6). Notice how this image is properly focused: it has sharp details such as the blood vessels. The other two images are blurred, demonstrating the consistency of the $S_{a}$ curves with image quality or sharpness. The result that emerges from this example is that to effectively locate the best-focused image, homogeneous regions should be avoided. An adaptive technique, based in an edge detector for example, could prove useful for detecting such prominent structures and therefore candidate regions for applying the focusing technique automatically. The focusing curves shown hereafter, however, were all computed from a focusing window located manually over retinal structures.

Fig. 7

Image detail from Fig. 6 for different focusing positions. (a) 6, (b), 11 (optimal focus), and (c), 15. The positions are in reference to Fig. 6(b)–6(c).

To further analyze the performance of the FM in Fig. 8, we show the focusing curves obtained from four of the five subjects; the ages are shown in the figure caption. From the four cases shown only in one [Fig. 8(c)], the $S_{a}$ measure peak did not coincide precisely with the optimal focus position. However, the error is no more than a single position. The FMs curves of $S_{1}$ and $S_{4}$ are generally flatter than those of $S_{a}$ which in a focus search strategy is not wanted because of the difficulty to properly distinguish the optimum position in a coarse or initial search. From the curves in Fig. 8, we can also note that there appears to be little difference between the curves from young and elderly subjects. In Fig. 9, we show the focusing curves obtained from the 81-year-old subject for both eye fundi. This case is interesting on its own because in the right eye [Fig. 9(a)], the crystalline lens has been extracted and replaced with an intraocular lens, whereas the left eye [Fig. 9(b)], is in an early stage of cataract. While both focusing curves are able to successfully identify the optimal focus position, the curve in Fig. 9(b) is certainly flatter throughout most of the search space. This is most likely due to the difference in visibility and clarity from both eyes. In general, from the comparison against $S_{1}$ and $S_{4}$ it can clearly be stated that the proposed FM $S_{a}$ outperforms them in the considered cases.

Fig. 8

Focusing curves obtained from four subjects with ages 27 (a) 40 (b) 68 (c), and 70 (d) years. The dashed vertical line indicates the correct focused position.

Fig. 9

Focusing curves obtained from the 81-year-old subject for each eye fundus. In the right eye (a), the crystalline lens has been extracted and replaced with an intraocular lens. The left eye (b), is in an early stage of cataract. The dashed vertical line indicates the correct focused position.

A close examination of the results reveal that the shape of the focusing curve is not exclusively given by the degree of defocus, but by the state of the subject’s eye and the analyzed region of the fundus as well. This is important because it conditions the strategy for searching the optimal focus position. Finally, even though the results seem to indicate that the FM could be successfully applied to both young and elderly subjects, further research on a higher number and variety of subjects is necessary. Additionally, we report here that we encountered some difficulty in the procedure with the elderly subjects related to sustaining fixation during the acquisition procedure. From an initial number of six subjects one was excluded from all calculations due to this condition. Patient inability to successfully establish fixation is a true challenge in fundus photography, and dealing with it is out of the scope of this work.

6. Conclusion

In this paper, a new robust FM for nonmydriatic retinal imaging has been proposed. It is based on a measure of anisotropy, mainly the weighted directional variance of the normalized DCT. The weights were calculated by means of an optimization procedure to maximize the noise robustness of the FM. Not only were the resulting weights in agreement with previous works,⁹ but they also provide a key insight into the design of noise-invariant FMs. By both simulation and real fundus imaging, we demonstrated the robustness and the accuracy of the proposed FM, which clearly outperformed the other considered measures. The findings presented here may have a number of implications for the design and operation of auto-focusing in modern retinal cameras. Finally, in this study we included several young and elderly subjects to assess the limitations of the proposed FM. Even though we found no significant differences between the focusing curves, there was some difficulty in the acquisition of images from the elderly mainly given by inability to sustain fixation. As with all such studies, there are limitations that offer opportunities for further research. Adapting our method to these variations within the patient population is a goal worth pursuing.

Acknowledgments

This research has been partly funded by the Spanish Ministerio de Economía y Competitividad and Fondos FEDER (project DPI2009-08879) and projects TEC2010-09834-E and TEC2010-20307. The first author also thanks the Spanish Ministerio de Educación for an FPU doctoral scholarship. The authors especially thank the Pérez-Cabré and the Sisquella-Cabré families for their cooperation in the experimental session.

References

1.

T. J. BennettC. J. Barry, “Ophthalmic imaging today: an ophthalmic photographer’s viewpoint—a review,” Clin. Exp. Ophthalmol., 37 (1), 2 –13 (2009). http://dx.doi.org/10.1111/ceo.2009.37.issue-1 1442-6404 Google Scholar

2.

M. Moscaritoloet al., “An image based auto-focusing algorithm for digital fundus photography,” IEEE Trans. Med. Imag., 28 (11), 1703 –1707 (2009). http://dx.doi.org/10.1109/TMI.2009.2019755 ITMID4 0278-0062 Google Scholar

3.

A. G. Marrugoet al., “Retinal image restoration by means of blind deconvolution,” J. Biomed. Opt., 16 (11), 116016 (2011). http://dx.doi.org/10.1117/1.3652709 JBOPFO 1083-3668 Google Scholar

4.

R. BernardesP. SerranhoC. Lobo, “Digital ocular fundus imaging: a review,” Ophthalmologica, 226 (4), 161 –181 (2011). http://dx.doi.org/10.1159/000329597 OPHTAD 0030-3755 Google Scholar

5.

M. D. AbramoffM. K. GarvinM. Sonka, “Retinal imaging and image analysis,” IEEE Rev. Biomed. Eng., 3 169 –208 (2010). http://dx.doi.org/10.1109/RBME.2010.2084567 Google Scholar

6.

A. G. Marrugoet al., “No-reference quality metrics for eye fundus imaging,” CAIP 2011, Lect. Notes Comput. Sci., 6854 486 –493 (2011). http://dx.doi.org/10.1007/978-3-642-23672-3 0302-9743 Google Scholar

7.

S. GabardaG. Cristóbal, “Blind image quality assessment through anisotropy,” J. Opt. Soc. Am. A, 24 (12), B42 –B51 (2007). http://dx.doi.org/10.1364/JOSAA.24.000B42 JOAOD6 0740-3232 Google Scholar

8.

P. Bedggoodet al., “Characteristics of the human isoplanatic patch and implications for adaptive optics retinal imaging,” J. Biomed. Opt., 13 (2), 024008 (2008). http://dx.doi.org/10.1117/1.2907211 JBOPFO 1083-3668 Google Scholar

9.

M. SubbaraoT. S. ChoiA. Nikzad, “Focusing techniques,” Opt. Eng. , 32 (11), 2824 –2836 (1993). http://dx.doi.org/10.1117/12.147706 OPENEI 0892-354X Google Scholar

10.

F. C. A. GroenI. T. YoungG. Ligthart, “A comparison of different focus functions for use in autofocus algorithms,” Cytometry, 6 (2), 81 –91 (1985). http://dx.doi.org/10.1002/(ISSN)1097-0320 CYTODQ 0196-4763 Google Scholar

11.

S. Y. Leeet al., “Enhanced autofocus algorithm using robust focus measure and fuzzy reasoning,” IEEE Trans. Circuits System. Video Technol., 18 (9), 1237 –1246 (2008). http://dx.doi.org/10.1109/TCSVT.2008.924105 ITCTEM 1051-8215 Google Scholar

12.

J. Kautskyet al., “A new wavelet-based measure of image focus,” Pattern Recogn. Lett., 23 (14), 1785 –1794 (2002). http://dx.doi.org/10.1016/S0167-8655(02)00152-6 PRLEDG 0167-8655 Google Scholar

13.

V. AslantasR. Kurban, “A comparison of criterion functions for fusion of multi-focus noisy images,” Opt. Commun., 282 (16), 3231 –3242 (2009). http://dx.doi.org/10.1016/j.optcom.2009.05.021 OPCOB8 0030-4018 Google Scholar

14.

P. LiatsisP. Kantartzis, “Real-time colour segmentation and autofocus in retinal images,” in ELMAR, 47th International Symposium, 13 –18 (2005). Google Scholar

15.

S. K. NayarY. Nakagawa, “Shape from focus,” IEEE Trans. Pattern Anal. Mach. Intell., 16 (8), 824 –831 (1994). http://dx.doi.org/10.1109/34.308479 ITPIDJ 0162-8828 Google Scholar

16.

K. S. ChoiJ. S. LeeS. J. Ko, “New autofocusing technique using the frequency selective weighted median filter for video cameras,” IEEE Trans. Consum. Electron., 45 (3), 820 –827 (1999). http://dx.doi.org/10.1109/30.793616 ITCEDA 0098-3063 Google Scholar

17.

M. SubbaraoJ.-K. Tyan, “Selecting the optimal focus measure for autofocusing and depth-from-focus,” IEEE Trans. Pattern Anal. Mach. Intell., 20 (8), 864 –870 (1998). http://dx.doi.org/10.1109/34.709612 ITPIDJ 0162-8828 Google Scholar

18.

N. N. K. ChernP. A. NeowM. H. Ang, “Practical issues in pixel-based autofocusing for machine vision,” 2791 –2796 (2001). http://dx.doi.org/10.1109/ROBOT.2001.933045 Google Scholar

19.

G. K. Wallace, “The jpeg still picture compression standard,” IEEE Trans. Consum. Electron., 38 (1), 18 –34 (1992). http://dx.doi.org/10.1109/30.125072 0098-3063 Google Scholar

20.

J. Ramirezet al., “A new architecture to compute the discrete cosine transform using the quadratic residue number system,” 321 –324 (2000). http://dx.doi.org/10.1109/ISCAS.2000.857429 Google Scholar

21.

M. Kristanet al., “A Bayes-spectral-entropy-based measure of camera focus using a discrete cosine transform,” Pattern Recogn. Lett., 27 (13), 1431 –1439 (2006). http://dx.doi.org/10.1016/j.patrec.2006.01.016 PRLEDG 0167-8655 Google Scholar

22.

Z. LianM. J. Er, “Illumination normalisation for face recognition in transformed domain,” Electron. Lett., 46 (15), 1060 –1061 (2010). http://dx.doi.org/10.1049/el.2010.1495 ELLEAK 0013-5194 Google Scholar

23.

W. ChenM. J. ErS. Wu, “Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain,” IEEE Trans. Syst. Man Cybern. B Cybern., 36 (2), 458 –466 (2006). http://dx.doi.org/10.1109/TSMCB.2005.857353 1083-4419 Google Scholar

24.

A. Suppapitnarmet al., “A simulated annealing algorithm for multiobjective optimization,” Eng. Optimiz., 33 (1), 59 –85 (2000). http://dx.doi.org/10.1080/03052150008940911 EGOPAX 0305-215X Google Scholar

25.

V. GranvilleM. KrivanekJ.-P. Rasson, “Simulated annealing: a proof of convergence,” IEEE Trans. Pattern Anal. Mach. Intell., 16 (6), 652 –656 (1994). http://dx.doi.org/10.1109/34.295910 ITPIDJ 0162-8828 Google Scholar

26.

A. B. Watson, “Perceptual optimization of DCT color quantization matrices,” in ICIP94, IEEE Int. Conf. Image Proc., 100 –104 (1994). http://dx.doi.org/10.1109/ICIP.1994.413283 Google Scholar

Citation Download Citation

Andres G. Marrugo, María S. Millán, Hector C. Abril Baez, Gabriel Cristóbal, and Salvador Gabarda "Anisotropy-based robust focus measure for non-mydriatic retinal imaging," Journal of Biomedical Optics 17(7), 076021 (17 July 2012). https://doi.org/10.1117/1.JBO.17.7.076021

Published: 17 July 2012

Access the abstract

JOURNAL ARTICLE
10 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

CITATIONS

Cited by 9 scholarly publications.

Explore citations on Lens.org

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Fermium

Frequency modulation

Eye

Cameras

Anisotropy

Imaging systems

Retinal scanning

1.

Introduction

1.1.

Focusing

Eq. (1)

2.

Related Works

Eq. (2)

Eq. (3)

Eq. (4)

Eq. (5)

Eq. (6)

Eq. (7)

3.

Discrete Cosine Transform

Eq. (8)

Eq. (9)

Fig. 1

3.1.

Normalized DCT

Eq. (10)

Eq. (11)

Fig. 2

4.

Focus Measure

4.1.

Measure of Anisotropy

Eq. (12)

Fig. 3

4.2.

DCT Coefficient Weighting

Eq. (13)

4.3.

Implementation

Fig. 4

5.

Results

5.1.

Simulated Images and Robustness Assessment

Eq. (14)

Fig. 5

Table 1

5.2.

Real Images

Fig. 6

Fig. 7

Fig. 8

Fig. 9

6.

Conclusion

Acknowledgments

References

Show All Keywords

Keywords/Phrases

Search In:

Publication Years