Straightforward path to Zernike polynomials

Anthony Yen

doi:10.1117/1.JMM.20.2.020501

28 April 2021 Straightforward path to Zernike polynomials

Anthony Yen

Journal of Micro/Nanopatterning, Materials, and Metrology, Vol. 20, Issue 2, 020501 (April 2021). https://doi.org/10.1117/1.JMM.20.2.020501

Abstract

Starting from Weierstrass’ approximation theorem, Zernike polynomials are obtained by a few straightforward steps involving only the recast of the aberration function as a double sum in the polar coordinates followed by the weighted orthogonalization of a power series. The origin of the name Fringe Zernike polynomials is also explained.

Zernike polynomials are used extensively in microlithography to characterize the imaging optics and in evaluating the resulting images. Yet few lithographers have questioned how these polynomials are obtained. Frits Zernike invented the eponymous circle polynomials as solutions of a self-adjoint differential equation subject to circular boundary conditions.¹^–³ The angular parts of his solutions are simply $\cos m φ$ and $\sin m φ$ with $m \geq 0$ , but the general expression for the radial part of the solution for $0 \leq r \leq 1$ looks quite daunting at first:

Eq. (1)

R_{n}^{m} (r) = \frac{r^{- m}}{(\frac{n - m}{2})!} {(\frac{d}{d (r^{2})})}^{(n - m) / 2} {r^{n + m} {(r^{2} - 1)}^{(n - m) / 2}} = \sum_{s = 0}^{(n - m) / 2} {(- 1)}^{s} \frac{(n - s)!}{s! (\frac{n + m}{2} - s)! (\frac{n - m}{2} - s)!} r^{n - 2 s} .

An alternate derivation of the above formulae is described in Born and Wolf.⁴ To understand fully both derivations, the reader has to be familiar with some specialized topics in mathematical physics. It is the purpose of this letter to lessen the complexity and demonstrate that the Zernike polynomials can be obtained using straight-forward mathematics involving three steps described below.

Let $W (x, y)$ be an aberration function or any function that is continuous within and on the unit circle. According to Weierstrass’ approximation theorem, $W (x, y)$ may be expressed in a polynomial to arbitrary degree of accuracy:

Eq. (2)

W (x, y) = \lim_{N \to \infty} \sum_{p, q = 0}^{N} A_{p q} x^{p} y^{q},

where

p

and

q

are integers.⁵ Anticipating the split of the polynomials into radial and angular parts, our first step is to express

x^{p} y^{q}

in polar corrdinates. We accomplish this by letting

x = r \cos θ

and

y = r \sin θ

and making use of Euler’s formula and the binomial theorem

x^{p} y^{q} = {(r \cos θ)}^{p} {(r \sin θ)}^{q} = r^{p + q} {(\frac{e^{i θ} + e^{- i θ}}{2})}^{p} {(\frac{e^{i θ} - e^{- i θ}}{2 i})}^{q} = \frac{r^{p + q}}{2^{p + q} i^{q}} \sum_{s = 0}^{p} (\begin{matrix} p \\ s \end{matrix}) {(e^{i θ})}^{p - s} {(e^{- i θ})}^{s} \sum_{t = 0}^{q} (\begin{matrix} q \\ t \end{matrix}) {(e^{i θ})}^{q - t} {(- e^{- i θ})}^{t} = \frac{r^{p + q}}{2^{p + q} i^{q}} \sum_{s = 0}^{p} \sum_{t = 0}^{q} (\begin{matrix} p \\ s \end{matrix}) (\begin{matrix} q \\ t \end{matrix}) {(- 1)}^{t} {(e^{i θ})}^{p + q - (s + t)} (e^{- i θ})^{s + t} = \frac{r^{p + q}}{2^{p + q} i^{q}} e^{i (p + q) θ} \sum_{l = 0}^{p + q} C_{l} e^{- 2 i l θ},

where we have let

l = s + t

and combined all the exponential terms with fixed

l

into a single term with coefficient

C_{l}

. We then insert the above expression into Eq. (2) and get

Eq. (3)

W (x, y) = W (r, θ) = \sum_{m = 0}^{\infty} r^{m} \sum_{l = 0}^{m} C_{l m} e^{i (m - 2 l) θ},

where we have let

m = p + q

and, just like before, combined all the exponential terms with fixed

m

into a single term with coefficient

C_{l m}

.

Our second step is to re-arrange the terms in Eq. (3). To start, let us write down its first few terms, say from $m = 0$ to $m = 4$ . They are

C_{00} + r (C_{01} e^{i θ} + C_{11} e^{- i θ}) + r^{2} (C_{02} e^{i 2 θ} + C_{12} + C_{22} e^{- i 2 θ}) + r^{3} (C_{03} e^{i 3 θ} + C_{13} e^{i θ} + C_{23} e^{- i θ} + C_{33} e^{- i 3 θ}) + r^{4} (C_{04} e^{i 4 θ} + C_{14} e^{i 2 θ} + C_{24} + C_{34} e^{- i 2 θ} + C_{44} e^{- i 4 θ}) .

These terms can be rearranged so that the following pattern can be seen:

{C_{00} + r (C_{01} e^{i θ} + C_{11} e^{- i θ}) + r^{2} (C_{02} e^{i 2 θ} + C_{22} e^{- i 2 θ}) + r^{3} (C_{03} e^{i 3 θ} + C_{33} e^{- i 3 θ}) + r^{4} (C_{04} e^{i 4 θ} + C_{44} e^{- i 4 θ})} + {r^{2} C_{12} + r^{3} (C_{13} e^{i θ} + C_{23} e^{- i θ}) + r^{4} (C_{14} e^{i 2 θ} + C_{34} e^{- i 2 θ})} + {r^{4} C_{24}} .

Continuing this process and making use of Euler’s formula, Eq. (3) can be expressed as

W (r, θ) = \sum_{m = 0}^{\infty} r^{m} (A_{m} \cos m θ + B_{m} \sin m θ) + \sum_{m = 0}^{\infty} r^{2} r^{m} (A_{m}^{'} \cos m θ + B_{m}^{'} \sin m θ) + \sum_{m = 0}^{\infty} r^{4} r^{m} (A_{m}^{″} \cos m θ + B_{m}^{″} \sin m θ) + \dots

Grouping all the cosine and sine terms together, we have

Eq. (4)

W (r, θ) = \sum_{m = 0}^{\infty} \cos m θ r^{m} (A_{m} + A_{m}^{'} r^{2} + A_{m}^{'} r^{4} + \dots) + \sum_{m = 0}^{\infty} \sin m θ r^{m} (B_{m} + B_{m}^{'} r^{2} + B_{m}^{'} r^{4} + \dots) = \sum_{m = 0}^{\infty} \cos m θ r^{m} \sum_{k = 0}^{\infty} A_{m k} r^{2 k} + \sum_{m = 0}^{\infty} \sin m θ r^{m} \sum_{k = 0}^{\infty} B_{m k} r^{2 k} .

In reaching the above expression, no requirement of rotational symmetry about an axis had to be imposed.

Our third and last step involves expressing $r^{2 k}$ as a linear combination of orthogonal polynomials satisfying the orthogonal relation over the interval [0, 1]. Once this is accomplished, both summations over $k$ can be expressed as linear combinations of these polynomials. Therefore, the first thing to do is to obtain these orthogonal radial polynomials (actually the Zernike radial polynomials) by orthogonalizing the set ${1, r^{2}, r^{4} \dots r^{2 k} \dots}$ . We do this by first letting $r^{2} = u$ so that the orthogonalization process becomes for the set ${1, u, u^{2}, \dots u^{k} \dots}$ . We may then associate the orthogonalization of this set with shifted Legendre polynomials $P_{k} (u)$ . $P_{k} (u)$ ’s can be obtained through the Gram-Schmidt orthogonalization process on the above set or simply by making use of the formula⁶

P_{k} (u) = \frac{1}{k!} \frac{d^{k}}{d u^{k}} {u^{k} {(u - 1)}^{k}} .

(See also Ref. 5, pp. 233–239. The Legendre polynomials discussed in the text are defined on the interval

[- 1,1]

and the associated formula is called Rodrigues’ formula.) The first three shifted Legendre polynomials are

P_{0} (u) = 1

,

P_{1} (u) = 2 u - 1

,

P_{2} (u) = 6 u^{2} - 6 u + 1

. Therefore, we may express 1 as

P_{0} (u)

,

u

as

\frac{P_{1} (u)}{2} + \frac{P_{0} (u)}{2}

,

u^{2}

as

\frac{P_{2} (u)}{6} + \frac{P_{1} (u)}{2} + \frac{P_{0} (u)}{3}

, and so on. Hence any linear combination of powers of

u

can be expressed as a linear combination of

P_{k} (u)

’s. There is only one catch, however. We have to include the common factor

r^{m} = u^{m / 2}

in Eq. (4) in the orthogonalization process, so if

G_{k}^{m} (u)

’s are the resulting polynomials, our orthogonalization relation has to be, instead of

\int_{0}^{1} P_{k} (u) \cdot P_{k^{'}} (u) d u = C o n s t . δ_{k k^{'}}

,

Eq. (5)

\int_{0}^{1} u^{m / 2} G_{k}^{m} (u) \cdot u^{m / 2} G_{k^{'}}^{m} (u) d u = \int_{0}^{1} G_{k}^{m} (u) G_{k^{'}}^{m} (u) u^{m} d u = C o n s t . δ_{k k^{'} .}

The second integral in Eq. (5) suggests that the presence of the factor $u^{m / 2}$ may be regarded as orthogonalizing the set ${1, u, u^{2}, \dots u^{k} \dots}$ with the weight $u^{m}$ . The formula for the polynomials obtained by orthogonalizing the set ${1, u, u^{2}, \dots}$ with the weight equal to $u^{m}$ instead of 1 (which would result in shifted Legendre polynomials) is given as

Eq. (6)

G_{k}^{m} (u) = \frac{1}{k!} \frac{1}{u^{m}} \frac{d^{k}}{d u^{k}} {u^{m} u^{k} {(u - 1)}^{k}} .

The validity of Eq. (6) can be established as follows. First, the polynomial so generated is of order $k$ because the term of the highest power inside the brackets to be differentiated is $u^{m + 2 k}$ . Second, the following integral is valid (Ref. 6, p. 324):

\int_{0}^{1} G_{k}^{m} (u) u^{k^{'}} u^{m} d u = 0, 0 \leq k^{'} < k .

Since $G_{k^{'}}^{m} (u)$ is a linear combination of powers of $u$ with $u^{k^{'}}$ being the term of the highest power, Eq. (5) therefore stands.

Now with powers of $u$ represented by $G_{k}^{m} (u)$ ’s, Eq. (4) can be recast as

W (r, θ) = \sum_{m, k = 0}^{\infty} C_{k}^{m} r^{m} G_{k}^{m} (r^{2}) \cos m θ + \sum_{m, k = 0}^{\infty} D_{k}^{m} r^{m} G_{k}^{m} (r^{2}) \sin m θ,

where

C_{k}^{m}

and

D_{k}^{m}

are the new coefficients.

The Zernike polynomial is simply $Z_{k}^{m} (r, θ) = r^{m} G_{k}^{m} (r^{2}) {\begin{matrix} \cos m θ \\ \sin m θ \end{matrix} = R_{k}^{m} (r) {\begin{matrix} \cos m θ \\ \sin m θ \end{matrix}$ , where $R_{k}^{m} (r) = r^{m} G_{k}^{m} (r^{2})$ is called the Zernike radial polynomial. Since the angular parts are already orthogonal, as

\int_{0}^{2 π} \cos m θ \sin m^{'} θ d θ = 0, \int_{0}^{2 π} \cos m θ \cos m^{'} θ d θ = π (1 + δ_{m 0}) δ_{m m^{'}}, \int_{0}^{2 π} \sin m θ \sin m^{'} θ d θ = π (1 - δ_{m 0}) δ_{m m^{'}},

and since

R_{k}^{m} (r)

’s satisfy

\int_{0}^{1} R_{k}^{m} (r) R_{k^{'}}^{m} (r) d r^{2} = \int_{0}^{1} G_{k}^{m} (r^{2}) G_{k^{'}}^{m} (r^{2}) r^{2 m} d r^{2} = C o n s t . δ_{k k^{'}}

because of Eq. (5),

Z_{k}^{m} (r, θ)

’s therefore satisfy the orthogonal relation over an area bounded by the unit circle, as

\int_{0}^{1} \int_{0}^{2 π} Z_{k}^{m} (r, θ) Z_{k^{'}}^{m^{'}} (r, θ) r d r d θ = C o n s t . δ_{k k^{'}} δ_{m m^{'}} .

The explicit expression for the Zernike radial polynomials can now be written down immediately as

R_{k}^{m} (r) = r^{m} G_{k}^{m} (r^{2}) = \frac{r^{- m}}{k!} \frac{d^{k}}{d {(r^{2})}^{k}} {r^{2 (m + k)} {(r^{2} - 1)}^{k}} .

Defining

n = m + 2 k

brings us to Eq. (1) put forth originally by Frits Zernike.

Incidentally, using $k$ instead of $n$ to index the Zernike polynomials is not a bad thing. One advantage is that $k$ is independent of $m$ . The ordering sequence of the Zernike polynomials used by Zeiss and ASML is a modified version of the indexing scheme originated at the University of Arizona. We can learn the origin of this Fringe indexing scheme from Katherine Creath and Robert E. Parks’ article:⁷ “The first program for analyzing interferograms was written by Jim Rancourt, PhD 1974 (Fig. 11),[19]… Later, Loomis, PhD 1980, wrote a FRINGE MANUAL, and updated the program to output the 37 “FRINGE” Zernike polynomials,[20] and the beginning of the confusion about whose numbering of the polynomials one might be using.” Citation [19] in their article is: Optical Sciences Center, “FRINGE Software Program,” OSC Newsletter 8(12), 29 (1974). Citation [20] refers to John S. Loomis, FRINGE User’s Manual, Optical Sciences Center, University of Arizona, Tucson, AZ, November 1976. Hence we believe that it was John Loomis who invented this indexing scheme in conjunction with the wavefront-fitting program called FRINGE, originally written by Jim Rancourt. It is therefore a gross misnomer that the Zernike polynomials we lithographers use are often referred to as Fringe Zernike polynomials, as if there are various sets of such polynomials; it is the “Fringe” indexing scheme of the one and only set of Zernike polynomials!

The indexing scheme used by Zeiss and ASML is shown in Fig. 1. As one can see, rows are arranged by the ascending order of $m + k$ . Since the power of every radial polynomial is $n = m + 2 k$ and since $(m + 2 k) + m = 2 (m + k)$ is fixed for every row, the rightmost entry of every row, with $m = 0$ , has the highest power. Table 1 lists explicitly the Zernike polynomials according to this indexing scheme.

Fig. 1

Indexing scheme of Zernike polynomials used by Zeiss and ASML. These plots were originally generated by Marco Moers.

Table 1

Explicit expressions of the first 36 Zernike polynomials.

Index	Mathematical expression	Name	m (period)	m+k	n=m+2k (power)
1	1	Piston	0	0	0
2	$r \cos θ$	Tilt $x$	1	1	1
3	$r \sin θ$	Tilt $y$	1	1	1
4	$2 r^{2} - 1$	Focus	0	1	2
5	$r^{2} \cos 2 θ$	Astigmatism $x$	2	2	2
6	$r^{2} \sin 2 θ$	Astigmatism $y$	2	2	2
7	$(3 r^{3} - 2 r) \cos θ$	Coma $x$	1	2	3
8	$(3 r^{3} - 2 r) \sin θ$	Coma $y$	1	2	3
9	$6 r^{4} - 6 r^{2} + 1$	Spherical aberration	0	2	4
10	$r^{3} \cos 3 θ$	Three-fold $x$	3	3	3
11	$r^{3} \sin 3 θ$	Three-fold $y$	3	3	3
12	$(4 r^{4} - 3 r^{2}) \cos 2 θ$	Astigmatism $x$	2	3	4
13	$(4 r^{4} - 3 r^{2}) \sin 2 θ$	Astigmatism $y$	2	3	4
14	$(10 r^{5} - 12 r^{3} + 3 r) \cos θ$	Coma $x$	1	3	5
15	$(10 r^{5} - 12 r^{3} + 3 r) \sin θ$	Coma $y$	1	3	5
16	$20 r^{6} - 30 r^{4} + 12 r^{2} - 1$	Spherical aberration	0	3	6
17	$r^{4} \cos 4 θ$	Four-fold $x$	4	4	4
18	$r^{4} \sin 4 θ$	Four-fold $y$	4	4	4
19	$(5 r^{5} - 4 r^{3}) \cos 3 θ$	Three-fold $x$	3	4	5
20	$(5 r^{5} - 4 r^{3}) \sin 3 θ$	Three-fold $y$	3	4	5
21	$(15 r^{6} - 20 r^{4} + 6 r^{2}) \cos 2 θ$	Astigmatism $x$	2	4	6
22	$(15 r^{6} - 20 r^{4} + 6 r^{2}) \sin 2 θ$	Astigmatism $y$	2	4	6
23	$(35 r^{7} - 60 r^{5} + 30 r^{3} - 4 r) \cos θ$	Coma $x$	1	4	7
24	$(35 r^{7} - 60 r^{5} + 30 r^{3} - 4 r) \sin θ$	Coma $y$	1	4	7
25	$70 r^{8} - 140 r^{6} + 90 r^{4} - 20 r^{2} + 1$	Spherical aberration	0	4	8
26	$r^{5} \cos 5 θ$	Five-fold $x$	5	5	5
27	$r^{5} \sin 5 θ$	Five-fold $y$	5	5	5
28	$(6 r^{6} - 5 r^{4}) \cos 4 θ$	Four-fold $x$	4	5	6
29	$(6 r^{6} - 5 r^{4}) \sin 4 θ$	Four-fold $y$	4	5	6
30	$(21 r^{7} - 30 r^{5} + 10 r^{3}) \cos 3 θ$	Three-fold $x$	3	5	7
31	$(21 r^{7} - 30 r^{5} + 10 r^{3}) \sin 3 θ$	Three-fold $y$	3	5	7
32	$(56 r^{8} - 105 r^{6} + 60 r^{4} - 10 r^{2}) \cos 2 θ$	Astigmatism $x$	2	5	8
33	$(56 r^{8} - 105 r^{6} + 60 r^{4} - 10 r^{2}) \sin 2 θ$	Astigmatism $y$	2	5	8
34	$(126 r^{9} - 280 r^{7} + 210 r^{5} - 60 r^{3} + 5 r) \cos θ$	Coma $x$	1	5	9
35	$(126 r^{9} - 280 r^{7} + 210 r^{5} - 60 r^{3} + 5 r) \sin θ$	Coma $y$	1	5	9
36	$252 r^{10} - 630 r^{8} + 560 r^{6} - 210 r^{4} + 30 r^{2} - 1$	Spherical aberration	0	5	10

If the pupil function is rather roughly behaved, it may be necessary to include Zernike polynomials of very high orders. For numerical computations involving Zernike radial polynomials of $n \geq 40$ , Janssen and Dirksen suggested an alternate form of Eq. (1) with advantages in computation time, accuracy and ease of implementation.⁸ Based on Janssen and Dirksen’s integral expression, Shakibaei and Paramesran found a concise recursive relation for $R_{n}^{m} (r)$ leading to a reduction in computational complexity.⁹

Acknowledgments

The author wishes to thank A. J. E. M. Janssen for helpful suggestions and for clarifications regarding the derivation of an equation in Ref. 8 and the derivation of Zernike polynomials in Born and Wolf.¹⁰ He also thanks Bernd Geh for helpful suggestions and extensive discussions on the broader topic associated with Zernike polynomials.

References

1.

F. Zernike, “Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode,” Physica, 1 689 –704 (1934). https://doi.org/10.1016/S0031-8914(34)80259-5 Google Scholar

2.

F. Zernike, “Diffraction theory of the knife-edge test and its improved form, the phase-contrast method,” J. Micro/Nanolithogr. MEMS MOEMS, 1 (2), 87 –94 (2002). https://doi.org/10.1117/1.1488608 Google Scholar

3.

B. R. A. Nijboer, “The diffraction theory of aberrations,” (1942). Google Scholar

4.

M. Born and E. Wolf, Principle of Optics, 7th ed.Cambridge University Press, Cambridge (1999). Google Scholar

5.

F. W. Byron and R. W. Fuller, Mathematics of Classical and Quantum Physics, 239 Dover Publications, New York (1992). Google Scholar

6.

A. Yen and S.-S. Yu, Optical Physics for Nanolithography, 316 –320 SPIE Press, Bellingham, Washington, DC (2018). Google Scholar

7.

K. Creath and R. E. Parks, “Optical metrology at the optical sciences center: a historical review,” Proc. SPIE, 9186 91860T (2014). https://doi.org/10.1117/12.2064376 PSISDG 0277-786X Google Scholar

8.

A. J. E. M. Janssen and P. Dirksen, “Computing Zernike polynomials of arbitrary degree using the discrete Fourier transform,” J. Eur. Opt. Soc. Rapid Publ., 2 07012 (2007). https://doi.org/10.2971/jeos.2007.07012 Google Scholar

9.

B. H. Shakibaei and R. Paramesran, “Recursive formula to compute Zernike radial polynomials,” Opt. Lett., 38 2487 (2013). https://doi.org/10.1364/OL.38.002487 OPLEDP 0146-9592 Google Scholar

10.

J. Braat and P. Török, Imaging Optics, 888 –890 Cambridge University Press, Cambridge (2019). Google Scholar

Citation Download Citation

Anthony Yen "Straightforward path to Zernike polynomials," Journal of Micro/Nanopatterning, Materials, and Metrology 20(2), 020501 (28 April 2021). https://doi.org/10.1117/1.JMM.20.2.020501

Received: 13 February 2021; Accepted: 8 April 2021; Published: 28 April 2021

Access the abstract

JOURNAL ARTICLE
7 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY