We present a fast algorithm for computing the diffracted field from arbitrary binary (sharp-edged) planar apertures and occulters in the scalar Fresnel approximation for up to moderately high Fresnel numbers (≲103). It uses a high-order areal quadrature over the aperture and then exploits a single 2D nonuniform fast Fourier transform to evaluate rapidly at target points (on the order of 107 such points per second, independent of aperture complexity). It thus combines the high accuracy of edge integral methods with the high speed of Fourier methods. Its cost is O ( n2 log n ) , where n is the linear resolution required in the source and target planes, to be compared with O ( n3 ) for edge integral methods. In tests with several aperture shapes, this translates to between two and five orders of magnitude acceleration. In starshade modeling for exoplanet astronomy, we find that it is roughly 104 × faster than the state-of-the art in accurately computing the set of telescope pupil wavefronts. We provide a documented, tested MATLAB/Octave implementation. An appendix shows the mathematical equivalence of the boundary diffraction wave, angular integration, and line integral formulas and then the analysis of a new non-singular reformulation that eliminates their common difficulties near the geometric shadow edge. This supplies a robust edge integral reference against which to validate the main proposal. |
1.IntroductionThe numerical modeling of wave diffraction from thin two-dimensional (2D) screens and apertures in the Fresnel regime has many applications in optics1 and acoustics (Ref. 2, Sec. 8.4), including instrument modeling,3,4 lithography mask design,5 Fourier optics,6 coherent x-rays,7 acoustic emission,8 computer-generated binary holograms,9 starshades,10 and Fresnel zone plate imagers.11,12 This usually involves a plane or spherical wavefront hitting a binary (“0–1”) mask of given shape, although continuous opacity/phase variation is also possible. We confine ourselves to the former case, although the method can trivially accommodate the latter. Our point is to present a simple—yet seemingly overlooked—method that renders a high-accuracy numerical modeling at moderate Fresnel numbers orders of magnitude more efficient than prior methods. One motivation is starshade design.10,13–17 Given a space telescope, the goal is that a distant binary occulter blocks the direct light from a star, allowing for much dimmer exoplanets separated from it by only tens of milli-arcseconds to be imaged. The occulter shape and distance are thus optimized to give a deep shadow region, with a relative intensity on the order of across the telescope pupil, throughout a given wavelength range, while minimizing the occulter’s physical size (for practical reasons) and the angular size at the telescope. This has led to shapes with “petals” that emulate a continuous radial apodization, with radii on the order of 10 m and distances on the order of . Here the scalar18 and Fresnel approximations are superb (Ref. 17, App. A), with a small Fresnel number [defined in Eq. (2)] of typically 5 to 20. Numerical modeling is challenging, demanding at least 6-digit accuracy in amplitude to validate the shadow and many runs with different wavelengths and shapes to assess mechanical and thermal stability.16 Fixing a wavelength and propagation distance , a point in the aperture (or occulter) plane is and a target point in the detector (or pupil) plane is (see Fig. 1). We drop the constant -coordinates hereafter, so the problem is in essence 2D. In the case of a unit amplitude plane wave with wavevector incident on a planar aperture , the mathematical task is to evaluate the Fresnel integral for the scalar potential: This takes the form of a 2D convolution of the aperture’s characteristic function with a radially symmetric kernel (a complex Gaussian), with half-wave oscillation regions that are commonly called “zones” (Fig. 1). If is an effective (or maximum) radius of , then the in-plane separation is typically bounded by times a small constant. Thus the number of zones inside is on the order of and the finest oscillation scale of the integrand is . In Eq. (1), the prefactor ensures that tends to unity in the limit of a large aperture, i.e., the unimpeded wave.It is worth reviewing the origin of Eq. (1). It arises from the Kirchhoff diffraction approximation to the full Maxwell equations. This is good when aperture features are much larger than (Ref. 6. Ch. 3). Only the zeroth and first term in the Taylor expansion of the exponent in the free-space Green’s function are then kept, being the source-target distance (dotted line in Fig. 1). The Fresnel approximation is thus valid to the extent that the next term is small, implying the condition . The denominator of the Green’s function is approximated by . We refer the reader to (Ref. 1, Sec. 8.3.3) (Ref. 6, Ch. 4) for details. Note that the zeroth term gave the plane propagation phase , which is usually included as a prefactor in Eq. (1). We drop it for simplicity; it is trivial to insert. By replacing by , Eq. (1) also applies when a point source at finite distance produces a spherical incident wave.1,17,19 In the plane wave case, by Babinet’s principle (Ref. 1, Sec. 8.3.2) (Ref. 14, Sec. 2.2), the potential when defines an occulter rather than an aperture is simply given by Analytical forms for the integral Eq. (1) are known only for the straight edge, infinite slit, and rectangles (all involving the same 1D special function) (Ref. 1, Sec. 8.7) (Ref. 6, Sec. 4.5.1) and the disc.20–22 Semianalytical Bessel expansions can be useful for symmetric starshade design.10,14 But for general shapes, one is left with fully numerical methods, which fall into two main categories [sketched in Figs. 2(a) and 2(b)]:
Remark 1. Such edge integral methods should not be confused with the (more involved) boundary integral equation method, which solves the 3D Helmholtz or Maxwell equations using surface unknowns and potential theory.28,29 Our proposal combines the best features of the above two categories, namely the speed of the FFT methods with the high (and potentially high order) accuracy of edge integral methods for binary apertures. The result is an acceleration over edge integral methods of between two and five orders of magnitude. In a nutshell, the idea is, realizing that a 2D regular grid guarantees poor quadrature for integrals over , to replace it with a much better (high-order) areal quadrature scheme [see Fig. 2(c)]. Then exploiting the quadratic form in Eq. (1), as in “single FFT” methods (a)(ii), leaves one remaining problem: how to rapidly evaluate Fourier sums involving off-grid frequencies. Fortunately, fast algorithms for this task—nonuniform FFTs30,31—are quite mature and have speeds only one order of magnitude below those of plain FFTs. High accuracy relies on constructing a good areal quadrature for , which depends on its geometric description. We show that such quadratures can easily piggyback off boundary quadratures or be built independently. Our proposal is in some way related to diffraction methods that subsample the FFT32 or use chirp FFTs;4 however, it is much simpler and more general than either. Turning to the structure of this paper, Sec. 2 explains the method for arbitrary targets (Sec. 2.1) and then for gridded targets (Sec. 2.2), the latter being somewhat faster. In Sec. 3, we demonstrate the high accuracy and efficiency of the method for a smooth occulter (Sec. 3.1), two symmetric starshades (Sec. 3.2), and an aperture built from 67 million triangles (Sec. 3.3). We compare the method with the performance of a BDW edge integral code.15,27 We draw conclusions, explain how the method can be applied to perturbed starshades, and propose extensions in Sec. 4. Finally, validating the proposed nonuniform fast Fourier transform (NUFFT) method down to errors of or better demanded a high-accuracy reference edge integral method, which required new research, to which the Appendix is devoted. There we clarify that the BDW edge-integral (b)(i) and angular-integration (b)(ii) methods are equivalent and are equivalent to a more convenient line integral due to Cash.14 However, as we show, existing edge integral methods suffer numerical breakdown as targets approach (the geometric shadow edge) because they represent the (smooth) diffracted field as a sum of two discontinuous terms, one with a singular integrand. We instead present and analyze a simple, robust non-singular line integral (NSLI) that maintains close to machine accuracy for target points near or even on , without extra work, yet takes only five lines to code. Remark 2. We maintain a documented, tested, open-source MATLAB/Octave implementation of the proposed fast algorithm (using the FINUFFT library33), and the NSLI, on GitHub.34 Queries to the author are welcome. 2.Proposed MethodGiven a set of targets , , recall that the goal is to approximate Eq. (1) efficiently, i.e., to evaluate Suppose that an areal quadrature rule for the aperture has been found, that is, a set of nodes and weights , , such that, for all sufficiently smooth functions , holds to high accuracy. Specifically, one seeks a family of rules of increasing , with a high order of convergence , meaning that, for each -smooth , the error (difference between left and right sides) is . This may even hold for all , in which case the convergence is said to be super-algebraic or spectral. For instance, such a quadrature over a rectangle is given by a tensor product of 1D Gauss–Legendre rules (Ref. 35, Ch. 19). By passing those nodes through a smooth mapping of to (and multiplying the weights by its Jacobean), such rules for triangles and distorted quadrilaterals are easily made, the union of which can approximate any domain with a piecewise smooth boundary to a high order. For more node-efficient quadratures, we refer the reader to recent works on triangles,36 polygons,37,38 and domains defined by piecewise rational curves.39 Converting a general boundary into an areal quadrature for its interior is a software engineering task beyond the scope of this paper. We will be content constructing areal quadratures for three types of domains: the interior of a smooth closed curve, symmetric starshades, and unions of triangles.Since any rule Eq. (5) must resolve the Fresnel integrand oscillations, for a fixed , its number of nodes must grow like Now simply applying Eq. (5) to Eq. (4) gives a high-order accurate direct (slow) summation method that, as the results below show, can exceed accuracy requirements with reasonable numbers of nodes. (Our results show that claims such as “a single point in the shadow plane can require a trillion sine calculations at quadruple precision”14 are overly pessimistic.) The cost of this direct sum to targets would be . Our goal is now to reduce this to close to via fast Fourier methods.2.1.Fast Algorithm for Diffraction to Arbitrary Target PointsWe apply Eq. (5) to Eq. (4) and then in the second line expand each quadratic term. (This trick, common to “single FFT” methods,11,13,23 is similar to, but essentially the reverse of, that in the Bluestein method.40) This gives This factorized form allows for a three-step “fast” (in the sense of quasi-optimal scaling) algorithm:
The overall cost is thus . Recalling Eq. (6), this is . The NUFFT requires a user-chosen error tolerance , which affects the prefactor of this run time scaling, but rather weakly.33 2.2.Fast Algorithm for Target Points Lying on a GridOften sampling the diffracted wave on a dense regular Cartesian grid is sufficient. In this special case, more speed can be gained. Specifically, let be the 2D grid points defined by the product of -point regular 1D grids in the and directions, with being the grid spacing. We assume that is even. This grid has targets and (ignoring its left-most column and bottom row) is centered on the origin. If we relabel the grid points as for integer indices , and define rescaled source points and , the middle step Eq. (9) can be written which is precisely the so-called type 1 NUFFT30 (also known as the “adjoint NFFT”41). Its nonuniform points are only defined modulo and may need to be “folded” back into a valid input domain such as . If the grid width is similar in size to , then it is easy to check that such a folding is only needed if is less than the order of , i.e., the target grid under-resolves the diffracted field . This is probably not a common use case.The total cost for the regular grid case is , which, recalling Eq. (6) and , is . In practice, we find that, for the same number of targets spanning the same domain and the same tolerance , this regular grid version is around four times faster than the above arbitrary target version because the type 1 NUFFT is faster than the type 3 for the same space-bandwidth product.33 The generalization to rectangular target grids with arbitrary translations is simple, achieved through prephasing in step 1, and we will not present it here. 3.Performance Tests and ResultsWe now test the accuracy and speed of the above method in several geometries and compare it with two edge integral methods. All codes are written in MATLAB R2017a, apart from FINUFFT, which is a parallel C++ library with MEX interface. All timings are reported for double-precision arithmetic on a 4-core i7-7700HQ laptop with 32 GB RAM, using 8 threads (full hyperthreading). All codes take advantage of the multiple threads, either via vectorized MATLAB statements or via OpenMP in FINUFFT. In each geometry, we first need to describe the areal quadrature used. 3.1.Domains Defined by a Simple Smooth Closed CurveWe start by setting up a simple areal quadrature for the interior of a simple smooth closed curve. Suppose that we have good quadrature nodes and weights , indexed by , for vector line integrals on , that is, for all sufficiently smooth vector-valued functions on : where is the counter-clockwise vector line element on . Now fix a number of “radial” nodes , and let be the Gauss–Legendre (Ref. 35, Ch. 19) nodes and be their weights for the interval (0, 1). The integral over may be rewritten using a “dilation” parameterization , where and the 2D cross product is understood to give a scalar. Applying the -node rule on (0, 1) to the inner integral and Eq. (10) to the outer integral gives the “tensor product” areal quadrature for , with nodes and weights given by These nodes lie along “spokes” connecting the origin to the boundary nodes, as in Fig. 2(c). If is not star-shaped about the origin, then some of the nodes lie outside and some are negative; however, we observe little loss of accuracy unless is highly non-convex or poorly centered on the origin.Remark 3. The above “dilation” method automatically creates an areal quadrature for given only a (vector) line integral quadrature for and the convergence parameter . It thus also applies to boundaries with corners or cusps, including non-symmetric starshades. In practice, its construction time is negligible compared with that of the proposed NUFFT algorithm. Finally, we must build a vector line integral rule Eq. (10). The simplest case is when is parameterized in a counter-clockwise sense over by a smooth -periodic vector function . Then where and we apply the -point periodic trapezoid rule quadrature with nodes and equal weights . Comparing the right sides of Eqs. (10) and (13), one readsWe apply the above to build a family of areal quadratures for the kite domain with a smooth boundary , shown in Fig. 2. Its maximum radius is . We then show multiple types of error convergence for the proposed method for Fresnel diffraction in Fig. 3. The graphs not labeled “self” (i.e., blue, black, and green) show absolute error in at a single point, using as a reference solution the (converged) NSLI edge integral method in the Appendix. The other self convergence (red) graphs show the maximum error in over targets, using the converged values themselves as a reference. The “direct” (blue) graph simply sums Eq. (7) without the fast algorithm. The other graphs test two types of the proposed NUFFT method—arbitrary targets (, + signs) and gridded targets (, ∘ signs)—each at two different requested tolerances (6-digit and 12-digit). In each row of panels, the first shows convergence in (“radial” nodes), with fixed (boundary nodes), and the second vice versa. The right-most panels show, on a point grid, the converged intensity , applying Eq. (3) so that is an occulter. There are several observations as follows.
Table 1 reports CPU timings and errors for these same tasks at converged and quadrature parameters. (The areal quadrature generation is not included as it was found to be negligible.) NSLI (known to achieve 13 to 14 digits with these parameters) is used as the reference for all errors. A state-of-the-art edge integral code BDWF15 (as available in SISTER27 and documented in Ref. 34) is also tested with the same : while its median errors are as expected from its use of a second-order accurate midpoint rule, its maximum errors are larger than 1. These huge errors appear to be confined to a few target points very near , the geometric shadow edge. The main conclusions from Table 1 are as follows.
Table 1Absolute error in uoc and run times of several methods for the smooth kite occulter with converged quadrature parameters. The domain, upper two λz choices, and target region are as in Fig. 3. Blank entries in this table to be taken as repeated from above and “—” indicates not applicable. In both random and grid cases, errors are measured relative to the NSLI reference method (see Appendix). Timings for NSLI and BDWF15 are not listed for the grid cases since they are identical to the random cases. The last two rows are close to the largest Fresnel number f that the laptop can handle (errors were checked at only 104 targets in those cases). See Sec. 3.1 for other details.
Since under the hood, the NUFFT uses FFTs, the reader might worry about their RAM usage. It is very mild: for (gridded) cases, the FFT size is times (for , or twice otherwise) the requested grid size in each dimension. For , FFT dimensions scale like : for the smallest , the FFT is a tiny . This is to be compared with the FFT needed for a subpixel sampling method to reach around 6-digit accuracy in in various tests17 at similar . In the penultimate row of the table is times larger, yet the FFT is only (similar to the maximum zones), and total RAM usage is 17 GB, about the largest the laptop can handle. Another natural question is: does the method suffer at smaller than tested above? It does not: node numbers and CPU times only get smaller. In fact, by expanding the target grid in proportion to , the Fraunhofer limit is reached in a stable fashion [here the first factor in Eq. (8) should be discarded, whereas the third factor tends to unity]. 3.2.Application to Starshade ModelingIdealized starshades are described10,14,15 by a radial apodization function , where (in this section only) we use as occulter-plane polar coordinates about the origin. is 1 (indicating a fully blocking disc) for and drops in a carefully optimized fashion in to close to zero at , the maximum occulter radius, and identically 0 beyond this. Apodization over is realized via identical binary petals, each of which has an angular width at radius of . Let the function denote for the unique such that , a common definition of the principal value of an angle. Then the occulter is the “flower” shape: See Fig. 4. Note that published designs are discontinuous at (indicating a gap between petals) and at (petal tips have finite widths). In early “analytic” designs, these discontinuities were required to be no larger than about in size to minimize Arago-spot-style diffraction into the deep shadow (Ref. 14, § 4.3), but designs generated by optimization over a band10,27 have much larger gap and tip discontinuities, on the order of to , and an Arago effect that is apparently cancelled out over the band by distributed “ripples” (Ref. 14, §5) in . Recall that the task is simply to evaluate Eqs. (1) and (3) with errors in no worse than . To apply the proposed method, we build a high-order areal quadrature as follows. Since is discontinuous at , we split into the disc of radius plus each of petals. Our disc quadrature simply applies Eqs. (11) and (12) to the uniform -node line integral on its boundary, that is, Eq. (14) applied to the parameterization . We use radial nodes. We then cover each petal by nodes to handle the (outer) radial integral and then handle the (inner) arc integral at each of their node radii by angular nodes (see Fig. 4). Specifically, let be 1D Gauss–Legendre nodes and weights for the (outer) radial integral over . Similarly, let be nodes and weights for the fixed interval . Then recalling the area element , the resulting areal nodes (in Cartesians) and weights covering one petal are Other petals are obtained by rotating by multiples of . The total number of nodes is then . We fix and , leaving two (petal) convergence parameters and . In Sec. 4, we discuss applying this to perturbed (non-ideal) starshades.3.2.1.Accuracy validationIn Fig. 5, we compare the wave amplitudes along a radial slice computed by three methods for two designs of starshade: “NI2” (a small occulter with rippled profile optimized for a blue-green range16) and “HG” (a large occulter with analytic “offset hyper-Gaussian” profile14). We choose within their designed wavelength windows. The parameters used, and some CPU timings, are listed in Table 2. Since they have different geometry descriptions, we treat the two designs in turn. Table 2Parameters and CPU times for the proposed NUFFT t1 and the BDWF edge-integral to complete the same diffraction tasks, for two starshades. See Fig. 5 for comparisons of their answers. The Fresnel number f uses the maximum radius R in Eq. (2). N is the number of areal quadrature nodes and n is the number of boundary nodes.
NI2The optimized profile is available in the SISTER package27 in the form of 2462 equispaced samples covering the petal radius range . From these, we use piecewise cubic splines to interpolate at radial Gauss nodes in . Since appears to have at least 13 “bang-bang” type discontinuities (the discrete second derivative mostly takes values , for some constant , or 0), this necessarily limits accuracy to around 6 to 7 digits. By a convergence study, we found that and nodes across each petal were sufficient for areal quadrature to match this accuracy. For BDWF, we used the boundary nodes as given and used in SISTER. These have 6000 nodes per petal edge, but no nodes covering the interpetal gaps or tips (each of which is 0.03 m wide). For NSLI, we used an -node vector line integral quadrature matching the second-order accurate midpoint rule in BDWF (this match was needed to accurately handle the wide tips with a single segment). Figure 5(a) shows that the proposed NUFFT method matches both of these edge integral methods to around in in the shadow region where . NSLI agrees with to around 6 digits everywhere. However, at , the error of BDWF spikes to as approaches . Remark 4. Since , overall phase is meaningless; thus we fit the phase of BDWF to the other two methods at a single target. We then quote absolute differences in complex . This is a more predictable metric than the error in intensity , which is affected by local intensity. HGHere the profile is analytically known from Ref. 14: in , where , the maximum radius is , and . A convergence study shows that only and are needed, giving an areal quadrature of nodes. For NSLI, we used and to generate a high-order line integral quadrature using the same radii per petal, plus four Gauss nodes across each gap and tip, giving in total (see starshadeliquad in our repository34). We also fed these boundary nodes (but of course not their high-order weights) to BDWF. Figure 5(b) shows that the three methods again agree to the desired accuracy almost everywhere, apart from BDWF near where again its errors hit . Remark 5. For HG with , the errors of BDWF are summarized by 2 to 3 digits of relative accuracy overall, giving 6 to 7 digits of absolute accuracy in the deep shadow. Yet NSLI, if fed the low-order midpoint rule used inside BDWF, gives only 4-digit absolute accuracy; thus it is useless in deep shadow. Figure 5(b) thus also explores (dash-dot line) the errors of NSLI with this low-order rule and the larger : the absolute error now bottoms out at a useful . This highlights an advantage of BDWF over plain NSLI when deep shadows are modeled with poor quadrature, a subtle point explained in Remark 7. 3.2.2.Solution speedTable 2 presents CPU times for the above converged experiments on gridded targets (we omit NSLI times since they are similar to BDWF). The time to construct the areal quadrature is not included for two reasons: (i) it is never more than half the NUFFT method run time and (ii) it would be amortized away over runs at multiple wavelengths. We make the following observations.
We now turn to arbitrary targets. In Fig. 6, the intensity suppression of the two starshade designs are studied, sweeping 50 wavelengths and taking the maximum intensity over circles of varying radii ( targets at each ), using the proposed NUFFT method. The narrow-band nature of NI2 and deterioration of HG above are apparent. The entire calculation for both starshades, including quadrature generation, totals 6 s on the laptop. Finally, we report initial timing results upon having (rather crudely) inserted the NUFFT method in place of BDWF within the SISTER code base27 (see sister_mods in our repository34). Running a standard SISTER “PSF basis” generation task for the non-spinning NI2 starshade, 14 wavelengths are needed covering , at each of which targets are needed. (Targets are organized into telescope pupil grids, translated to 3149 different centers covering a sector with angle .) We had to reorganize the loop ordering since BDWF was called separately for each pupil while grouping wavelengths together for speed, whereas the NUFFT is most efficient with a single call to all targets at each wavelength. The original SISTER run time was an estimated 6.5 h (since BDWF gets about node-target pairs per second). The NUFFT method, using parameters as in the previous section and including quadrature generation, took 2.6 s. The acceleration is thus about . Remark 6. The above shows that the proposed algorithm excels in efficiency when the number of desired targets is large, resolving a region similar in size to the occulter. Since its cost is close to , dropping does not reduce run time much: then dominates, and the relative speed over edge integral methods drops in proportion. The natural question is: what is the crossover such that there is no advantage? For the NI2 starshade, since BDWF gives about , the answer is as small as for and 50 for . Thus whenever the user needs more targets than this, the NUFFT wins. 3.3.Complicated DomainAs a final example, we compute Fresnel diffraction for an aperture with a fractal boundary, specifically the standard Koch snowflake with a maximum radius . Let denote the equilateral triangle with side length , is its union with three triangles of side , is the union of with 12 triangles of side , etc., so that is the level- construction [see Fig. 7(a)]. To reach level , triangles are needed. To build an areal quadrature, the integral over each triangle is approximated by a simple node product Gauss–Legendre quadrature, by translating one vertex to the origin and then applying Eqs. (11) and (12) to the discretized line integral connecting the other two vertices (see the inset of figure). To accurately handle the oscillatory integrand as in Eq. (6), the node spacing should not exceed ; thus we designed a heuristic choice of that varied from 217 for the largest triangle to at levels and checked -convergence for Fresnel integrals for . For , the resulting total node number is about 69 million, requiring about 4 min to build in our simple implementation. The NUFFT method with is then applied to this areal quadrature to resolve the diffracted field for on a grid of target points, as shown in Fig. 7(b). For each new , this takes 4.6 s. Since has edges, an edge integral method of similar accuracy is estimated to be around to slower. We checked (via -convergence at each level) that this computed for has at least 6-digit accuracy. However, we may also interpret the calculation as an approximation to one for the limit domain with a true fractal boundary. Since the smallest triangles in have sides of , i.e., much smaller than any Fresnel zone, we are well into the regime of Richardson extrapolation in , with differences from the limit scaling like . The largest absolute change in on the grid in going from to was ; thus by extrapolation, the largest change in between and the limiting domain is around of this. Thus we may quote uniformly about 4-digit accuracy for diffraction from the limit fractal domain. (By applying Richardson to a sequence, one could in fact recover many more digits without much extra effort.) 4.Conclusion and DiscussionWe have explained a fast algorithm for Fresnel diffraction from binary aperture or occulter shapes, which achieves high accuracy via flexible areal quadrature schemes over the planar domain yet with speeds close to regular-grid FFT propagation methods, via the nonuniform FFT. Extensive tests of error convergence and CPU timings show between two and five orders of magnitude acceleration over edge integral methods at comparable accuracies. Thus at moderate Fresnel numbers , one can evaluate accurate diffraction fields from complicated shapes such as starshades almost instantaneously (0.1 s for a grid of targets). For higher , the cost starts to dominate. RAM usage starts to become a limitation only at . Along the way, we have reformulated edge integrals as an NSLI that is numerically robust (Appendix A). Although we did not exploit it, the proposed NUFFT method can trivially include smooth source-plane phase/amplitude variations while retaining accurate edge diffraction simply by multiplying by the source term at each areal node, in step 1 of Sec. 2.1. This is impossible for traditional edge integral methods, but we note exciting recent progress on including low-frequency phase variations with edge integrals (Ref. 17, Sec. 6). Our findings highlight the importance of high-order accurate quadratures, both for edge integrals and, crucially, planar integrals. We have shown how the latter may be constructed for three classes of domain (those with an existing boundary quadrature, ideal starshades, and unions of triangles). Such construction on a case-by-case basis is possible, at least up to the tolerance with which a description of is available. Yet to automate this procedure for a requested accuracy and , given “any” , raises 2D geometry representation and meshing issues common throughout engineering and scientific computing. This is a huge topic with many available tools. Given the benefits that we show in optical simulation, automated high-order areal quadratures from CAD formats should be a useful future project. 4.1.Application to Non-Ideal (Perturbed) StarshadesSince the proposed method does not exploit symmetry, it could vastly accelerate Monte Carlo studies of optical stability under the various types of realistic shape perturbations. The latter include manufacturing tolerances, in-flight misalignment, thermal distortions, and damage over time. The topic is complicated by geometry descriptions, statistical correlations, and averaging due to spinning.16 Numerical study of this is beyond the scope of this initial work. Yet, recall that once an areal quadrature exists for the desired shape , the proposed method is very simple. We explain three ways that such a quadrature can be generated as follows.
Which of the above methods, or whether another method, proves best in practice will depend on the number of runs required and spatial details of the perturbation. 4.2.Other ExtensionsBeyond the geometry-handling extensions discussed above, fruitful future directions include the following.
5.Appendix A: Fresnel Edge Integral Methods—Equivalence and Desingularization5.1.Equivalence of Edge Integral Methods for Planar Waves and AperturesDauger19 noticed that, using polar coordinates about the target , i.e., and , the Fresnel aperture integral Eq. (1) may be analytically integrated in , for each , as follows. Assuming that the target is in (geometric shadow) and that is strictly star-shaped about this target, Eq. (1) becomes where is the distance where the in-plane ray launched at angle from the target exits . If these conditions are broken, becomes multivalued. For targets outside , the second term in the square brackets must be replaced by one similar to the first but involving the value where the ray first enters . The resulting numerical method is cumbersome for a general because much effort is spent finding, and tracking as a function of , these multiple ray intersection points.19It is simpler to reformulate Eq. (1) in terms of a line integral over . Cash [Ref. 14, (46)] provides such a formula but no rigorous derivation. To remedy this, fixing the target, we write and hence , and we define the 2D vector field: After cancelling terms, its divergence is found to be simply the Fresnel integrand from Eq. (1) minus the unit 2D delta distribution at the target (the latter can be proven by excluding a small disk of radius about the target), that is, Applying the divergence theorem in , with being the unit outward normal, and, as before, taking the 2D cross product as a scalar, Substituting Eq. (18) and recalling Eq. (1) gives the line integral formula This is easily seen to be equivalent to Dauger’s formulas using the facts: (i) , (ii) , and (iii) the multiple values of correspond to folding back as is traversed.We now show that, within the Fresnel approximation, the Miyamoto–Wolf [Ref. 26, Eqs. (5.1), (5.5)] BDW formulation used by Cady15 is also equivalent to the above. Given , the plane incident wave at the aperture with unit direction vector , this states (noting that our definition excludes the phase of plane -propagation) that Here we recall that is the target-source 3D distance and define to be the unit vector pointing from the target to the source ( is notated as in standard references, but we reserve the latter for arclength). Since we are concerned with planar incidence, and . Since is implicit in Eq. (1), we insert the leading-order small-angle approximations , , and and the usual Fresnel approximation . The result is precisely Eq. (19). Thus all three edge formulations are equivalent.However, it is worth noting that BDW Eq. (20) and some formulas in Dubra–Ferrari22 have a wider range of validity than Eq. (1), Dauger’s formulas, or Eq. (19) since they allow for out-of-plane apertures and more general incident waves. 5.2.Robust Non-Singular Line Integral FormulationTo our knowledge, all edge integral numerical codes use formulas shown in the previous section to be equivalent to Eq. (19) and are thus well known to be plagued by two serious problems as follows.14,15,17,19,22
For instance, in Secs. 3.1 and 3.2.1, we saw that the BDWF code loses all accuracy near . However, once it is realized that the two problems are in fact facets of the same phenomenon, they can be made to “cancel out.” This works as follows. It is well known [Ref. 44, (6.23)] [or combining facts (i) and (ii) above] that Inserting this into Eq. (19) gives one formula that applies whether the target is inside or outside : This has no singularity as (target approaching ) because the term in square brackets is , cancelling the denominator. The integrand is as smooth as the Fresnel zones, i.e., as smooth as the diffracted field in the target plane. We believe that Eq. (22) is new.This leads to an incredibly simple yet robust code. For instance, in MATLAB, if bx and by list coordinates of nodes on , with wx and wy being the corresponding weights for a vector line integral as in Sec. 3.1, the entire NSLI code to output at a target (xi,eta) is five lines: rx = bx - xi; ry = by - eta; % components of r displacement vector r2 = rx.*rx + ry.*ry; % r^2 f = (1 - exp((1i*pi/lambdaz)*r2))./ r2; f(r2==0.0) = 0.0; % kill NaNs (target hits node) uap = sum((rx.*wy - ry.*wx).* f) / (2*pi); % cross product, quadrature Note that when a target hits a node to within machine error (), any finite value of f may be inserted in line 4 because in line 5. Yet there is a subtlety here: numerical eyebrows should immediately be raised because f involves catastrophic cancellation as . To understand why this is in fact barely a problem, we apply forward error analysis [Ref. 45, Ch. 1] and treat the real and imaginary parts separately (combining them leads to a pessimistic prediction). The imaginary part of the exp is , which, given a rounded value of r2, is computed to relative accuracy , where is the usual double precision relative error. Subtraction from 1 does not change the imaginary part. The division by r2 then results in absolute error , which then gets multiplied by the cross product, giving . Note that this holds even though r2 is necessarily inaccurate due to coordinate subtraction in line 1. Now to the real part of the exp, which is . Thus when , the real part of exp is in machine arithmetic exactly 1, which cancels the other 1 exactly, leaving zero. Since the true answer is , in this regime the final error is bounded by . On the other hand, for , catastrophic cancellation occurs: the error in the real part of exp is , so the final error is . In summary, uniformly in , the final error is bounded by . In practice, we find by comparison to the areal quadrature answers that this uniform bound is around , which is adequate. Replacing by a Taylor expansion for small (e.g., via cexprl46) could possibly gain a digit. Equation (22), in the form of the above code looped over target points, serves as our reference direct method. A documented, tested MATLAB/Octave implementation is in the repository34 bdrymeths/nsli_pts.m Remark 7. When a poor quadrature (that is, low order and few nodes) is used with deep shadow regions, the usual line integral Eq. (19) has one advantage over NSLI Eq. (22): it can in shadows achieve relative accuracy in , appropriate to the quadrature, because exactly. NSLI merely achieves absolute accuracy in ; thus it may require a better quadrature to resolve deep shadows than Eq. (19) (as implemented by, e.g., BDWF). In essence, [the “1” term in Eq. (22)] to limited accuracy, which is then poorly canceled in Eq. (3). To remedy this, our NSLI implementation also includes an option to use Eq. (19) for targets far from , combining the robustness of Eq. (22) with the deep shadow relative accuracy of traditional edge integrals. AcknowledgmentsThis work benefited from the input of the anonymous reviewers and discussions with David Spergel, Robert Vanderbei, Stuart Shaklan, and Eric Cady. The Flatiron Institute is a division of the Simons Foundation. ReferencesM. Born and E. Wolf, Principles of Optics, 6th ed.Pergamon Press, Oxford
(1980). Google Scholar
P. M. Morse and K. U. Ingard, Theoretical Acoustics, McGraw-Hill, New York
(1968). Google Scholar
M. D. Perrin et al.,
“Simulating point spread functions for the James Webb Space Telescope with WebbPSF,”
Proc. SPIE, 8442 84423D
(2012). https://doi.org/10.1117/12.925230 PSISDG 0277-786X Google Scholar
Y. Hu et al.,
“Efficient full-path optical calculation of scalar and vector diffraction using the Bluestein method,”
Light Sci. Appl., 9
(1), 119
(2020). https://doi.org/10.1038/s41377-020-00362-z Google Scholar
A. J. Bourdillon et al.,
“A critical condition in Fresnel diffraction used for ultra-high resolution lithographic printing,”
J. Phys. D, 33
(17), 2133
–2141
(2000). https://doi.org/10.1088/0022-3727/33/17/307 JPAPBE 0022-3727 Google Scholar
J. W. Goodman, Introduction to Fourier Optics, 2nd ed.McGraw-Hill(1996). Google Scholar
M. Ruiz-Lopez et al.,
“Coherent x-ray beam metrology using 2D high-resolution Fresnel-diffraction analysis,”
J. Synchrotron. Rad., 24
(1), 196
–204
(2017). https://doi.org/10.1107/S1600577516016568 JSYRES 0909-0495 Google Scholar
T. D. Mast,
“Fresnel approximations for acoustic fields of rectangularly symmetric sources,”
J. Acoust. Soc. Am., 121 3311
–3322
(2007). https://doi.org/10.1121/1.2726252 JASMAN 0001-4966 Google Scholar
P. Tsang et al.,
“Computer generation of binary Fresnel holography,”
Appl. Opt., 50
(7), B88
–B95
(2011). https://doi.org/10.1364/AO.50.000B88 APOPAI 0003-6935 Google Scholar
R. J. Vanderbei, E. J. Cady and N. J. Kasdin,
“Optimal occulter design for finding extrasolar planets,”
Astrophys. J., 665
(1), 794
–798
(2007). https://doi.org/10.1086/519452 ASJOAB 0004-637X Google Scholar
D. Serre,
“The Fresnel imager: instrument numerical model,”
Exp. Astron., 30 111
–121
(2011). https://doi.org/10.1007/s10686-010-9200-7 EXASER 0922-6435 Google Scholar
R. Wilhem and K. Laurent,
“Improvements on Fresnel arrays for high contrast imaging,”
Exp. Astron., 45 21
–40
(2018). https://doi.org/10.1007/s10686-017-9568-8 EXASER 0922-6435 Google Scholar
A. S. Lo, T. Glassman and C. Lillie,
“New worlds observer optical performance,”
Proc. SPIE, 6687 668716
(2007). https://doi.org/10.1117/12.730521 PSISDG 0277-786X Google Scholar
W. Cash,
“Analytic modeling of starshades,”
Astrophys. J., 738
(1), 76
(2011). https://doi.org/10.1088/0004-637X/738/1/76 ASJOAB 0004-637X Google Scholar
E. J. Cady,
“Boundary diffraction wave integrals for diffraction modeling of external occulters,”
Opt. Express, 20
(14), 15196
–15208
(2012). https://doi.org/10.1364/OE.20.015196 OPEXFF 1094-4087 Google Scholar
S. B. Shaklan, L. Marchen and E. Cady,
“Shape accuracy requirements on starshades for large and small apertures,”
Proc. SPIE, 10400 104001T
(2017). https://doi.org/10.1117/12.2273436 PSISDG 0277-786X Google Scholar
A. Harness et al.,
“Advances in edge diffraction algorithms,”
J. Opt. Soc. Am. A, 35
(2), 275
–285
(2018). https://doi.org/10.1364/JOSAA.35.000275 JOAOD6 0740-3232 Google Scholar
A. Harness et al.,
“Modeling non-scalar diffraction in the Princeton starshade testbed,”
Proc. SPIE, 10698 1069865
(2018). https://doi.org/10.1117/12.2310187 PSISDG 0277-786X Google Scholar
D. E. Dauger,
“Simulation and study of Fresnel diffraction for arbitrary two-dimensional apertures,”
Comput. Phys., 10
(6), 591
–604
(1996). https://doi.org/10.1063/1.168584 CPHYE2 0894-1866 Google Scholar
J. E. Harvey and J. L. Fordham,
“The spot of Arago: new relevance for an old phenomenon,”
Am. J. Phys., 52
(3), 243
–247
(1984). https://doi.org/10.1119/1.13681 AJPIAS 0002-9505 Google Scholar
G. E. Sommargren and H. J. Weaver,
“Diffraction of light by an opaque sphere. 1: Description and properties of the diffraction pattern,”
Appl. Opt., 29 4646
–4657
(1990). https://doi.org/10.1364/AO.29.004646 APOPAI 0003-6935 Google Scholar
A. Dubra and J. A. Ferrari,
“Diffracted field by an arbitrary aperture,”
Am. J. Phys., 67
(1), 87
–92
(1999). https://doi.org/10.1119/1.19195 AJPIAS 0002-9505 Google Scholar
L. Junchang and W. Yanmei,
“An indirect algorithm of Fresnel diffraction,”
Opt. Commun., 282 455
–458
(2009). https://doi.org/10.1016/j.optcom.2008.10.060 OPCOB8 0030-4018 Google Scholar
D. Mas et al.,
“Fast algorithms for free-space diffraction patterns calculation,”
Opt. Commun., 164 233
–245
(1999). https://doi.org/10.1016/S0030-4018(99)00201-1 OPCOB8 0030-4018 Google Scholar
A. F. Oskooi et al.,
“MEEP: a flexible free-software package for electromagnetic simulations by the FDTD method,”
Comput. Phys. Commun., 181
(3), 687
–702
(2010). https://doi.org/10.1016/j.cpc.2009.11.008 CPHCBZ 0010-4655 Google Scholar
K. Miyamoto and E. Wolf,
“Generalization of the Maggi-Rubinowicz theory of the boundary diffraction wave—Part II,”
J. Opt. Soc. Am., 52 626
–636
(1962). https://doi.org/10.1364/JOSA.52.000626 JOSAAH 0030-3941 Google Scholar
S. R. Hildebrandt et al.,
“Starshade imaging simulation toolkit for exoplanet reconnaissance (SISTER),”
http://sister.caltech.edu/ Google Scholar
D. Colton and R. Kress, Inverse Acoustic and Electromagnetic Scattering Theory, 93 2nd ed.Springer-Verlag, Berlin
(1998). Google Scholar
O. P. Bruno and S. K. Lintner,
“A high-order integral solver for scalar problems of diffraction by screens and apertures in three-dimensional space,”
J. Comput. Phys., 252 250
–274
(2013). https://doi.org/10.1016/j.jcp.2013.06.022 JCTPAH 0021-9991 Google Scholar
A. Dutt and V. Rokhlin,
“Fast Fourier transforms for nonequispaced data,”
SIAM J. Sci. Comput., 14 1368
–1393
(1993). https://doi.org/10.1137/0914081 SJOCE3 1064-8275 Google Scholar
J.-Y. Lee and L. Greengard,
“The type 3 nonuniform FFT and its applications,”
J. Comput. Phys., 206 1
–5
(2005). https://doi.org/10.1016/j.jcp.2004.12.004 JCTPAH 0021-9991 Google Scholar
R. Soummer et al.,
“Fast computation of Lyot-style coronagraph propagation,”
Opt. Express, 15
(24), 15935
–15951
(2007). https://doi.org/10.1364/OE.15.015935 OPEXFF 1094-4087 Google Scholar
A. H. Barnett, J. F. Magland and L. af Klinteberg,
“A parallel non-uniform fast Fourier transform library based on an ‘exponential of semicircle’ kernel,”
SIAM J. Sci. Comput., 41
(5), C479
–C504
(2019). https://doi.org/10.1137/18M120885X SJOCE3 1064-8275 Google Scholar
A. H. Barnett,
“FRESNAQ: MATLAB/Octave library for fast Frensel diffraction from apertures and occulters,”
(2020). https://github.com/ahbarnett/fresnaq Google Scholar
L. N. Trefethen, Approximation Theory and Approximation Practice, SIAM,2013). http://www.chebfun.org/ATAP Google Scholar
B. Vioreanu and V. Rokhlin,
“Spectra of multiplication operators as a numerical tool,”
SIAM J. Sci. Comput., 36 A267
–A288
(2014). https://doi.org/10.1137/110860082 SJOCE3 1064-8275 Google Scholar
S. E. Mousavi, H. Xiao and N. Sukumar,
“Generalized Gaussian quadrature rules on arbitrary polygons,”
Int. J. Numer. Methods Eng., 82
(1), 99
–113
(2010). https://doi.org/10.1002/nme.2759 IJNMBH 0029-5981 Google Scholar
H. Xiao and Z. Gimbutas,
“A numerical algorithm for the construction of efficient quadrature rules in two and higher dimensions,”
Comput. Math. Appl., 59
(2), 663
–676
(2010). https://doi.org/10.1016/j.camwa.2009.10.027 CMAPDK 0898-1221 Google Scholar
D. Gunderman, K. Weiss and J. A. Evans,
“Spectral mesh-free quadrature for planar regions bounded by rational parametric curves,”
Comput. Aided Des., 130 102944
(2021). https://doi.org/10.1016/j.cad.2020.102944 CAIDA5 0010-4485 Google Scholar
L. Bluestein,
“A linear filtering approach to the computation of discrete Fourier transform,”
IEEE Trans. Audio Electroacoust., 18 451
–455
(1970). https://doi.org/10.1109/TAU.1970.1162132 ITADAS 0018-9278 Google Scholar
J. Keiner, S. Kunis and D. Potts,
“Using NFFT 3—a software library for various nonequispaced fast Fourier transforms,”
ACM Trans. Math. Software, 36
(4), 1
–30
(2009). https://doi.org/10.1145/1555386.1555388 ACMSCU 0098-3500 Google Scholar
B. K. Alpert,
“Hybrid gauss-trapezoidal quadrature rules,”
SIAM J. Sci. Comput., 20 1551
–1584
(1999). https://doi.org/10.1137/S1064827597325141 SJOCE3 1064-8275 Google Scholar
N. Hale and L. N. Trefethen,
“New quadrature formulas from conformal maps,”
SIAM J. Numer. Anal., 46
(2), 930
–948
(2008). https://doi.org/10.1137/07068607X Google Scholar
R. Kress, Linear Integral Equations, 82 2nd ed.Springer, New York
(1999). Google Scholar
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed.SIAM, Philadelphia
(2002). Google Scholar
W. Fullerton,
“cexprl.f: Fortran code for the complex relative error exponential,”
(1989) https://www.netlib.org/slatec/fnlib/cexprl.f Google Scholar
BiographyAlex H. Barnett received his PhD in physics from Harvard University, followed by postdoctoral work in radiology at Massachusetts General Hospital and in applied mathematics at New York University. He then taught in the Department of Mathematics at Dartmouth College for 12 years, becoming a full professor in 2017. He is a group leader for numerical analysis at the Center for Computational Mathematics, Flatiron Institute, New York, USA. His research interests include computational partial differential equations, waves, fast algorithms, integral equations, neuroscience, imaging, and inverse problems. He has published more than 50 papers, received several grants from the National Science Foundation, won the XXI International Physics Olympiad, and received Dartmouth’s Karen E. Wetterhahn Memorial Award for Distinguished Creative or Scholarly Achievement. |