Quantitative analysis of the state of polarization of light provides a powerful tool in modern science. Applications vary from microscopy, biomedical diagnosis, and astrophysics1–3 to crystallographic, material, and single-molecule studies.4,5 While the polarization state of light itself can be used to transmit information, hence presenting new opportunities in optical data storage and communications,6–9 changes in polarization induced by a material can alternatively be used for object detection10 or to characterize sample properties, such as chirality or molecular orientation.11–13
Stokes polarimeters, which allow a complete characterization of the polarization state of input light as described by the associated Stokes vector , comprise of () distinct measurements that can be multiplexed in time,14 frequency,15 or space.16 Fundamentally, each constituent measurement outputs an intensity (), which is proportional to the projection of the incident Stokes vector onto an analysis state vector , i.e., . Central to the description and design of Stokes polarimeters is hence the so-called instrument or measurement matrix formed from stacking the set of analysis vectors. In order to obtain an estimate of the Stokes vector from the set of projections , the measurement matrix must be inverted. So as to limit noise propagation through this inversion process, optimization of the measurement matrix is hence frequently performed. Optimization in this vein has been performed using different metrics including the associated information content,17–19 matrix determinant,20–22 signal-to-noise ratio,23 equally weighted variance (EWV),24,25 and condition number.21–23,25–29
Mueller matrix polarimeters, on the other hand, combine a Stokes polarimeter with use of multiple incident polarized states so as to measure the full Mueller matrix of an object. Variation of the probing polarization states (as can be described using an analogous illumination matrix), therefore, introduces additional degrees of freedom, hence admitting further optimization.17,28–32 Application specific optimization of polarimeters has also been reported, for example, in detection and imaging problems the polarization contrast is a more suitable metric.33,34
Recently, the equivalence of a number of optimization metrics, namely the EWV, the condition number of , and the determinant of the associated Gram matrix, was discussed by Foreman et al.35 Additionally, Foreman et al. proved that a Stokes polarimeter is optimal (as characterized by these metrics) when the set of analysis states defines a spherical 2 design36 on the unit Poincaré sphere. A re-examination of the equivalence between these metrics is, however, necessary due to an error in the proof presented in Ref. 35. The goal of this paper is, therefore, to provide a rigorous proof that the conclusions of Ref. 35 hold. Our derivations also elicit greater insight into the optimization of nonideal Stokes polarimeters, which is hence discussed. We additionally note that our results are equally applicable to optimization of the probing states used in Mueller matrix polarimetry due to the similar matrix structure of the problem.31,37
Optimal Polarimetry with Spherical 2 Designs
The instrument matrix of a polarimeter is an matrix, the rows of which are the Stokes vectors of the polarization states being analyzed, normalized such that the polarimeter is passive. Accordingly, the instrument matrix has the parametric form:4.
In Stokes polarimetry, one performs intensity measurements , by projecting the input Stokes vector onto each of the analyzers described by the rows of the matrix . If these measurements are stacked in an -dimensional vector , and if we assume that the measurements are perturbed by white additive noise, we obtain17,23,24
To optimize the EWV, we first express the Gram matrix in block format, viz.38 38 38
Noting that and that is positive definite, it follows immediately that the first two terms in Eq. (15) are positive. We show in Sec. 6 that the third term is also positive. Consequently, the trace in Eq. (15) is minimal when its three terms are minimal. The first term is constant, and the third is minimal when it is null, i.e., when or equivalently35. When Eq. (16) holds, minimizing is equivalent to minimizing . This optimization has to be done under the constraint that the trace of the matrix is constant as follows from the normalization of . Indeed, since each row of the matrix is a unit-norm vector, we have 35. The form of the Gram matrix that hence minimizes the EWV of the instrument matrix is thus
Finally, the conditions expressed by Eqs. (16) and (20) are satisfied when the set of measurement states on the normalized Poincaré sphere, defined by , , constitute a spherical 2 design (see Sec. 7 for a proof) as reported in Ref. 35. A spherical design is defined as a collection of points on the surface of the unit sphere (in our case in ) for which the normalized integral of any polynomial function, , of degree or less is equal to the average taken over the points. The Platonic solids, i.e., the regular tetrahedron (), the octahedron (), the cube (), the icosahedron (), and the dodecahedron (), are well-known examples of spherical 2 designs. A geometric scheme to construct optimal polarimeters for any even , any factorable odd value of , and for prime has also been described in Ref. 35. Further examples of spherical designs and construction strategies can be found in Refs. 3940.–41. Critically, spherical 2 designs are known to exist for any , with the important exception of .39,41 In the context of optimal polarimetry, this implies that for the constraints described by Eqs. (16) and (20) cannot be fully satisfied. Recalling Eq. (15), this arises because the second and third terms cannot be simultaneously minimized. Although the resulting measurement states do not form a spherical 2 design, the sum of these two terms, and hence the EWV, can nevertheless be minimized yielding a value of . The corresponding analysis states define a square pyramid inscribed by the unit Poincaré sphere.
Equivalence of Optimization Metrics
We will now demonstrate that the optimization of two other popular metrics, namely the condition number and the determinant of the Gram matrix, lead to exactly the same measurement frames as the EWV so that these three criteria are strictly equivalent.
The condition number of the instrument matrix is defined by , where is the pseudoinverse matrix and denotes the matrix norm. In principle, any choice of matrix norm can be made, however, within the context of polarimetry, the most common choices are those of either the 2-norm,42,43 defined as the maximum singular value of , or the Frobenius norm,27,35,43 given by3843 In this paper, we exclusively consider the Frobenius norm (and henceforth drop the subscript ). This selection is motivated by the resulting equivalence between the condition number and EWV. To prove this equivalence [for polarimeters with instrument matrix of the form of Eq. (1)], we first note that our choice of normalization of the measurement states implies that
Determinant of the Gram Matrix
The first works on Stokes polarimeter optimization considered devices with a minimal number () of measurement vectors.26 Optimization of such systems used the determinant of the matrix (which for this value of is square and nonsingular) as a performance metric. In this case, the optimal structure found dictated that the measurement vectors defined a regular tetrahedron on the Poincaré sphere, a result that we also found above by optimizing the EWV. We show in this section that this result comes from the strict equivalence of these two optimization metrics. This equivalence can be generalized to any value of if one considers the optimization of the determinant of the Gram matrix since for the matrix itself is rectangular and its determinant is thus not defined. Notice that this equivalence was mentioned in Ref. 35, but there was an erroneous step in the logic presented in that work (see Sec. 8 for more details).
We intend here to show that maximization of the determinant yields the same polynomial constraints embodied in Eqs. (16) and (20). Considering the block form of the Gram matrix in Eq. (8), its determinant can be written as38
For the second factor, we note that where , , are again the eigenvalues of the matrix , which are positive since is positive definite. Moreover, according to Eq. (17), the matrix has constant trace. Maximization of is thus once again a constrained optimization problem, which can be solved using the method of Lagrange multipliers. We will consider maximization of , which is equivalent since the logarithmic function is monotonically increasing. The Lagrange function then becomes2, that for all . As shown in Sec. 2, the second polynomial constraint expressed in Eq. (20) then follows. Therefore, we have ultimately shown that minimization of the EWV (and thus also of the Frobenius condition number of the instrument matrix) of a polarimeter yields the same set of optimality constraints as maximizing the determinant of the associated Gram matrix.
The main conclusion from the analysis presented in the Secs. 2 and 3 is that among all measurement matrices of the form described by Eq. (1), those that maximize the condition number, the EWV and the determinant are exactly the same. Our result can thus be said to unify many previous works on polarimeter optimization, e.g., the early work of Azzam et al.26 (which optimized based on the instrument matrix determinant), Ambirajan and Look22 (based on the condition number and determinant), Sabatke et al.24 (based on the EWV and determinant), and Tyo44 (based on condition number), among many others.
Modeling of based on Eq. (1) implies that the transmittance of each polarization analyzer and the degree of polarization of the transmitted light are both equal to one. This assumption is frequently made in polarimetry, however, it is interesting to consider the case where it is not fulfilled. In the general case, each analyzer, as described by each row of the measurement matrix, may have a different transmission , , and a different resulting degree of polarization , , such that the measurement matrix can be expressed in the form:3. Specifically, when the transmission and degree of polarization of each analysis vector is fixed (albeit arbitrary), optimization of the positions of the analysis state vectors on the normalized Poincaré sphere (i.e., of ) yields the same result regardless of whether the condition number or the EWV is used as the performance metric. The EWV, however, also depends on the transmission and polarization factors ( and ), such that this equivalence breaks down when and are not fixed for each individual measurement. Letting , , and , by following a similar logic to Sec. 2 it can be shown [in analogy to Eq. (15)] that
Another important practical question is which of the three considered metrics is the most appropriate for evaluating the performance of a polarimeter under more general conditions. Indeed, from this point of view, the metrics are not necessarily equivalent, particularly in complex noise regimes or when nonideal polarization state analyzers are used. This is most easily seen by noting that the three metrics can be expressed in the form:
Another strong advantage of the EWV is that it can be used for polarimeter optimization in the presence of nonadditive noises sources. The EWV has been used to determine the optimal measurement frames in the presence of Poisson shot noise.45,46 In this case, the covariance matrix of the Stokes estimate takes a different form to that of Eq. (5). Consequently, the EWV is no longer given by Eq. (6), and thus not proportional to the square of the condition number. Furthermore, when measurements are simultaneously affected by several types of statistically independent noise sources, the total EWV is simply the sum of the individual EWVs for each noise source. This additive property has been recently employed to characterize the actual performance of microgrid-based polarimetric cameras in the presence of both additive detection noise and Poisson shot noise.47
In conclusion, the key finding of the present work is that when optimizing the estimation performance of a polarimeter in the presence of additive Gaussian noise, the Frobenius condition number of the instrument matrix, the Gram determinant, and EWV are three strictly equivalent metrics. When evaluating and comparing the performance of different polarimeters however, or when optimizing polarimeters in the presence of nonadditive, non-Gaussian noise sources, the EWV has strong advantages compared with the other two metrics.
We have shown that optimization of the EWV, of the Frobenius condition number, or of the determinant of the Gram matrix of a Stokes polarimeter leads to the same optimal measurement structures, namely, spherical 2 designs. These structures yield a very simple closed-form expression for the covariance matrix of the Stokes vector estimator and thus of the variances of each element of the Stokes vector. These expressions constitute the fundamental limit of the estimation variance that can be reached by a Stokes polarimeter in the presence of additive noise.
As a conclusion, we would like to stress that although the three considered metrics are equivalent for polarimeter optimization in the presence of additive noise, the EWV has the simplest physical interpretation since it corresponds to an estimation variance, which has a clear and useful statistical meaning. As a consequence, in contrast to the two other metrics, the EWV can be used for polarimeter optimization in the presence of noise sources with nonadditive, non-Gaussian, or mixed statistics. As discussed above, this problem has already been addressed by optimizing the EWV obtained after application of the pseudoinverse estimator.45,46 Although this procedure gives satisfying results in practice,48 it is not strictly optimal. Indeed, in the presence of nonadditive and non-Gaussian noise, by virtue of the Cramér-Rao lower bound, the appropriate criterion is the trace of the inverse Fisher information matrix.17,18 The value of this metric corresponds to the EWV of an efficient estimator (where “efficient” is meant here in the precise sense used in estimation theory49), whereas in general the pseudoinverse estimator is not efficient. The interesting problem of analyzing the differences between the optimal measurement structures found using a Fisher information-based metric and the spherical 2 designs remains as future work.
Appendix A: Positivity of the Third Term of Eq. (15)
We demonstrate in this section that the third term of the expression of in Eq. (15) is positive definite. Since the matrix is by definition a positive matrix, the numerator of this term is also positive. We, therefore, need only analyze the denominator. Considering then the singular value decomposition , where and are unitary matrices and is diagonal, it is easily seen that
Appendix B: Satisfying Eqs. (16) and (20) with Spherical Designs
Consider a finite set of points (), which lie on the surface of the three-dimensional unit sphere. The set of points are said to constitute a spherical design if for any polynomial function of order or lower:
Proof that Eqs. (16) and (20) can be satisfied using spherical 2 designs follows by showing that we can generate the constraints through appropriate choice of polynomial functions of second-order degree or less in Eq. (42). Considering first the case (), substitution into Eq. (42) yields:
Appendix C: Previous Derivation
The constraints derived in Sec. 2 through direct minimization of the EWV were first derived by Foreman et al. exploiting a claimed equivalence between minimizing the trace of and maximizing the determinant of . Specifically, using the definition of the matrix inverse and Jacobi’s formula, it was first shown that the condition number can be expressed in the form:35
The authors would like to thank Dr. A. Favaro for useful discussions. M. R. F. also acknowledges financial support from the Royal Society through a Royal Society University Research Fellowship. The authors declare they have no conflicts of interest.
Matthew R. Foreman received his MPhys degree from the University of Oxford in 2006 and his PhD from Imperial College London in 2010. He has held research posts at the UK National Physical Laboratory (Teddington) and the Max Planck Institute for the Science of Light (Erlangen), where he held an Alexander von Humboldt Fellowship. Currently, he is a Royal Society University Research Fellow at Imperial College London. His research interests include theoretical aspects of nanophotonics, plasmonics, polarimetry, and random scattering and sensing.
François Goudail graduated from the École Supérieure d’Optique (Orsay) in 1992 and received his PhD in 1997 from the University of Aix-Marseille III. He was an associate professor at Fresnel Institute (Marseille) until 2005. He is now a professor at the Institut d’Optique Graduate School (Palaiseau). His research topics include information extraction in images from different types of passive and active sensors (hyperspectral, SAR, polarimetric), wavefront engineering and joint design of optical systems, and image processing algorithms.