This chapter develops methods for the description of the polarization aberrations of optical systems with an emphasis on radially symmetric optical systems such as camera lenses, microscope objectives, and telescopes.
Aberrations are deviations from ideal and desired behavior. In conventional optical design, spherical wavefronts are desired. A uniform spherical wavefront with circular profile focuses to an Airy disk; any deviation of the wavefront from spherical, any aberration, increases the size of this image and thus reduces the resolution and information content of an image. To better understand and communicate aberration information, wavefront aberration terms are defined as a basis set of functions that can be added to provide a close fit to the target wavefront, as in Section 10.7.
Similarly, polarization aberrations are deviations from a uniform amplitude and uniform polarization state. To transmit wavefronts with a uniform polarization state for arbitrary inputs, the ray paths need to be free from diattenuation and retardance. Thus, polarization aberrations can be described as the deviations from an identity Jones matrix for all rays.
This chapter provides algorithms to analyze the polarization of weakly polarizing optical elements such as lenses and mirrors and to investigate how these polarizations interact. For very weakly polarizing elements, the Pauli matrix components can be added, providing great simplification and insight. For paraxial optics, three polarization aberrations occur, aberrations that are the diattenuation or retardance equivalents of defocus (quadratic), tilt (linear), and piston (constant), as shown in Figure 15.1. For lens and mirror interfaces, the variation of diattenuation and retardance with angle of incidence about normal incidence is mainly quadratic. Thus, after describing each interface by a quadratic variation of diattenuation and retardance, the polarization defocus, tilt, and piston coefficients for a radially symmetric system are readily calculated by a paraxial calculation.
Figure 15.1Wavefront aberration (top row) and polarization aberration (bottom row) for piston (left), tilt (middle), and defocus (right).
In geometrical optics, the paraxial region comprises those rays for which the Seidel aberrations, spherical aberration, coma, and astigmatism, are not significant, perhaps less than a tenth of a wave. For the paraxial polarization aberrations, the paraxial region spans the set of rays near the axis where second-order fits to the amplitude coefficients are accurate; see Chapter 8 for a discussion of the region of accuracy for second-order fits to Fresnel equations. The polarization paraxial region turns out to be much larger than the paraxial region for rays and wavefront. Thus, the paraxial polarization algorithms developed here are applicable to a large number of lens and mirror systems. Paraxial optical calculations provide a great simplification over ray tracing equations. An additional simplification for the polarization calculations is that the rays can be treated as propagating along the z-axis, neglecting the z-component of the electric field; hence, Jones matrices will be used for the paraxial analysis instead of 3 × 3 polarization ray tracing matrices.
The development of paraxial polarization aberrations follows these steps. A summary of the paraxial ray trace equations are presented in the Appendix to this chapter. The paraxial ray trace is used to calculate the angle of incidence across the wavefront at each interface. The amplitude coefficients, such as Fresnel equations for uncoated interfaces, are fit to quadratic equations. The combination of the paraxial angle of incidence with these quadratic amplitude relations leads to surface contributions of the polarization aberration forms, three for linear diattenuation and three for linear retardance, at each interface. These surface contributions can be summed over the interfaces to obtain a paraxial polarization aberration for the optical system. Several examples will demonstrate the utility of the paraxial polarization aberrations. Finally, Zernike polynomials are generalized for the discussion of higher-order polarization aberrations.
One interesting result is a binodal form for the polarization aberration, a form where the magnitudes of the off-axis retardance and diattenuation aberrations have two zeros in the pupil; a similar result occurs with astigmatism in tilted or decentered systems where it is called binodal astigmatism.1
As is common in aberration theory, when the terms linear and quadratic and so on, are used, approximately linear and approximately quadratic are implied.
The aberration of an optical system is its deviation from ideal performance. In an imaging system with ideal spherical or plane wave illumination, the desired output is spherical wavefronts with constant amplitude and constant polarization state centered on the correct image point. Deviations from spherical wavefronts arise from variations of optical path length of rays through the optic due to the geometry of the optical surfaces and the laws of reflection and refraction; this is described by the wavefront aberration function. Deviations from constant amplitude arise from differences in reflection or refraction efficiency between rays. Amplitude variations are amplitude aberration or apodization. Polarization change also occurs at each reflecting and refracting surface due to differences of reflectance and transmission coefficients between the s- and p-components of the light. Across a set of rays, the angles of incidence and thus the polarization change varies, so that a uniformly polarized input beam has polarization variations when exiting.2,3 For many optical systems, the desired polarization output would be a constant polarization state with no polarization change transiting the system; such ray paths through an optical system can be described by identity Jones matrices. Deviations from this identity matrix are referred to as polarization aberrations.
Polarization aberration, also called instrumental polarization, refers to all polarization changes of the optical system and the variations with pupil coordinate ρ, object location H, and wavelength λ, J(H, ρ, λ). The term Fresnel aberrations (Chapter 12) refers to those polarization aberrations that arise strictly from the Fresnel equations, that is, systems of metal-coated mirrors and uncoated lenses.2,3,4,5 Multilayer coated surfaces produce polarization aberrations with similar functional forms to uncoated surfaces with larger or smaller magnitudes, depending on the coating and wavelength. For systems with homogeneous and isotropic interfaces, polarization aberrations are predominantly related to linear diattenuation and retardance; their magnitudes are described by even functions of angle of incidence.
Since the Jones matrix has eight degrees of freedom, variations from the Jones pupil identity matrix can be described with an expansion into a set of eight Zernike polynomials for either the real and imaginary parts, or the amplitudes and phases of each element. In reflecting and refracting optical systems, most of the deviation from the identity matrix Jones pupil due to polarization aberration occurs as linear diattenuation and linear retardance. Linear retardance and linear diattenuation do not behave as vectors, and expansion of retardance aberrations into vector Zernike polynomials is not appropriate. Instead, a new mathematical object, orientors, is introduced, which provides a better description for linear retardance aberrations. Similarly, orientors provide a useful basis for the expansion of linear diattenuation aberrations. Further explanation of the orientors is provided in Section 15.5.2.
Lens and mirror surfaces are typically weak polarization elements. Weak polarization elements, as described in Section 14.8, have a small amount of diattenuation and retardance, such that the diattenuation D and retardance δ are much less than one,
The Jones matrices for weak polarization elements are usefully expressed as follows. First, the Jones matrix J is expressed as the sum of Pauli matrices as
Then, the coefficient c0 of the identity matrix σ0 is factored out
c0 characterizes the average change in the amplitude and phase of the light, which are readily identified in polar coordinates,
Next, the remaining coefficients are separated into real and imaginary parts,
With the addition of the factor of one-half, the D’s and δ’s are now in units of diattenuation and retardance. This equation is the canonical form for the Jones matrix of a weak polarization element, as discussed in Chapter 14. The minus signs are chosen to remain consistent with the decreasing phase sign convention used throughout. Equation 15.5 shows the first-order series expansion of the general exponential form for the Jones matrix for diattenuators and retarders written in terms of three diattenuation components and three retardance components see Equation 14.78.
Equation 15.5, the canonical form, is the preferred and most useful form, for a Jones matrix for a ray intercept, providing a simple expression in terms of three diattenuation and three retardance components. The utility of this form for the Jones matrix is that when the diattenuation and retardance components are small, they can simply be added, as the order-dependent terms are very small. For ray intercepts, the interface Jones matrices are also linear, not elliptical or circular diattenuation or retardance, and the DL and δL are generally zero.
The Jones matrix for a sequence of polarization interactions is the matrix product of the Jones matrices for the individual interactions. For weak polarization interactions, the resultant Jones matrix can be simplified to the sum of the Pauli coefficients for the individual interactions. The Jones matrix for a sequence of two weakly polarizing interactions, which can be two ray intercepts in a weakly polarizing optical system, is studied to observe how the polarization properties interact. This leads to a simple equation for sequences of weak polarization elements. Since most reflecting and refracting interfaces are linear diattenuators and linearretarders, not circular or elliptical, we start with an example of the Jones matrices of two linear interfaces.
Consider a light ray entering and exiting a lens with antireflection coatings (Section 13.2.1). Antireflection coatings are weakly polarizing with linear eigenpolarizations; thus, the diattenuation, D, and retardance, δ, coefficients are small,
The second subscript refers to the first or second ray intercept. There are no σ3 components. The H and 45 components depend on the planes of incidence for the ray intercepts. The Jones matrices for the two ray intercepts in canonical form are
Using the Pauli matrix identities in Section 14.2.1, σ0σ0 = σ0, σ0σ1 = σ1σ0 = σ1, σ0σ2 = σ2σ0 = σ2, and
the product J2J1 is
where χ and Χ are higher order in the D’s and δ’s. To first order, the linear diattenuation of the matrix product J2J1 is the sum of the individual diattenuation components: DH,1 + DH,2 and D45,1 + D45,2. Similarly, the linear retardance of J2J1 is the sum of the linear retardance components δH,1 + δH,2 and δ45,1 + δ45,2. This is the main simplifying result for weak polarization elements.
The σ3 (circular) components of J2J1 arise from the following product terms:
the interaction of the σ1-component of one ray intercept with the σ2-component of the other. The real part of the σ3-component is a circular diattenuation term,
which arises from the interaction of the diattenuation of one surface with the retardance of the other surface. The imaginary part of the σ3-component is a circular retardance term
It is obvious that combinations of linear retardance 45° apart generate a circular retardance component (i.e., elliptical retardance), as seen in the latter two terms above. Less obvious is the first two terms where the interaction of two diattenuations also generates elliptical retardance, since they generate an overall rotation on the Poincaré sphere; see Example 14.3. Both χ and X involve the products of two small D and δ coefficients and hence are second-order terms. Thus, if the diattenuations and retardances of a ray intercept are small, for example, of the order 10−3, these circular terms are of the order 10−6, which is negligible. Also note that when the order is reversed, the product J1J2 is
which is the same as J2J1 except that the circular terms χ and Χ changed sign. This is an important result. To first order, the product of weakly polarizing Jones matrices is order independent, and the diattenuation and retardance contributions can just be added.
In weakly polarizing optical systems, the circular terms are interesting but very small and thus are usually not important. However, as the strength of the interactions increases, these circular terms become more important.
Example 15.1Weak Plarization Aberration in a Ray
A meridional ray propagates through an uncoated lens with two surfaces in a plane at 45° between x and y. At the first ray intercept, the ray has an amplitude change ρ0,1 = 0.79889 with diattenuation DH,1 = −0.00020 and D45,1 = 0.00020. At the second ray intercept, the ray has an amplitude change ρ0,2 = 1.21583 with diattenuation DH,2 = −0.00186 and D45,2 = 0.00186. The overall Jones matrix is
Note that the diattenuation on the two surfaces are aligned; hence, the circular cross term is zero.
Section 15.2.1 demonstrated how the linear diattenuation and retardance of two weakly linearly polarizing interfaces, to first order, just add. Next, this result is extended to an arbitrary number of weakly linearly polarizing interfaces, such as optical systems formed from uncoated interfaces, antireflection coatings, or metallic reflections at small angles of incidence. A geometrical picture of this summation is provided by adding the complex Pauli coefficients in a vector-like manner.
A light ray propagating through an optical system encounters a series of weakly polarizing ray intercepts numbered, q = 1, 2, … Q, each with Jones matrix Jq expressed in the normalized Pauli summation form
Although we are principally discussing optics with interfaces with linear eigenpolarizations (zero σ3), σ3 is included here for generality; its coefficients DL,q and δL,q will usually be zero. The Jones matrix J for the weakly polarizing sequence to first order is
The complex Pauli coefficients to first order are
where
Thus, for sequences of weakly polarizing ray intercepts, the net diattenuation is approximately the sum of the individual diattenuation components, that is, a sum over the σ1 components, a sum over the σ2 components, and, if present, a sum over the σ3 components. Similarly, the net retardance is approximated by the sum of the individual σ1, σ2, and, if present, the σ3 retardance components. The diattenuations correspond to real parts and the retardances correspond to imaginary parts.
Equation 15.18 leads to a vector-like geometrical representation of the summation, not in x–y space but in (σ1, σ2, σ3) space as follows. Consider just the linear diattenuation contributions from a series of three ray intercepts in four steps as in Figure 15.2. (1) The top row represents the magnitudes and orientations of the three diattenuations in x–y space as lines or diattenuation orientors (see Section 15.5.2). Diattenuations repeat after a rotation of 180°. Remember, the Pauli bases σ1 and σ2 are only 45° apart. (2) To transform diattenuation lines into the Pauli basis, the angles from the x-axis are doubled, and the lines are converted to vectors as shown on the second row. (3) The “Pauli vectors” are added as vectors yielding the combined vector in black. (4) Finally, the angle is halved and the vector is converted back into a line, as it is returned to the x–y space. Note that this is an approximate calculation in the limit of weak diattenuators that ignores the σ3 component, which would occur as a cross-product-like component out of the plane of the page. A separate and parallel Pauli-vector calculation would apply to the summation of the retardance components.
Figure 15.2Geometrical view of the combination of weak diattenuations. (1) Three diattenuations with magnitudes indicated by line lengths and orientations θ1, θ2, and θ3. (2) Converted into Pauli coefficient “vectors” by doubling the angles; magnitudes unchanged. (3) Vector addition of Pauli coefficients. (4) Resultant linear diattenuation obtained by halving the angle from the vector addition. The same construction applies to weak retarders.
A description of the polarization aberrations in the paraxial region of lens and mirror systems will be developed in this section in the form of an expansion in polarization aberration terms similar to the Seidel aberrations. A polarization aberration expansion to only second order in the diattenuation and retardance can provide an accurate polarization description of a large fraction of the object and pupil for many radially symmetric systems. A method is presented to calculate these coefficients from the paraxial ray trace combined with a Taylor series expansion description of the interface polarization, an expansion of either Fresnel equations for uncoated and metal interfaces, or thin film amplitude coefficients. A summary of paraxial ray tracing is included in the Appendix. This treatment includes the calculation of angles of incidence, planes of incidence, and the propagation vectors of skew rays, which are needed for polarization aberration calculation. These polarization aberration coefficients can also be determined by polarization ray tracing and fitting functions to the Jones pupil; an example of this is provided in Chapter 27 (Summary and Conclusions).
The optical system is described by normalized object H and pupil coordinates, for the entrance pupil ρE and exit pupil ρX as described in Section 10.7.1 and as shown in Figure 15.3.
Figure 15.3Normalized coordinates for the paraxial polarization expansion.
The polarization aberrations for lens and mirror surfaces depend on the variation of angle of incidence and the orientation of the plane of incidence across the beam. Figure 15.4 shows the wavefront from an on-axis object incident on a spherical interface. The angle of incidence over the pupil for this on-axis object is shown in Figure 15.4. The angle of incidence is zero for the ray in the center of the pupil (ρ = 0). For rays from the on-axis object point H = 0, the angle of incidence increases linearly to the marginal ray angle of incidence im at the edge of the pupil. The orientation Ф of the plane of incidence is radially oriented.
Figure 15.4(Left) The wavefront from an on-axis object point incident on a spherical interface has an angle of incidence of zero at the center of the beam. (Right) The paraxial angle of incidence for an on-axis wavefront at a spherical surface increases linearly from the center and equals the marginal ray angle of incidence at the edge. Line lengths indicate the angle of incidence and the orientation indicates the plane of incidence.
Figure 15.5 shows the wavefront as the object moves away from the axis. Since the wavefront is spherical and the surface is spherical, the functional form remains like Figure 15.4, but is shifted and centered on whichever ray is at normal incidence, as graphed in Figure 15.6.
Figure 15.5Wavefronts from an on-axis and two off-axis objects incident on a spherical surface. All three cases are a spherical wavefront tangent to a spherical interface.
Figure 15.6Angle of incidence maps as the wavefronts of Figure 15.5 move off-axis. The patterns shift, but otherwise are unchanged. The ray through the center of each pattern (inside blue circle) is the chief ray for that field.
The angle of incidence, θ(H, ρ) for marginal rays and skew rays at vector object position H and pupil position ρ is calculated using the Pythagorean theorem because θ(H, ρ) has both x- and y-components:
where ic is the chief ray angle of incidence.
Consider an optical system with a series, q = 1, 2, …, Q surfaces. The paraxial angle of incidence at each surface q is a function of the chief ic,q and marginal ray angles im,q,
where (G, H) are the x- and y-components of H, and (x, y) are the x- and y-components of ρ. For simplicity, the object is preferably placed on the y-axis, G = 0, and the angle of incidence simplifies to
or in polar pupil coordinates, ρ and ϕ,
The orientation of the plane of incidence, Φ, measured counterclockwise from the x-axis, is, as seen in Figure 15.6,
If θx = 0, the plane of incidence then intersects the x–y plane in a vertical line. The orientation of the plane of incidence is
so
The paraxial polarization aberrations for a single lens or mirror surface are obtained by combining the paraxial angle of incidence with a quadratic expression for diattenuation or retardance as a function of angle of incidence. By combining the angle of incidence functions with approximations for the Jones matrix of a surface or coating, an approximate expression for the Jones matrix of a surface is developed. The diattenuation D(θ) and retardance δ(θ) of interfaces are well approximated for all isotropic coatings and uncoated interfaces at small angles of incidence θ by simple quadratic equations,
where D2 and δ2 are coefficients of the diattenuation and retardance functions. For uncoated and coated interfaces, D2 and δ2 are found for each interface in the system from polynomial fits to the multilayer intensity reflection and transmission equations and associated diattenuation and retardance expressions; see Math Tip 13.1.
Consider an uncoated refracting surface, which has no retardance, only diattenuation. A pupil map of the diattenuation for an on-axis object point, shown in Figure 15.7, is obtained by squaring the magnitude of the angle of incidence map of Figure 15.6 while leaving the orientation unchanged. The diattenuation magnitude at the edge of the pupil is D2, the diattenuation of the marginal ray. This diattenuation pattern varies quadratically with pupil coordinate and so has been named diattenuation defocus. The p-Fresnel coefficient has a greater magnitude than the s-Fresnel coefficient; hence, the transmission axis is aligned with the plane of incidence. The uncoated interface has no retardance since the refractive index is real on both sides of the interface. The corresponding Jones pupil equation is
Figure 15.7(Left) Pupil map for the diattenuation of a single refracting surface with an on-axis wavefront has a diattenuation that increases quadratically from the center and is radially oriented. The diattenuation magnitude around the edge of the pupil is the diattenuation for the interface evaluated at the marginal ray angle of incidence. The σ1 (middle) and σ2 (right) components of the diattenuation map are plotted. The red arrows points from the x–y plane to help visualize the shape σ1 has a cos 2ϕ form and σ2 has a sin 2ϕ form.
The Pauli coefficients for diattenuation defocus are plotted in Figure 15.7 (middle and right). The functional form of diattenuation defocus, for small D2, is given by the matrix in Equation 15.30. The σ1 is positive along the x-axis and negative along the y-axis while the σ2 component is rotated by 45°. Figure 15.8 shows the output polarization state maps when 0°, 45°, 90°, and 135° linearly polarized light is incident. The edge of the pupil is brighter than the center along the axis aligned with the linear polarization and dimmer along the orthogonal axis. The polarization undergoes maximum polarization rotation at ±45° to the polarization axis. Figure 15.9 contains the corresponding maps when right and left circularly polarized light are incident. There is no polarization change at the center of the pupil. The change increases quadratically toward the edge with the light becoming elliptically polarized with a major axis parallel to the diattenuation.
Figure 15.8The transmitted polarization ellipse map for the diattenuation map of Figure 15.7 when, from left to right, 0°, 45°, 90°, and 135° linearly polarized light is incident. Diattenuation magnitude is 0.3 for the marginal ray in this example.
Figure 15.9The transmitted polarization ellipse for the diattenuation map of Figure 15.7 when (left) left circular and (right) right circular polarized light is incident.
Next, compare a metal mirror on-axis with the uncoated lens surface of Section 15.3.3. The variation of the intensity reflectance and phase is quadratic near the origin for most interfaces. Figure 15.10 plots the Fresnel coefficients, diattenuation, and retardance of a typical metal mirror coated with aluminum. Such reflecting surfaces have non-zero retardance, and both diattenuation and retardance are nearly quadratic with incident angle.
Figure 15.10The Fresnel intensity reflection coefficients (upper left), phases (lower left), diattenuation (upper right), and retardance of an aluminum-coated mirror as a function of angle of incidence are all quadratic near the origin. Red indicates s-polarization and orange indicates p-polarization. These figures are for aluminum reflecting surface with n = 1.262 + 7.185i at 600 nm.
The general spherical interface with a multilayer reflecting or refracting coating and illuminated on-axis has both diattenuation defocus and retardance defocus, and each can have either a radial or tangential axis. Thus, for an on-axis beam, there are four possibilities for the combined signs of the diattenuation and retardance for an interface, as shown in the four columns of Figure 15.11. Metal mirrors have the form of the third column.
Figure 15.11Combinations of retardance defocus (magenta) and diattenuation defocus (brown) can come with four combinations of signs: (left) positive retardance and positive diattenuation, (second) positive and negative, (third) negative and positive, and (right) negative and negative.
Figure 15.6 showed the variation of the paraxial polarization aberration, either diattenuation or retardance, as an object moves off-axis. The polarization aberration pattern translates in the direction H is moving. The magnitude of the σ1 coefficient is a shifted quadratic, as shown in Figure 15.12. These patterns can be decomposed into quadratic, linear, and constant terms. The quadratic term maintains the same magnitude as the on-axis pattern as the pattern shifts. The Pauli coefficients at the edge of the field of view have the form shown in Figure 15.7.
Figure 15.12A decentered quadratic equation expressed as the sum of quadratic, linear, and constant components.
Consider an object off axis in the y-direction. The polarization aberration pattern of Figure 15.13 (left) can be expressed as a combination of a quadratic diattenuation map (second), a linear diattenuation map (third), and a constant diattenuation map (right). These are the second-order polarization aberrations. These functional forms are given the names polarization defocus, polarization tilt, and polarization piston. When they refer to diattenuation, they become diattenuation defocus, diattenuation tilt, and diattenuation piston. Similarly, for retardance, they become retardance defocus, retardance tilt, and retardance piston.
Figure 15.13The diattenuation map for an off-axis beam (left) has a quadratic variation along the y-axis. This map can be expressed as the sum of a quadratic variation, linear variation, and constant diattenuation. These are diattenuation defocus, diattenuation tilt, and diattenuation piston aberrations. Parallel components add, crossed components subtract, and others combine as shown in Figure 15.2.
Diattenuation tilt and retardance tilt are linear variations in the pupil of σ1 and σ2. The σ1 component varies linearly in the meridional plane (here the y-axis) and the σ2 component varies linearly perpendicular to the meridional plane (here the x-axis), as shown in Figure 15.14. In a radially symmetric system, polarization tilt is zero on-axis and varies linearly with the object vector.
Figure 15.14Polarization tilt, either diattenuation tilt or retardance tilt, has a linear variation of magnitude from the center and changes sign (90° rotation) passing through the center. The axis rotates by 180° around the edge of the pupil. The σ1 component varies linearly in the meridional plane, here the y-axis, and the σ2 component varies linearly in the orthogonal direction. The red arrows points from σ = 0.
One way an optical system can generate pure polarization tilt is through the combination of two polarization defocus patterns, equal in magnitude and opposite in sign, shifted in opposite directions, as shown in Figure 15.15,
Figure 15.15Pure polarization tilt can be generated from two equal but opposite polarization defocus contributions, one shifted up, the other shifted down.
The other second-order polarization aberration is polarization piston, a constant diattenuation or retardance shown in Figure 15.13 (right) and Figure 15.16. For a radially symmetric system, the piston is zero on axis and increases quadratically across the field as H · H,
Figure 15.16Polarization piston has a constant σ1 component.
One interesting pattern that occurs with second-order polarization aberration is the binodal polarization aberration shown in Figure 15.17 (left). Binodal indicates these are two zeros in the pupil, shown by the points on the x-axis. The axis rotates by 180° around each node. Binodal polarization can be generated by the combination of polarization defocus (center) and polarization piston (right), generating zeros where the two patterns are orthogonal. This distribution of polarization is very similar to the distribution of astigmatism in binodal astigmatism.6,7
Figure 15.17Binodal polarization aberration with two zeros or nodes with the polarization rotation by 180° around each node. In this example, the two nodes are located on the x-axis. Binodal polarization aberration can be created by the combination of polarization defocus (center) with polarization piston (right).
As a light ray propagates through a series of surfaces with weak polarization aberration, the aberration contributions at each surface, either retardance or diattenuation, can be summed to calculate the overall aberration. For example, Figure 15.18 overlays the polarization contributions from three surfaces, all expressed in pupil coordinates. Here, the lines could represent either linear diattenuation or linear retardance contributions, which can be summed by the method of Section 15.2.2.
Figure 15.18Polarization contributions from three surfaces, each with shifted polarization defocus, shown in black, purple, and orange, with small offsets for clarity.
Figure 15.19 shows another example of paraxial aberrations for an off-axis beam cascaded over three surfaces. The first column shows the net polarization, retardance, or diattenuation, for each surface, and the total aberration from the three surfaces (bottom, left). The centers of the individual surface patterns are shifted due to the off-axis beam. Since the beam is off axis, the patterns for each surface can be decomposed into a polarization defocus (second column), polarization tilt (third column), and a polarization piston term (right column). The defocus terms can be separately added to yield the total defocus (bottom row, second column). Similarly, the tilt terms can be separately added, and the piston terms can be separately added. Hence, the net polarization aberration (bottom, left) is (1) the sum of the nine terms in the upper right, or (2) the sum of the three surface contributions in the first column, or (3) the sum of the total defocus, tilt, and piston in the bottom row. The sum is performed with the Pauli representation of the polarization (Section 15.2.2).
Figure 15.19The polarization contribution from three surfaces is shown in the left column. Each of these polarization aberration maps can be decomposed into the summation of a defocus term (second column), a tilt term (third column), and a piston term (right column). These defocus, tilt, and piston columns can also be summed separately, by adding columns, to equal the bottom row’s defocus, tilt, and piston terms, which sum to the overall polarization aberration map (lower left).
For the on-axis beam, the tilt and piston terms would be zero, so the net polarization aberration would be just the sum of the defocus terms (bottom row, second column). For a radially symmetric system, the tilt increases linearly with the field, the piston quadratically with the field, and the defocus is constant. Thus, a beam at twice the field angle of Figure 15.19 would have twice the tilt, four times the piston, but the same polarization defocus. Next, this paraxial polarization aberration method and these associated scaling rules will be tested with a high-etendué lens.
The paraxial polarization aberration method is demonstrated using the Polaris-M polarization analysis program with the seven-element lens shown in Figure 15.20. An exact polarization aberration calculation will be performed and compared to a paraxial calculation of the retardance and diattenuation defocus, tilt, and piston terms, showing a fit to within a few percent at the 10° field, and a fit off by only about 20% at the 30° field, a very large field angle. This example shows how the individual polarization aberration terms can be summed and also how large the paraxial region can be for polarization aberrations.
Figure 15.20A seven-element lens (L1–L7) with several meridional ray paths drawn for the 10° field coming from infinity. The second lens L2 (orange) is cemented to the third lens L3 (green), and the sixth lens L6 (blue) is cemented to the seventh lens L7 (magenta). The stop is located between the third and fourth lenses.
In the following calculation, each lens surface has a multilayer antireflection coating. The polarizations of the coated lenses are evaluated at 500 nm for objects at infinity. Figures 15.21 and 15.22 provide the coating performance, transmission amplitude, and phase for s- and p-polarizations, as a function of incident angle for each interface. Notice that the coatings have 2π discontinuities when the phase is ±π due to the arctan function.
Figure 15.21The magnitude of the s (solid) and p (dotted) amplitude coefficients are plotted for each interface for angles of incidence from 0° to 90°. L1, L2, and so on, refer to lens one, two, and so on. F refers to the front side toward the object and B refers to the back side toward the image. Beams exiting from glass into air show total internal reflection above their critical angles. F for front surface, and B for back surface.
Figure 15.22The s (solid) and p (dotted) phase coefficients are plotted in radians for each antireflection coated interface from 0° to 90°. Thin film program calculations often show 2π phase steps, such as on L3_B at 8°; these steps do not affect the Fourier transforms used for point spread function calculations, but can complicate optical path length calculations and interferogram calculation and interpretation. F for front surface, and B for back surface.
Figure 15.23 shows the diattenuation calculated at a set of angles superposed on quadratic fits to the diattenuation at each interface; the quadratic diattenuation coefficients D2,q, diattenuation per radian squared, are provided above each plot. Note how well quadratics fit over this range of angles. The D2,q are used in a paraxial calculation to determine the magnitudes of the diattenuation defocus, diattenuation tilt, and diattenuation piston. Similarly, Figure 15.24 shows the retardance quadratic fits at each interface with the quadratic retardance coefficients δ2,q. This provides the paraxial coating polarization for each interface, except L2/3 and L6/7 are cemented interfaces with no coatings.
Figure 15.23The diattenuation calculated for each coating is plotted as a set of points over the range of angles of incidence at each surface. The quadratic fits to the diattenuation are drawn as solid lines. Above each plot is an equation for the quadratic fit with θ in radians and the value of the quadratic diattenuation coefficients D2,q for each interface in numerical form. F for front surface, and B for back surface.
Figure 15.24The retardance in radians calculated by the thin film algorithm (dots) for each coating is plotted as a set of points over the range of angles of incidence at each surface. The quadratic fits to the retardance are drawn as solid lines. Above each plot is an equation for the quadratic fit with θ in radians and the quadratic retardance coefficient δ2,q in numerical form. F for front surface, and B for back surface.
Angle of incidence maps for the 10° field at each surface are shown in Figure 15.25. The ray at normal incidence in each map is located where the angle of incidence becomes zero. For some surfaces such as L2/3, the normal incidence point is in the top of the pupil, while for others, including L4_F (F for front) and L4_B (B for back), the normal incidence point is in the lower part. For some surfaces such as lens 5_F and lens 6_F, the ray that would be at normal incidence is outside the aperture. The key at the lower right of each plot shows the length and value of the maximum angle of incidence in each plot. At each surface, the angle of incidence of the chief ray θC is the value in the center of the beam.
Figure 15.25The angle of incidence maps for the lens of Figure 15.20 for the 10° field all have the form of the patterns of Figure 15.6, with the angles of incidence radially oriented about a ray at normal incidence and the magnitude increasing linearly from this node. F for front surface, and B for back surface.
To evaluate the paraxial polarization aberration method, a comparison will be made between the result from an exact polarization ray trace and the paraxial polarization calculation. Figure 15.26 displays pupil maps of the surface-by-surface retardance contribution calculated by polarization ray tracing. These surface-by-surface retardance maps can also be calculated from the angle of incidence map. The retardance nodes and angle of incidence nodes on each surface are located in the same place. The values of all the retardances are small, less than 0.2 radians; thus, the retardances can be summed in Pauli coefficients by the method of Section 15.2.2.
Figure 15.26The retardance maps for the beams of Figure 15.25 for the 10° field all have the form of Figure 15.13 (left). Thus, the retardance map at each surface has a defocus, tilt, and piston component, like Figure 15.12. At each surface, the retardance of the chief ray, the value in the center of the beam, is given at the lower right. F indicates a lens front surface and B indicates a back surface.
Figure 15.27 (left) shows an exact calculation of the retardance for the marginal ray at each surface from a polarization ray trace. This is the magnitude of retardance defocus for each surface. These values can be summed to yield the cumulative retardance defocus for the entire lens system. Figure 15.27 (right) shows the summation, representing the cumulative marginal ray retardance from object space through each surface. Similarly, Figure 15.28 shows an exact calculation of the retardance for the chief ray at each surface. This is the magnitude of retardance piston for each surface. Figure 15.29 (left) shows the total retardance (entrance pupil through exit pupil) for the 10° field ray paths. This retardance map can be decomposed into a retardance defocus term, calculated from the marginal ray, a retardance piston term, calculated from the chief ray, and a retardance tilt term, calculated from the product of the chief and marginal rays at each surface. Now, at this field and wavelength with these coatings, the paraxial approximation and exact polarization ray trace can be compared, and it is found that the three paraxial second-order retardance aberrations comprise more than 95% of the exact polarization aberration.
Figure 15.27(Left) The surface-by-surface contributions to the retardance for the real marginal ray for the 10° field. (Right) The cumulative marginal ray retardance from object space through each surface increases monotonically to 0.4 radians, the exact value of the retardance defocus.
Figure 15.28The surface-by-surface retardance contributions for the chief ray for the 10° field (left) and the accumulated values (right). The final accumulated value of 0.06 radians is the paraxial value for the retardance piston.
Figure 15.29For the 10° field, the cumulative retardance map and its decomposition into a sum of retardance defocus, retardance tilt, and retardance piston are calculated from the paraxial ray trace. Comparing with the exact calculation values, the paraxial calculation is about 5% off for the chief ray and 2% off for the marginal ray.
The exact and paraxial calculations of diattenuation are compared. Figure 15.30 maps the exact polarization ray tracing calculation of the diattenuation for each ray. The surface-by-surface diattenuation map is calculated from the angle of incidence map; the diattenuation nodes and angle of incidence nodes on each surface are located in the same place. The values of all the diattenuations are small, less than 0.1; hence, the diattenuations can be summed in Pauli coefficients by the method of Section 15.2.2.
Figure 15.30The exact diattenuation maps for the beam from the 10° field from a polarization ray trace. Since the contributions are quadratic in the angle of incidence, only a few surfaces with large marginal ray angles make substantial contributions.
The diattenuation is calculated by polarization ray tracing for comparison to the paraxial polarization aberration calculation. Figure 15.31 shows the exact calculations for the diattenuation for the marginal ray at each surface, the magnitude of the diattenuation defocus for each surface. These values can be summed to the cumulative diattenuation defocus for the entire lens system. Figure 15.32 shows the exact calculation for the diattenuation for the chief ray for each surface, the magnitude of diattenuation piston for each surface. Figure 15.33 (left) shows the end-to-end diattenuation map for the 10° field. This map can be decomposed into a diattenuation defocus term, calculated from the marginal ray, a diattenuation piston term, calculated from the chief ray, and a diattenuation tilt term, calculated from the product of the chief and marginal rays at each surface. At this field and wavelength, with these coatings, the three second-order diattenuation aberrations comprise 87% of the polarization aberration.
Figure 15.31(Left) The marginal ray diattenuation is plotted for each surface for the 10° field. (Right) The cumulative marginal ray diattenuation from object space through each surface increases monotonically. The final value of 0.13 is the lenses’ diattenuation defocus.
Figure 15.32The surface-by-surface diattenuation contributions for the chief ray (left) and the accumulated values (right) for the 10° field. The final accumulated value of 0.02 is the value for the lenses’ diattenuation piston.
Figure 15.33The cumulative diattenuation map for the 10° field (left) and its decomposition into a sum of diattenuation defocus, diattenuation tilt, and diattenuation piston (right three figures). Comparing with the exact calculation values shown on the left, the paraxial calculation is 8% off for the chief ray diattenuation, 2% off for the marginal ray at the top of the pupil, and 11% off for the marginal ray at the bottom of the pupil.
For the on-axis field (Figure 15.34), the chief ray goes down the axis and the angle of incidence is zero at each interface. Thus, the diattenuation piston, retardance piston, diattenuation tilt, and retardance tilt are uniformly zero for each interface and for the entire lens, as shown in retardance and diattenuation maps in Figure 15.34 (middle and bottom rows). In the paraxial approximation, which is to second order, the defocus aberrations do not change over the field. The tilt terms increase linearly, while the piston terms increase quadratically.
Figure 15.34(Top) Ray paths for the on-axis field. (Middle row) The diattenuation map for the on-axis beam and the paraxial approximation, where only the retardance defocus term is non-zero. (Bottom row) The retardance map for the on-axis beam and the paraxial approximation. In radially symmetric systems, piston and tilt are always zero on-axis.
Figure 15.35 shows the polarization of the ray paths through the lens for a 30° field, the cumulative diattenuation map, and the cumulative retardance map for the lens, which is now dominated by diattenuation piston and retardance piston, since this is the aberration that increases quadratically with field.
Figure 15.35The example lens of Figure 15.20 with the beam traced from the 30° field. The paraxial calculation differs from the exact calculation by 28% for the retardance chief ray and 17% for the diattenuation chief ray.
The paraxial polarization aberration method has been applied to a lens with large etendué to demonstrate the method of calculating the second-order polarization aberrations using quadratic fits for diattenuation and retardance. The particular numerical values for this example are not so important, but the method is powerful. The method and the resulting functional forms can form a template for other polarization analyses.
The paraxial aberration expansion of Section 15.3 is excellent for approximating the Jones pupil of radially symmetric lens and mirror systems and can also accurately describe many off-axis systems, such as fold mirrors and off-axis telescopes. In other cases, the variations of diattenuation and retardance are more complex than second-order terms can accurately describe. Then, a set of higher-order basis functions are useful to analyze such cases. First, vector Zernike polynomials are applied for describing higher-order variations of the electric field. Then, orientors are introduced for the expansion of angle of incidence, linear diattenuation, and linear retardance into a set of basis functions.
Consider an arbitrary polarization aberrated monochromatic wavefront described on a reference sphere. The electric field distribution can be characterized by a Jones vector function E in normalized pupil coordinates, either polar coordinates, (ρ, ϕ), or Cartesian coordinates, (x, y),
Since E is a complex two-element vector function, four scalar functions, Ax, Ay, Wx, and Wy, are required for a full description. W is the wavefront’s phase in waves.
Now, consider a simpler wavefront. Many wavefronts are linearly or nearly linearly polarized. For the description of such linear vector fields, the Zernike polynomials (Section 10.7.6) have been generalized into vector Zernike polynomials, .8 To construct the vector Zernike polynomials, the Zernike’s cos(m ϕ) are replaced by the vector,
, and terms sin(m ϕ) are replaced by
, where the pair of orthogonally polarized basis vectors,
and
, are
Index m is the order of the angular part of the Zernike polynomial. The vector Zernike polynomials through order n = 4 are listed in Table 15.1 and are graphed in Figure 15.36. For each vector Zernike polynomial in Figure 15.36, there is another term that is rotated by 90°; the first three are shown in the bottom row of Figure 15.37. The vector Zernike polynomials form an orthonormal basis set,
Figure 15.36The vector Zernike polynomials for order n = 0 through 4 for terms e = 0 and 2.
Figure 15.37The first three vector Zernike polynomials for (top row) order e = 0 and 2, and (bottom row) order e = 1 and 3. For each vector Zernike polynomial in Figure 15.36, there is another term that is rotated by 90°; the first three are shown here.
where Γ is a normalization factor that here is chosen to equal one.
All the vector Zernike polynomials shown in Figures 15.36 and 15.37 have the same phase; therefore, the arrows are all at the end of the field line. Phase changes move the arrow around the polarization ellipse of a Jones vector, or in time, the arrow moves up and down along a linear Jones vector, as seen in Figure 15.38. Hence, the vector Zernike polynomials describe the linear polarization’s amplitude and orientation but not the phase.
Figure 15.38Phase change moves arrowheads up and down a linear polarization ellipse.
Consider an arbitrary linearly polarized vector function E1 of the form
where the phases are equal to zero. Using the vector Zernike polynomials, Equation 15.36 can be expressed as the summation
The expansion coefficients are found from the inner product
Equation 15.37 describes the variation of the amplitude and orientation of a linear polarization state with constant phase. To describe the phase, two additional functions are needed, either
an x-phase function and
a y-phase function, or
an average phase function (wavefront aberration function) and
a phase difference function (elliptical polarization).
(A) is straightforward, so (B) will be explored. First, the average of the phases
describes a (polarization-independent) wavefront aberration contribution that can be expanded in its own set of Zernike polynomials to describe defocus, tilt, spherical aberration, coma, astigmatism, and so on. However, the light may not be uniformly linearly polarized. In this case, the vector Zernike polynomials describe the major axis of the polarization ellipse. The ellipticity of the light, ε (ρ, ϕ), varying from +1 for right circularly to −1 for left circularly polarized light, can then be expanded in an additional set of Zernike polynomials; for a linearly polarized field, these last Zernike coefficients would be all zero. Thus, in general, four sets of Zernike polynomials, counting the vector Zernike terms as two, are needed to fully describe a single polarized wavefront (i.e., from a single field point at a single wavelength) in an aberration expansion.
In Section 15.5.1, the electric field in a circular pupil was expanded into vector Zernike polynomials because of the light’s vector nature. Vectors repeat after a 360° rotation. Angle of incidence, linear retardance, and linear diattenuation are not vectors; their properties repeat after a 180° rotation. To account for this geometrical property of repetition after a 180° rotation, orientors are introduced, which provide basis functions for the expansion of angle of incidence, linear retardance, and linear diattenuation.8 These orientor basis functions are derived from the vector Zernike polynomials.
Consider the behavior of linear retardance. Two retarders with retardances δ1 and δ2 with parallel fast axes have a net retardance δ1 + δ2. The retardances still add to δ1 + δ2 after one retarder is rotated by 180°, whereas two vectors would subtract when one vector is rotated by 180°. Consider the following geometrical construction that transforms angles. If the orientation angles (fast axis angles) of linear retarders are doubled, all θ are transformed to 2θ. Now, the transformed “orientation” 2θ repeats in 360° and a vector representation for “angle-doubled linear retardance” can be used. This “double the angle” property was seen earlier in Figure 15.2, where it followed from the Pauli matrix expressions for the combination of weak linear retardance or weak linear diattenuation. Linear diattenuation has the same behavior as retardance upon rotation. If one of two diattenuators is rotated by 180°, the two diattenuators combine in the same way. Equal magnitude diattenuators with axes 90° apart have a net diattenuation that cancels.
Orientors are defined using this “angle-doubled” approach and applied to representing linear retardance, linear diattenuation, and angle of incidence functions. Consider a pupil map in polar coordinates of linear retardance magnitude δ(ρ,ϕ) and fast axis orientation ψ(ρ,ϕ), with ψ defined in the range 0 ≤ ψ ≤ 180°, such as the arbitrary example retardance map in Figure 15.39. This pupil map is transformed into a distribution of vectors with the same magnitude δ(ρ,ϕ) but with orientation 2 × ψ (ρ, ϕ). With this “double the angle” transformation, vector Zernike polynomials can now be used as a basis for linear diattenuation and linear retardance aberrations and angle of incidence maps providing a basis set for higher-order polarization aberrations. The orientors are thus vector Zernike polynomials but oriented at half the angle. Hence, an orientor is a line object with a magnitude and orientation ψ associated with a vector at twice the orientation ϕ = 2ψ. Note that for linear retardance, linear diattenuation, and angle of incidence functions, there is no phase that needs to be described, as there was for the electric field in Section 15.5.1.
Figure 15.39Example of the conversion of a map of orientors (left) into a map of vectors (right). The orientor distribution could be an angle of incidence map, diattenuation map, or a retardance map. To create the map of vectors, an arrowhead is added to the left (positive x) end of the orientor, and the angle from the x-axis is doubled.
Next, the lowest-order orientor terms corresponding to the lowest-order vector Zernike polynomials will be considered. At zero order, the orientor basis set has two constant terms, and
, rotated by 45° from each other, as shown in Figure 15.40 (top row), and two corresponding vector basis functions,
and
, rotated by 90° from each other and at twice the angle from the x-axis, as shown in Figure 15.40 (bottom row). Changing the sign of any orientor rotates its map by 90°.
Figure 15.40(Top row) The two zero-order orientor pupil maps, and
, are constant distributions of lines corresponding to piston. The corresponding vector Zernike maps, oriented at twice the angle of the orientors, are shown in dark red in the row below. For negative coefficient values, the orientors are rotated by 90° and the vectors are rotated by 180°.
At first order, there are four orientors shown in Figure 15.41. Two of the first-order orientor pupil maps, and
, rotate clockwise, moving clockwise around the pupil, and are associated with the angular distributions for positive values of the Zernike radial polynomial’s coefficients,
Figure 15.41(Top) The four orientor maps at first order: ,
,
, and
. (Bottom) The corresponding Zernike vector polynomials,
,
,
, and
, are shown in magenta directly below. The
,
,
, and
terms (left two columns) correspond to the diattenuation and retardance tilt terms. The orientors are shown for positive coefficients. For negative coefficient values, the orientors are rotated by 90°.
The other two first-order orientor pupil maps rotate counterclockwise, moving clockwise around the pupil, and
, associated with the angular distributions
Six orientor basis functions are present at second order. Figure 15.42 shows the two terms with m = 0, and
. Note that the orientor’s orientation changes sign as the corresponding Zernike vector polynomial (lower row) passes through zero.
and
describe quadratic magnitude variations with constant orientation.
and
pass through zero at a radius of
to orthogonalize with the constant terms,
and
. Figure 15.43 shows the four terms with m = 2,
,
,
, and
.
is our linear diattenuation defocus and linear retardance defocus aberration form and is ubiquitous in describing polarization aberrations of radially symmetric systems. The other three terms are present to a much lesser extent in typical pupil function expansions.
Figure 15.42The second-order orientor pupil maps for m = 0, , and
.
Figure 15.43The second-order orientor pupil maps for m = 2, ,
,
, and
. The term
on the left corresponds to diattenuation defocus and retardance defocus.
Figures 15.44 and 15.45 continue the expansion, showing the terms at third order, while Figures 15.46, 15.47, and 15.48 show the orientors and the corresponding Zernike vector polynomials at fourth order.
Figure 15.44The third-order orientor maps (top) and vector Zernike polynomials (bottom) for m = 1.
Figure 15.45The third-order orientor maps (top) and vector Zernike polynomials (bottom) for m = 3.
Figure 15.46The fourth-order orientor maps (top) and vector Zernike polynomials (bottom) for m = 0.
Figure 15.47The fourth-order orientor maps (top) and vector Zernike polynomials (bottom) for m = 2.
Figure 15.48The fourth-order orientor maps (top) and vector Zernike polynomials (bottom) for m = 4.
Orientors were developed to describe linear diattenuation and linear retardance distributions in pupils with a series representation based on Zernike polynomials. Such a series representation can be constructed in different ways. One algorithm for describing an arbitrary Jones pupil’s linear parts with orientors is as follows. Given a Jones pupil J (ρ, ϕ), the function is divided into a hermitian (diattenuating) matrix function H (ρ, ϕ) and a unitary (retarding) matrix function U (ρ, ϕ) using the polar decomposition (Section 5.9.3),
Both H (ρ, ϕ) and U (ρ, ϕ) have four degrees of freedom. The retardance is decomposed following Chapter 14, using the matrix logarithm to divide U (ρ, ϕ) into Pauli components representing the average phase, ϕ, the 0° and 45° linear retardance components, δH and δ45, and the circular retardance, δL,
This determines the linear retardance part of the Jones pupil, which has corresponding linear retarder Jones matrices
Next, LR (δlinear, θ) has its angle doubled, (θ → 2θ), and is treated as a vector function to be expanded into vector Zernike polynomials. The linear retardance is . LR corresponding to δHσ1 + δ45σ2 is to be expanded in orientors, O (δlinear, 2θ), by doubling the orientation angles and expanding O (δlinear, 2θ) in the vector Zernike polynomials. The phase, the “scalar” wavefront aberration, will be expanded in “scalar” or ordinary Zernike polynomials, as is usual. The linear retardance normally comprises the majority of the retardance, but any significant circular retardance can also be expanded in its own set of scalar Zernike polynomials. Characterizing and understanding the linear retardance generally has a greater priority than the circular retardance.
Now, the linear retardance is given by a sum of vector Zernike polynomials,
in a form similar to our Pauli matrix representation. The coefficients ,
describe the amount of each vector Zernike polynomial term, and the corresponding orientor term in the retardance pupil map, analogous to Zernike polynomial coefficients for a wavefront.
It was shown in Section 15.2.1 that for small values of linear retardance, δlinear ≪ 1, retardances in the Pauli form add. Therefore, the linear retardance orientors expressed as vector Zernike polynomials also add in the weak retardance limit. If several parts of a weakly retarding system are expressed as vector Zernike polynomials,
the resulting linear retardance distribution can be expressed approximately as the sum of the coefficients for each corresponding vector Zernike polynomial term, An example of the application of this method is found in Section VIII of the work of Ruoff and Totzeck.8
To express diattenuation maps in an expansion of orientors, the same procedure is applied as used above for retardance, except the linear diattenuation is obtained from the matrix logarithm of the hermitian part of the Jones matrix following the procedure of Section 14.4.5.
Polarimetry is useful for measuring the polarization aberrations of optical systems and for characterizing optical and polarization components. Here, a few examples of polarization aberration measurements are provided. Optical system polarization aberrations can be measured by placing the system in the sample compartment of a Mueller matrix imaging polarimeter such as the Axometrics AxoStep Muller matrix imaging polarimeter. Usually, the exit pupil is imaged, measuring a Mueller matrix as a function of pupil coordinates. Maps of linear diattenuation, linear retardance, and other metrics are readily generated. Such a Mueller matrix pupil image is readily converted to a Jones pupil, but the absolute phase, the wavefront aberration, is not measured by a non-interferometric Mueller matrix image set.
Figure 15.49 (top) shows the polarimeter configuration to measure polarization aberrations for a pair of 0.55 numerical aperture microscope objectives. Collimated light from the polarization generator enters the pupil of the first objective, focuses at the joint focal point of the two objectives, is recollimated by the second objective, and measured in the polarization state analyzer. Figure 15.49 (bottom) shows a measured Mueller matrix pupil image for such a pair of objectives, specifically sold as low polarization objectives for polarization microscopes. Figure 15.50 plots the diattenuation and retardance maps calculated from the Mueller matrix image. This microscope objective pair has up to 5.4° of spatially varying retardance and 0.1 of spatially varying diattenuation. When placed between crossed linear polarizers, this pair of objectives will leak about 0.15% of the incident flux, averaged over the pupil.
Figure 15.49A schematic of a Mueller matrix imaging polarimeter measuring the polarization aberration of a pair of microscope objectives (top). Polarization states (PS) Generator is the polarization state generator and PS Analyzer is the polarization state analyzer. In this configuration, the camera is focused on the exit pupil of the microscope objectives. (Bottom) The Mueller matrix image for a microscope objective pair is close to the identity matrix with weak linear diattenuation evident in the top row and retardance in the off-diagonal elements of the lower right 3 × 3 elements.
Figure 15.50Linear diattenuation and linear retardance pupil maps for a pair of microscope objectives are nearly radially symmetric as expected. Deviations from radial symmetry are likely due to slight tilts and decenters.
Figure 15.51 shows the diattenuation and retardance aberrations of another microscope pair where the polarization aberrations were further reduced by thin film coating design. Figure 15.52(left) schematically shows a microscope objective between crossed polarizers, while Figure 15.52 (right) shows the corresponding flux distribution in the exit pupil.
Figure 15.51Diattenuation (left) and retardance (right) aberrations for another microscope objective pair with a reduced polarization coating design. Leakage through crossed polarizers for the low polarization microscope objective pair.
Figure 15.52(Left) A microscope objective between crossed polarizers, and (right) the flux distribution in the exit pupil.
When significant polarization aberrations are present, an optical system illuminated with a uniform polarization state will have polarization variations within the point spread function. To characterize these variations and the dependence of the point spread function on the incident polarization state, a Mueller matrix imaging polarimeter focuses on the image of a point object and measures the Mueller Point Spread Matrix, MPSM, as a Mueller matrix image (Section 16.5). A measured MPSM with large polarization aberration is shown in Figure 15.53. A vortex retarder (Section 5.6.3) was placed in the pupil of an imaging system with a large F/# image on a camera focal plane, and a Mueller matrix image was acquired. This vortex retarder is a half wave linear retarder whose fast axis varies as a function of pupil angle.9 The pupil image on the left side shows the retardance orientation varying by 360° around the pupil. The right side contains the MPSM. When the Stokes vector of the incident light is multiplied by the MPSM, the resulting Stokes vector function describes the flux (point spread function) and polarization state variations within the image as a Stokes vector image. Figure 15.54 shows the point spread function for a fixed incident polarization state and several analyzers, demonstrating the polarization variations within the point spread function.
Figure 15.53(Left) The orientation of the fast axis of the half wave vortex retarder rotates by 360° around the pupil. (Right) The Mueller point spread matrix (MPSM) describes the polarization dependence of the point spread function as a Mueller matrix image.
Figure 15.54The measured point spread function of the vortex retarder completely changes with the analyzed polarization state: (left) no analyzer, (middle) horizontal linear analyzer, (right) vertical linear analyzer. Horizontal linearly polarized light is input.
Figure 15.55 shows the depolarization aberration measured from a lens whose coatings were damaged by heat and began flaking off.10 The resulting Mueller matrix pupil image shows a few tenths of a percent depolarization in the damaged area. The undamaged area has a depolarization of only a few hundredths of a percent, more typical of coated lenses.
Figure 15.55Depolarization index of lens with coating damage on the right center causing about 0.005 depolarization.
Paraxial optics provides straightforward and meaningful definitions of focal length, and the other “first order” properties of optical systems. Paraxial optics forms the underlying coordinate system for aberration theory. The Seidel wavefront aberrations are defined as deviations from paraxial performance. Similarly, for the derivation of polarization aberrations, paraxial optics forms an excellent basis for deriving the low-order forms of polarization aberration. In fact, the fraction of the etendué of an optical system that is well described by second-order polarization aberrations is generally far larger than the fraction of the etendué described by the fourth-order wavefront aberrations, that is, the region where the contributions of spherical aberration, coma, astigmatism, and field curvature is much less than one wave of optical path length.
The optical designer and optical engineer should be pleased to know that 95% of the polarization aberration of most optical systems can often be described with just three terms, polarization defocus, polarization tilt, and polarization piston.
Paraxial optics is the optics of ray paths in the vicinity of the optical axis through radially symmetric optical systems. As ray paths approach the optical axis, linear approximations to Snell’s law and the location of ray intercepts become increasingly accurate. In paraxial optics, all the rays from an object point intersect at the same image point, forming a “perfect image.” Thus, paraxial optics forms an excellent coordinate system for describing aberrations; aberrations are the deviations from paraxial behavior. Here, a brief summary of paraxial ray tracing is provided and augmented with the calculation of angles of incidence and propagation vectors for paraxial skew rays, a key result.
Paraxial optics is used to define focal length, nodal points, principal planes, pupil locations, magnification, and the other “first order” properties of optical systems. Our interest is primarily in the paraxial polarization aberrations; hence, these calculations are not treated here; the reader is referred to Field Guide to Geometrical Optics by John Greivenkamp,11 whose notation has been adopted here.
The paraxial region of an optical system is a small region close to the optical axis where the ray paths are accurately calculated by applying the linearized form of Snell’s law. At a refracting interface, Snell’s law relates the angle of incidence θ1 in an incident medium of refractive index n1 to the angle of refraction θ2 in a medium with refractive index n2,
For rays propagating very near the optical axis, the angle of incidence is very small; thus, replacing sin θ with its linear approximation θ yields the paraxial form of Snell’s law
In calculating the ray intercepts of paraxial rays, a linear approximation to the intercept is all that is needed. Since the sag of spheres, parabolas, and other conics varies quadratically about the vertex, the paraxial ray trace can ignore the sag; the paraxial ray intercept is the intersection of the ray with the vertex plane.
Because of the linearity of paraxial optics, all paraxial rays can be formed from linear combinations of two linearly independent paraxial rays. By convention, these two rays are chosen as the marginal ray and the chief ray. The marginal ray is chosen in the y–z plane from the center of the object through the top of the entrance pupil and edge of the aperture stop. The chief ray is a ray from a point on the edge of the field of view, through the center of the entrance pupil and center of the aperture stop. We distinguish between the paraxial marginal ray and the (real) marginal ray, and the paraxial chief ray and (real) chief ray. Figure 15.56 shows an example optical system with a marginal ray in the y–z plane, a chief ray in the x–z plane, and a skew ray formed from the addition of the yz marginal ray to the xz chief ray.
Figure 15.56A grid of collimated rays (gray) at a 30° field angle in the xz-plane propagates through an example lens is shown in perspective view, in the yz-plane, and in the xz-plane. The chief ray of the grid of rays is shown in black propagating in the xz-plane. A marginal ray in the yz-plane is shown in red. A skew ray (purple) within the grid of collimated rays can be calculated by a combination of the marginal and chief rays (1 marginal ray height + 1 chief ray height).
Using the results of a paraxial ray trace for the chief and marginal rays, together with quadratic Taylor series expansion coefficients for the Fresnel coefficients and amplitude reflection and transmission coefficients, yields simple approximate and easy-to-calculate polarization aberration expansions.
The radially symmetric optical system at its reference wavelength is defined by a set of thicknesses, tq, refractive indices, nq, and curvatures, Cq. Curvature is the reciprocal of the radius of curvature R of a surface and has units of mm−1. The index q = 0, 1, 2, …, Q − 1, Q labels the surfaces. q = 0 indicates the object surface, also indicated by subscript O. q = Q = I indicates the image surface. Subscript E indicates the entrance pupil; often, the entrance pupil is the first surface in the optical prescription. Similarly, subscript X indicates the exit pupil. Typically, q = Q − 1 will be the exit pupil. The optical power, Φq, the ability to focus light, of surface q is
which is measured in inverse mm−1.
The paraxial ray trace calculation can be organized into a standard table, the y–n–u ray tracing form (Table 15.2), or the information can be organized into an equivalent data structure in a computer. The optical system parameters, Cq, tq, and nq, are entered in the first three rows. The lens surface powers, Φq, and reduced thicknesses on rows four and five are calculated.
The general paraxial ray trace procedure traces two rays, the paraxial marginal ray (from the center of the object to the edge of the entrance pupil) and full-field paraxial chief ray (from the top of the object to the center of the entrance pupil).12 All paraxial rays can be calculated from linear combinations of these two rays. The paraxial ray trace will be performed in the y–z plane. The optical system is assumed fixed. Rays are started and their paths through the system are calculated. Consider an example meridional ray in the y–z plane. The ray is started at the object plane with y0 and u0. The intercept of the ray is found with the first surface’s vertex plane, y1. The angle of incidence, θ1, is calculated from u0 and the normal. Then, the ray is reflected or refracted determining u1. The process is repeated for the second surface, the third surface, and so on, until the image is reached.
The paraxial marginal ray heights are indicated by y0, y1, y2, … yQ−1, yQ. The paraxial marginal ray angles, the angle with respect to the optical axis, are indicated by u0, u1, u2, … uQ−1, uQ. The slope of a ray, u, is positive if a counterclockwise rotation brings the axis to the ray. The paraxial angles are defined as the tangents of the actual angles; thus, the paraxial ray trace is linear with respect to ray angles and ray heights. The marginal ray angles of incidence are indicated by θ0, θ1, θ2, … θQ−1, θQ. Chief ray quantities are indicated by the same letters and subscripts except with a bar over the letter, such as .
The marginal ray is started on axis with y0 = 0, and u0 chosen to strike the edge of the entrance pupil. The chief ray is started at the edge of the object with the angle
chosen to pass through the center of the entrance pupil. The starting values for the marginal ray y0 = 0 and u0 are entered on the left side of the next two rows, selected so that the marginal ray intersects the edge of the entrance pupil. The starting values for the chief ray,
and
, are entered in the next two rows selected so that the chief ray intersects the center of the entrance pupil. With the marginal and chief ray starting values defined, the rays are transferred from each surface q to the next surface q + 1 with the paraxial ray transfer equations,
Then, the paraxial refraction equation is applied with the revised values yq and to calculate the marginal ray angle
and chief ray angle
Transfer and refraction are repeatedly applied to complete these rows, systematically filling in the blank entries of the ray tracing form. The angles of incidence, θq, defined as the angle between a ray and the normal to its ray intercept, are then calculated for the marginal and chief ray intercepts,
θ is positive if a counterclockwise rotation brings the surface normal to the ray as shown in Figure 15.57. The paraxial ray trace algorithm presented in many references does not include θ; it is not needed for finding ray coordinates and cardinal points. For polarization analysis, calculating θ is an necessary objective since θ is needed for evaluating Fresnel and amplitude coefficients and calculating the polarization of interfaces.
Figure 15.57A ray (blue) with incident angle θ, paraxial angle u, and ray height y intercept at a surface with radius of curvature R.
For a reflecting surface, the power is set to −Фq. A plane mirror has a power of −1. The surface normal η at a point r on a surface is a vector in the plane of incidence that is perpendicular to the tangent plane at r. By convention, η points from the ray intercept away from the incident medium into the refracted medium. The slope of the surface normal, the scalar η, is likewise positive if clockwise rotation brings the optical axis to η. The paraxial angle of η is calculated for a parabolic fit to the surface. For a spherical surface with curvature C, the sag in the y–z plane, and its second order approximation, is
As seen in Figure 15.58, to second order, the sphere is the same as the osculating parabola. The slope of the paraxial sphere’s normal is
Figure 15.58Sphere (blue) and its osculating parabola (red) at the vertex match shapes to second order.
The definitions of u and ηparaxial sphere are consistent with the conventional definition of slope as
The paraxial transfer and refractions can be simplified by incorporating the refractive index into angles and thicknesses. Reduced angles ω are paraxial angles times the refractive index,
Reduced thicknesses τ are physical thicknesses divided by the refractive index
With reduced quantities, the marginal and chief paraxial transfer and refraction equations take the following simplified form at the qth ray intercept:
The skew aberration algorithm of Section 18.5 requires the propagation vector kq for paraxial skew rays. Because of the linearity of paraxial optics, any meridional paraxial ray can be expressed as the linear combination of two linearly independent meridional rays, such as the marginal and chief rays. An arbitrary paraxial skew ray can be expressed as the linear combination of four linearly independent paraxial rays; the chief and marginal rays in the y–z plane and the chief and marginal rays rotated into the x–z plane are used as the basis ray set.
Thus, for the ray from the object point (G, H) through stop location (x, y), the ray intercept at the qth surface is
where G and H are normalized object coordinates, G along the x-axis and H along the y-axis. Normalization is performed such that around the edge of a circular field of view, as defined by the chief ray height on the object plane, Likewise, the ray slope after the qth interface is
For skew rays, it is necessary to distinguish between quantities measured relative to the x-axis, subscript x, and y-axis, subscript y.
The skew aberration calculation uses the propagation vectors kq specified after the qth ray intercept is along . The normalized propagation vectors, kq, are
Find the polarization aberration function for a paraxial silicon interface, n = 4, with a marginal ray angle of incidence of 0.2 radians for the on-axis beam.
Express the Jones matrices for the following polarization elements in the weak polarization element form:
J1: An ideal linear diattenuator with D1 = 0.02, a transmission axis at 0°, and an average transmission of 1.
J2: An ideal linear retarder with δ2 = 0.006 and a fast axis at 45°, and an average transmission of 1.
J3: An ideal linear retarder with δ3 = 0.01 and a fast axis at θ3 = tan−1(3/4)/2, and an average transmission of 3/5.
Express J1J2 and J2J1 in weak polarization element form.
Which polarization effects are order independent?
Which polarization effects are order dependent?
Express J1J2J3 in weak polarization element form, keeping only first order terms.
Show how for J1J2J3 the first-order terms are order independent and are sums over the Pauli coefficients.
A beam of light goes through a series of five weak diattenuators each with diattenuation D = 0.01. The diattenuation transmission axes are oriented at angle 0°, 22.5°, 45°, 67.5°, and 90°.
Express each Jones matrix in the weak polarization element form. Find the σ1 and σ2 components for each element.
Add the components. What is the net diattenuation?
What is the orientation of the diattenuation transmission axis?
Row 1 contains eight polarization aberration maps, labeled A through H. Brown lines represent pupil maps of the magnitude and orientation of linear diattenuation. Pink lines represent linear retardance. These aberration maps are used to generate the ellipse maps in rows 2 and 3. Row 2 has been rearranged, as has row 3.
Match each ellipse map in row two, labeled 1 through 8, with the corresponding polarization aberration map and specify the incident polarization state.
Match each ellipse map in row three, labeled α through θ, with the corresponding polarization aberration map and specify the incident polarization state.
An optical system has retardance tilt for a particular object point. The diattenuation is zero everywhere in the pupil. The retardance magnitude increases linearly from the center of the pupil. The orientation ψ in degrees of the retardance fast depends on the angle θ in the pupil as ψ(θ) = 45 − θ/2. This retardance aberration pattern can be graphically represented as follows:
The position of the center of each line represents a location in the pupil. The length of the line represents the retardance magnitude. The orientation represents the orientation of the retardance fast axis. Assume that the maximum retardance at the edge of the pupil is much less than 1 radian. (Note: No calculations are requested or necessary in this problem.)
Where in the pupil is there no change of polarization state for all polarization states?
Where in the pupil is there no change of polarization state for 45° linearly polarized light?
If the optical system is placed in a linear polariscope with an initial horizontal linear polarizer and final vertical polarizer, what will the distribution of flux be across the exit pupil? Does the flux vary linearly or quadratically?
If the optical system is placed in a circular polariscope with an initial right circular polarizer and final left circular polarizer, describe the distribution of flux in the exit pupil.
For what incident polarization states is the polarization change, integrated across the pupil, the largest, and why? This is the total leakage in the polariscope.
A four-element lens has eight surfaces. All elements are fabricated from the same glass. A coated lens assembly is fabricated and an uncoated lens assembly is fabricated. These are to be compared. The surfaces are described by the intensity transmittances shown below. The left graph is for light entering the lenses (external refraction, odd-numbered interfaces). The right graph is for light exiting the lenses (internal refraction).
Estimate the intensity transmittance and the diattenuation down the axis for the coated and uncoated lenses.
If the marginal ray had angle of incidences of 0.8, 0.4, 0.6, 0.1, 0.2, 0.3, 0.7, 0.5 radians at each successive lens surface, estimate the diattenuation for the marginal ray path through the uncoated lens.
Repeat part (b) for antireflection-coated lens.
The following questions (a) to (d) assume a glass where diattenuation for small angles equals D(AoI) = d2AoI 2. Write a polarization aberration function in terms of σ0, σ1, σ2, and σ3 for each question. An example would be the polarization aberration function for polarization defocus with a shift along the y-axis, Jones matrix , where dH is diattenuation magnitude at edge of pupil, δH is retardance magnitude at edge of pupil, yo is pupil shift along the y-axis, and (x, y) are normalized pupil coordinates. (Feel free to make simplifying approximations.)
A collimated beam of light is deviated by 30° at a prism. The angles in air are equal at each surface as are the internal angles.
A collimated beam of diameter 2 mm enters a spherical surface of radius 10 mm, where the vertex of the spherical surface is shifted toward +x by 1 mm.
Light enters a flat glass surface of diameter 2 mm at normal incidence and then exits through a cubic phase plate z = 0.01 (x + y)3.
A converging spherical beam enters and focuses at the center of a spherical marble and then refracts out the back side.
For the following, assume a metal reflector whose diattenuation for small angles equals D(AoI) = d2AoI 2 and whose retardance for small angles equals δ(AoI) = δ2AoI 2.
A beam of numerical aperture 0.2 with a central ray that reflects at normal incidence from the reflector.
A beam of numerical aperture 0.2 with a central ray with k = (0, 0, 1) that reflects at normal incidence from a reflector with normal .
A beam of numerical aperture 0.2 with a central ray with k = (0, 0, 1) that reflects at normal incidence from a reflector with normal η1 = (sin (0.2), 0, cos (0.2)), and then from a second mirror with normal .
Reflection of a 2 mm diameter collimated beam from a toroid z = 0.06 x y.
Two identical pieces of sheet retarder have a retardance of δ = 0.1 ≈ 6°. As a result of the fabrication process, stretching, the fast axis of retardance varies steadily from one side of the retarder to the other, as in the figure, by ±3°. One of the retarders is then rotated by 90°, so the fast axes are crossed at the center.
Calculate the retardance of the composite retarder using the weak polarization approximation.
What higher-order effects would be present, and about how small would they be?
What would be the leakage between a horizontal and vertical polarizer?
For H = (0, h0), given the chief and marginal angles of incidence, ic and im, which point in the pupil is at normal incidence, θ = 0?
What is the condition on ic and im for θ(H = 1, ρ = 1) to equal zero?
In Figure 15.8, assume the diattenuation is 0.3 for the marginal ray, if the four wavefronts are transmitted through a vertical analyzer, what will the flux P(ρ,ϕ) be?
1J. Sasián, Introduction to Aberrations in Optical Imaging Systems, Figure 15.8, Cambridge University Press (2013).
2H. Kuboda and S. Inoue, Diffraction images in the polarizing microscope, J. Opt. Soc. Am 49 (1959): 191–192.
3R. A. Chipman, Polarization analysis of optical systems, Opt. Eng. 28(2) (1989): 90–99.
4R. A. Chipman, Polarization aberrations, PhD dissertation, Optical Sciences Center, University of Arizona, Tucson, AZ (1987).
5J. P. McGuire and R. A. Chipman, Polarization aberrations. 1. Rotationally symmetric optical systems, Appl. Opt. 33 (1994): 5080–5100.
6R. V. Shack and K. Thompson, Influence of alignment errors of a telescope system on its aberration field, in Proc. SPIE 251, Optical Alignment I, 146 (1980).
7K. Thompson, Description of the third-order optical aberrations of near-circular pupil optical systems without symmetry, J. Opt. Soc. Am. A 22 (2005): 1389–1401.
8J. Ruoff and M. Totzeck, Orientation Zernike polynomials: A useful way to describe the polarization effects of optical imaging systems, J. Micro/Nanolithogr. MEMS MOEMS 8.3 (2009): 031404.
9S. C. McEldowney, et al., Vortex retarders produced from photo-aligned liquid crystal polymers, Opt. Exp. 16(10) (2008): 7295–7308.
10J. Wolfe and R. A. Chipman, Reducing symmetric polarization aberrations in a lens by annealing, Opt. Exp. 12(15) (2004): 3443–3451.
11J. E. Greivenkamp, Field Guide to Geometrical Optics, SPIE Field Guides 1 (2004).
12B. R. Irving, et al., Code V, Introductory User’s Guide, Optical Research Associates (2001), p. 82.