9.3 Lens optimization⧉
In the FUNDAMENTALS part the thin lens did something miraculous: every ray leaving a scene point and passing through the lens reconverged to a single image point. That was a convenient lie — the paraxial linearization, $\sin\theta\approx\theta$, one wavelength, zero thickness. The previous chapter, Compound lenses, charged the lie: keep the cubic term of Snell's law, let the index depend on color, give the glass real thickness, and a single element can no longer bring all the rays from all the points at all the colors to where they belong. The fix was to add elements, and adding elements added knobs — radii, glasses, spacings — that let the designer trade one defect against another.
This chapter is about turning those knobs. The crucial reframing, and the reason this chapter belongs in a computational-photography book at all, is that choosing the knobs is an optimization problem — the very same shape of problem as the inverse problems elsewhere in the book. There is no formula for the perfect lens past first order. So you don't solve; you search. You write down a single number that measures how badly a candidate lens performs, summed over everything it has to do at once, and you let an optimizer push the knobs until that number is as small as you can get it. Lens design is the book's optimization theme cast in glass.
9.3.1 The high-level idea: design as optimization⧉
Start by being honest about why a formula is impossible. The paraxial equations of the thin lens are linear: power adds, the lensmaker's equation is closed-form, and you can solve for the focal length you want in one line. But the moment you keep the next term of $\sin\theta \approx \theta - \theta^3/6$, the equations stop being linear, and there is no longer any closed-form expression that places every ray from every field point at every wavelength through every part of the aperture onto its ideal spot. The conditions overdetermine the lens: you have more things you want to be true than you have surfaces to make them true with. Past first order, design becomes a search through a space of candidate lenses rather than a derivation of the one right answer.
So we set the problem up the way the book sets up every optimization. We need three things: the unknowns we are free to vary, an objective number to minimize, and (next section) a forward model that turns unknowns into that number.
The parameters — the unknowns. A lens is described by a list of numbers, and almost every one of them is a free variable the designer can move:
- each surface's radius of curvature — the dominant knob, since curvature is what bends rays;
- the glass of each element — which fixes both its refractive index $n$ and its dispersion (how strongly $n$ varies with wavelength). This one is special: glass comes from a catalog of a few hundred named types, so it is a discrete choice, not a continuous dial — an awkward variable for a gradient-based optimizer, and one reason global search (below) earns its keep;
- the thicknesses of the elements and the air-gap spacings between them;
- aspheric coefficients, when a surface is allowed to depart from a sphere (an extra handful of knobs per aspheric surface — Aberrations explains why one asphere can do the work of several spherical elements);
- the stop — the size and axial position of the aperture, which sets the f-number and selects which rays even participate.
Collect all of these into a single parameter vector $\mathbf{p}$. A modest lens has a dozen or two entries; a complex zoom has many more. The design space is whatever $\mathbf{p}$ can range over.
The objective — the merit function. Now the single number. We want a candidate lens to be judged by one scalar so that "better" and "worse" are unambiguous and an optimizer has something to descend. That scalar is called the merit function $\Phi(\mathbf{p})$, and its job is to sum up residual aberration over every condition the lens must satisfy at once. Concretely, $\Phi$ is a weighted sum of squared ray errors,
where $(x_i, y_i)$ is where ray $i$ actually lands on the image plane, $(\bar x, \bar y)$ is where it should land (the chief-ray or centroid target), and $w_i$ is a weight letting the designer say which conditions matter more. The three nested sums are the whole point: the lens is scored across a grid of field angles (image center out to the corner), across a set of wavelengths (classically the C, d, and F spectral lines — red, yellow-green, blue — so color errors get counted), and across a sampling of aperture/pupil zones (rays through the center of the aperture and rays through its edge). A lens that is razor-sharp on-axis in green but smeared in the corners and fringed in blue scores badly, exactly as it should.
Two things about the form of $\Phi$ are worth pausing on, because they are precisely the book's recurring structure. First, it is a sum of squared residuals — a least-squares objective, the same shape as the regression and linear-inverse problems of Slopendium/04 Computational tools; here the "residual" is a ray missing its target instead of a prediction missing a data point. Second, the trailing constraint penalties play the role of regularizers. A lens that minimized blur alone might come out the wrong focal length, three meters long, or with an element so thin its edge is negative (physically impossible to grind). So $\Phi$ carries hard constraints and penalties — hit the target focal length $f$, keep total length under a budget, keep every edge thickness positive, respect vignetting limits — that pull the solution toward something manufacturable and useful, not merely sharp. Data-fit plus regularization, the same bargain as everywhere else in the book.
The big loss integral. It helps to picture what $\Phi$ is really approximating. Imagine integrating the blur the lens produces over all the conditions it must satisfy simultaneously — every field point in the image, every wavelength in the band, every position across the pupil — one giant integral of badness over the whole space of imaging conditions:
You cannot evaluate that integral in closed form — that is the whole trouble — so you sample it: a finite grid of fields, a few wavelengths, a finite set of pupil rays. The discrete merit function above is that integral, sampled, exactly the way Monte Carlo or quadrature stands in for an integral you cannot do by hand. Drive the sampled sum down and you have driven the true blur-over-everything down. This "big loss integral, sampled to a finite sum" is the same mental picture as a training loss averaged over a dataset, and it is the bridge that makes lens design feel native to the rest of the book.
9.3.2 The forward model: ray-tracing and spot diagrams⧉
The merit function asks "where does each ray actually land?" Answering that, for a given candidate $\mathbf{p}$, is the forward model, and here it is exact ray-tracing — no paraxial shortcut.
The procedure is mechanical and is exactly what the FUNDAMENTALS part set up, run to completion. Pick a field point (say, a star at $5°$ off-axis). Launch many rays from it, fanned out to pass through a grid of points across the aperture — center, edge, top, bottom. Follow each ray surface by surface through the lens, and at every air–glass interface apply the full Snell's law,
with the true angles $\theta_1, \theta_2$ between the ray and the surface normal — the genuine non-linear law, not the linearized $n_1\theta_1 = n_2\theta_2$ the thin lens used. Bend the ray, propagate it to the next surface, refract again, and so on until it reaches the image plane. Record where it lands. Do this for every ray in the fan, for every field point, and for every wavelength (the index $n$ feeds in the wavelength, so different colors bend differently and land differently). That landing-point bookkeeping is the entire evaluation of one candidate lens.
Spot diagrams. Now plot, for one field point and one color, the landing points of all those rays on the image plane. That scatter of dots is a spot diagram (Figure 9.3.1). It is the most direct picture of how good the lens is at that condition: a tight little cluster means every ray from that point converges to nearly the same place — sharp — while a smeared blob means the rays scatter — aberrated. The shape of the blob even diagnoses which aberration dominates: a round symmetric haze is spherical aberration, a one-sided comet is coma, a cross or ellipse is astigmatism (the Aberrations chapter reads these signatures in detail). And because you trace each color separately, different wavelengths produce different clusters, displaced or differently sized — that displacement is chromatic aberration, visible at a glance.
The deeper point is that a spot diagram is the lens's point-spread function, sampled by rays. The point-spread function (PSF) is what the image of an ideal point source looks like after the optics smear it; tracing a fan of rays and seeing where they pile up is precisely a geometric estimate of that smear. So the thing the merit function measures — the size of these spots — is the geometric stand-in for image sharpness, and the same PSF that the Aberrations chapter will later hand to deconvolution.
The optimizer's job, then, is to shrink these spots across all fields and colors at once. You cannot just make the on-axis green spot tiny; the merit function is watching the corner spots and the blue spots too, and squeezing one often bulges another. The design loop (Figure 9.3.2) is the obvious one: take the current parameters, ray-trace the whole grid of fields and colors and pupil zones, collect every landing error into the merit value $\Phi$, let the optimizer propose a step in $\mathbf{p}$ that should lower $\Phi$, and repeat until it stops improving. A companion view that designers live by is root-mean-square (RMS) spot radius plotted against field angle (Figure 9.3.3): a curve showing how the average spot size grows from center to corner, which immediately reveals where in the frame the design was forced to give up quality — almost always the corners, where the constraints fight hardest.
9.3.3 From hand calculation to software⧉
It is worth remembering that this optimization was being done long before there were computers to do it — by hand, and it was brutal.
The old-fashioned way. Before electronic computing, designers traced rays by hand, one ray at a time, using trigonometric and logarithm tables to evaluate Snell's law at each surface, later with mechanical calculators and then early electronic ones. A single ray through a multi-element lens was minutes of careful arithmetic; a useful evaluation needed many rays; and the "optimizer" was the designer's own trained intuition about which curvature to nudge next. A few rays a day was a realistic pace, and a finished design could take months. The classic lenses of the field — the achromatic doublet, the Cooke triplet, the double-Gauss — were all found this way, by people who had internalized how each aberration responded to each surface and steered the design largely in their heads. That this was possible at all is a testament to how much structure the problem has; that it was so slow is why the field was transformed the instant computers arrived.
Today. Modern optical-design software does the same three things — forward model, merit function, iterative optimizer — at a scale no human can match. The industry-standard packages are Zemax OpticStudio, Synopsys CODE V, and OSLO. They ray-trace thousands of rays per evaluation of the merit function, and they drive the parameter vector with damped least squares — a Levenberg–Marquardt-style update that you will recognize from Slopendium/04 Computational tools. The "damped" part matters: a pure least-squares step on this strongly non-linear problem can overshoot into nonsense (a surface flips sign, an element collapses), so the damping factor blends the aggressive least-squares step with a cautious gradient-descent step, taking bold steps when the local quadratic model is trustworthy and timid ones when it is not. The optimizer needs the derivatives of every ray error with respect to every parameter — the Jacobian — which the software gets either by finite differences (perturb each parameter, re-trace, see how the spots move) or by differentiating the trace; assembling that Jacobian is most of the cost of each iteration.
How people actually use it. In practice a designer almost never starts from a blank sheet. A design session opens by loading a starting point — a lens from the program's built-in library, a scaled version of an existing design, or, very often, a competitor's design copied from a patent (patents quote every radius, thickness, and glass, precisely so they can be reproduced and worked around). With a sensible form on the screen, the designer sets up the problem, not the answer: the field points (center to corner), the wavelengths and their weights, the aperture/f-number, and then declares which of the lens's numbers are variables the optimizer may move and which are pinned — often by a solve, a rule that recomputes one parameter automatically to hold a property fixed (a back-focal-distance solve to keep the image in the right place, a curvature solve to hold the focal length). Next comes the merit function: usually the software's default RMS-spot or wavefront-error merit, then hand-added operands for the things the default ignores — cap the distortion, hold an edge thickness positive, bound the total length, favor a glass that is cheap and available. Only then does the designer hit optimize: damped least squares drops the design into the nearest local minimum in seconds, the designer reads the result off spot diagrams, ray-fan plots, and the MTF curve, adjusts weights or frees another variable, and re-runs — a tight, interactive loop in which the optimizer does the arithmetic and the human supplies the judgment about which compromise to chase. When local descent stalls, they unleash the global search (Zemax's "Hammer," CODE V's global synthesis) and let it grind for hours or overnight, hopping between basins; and when a design is finally good, they tolerance it and export the prescription to mechanical CAD and stray-light analysis. The craft did not vanish when the computer arrived — it moved up a level, from tracing rays by hand to steering an optimizer that traces millions.
Because the merit-function landscape is riddled with local minima — many distinct lens forms that are each locally optimal — damped least squares alone only polishes whatever design region you start it in. So the software adds global search strategies (saddle-point methods, simulated-annealing-style hops, large libraries of starting designs) that deliberately jump between basins to find genuinely different and better lens forms, not just the nearest tidy minimum. And once a good design is found, the software runs tolerancing: it perturbs every parameter by realistic manufacturing errors — a radius off by a fraction, an element decentered, a spacing loose — and checks that the merit function does not blow up, because a design that is only optimal when built perfectly is worthless on a production line. Optimizing for robustness to manufacturing error is itself a regularization of the design, in spirit identical to preferring a flat, wide minimum over a sharp, brittle one.
The bottom line is reassuring and is the whole reason this chapter exists: it is the same machinery as the rest of the book. A forward model that predicts an outcome from parameters; a merit (loss) function that scores the outcome; an iterative optimizer with regularization and a search strategy that drives the parameters down the loss. Swap "ray-trace" for "convolution" and "lens" for "image" and you are back in the inverse-problems chapter. Lens design is computational photography's optimization theme, running in hardware decades before the software caught up.
9.3.4 Tradeoffs: the design is always a compromise⧉
If the merit function could be driven to zero, lens design would be a solved, dull problem. It cannot, and the reason is that the design lives under hard constraints while several desirable axes pull against each other. You do not get to be sharp and fast and small and cheap; you buy one by spending another. The honest picture of a lens is a point on a Pareto front — a surface where improving any one quality necessarily worsens some other (Figure 9.3.4).
The axes that fight:
- Speed (low f-number) vs. aberration control. A faster lens opens the aperture wider, which means rays through the edge of the pupil — exactly the rays where spherical aberration, coma, and chromatic error are worst (the cubic term grows with aperture). Every stop of extra speed multiplies the aberrations the merit function has to fight, demanding more elements or exotic glass to hold sharpness. This is why an $f/1.2$ prime is large, heavy, and expensive while an $f/4$ lens of the same focal length is small and cheap.
- Size and weight vs. correction. More correction generally means more elements and longer paths; a pocketable lens has a tight glass budget, so its merit function is minimized under a hard size constraint, and quality is the thing that gives.
- Cost vs. sharpness. Sharpness is bought with element count, special low-dispersion glasses, and aspheric surfaces — each of which costs money to specify, source, grind or mold, and assemble. A cheaper lens is a lens told to hit its quality target with fewer and plainer ingredients.
- Zoom range vs. everything. A zoom must keep all of the above acceptable not at one focal length but across its whole range simultaneously, with groups sliding between configurations. Every additional millimeter of range tightens every other constraint, which is why a wide-range superzoom is the hardest thing in the catalog to optimize and the most visibly compromised.
This is the cue for computation to enter. If the optics is allowed to be merely good enough and cheap rather than perfect, software can finish the job: the lens-profile correction and deconvolution of the Aberrations chapter undo residual distortion, vignetting, and even some blur after the fact. The effect is to shift the operating point along the Pareto front — accept an optical design that is a little softer or a little more distorted, because you know the image-processing pipeline will recover it, and spend the savings on size, speed, or cost. The lens and the algorithm are co-designers, splitting the work.
The end-to-end turn. The deepest version of this idea, and the one the Advanced part develops in full, is to stop optimizing the optics in isolation at all. Classical lens design minimizes a merit function defined on the optics — spot size, wavefront error — as if a sharp geometric image were the final product. But it is not: the sensor and a reconstruction algorithm come after the glass, and what we actually care about is the quality of the final picture. So make the whole pipeline — lens, sensor, and reconstruction network — differentiable end to end, and optimize the merit of the output image, with the optical parameters and the algorithm's parameters trained jointly. When you do this, the optimizer is free to discover lenses that look "wrong" by classical optical metrics — a deliberately blurred or coded PSF, a design no hand-calculator would ever have drawn — because that PSF, paired with its matched reconstruction, yields a better final image than a classically sharp lens would. This is the same optimization machinery once more, now with the loss moved all the way to the end of the pipeline and the lens treated as just another layer of learnable parameters. Coded apertures, wavefront coding, and differentiable optics in the Advanced part are this idea carried to its conclusion — the ultimate expression of optics and computation as two ends of one design space.
9.3.5 Tolerancing: from the nominal design to a manufacturable one⧉
Everything so far has quietly assumed a perfect world. The optimizer hands back a prescription — a table of radii, thicknesses, glasses, and air gaps — and the spot diagrams that go with it are computed as if every one of those numbers were realized exactly, every surface ground to its ideal sphere, every element seated dead-center on the axis. Call that the nominal design. The trouble is that no shop builds the nominal design. A radius comes off the grinder a fraction of a percent off target; the glass in this particular melt has an index a few parts in ten thousand from the catalog value; an element sits a few microns decentered in its barrel and a hair tilted. A design that is paper-sharp at the nominal can be junk on the bench. Tolerancing is the discipline that closes this gap: it budgets the inevitable errors, predicts how the lens will actually perform once those errors are baked in, and — this is the part that surprises people new to the field — it is very often what really sets the cost and the element count, far more than the nominal design ever did. Two lenses can have nearly identical nominal performance and differ tenfold in price because one is forgiving of manufacturing error and the other is not.
The error sources. It helps to enumerate what actually goes wrong, because each entry is a parameter the optimizer treated as exact and the factory cannot. Surface by surface, the radius of curvature comes out slightly wrong (the tool wears, the polish runs long); each thickness and air-gap spacing lands within some machining and assembly tolerance; the glass index and dispersion vary not just by manufacturing slop but melt to melt — the same catalog glass type, poured in different batches, differs measurably, which is why precision lenses are sometimes recomputed for the measured index of the actual melt. On top of those per-number errors are the shape and placement errors the prescription does not even list: wedge (the two faces of an element not quite parallel/coaxial, so it acts like a weak prism), surface irregularity (the polished surface departing from a perfect sphere — bumps and astigmatic figure the test interferogram catches), and, at assembly, the decenter, tilt, and roll of each element relative to the optical axis. The decenters and tilts are usually the most damaging because they break the rotational symmetry the design relied on — a perfectly centered system has no on-axis coma, but a decentered one does.
Sensitivity analysis versus Monte Carlo. With a list of error sources and a guess at how big each can be, there are two complementary questions to ask, and the software answers both. The first is sensitivity analysis: perturb one tolerance at a time, re-evaluate the merit function (or the MTF), and see how much the performance moves. This ranks the tolerances — it tells you which handful of the dozens of numbers are the cost drivers, the ones whose tight tolerance you are really paying for, and which can be sloppy with no consequence. The second question is statistical: given the whole distribution of errors at once, what does the population of as-built lenses actually look like? That is answered by Monte Carlo — draw many random lenses, each with every parameter perturbed by a random draw from its tolerance distribution, evaluate each, and build up the distribution of as-built performance. From that distribution you read the yield: the fraction of production lenses that will clear the acceptance threshold (Figure 9.3.5). Sensitivity analysis tells you where to spend your tolerance budget; Monte Carlo tells you what you will actually ship and at what scrap rate. The two together are how a design that looks fine on paper is judged buildable or not.
Compensators. The lever that makes tight tolerances affordable is the compensator: a small number of adjustments deliberately left free, to be set per unit at assembly, that absorb the accumulated error. The most common is simply refocus — every camera body refocuses anyway, so the nominal back-focal distance need not be held perfectly; the factory (or the autofocus) takes up the slack. Beyond that, a designer may leave one air-space adjustable, to be shimmed to its best value while the lens is on the test bench, or leave one element's centering free to be shimmed or actively aligned until the on-axis image is clean. Because a compensator can claw back the performance that the random errors threw away, every other tolerance in the system can be loosened — and loosening tolerances is exactly how a lens gets cheaper. Tolerancing is therefore not just a pass/fail check at the end; it feeds back into the design, sometimes pushing the designer to add an element or rearrange the form specifically to make the lens tolerant, even at a small cost in nominal performance.
The punchline. The nominal design — the thing the merit function in the previous sections so carefully minimized — turns out to be the easy part. Getting the spot diagrams tight in software is a solved, routine exercise; what decides whether a lens is cheap, feasible, and shippable is its as-built performance and its yield. And this loops straight back to the theme of the chapter. One way to survive loose tolerances is to budget them generously and then lean on computational correction to recover the residual — the lens-profile correction and PSF deconvolution of the Aberrations chapter clawing back the distortion, vignetting, and softness that loosened tolerances let through. A lens that the pipeline is going to sharpen and de-distort anyway need not hold its corners to interferometric perfection on the production line. This is the same bargain as the Pareto shift above, now stated in the language of the factory: loosen the optics, lean on computation, and the cost falls. It is one more reason the pairing of cheap optics with computation keeps winning.