9.4 A short bestiary of classic designs⧉
The previous chapter left us with a dilemma. A single thin lens hands the designer exactly one free number — its power, the reciprocal of its focal length, $1/f$ — and through the lensmaker's equation that power is already spent on setting the focal length. There is essentially nothing left over to also flatten the field, to bring the marginal rays to the same focus as the paraxial ones, to keep red and blue together, or to straighten the lines. So a lone element fails several aberrations at once, by construction. The way out is structural: add elements. Each new element contributes two more refracting surfaces (two more curvatures to choose), one more glass (a fresh index and a fresh dispersion), and, if it floats on a gap, an air space whose width is another parameter — new knobs that can be turned to make the aberrations of one element cancel those of another rather than accumulate. A compound lens is exactly that: a system of elements whose individual flaws are arranged to subtract.
There is a cost, which is why lenses are not built from a hundred elements. Each one adds weight, bulk, and money, and — the concern of the Glare suppression chapter — each air–glass surface reflects a few percent of the light, so more surfaces mean more stray reflections and lower contrast. The art of lens design is therefore to reach the required quality with as few elements as possible. The designs that follow are, in effect, the historical record of discovering the minimum stack that buys each level of correction — a small bestiary of arrangements that recur, with variations, in nearly every lens ever sold. We name them and show the family resemblances; the quantitative argument for why each works was set up in the previous chapter, and the achromat in particular is derived there in full (the achromat condition). (Figure 9.4.1)
The achromatic doublet. The smallest compound lens worth the name is two elements: a positive lens of low-dispersion crown glass bonded to a negative lens of high-dispersion flint glass, usually cemented face-to-face so the pair acts as one unit. The negative flint reverses the color spread of the positive crown, so that two wavelengths — typically a red and a blue — are brought back to a common focus while the pair keeps net positive power. It is the minimum system that corrects chromatic aberration (the colored halos and fringes of the previous chapter), and the extra surfaces let the designer simultaneously tame the spherical aberration a single element of that power would carry. The doublet is everywhere — the front objective of cheap telescopes and binoculars, and a building block buried inside larger lenses. Its working principle, pairing glasses of opposite dispersion so the chromatic terms cancel, is the achromat condition from Aberrations and optical challenges.
The Cooke triplet. Three elements — positive, negative, positive — with the diaphragm by the central negative element. This is the celebrated minimum: the smallest design with enough free parameters (six surface curvatures, two air gaps, three glass choices, and the bending of each element) to correct all of the primary (Seidel) aberrations — spherical, coma, astigmatism, field curvature, distortion — and chromatic aberration at once. With one fewer element you cannot reach all of them together; with the triplet you can, which is why it is the ancestor of an enormous family of inexpensive lenses. Many simple primes are recognizable triplets or lightly modified ones — splitting an element into a cemented pair, for instance, to buy a little more correction. Designed by H. Dennis Taylor in the 1890s, it is the historical hinge between the single-element era and modern multi-element design.
The double-Gauss (Planar). When you want a fast normal lens — $f$/2 or wider — the triplet runs out of freedom, and the dominant solution is the double-Gauss, six or seven elements in two groups with rough mirror symmetry about the aperture stop. The symmetry is the whole point: a perfectly symmetric system, used at unit magnification, automatically cancels the odd aberrations — coma, distortion, and lateral chromatic aberration — because each half introduces them with the opposite sign, so they subtract. Real photographic double-Gauss lenses are only nearly symmetric (they image distant objects, not at 1:1), but enough of the cancellation survives that this layout has dominated fast standard lenses for a century. Zeiss's Planar is the canonical name; most 50 mm $f$/1.4 and $f$/1.8 lenses are its descendants. (Figure 9.4.2)
Telephoto and retrofocus. The two remaining classics solve a different problem — not image quality but physical length. Naively a lens of focal length $f$ ought to be roughly $f$ long, which is awkward for a 300 mm telephoto (too long to carry) and impossible for a wide-angle on a camera with a mirror in the way (too short to clear it). Both are fixed by ordering a positive and a negative group so as to slide the rear principal plane — defined in the next section — to a more convenient place. (Figure 9.4.3)
- A telephoto lens puts a positive group in front and a negative group behind. The rear negative group pushes the rear principal plane forward, out past the front of the lens, so the whole assembly is physically shorter than its focal length — a 300 mm telephoto in a barrel only 200 mm long. This is what "telephoto" properly means: not merely "long lens" (the everyday usage) but this length-shortening positive-then-negative construction.
- A retrofocus lens (also called inverted telephoto) does the reverse: a negative group in front, a positive group behind, pushing the rear principal plane backward and so lengthening the back-focal distance — the air gap between the last glass and the sensor. A wide-angle lens has a short focal length and would normally sit almost on the sensor, but on a single-lens reflex (SLR) camera the swinging mirror needs room, so the lens must stand off; retrofocus design buys that standoff. It is no accident that mirrorless cameras, with no mirror box to clear, can use simpler, smaller wide-angle lenses — they no longer pay the retrofocus tax.
The zoom. Finally, the zoom lens is a system whose focal length is variable. Internally one group of elements — the variator — slides along the axis to change the magnification and hence $f$, while a second group — the compensator — slides in a coordinated way to keep the image plane fixed so the picture stays in focus as you zoom (mechanically the two motions are coupled by a cam). The price is steep: a zoom must be reasonably corrected at every focal length in its range, not just one, so it carries many elements and many compromises, and it is the hardest kind of lens to optimize well. Almost every aberration trade-off in this part gets harder when the geometry itself is allowed to move. (Figure 9.4.4)
The throughline of the bestiary is that element count is correction made visible. A two-element doublet fixes color; three elements fix the primary aberrations; six or seven buy a fast normal lens; ten to twenty buy a high-performance zoom. Each stage spends glass to cancel one more class of the errors catalogued in the previous chapter. The next section gives the language to describe any of these stacks — however many elements — as a single equivalent system.
9.4.1 The lens as a system: cardinal points, pupils, f-number, T-stop⧉
The liberating idea of this chapter is that you almost never need to think about the individual elements. No matter how many lenses are crammed into the barrel, the system as a whole behaves, for the purpose of locating images, exactly like an ideal thin lens — provided you measure distances from the right reference planes and the aperture at the right opening. Those reference planes and openings are the system's cardinal points and pupils, and they are what a lens's published specifications — focal length, f-number, angle of view — actually refer to.
Principal planes and the thick-lens equation. A thin lens has one obvious plane to measure from: the lens itself. A thick, multi-element system has no single such plane, but it turns out to have two effective ones, the front and rear principal planes, written $H$ and $H'$. They are the planes from which object and image distances must be measured for the familiar equation to hold: with $u$ measured from $H$ and $v$ measured from $H'$,
holds for the entire system, with $f$ the system's effective focal length. The whole black box then acts like a single thin lens placed at $H$ for the object side and at $H'$ for the image side. The two planes need not lie inside the glass at all — and indeed, as we just saw, the telephoto and retrofocus designs are precisely tricks for moving $H'$ to where it is wanted. This is also why the effective focal length (measured from $H'$) and the back focal length (measured from the last glass surface to the sensor) are different numbers: the telephoto's effective focal length is large while its back focal length is small, and the retrofocus is the reverse. (Figure 9.4.5)
Nodal points. A second pair of cardinal points, the nodal points, are defined by a different property: a ray aimed at the front nodal point leaves the rear nodal point traveling parallel to its original direction — it crosses the axis undeviated in angle. When the lens has air on both sides (the ordinary camera case) the nodal points coincide with the principal points. The front nodal point is the pivot you rotate a camera about to shoot a panorama with the near and far objects holding still relative to each other (it is sometimes loosely called the "no-parallax point," which more precisely is the entrance pupil, discussed next).
Entrance and exit pupils. Somewhere inside the lens is the aperture stop — the diaphragm whose blades you open and close. But you rarely see the stop directly; you see its image formed by the glass in front of or behind it. Looking into the front of the lens, the apparent opening is the entrance pupil (EP) — the image of the aperture stop projected into object space by the elements ahead of it. Looking into the back, you see the exit pupil, the image of the stop projected into image space by the elements behind it. The entrance pupil matters enormously because it is the opening that actually gathers light: a scene point sends into the lens exactly the cone of rays that fills the entrance pupil. So the entrance pupil — not the physical front element, and not the physical stop — is the effective aperture of the system. It is also the true center of perspective: rotating the camera about it gives genuinely parallax-free panoramas, which is why it, rather than the front nodal point, is the precise "no-parallax point."
The f-number, pinned to the entrance pupil. FUNDAMENTALS defined the f-number $N = f/D$ as focal length over aperture diameter, controlling both light-gathering (a stop is a factor of $\sqrt 2$ in diameter, a factor of $2$ in area and exposure) and depth of field. Now we can say which diameter: the entrance-pupil diameter $D_{\text{EP}}$, so that
Using the front-element diameter instead is a genuine mistake — in a retrofocus or telephoto lens the front element can be much larger or smaller than the entrance pupil, and only $D_{\text{EP}}$ gives the right light-gathering and depth of field. Off-axis, the barrel and the front and rear elements can clip the pupil — the oblique cone of light no longer passes cleanly — which darkens the corners (optical vignetting, distinct from the natural $\cos^4$ falloff of FUNDAMENTALS) and squeezes out-of-focus highlights into "cat's-eye" shapes. We pick that up in Depth of field; here it is enough to note that the pupil, not the stop, is what gets vignetted.
The T-stop. The f-number quietly assumes that all the light entering the entrance pupil reaches the sensor — that the lens transmits 100% of it. Real glass absorbs a little and, more importantly, every air–glass surface reflects a few percent (which we attack with coatings in Glare suppression), so a multi-element lens transmits only a fraction $\tau < 1$ of the light. Two lenses of the same f-number can therefore deliver different actual exposures. Cinematographers, who cannot tolerate a brightness jump when they switch lenses, mark their lenses not in f-stops but in T-stops, which fold in the measured transmittance:
The square root appears because exposure goes as the square of the f-number (it is an area), so a transmittance $\tau$ reduces the effective area by $\tau$ and raises the equivalent f-number by $1/\sqrt{\tau}$. A T-stop of $T$ thus means "this lens gives the exposure that a perfect (lossless) $f$/$T$ lens would," which is exactly the number you need to set exposure reliably. As coatings push $\tau$ toward 1 (well above 0.9 for a good modern multi-coated lens), the T-stop creeps back toward the f-number.
How many elements, in practice. Putting it together: the visible cost of cancelling every aberration in glass is the element count, and modern lenses are extravagant. A high-performance zoom or a fast prime commonly runs 10 to 20-plus elements in several groups, sprinkled with aspheric surfaces and exotic low-dispersion glasses to do more correction per element (both detailed in Aberrations correction). Even a smartphone "lens," for all that it looks like a single dot, is a stack of five to eight tiny molded plastic or glass elements, several strongly aspheric. Every one of those elements is there to cancel something — and the central argument of this whole part is that you do not have to cancel everything in glass. You can let the optics get close and finish the job in software: correct the residual distortion, vignetting, and color computationally, which is cheaper than another aspheric element. That trade — glass against computation — is the spine of the chapters ahead, and the cardinal-point picture built here is what lets us reason about it cleanly: it tells us exactly what the optics is promising, so we know precisely what the software must clean up.