8.1 Recap: tone mapping⧉
A camera sensor can record a scene spanning many thousands to one in brightness — a sunlit window and the shadowed corner of the same room in a single capture. A print or a screen can show maybe a couple hundred to one. Tone mapping is the act of squeezing the former into the latter without the result looking either washed-out or crushed to mud. It is one of the oldest problems in the medium: long before anyone wrote code, a photographer in a darkroom faced exactly the same gap between what the negative held and what the paper could print, and answered it with the same two moves we still make digitally — a curve applied to the whole frame, and local adjustments that brighten or darken one region without touching the rest.
This chapter introduces no new algorithm. Every method named below was developed in an earlier chapter; the point here is to put them on one map, see why they all exist (they are five answers to a single question), and trace each back to the darkroom craft it mechanized. If you remember one thing, let it be the organizing question: how do you compress a scene's overall range while keeping its local contrast? A pixel-by-pixel curve cannot do both at once. Everything interesting in tone mapping is a way around that impossibility.
8.1.1 Global vs local, re-hashed⧉
A global tone-mapping operator applies one curve to every pixel — the output is a function of that pixel's intensity alone, with no regard for its neighbors. Gamma, a photographic S-curve, Reinhard's global operator (Reinhard et al. 2002): all are a single monotone function $L_\text{out} = f(L_\text{in})$ applied identically everywhere. This is simple, fast, and order-preserving (brighter-in stays brighter-out, so no reversals). It is also fundamentally limited.
Here is the limitation, and it is worth stating sharply because it is the reason the whole local family exists. A global curve that compresses the range must be flat — its slope must be much less than one — across the broad span from shadows to highlights, or the ends won't fit in the display's range. But the slope of the tone curve is the local contrast gain: a difference of $\Delta L$ in the input comes out as $f'(L)\,\Delta L$. Flatten the curve to fit the range and you simultaneously flatten every local detail riding on top of it. Open up the curve to keep the detail and the range no longer fits — the highlights clip, the shadows block up. One curve cannot both squash the global range and preserve local contrast, because the same number — the curve's slope — controls both (Figure 8.1.1).
A local operator escapes the bind by letting the mapping depend on a pixel's neighborhood. Intuitively: decide how bright the region should be (and compress that, the large-scale variation), then lay the fine local detail back on top at full strength. Because the compression now acts on a slowly-varying "overall brightness" rather than on every pixel's raw value, the local contrast is spared. The recurring price is the halo: if the estimate of "overall brightness" bleeds across a strong edge — say it averages the bright sky into the dark mountain just below it — then near that edge the operator pushes the wrong way, and a bright or dark fringe appears hugging the silhouette. Halos (and, in the worst case, gradient reversals) are the failure mode of local tone mapping, introduced in BASIC (Basic image processing and ISP#Tone mapping) and the thing every method below is in some part designed to avoid.
So the split is clean. Global: one curve, fast, safe, but range-versus-detail is a zero-sum tradeoff. Local: separate large-scale brightness from local detail and treat each on its own terms — strictly more powerful, at the risk of halos. The next section is a tour of the five ways people have drawn that separation.
8.1.2 A taxonomy of local methods⧉
Every local operator answers the same question — separate "overall brightness," which we compress, from "local detail," which we keep — and they differ only in how they draw that separation and how they avoid halos at edges. Five mechanisms, each developed in full elsewhere; here is the map (Figure 8.1.2).
- Gradient-domain (Fattal et al. 2002 — Fattal, Lischinski and Werman). Work on the image's gradient field rather than its pixels. A high-dynamic-range scene has a few enormous gradients — the edge of the window, the rim of the lamp — swamping the many small gradients of ordinary detail. Attenuate the large gradients (shrink the big steps, leave the small ones alone) and then reconstruct the image whose gradients best match the edited field — a Poisson solve, $\nabla^2 f = \operatorname{div}\mathbf{v}$. The huge range collapses while local contrast survives, because we manipulated differences and let the absolute level fall out of the solve. This is the tone-mapping face of L9 — the eye cares about gradients, not absolute values — and it is developed in full in Poisson image editing. Integrate in log, where a multiplicative contrast becomes a constant gradient (the encoding point below).
- Bilateral base/detail (Durand & Dorsey 2002 — Durand and Dorsey). Split the (log) image into a base layer — large-scale, the "overall brightness" — and a detail layer — everything finer — then compress the base, keep the detail, recombine. The trick is that the base is computed with the bilateral filter: a smoothing that averages only over neighbors of similar intensity, so it blurs within a region but refuses to blur across a strong edge. That edge-respecting average is exactly L4 — edge-preserving = affinity, where the affinity between two pixels is keyed on their intensity difference — which is why the base/detail split does not produce halos the way a plain Gaussian blur would (a Gaussian base would average sky into mountain and the recombination would ring). The canonical edge-aware local operator; developed in Bilateral filtering.
- Local Laplacian (Paris et al. 2011 — Paris, Hasinoff and Kautz). Manipulate a Laplacian pyramid directly, applying a per-coefficient remapping function that controls local contrast band by band. Its signal property is that it avoids halos by construction — rather than estimate one edge-preserving base and risk it leaking across an edge, it processes the multiscale decomposition so that local contrast is shaped without the artifacts the bilateral can still produce on hard cases. Same goal as the bilateral split, cleaner edge behavior; also in Bilateral filtering.
- Exposure fusion (Mertens et al. (exposure fusion) — Mertens, Kautz and Van Reeth). Given a bracket of exposures of the same scene, blend them by per-pixel quality weights — favoring pixels that are well-exposed, locally contrasty, and saturated — combined in a pyramid so the blend is seamless across scales. The remarkable part is what it skips: it produces a pleasing low-dynamic-range image without ever forming an HDR radiance map and tone-mapping it. It straddles the single/multi-image boundary (it consumes a bracket, like an HDR merge, but its output is a display image, like a tone map) — which is why its full treatment lives with HDR in Multiple exposure imaging; it appears here because mechanically it is a local tone operator.
- Learned (Gharbi et al. 2017 — HDRnet; trained on the FiveK corpus of expert retouching, Bychkovsky et al. 2011). Predict the local tone mapping from data: a network outputs the coefficients of a local, bilateral-grid affine transform, learned to imitate expert edits. This is L8 in the tone-mapping setting — the operator itself, rather than a hand-tuned curve or filter, is fit to a dataset. Developed in Deep learning.
The table to land is simple: one goal, five mechanisms. Separate large-scale brightness from local detail — via a gradient field, a base/detail split, a pyramid remapping, a multi-exposure blend, or a learned predictor — compress the former, keep the latter (Figure 8.1.2). The differences are engineering: how cleanly the separation respects edges, how it avoids halos, whether it needs one image or a bracket, whether it is hand-designed or learned.
8.1.3 The darkroom ancestor: dodge & burn and the Zone System⧉
It is tempting to read tone mapping as a digital invention — a thing computational photography added. It is the opposite. Photographers have controlled tone locally since the wet darkroom, and the digital operators above are, almost without exception, the mechanization of a craft more than a century old.
Start with the manual local operators. Under the enlarger, the print is exposed by light projected through the negative. Dodging is holding a piece of card (or a hand) between the light and a region of the paper for part of the exposure, so that region receives less light and prints lighter — you "hold back" the shadows so they don't block up. Burning is the reverse: after the base exposure, you give one region extra light through a hole in a card, so it prints darker — you "burn in" a bright sky so it doesn't blow out. Dodge and burn are local tone mapping with the hands — exactly the move of brightening or darkening one region while leaving the rest alone, done with cardboard and timing instead of a bilateral filter. The digital local operators automate it and make it per-pixel.
The global side has an analog ancestor too: the characteristic curve of film and paper — the H&D curve, plotting developed density against the logarithm of exposure. Its shape is the familiar toe, straight-line, shoulder: a gently-rising toe in the shadows (which gradually compresses the darkest values), a roughly straight middle (the linear-response region), and a shoulder that rolls off in the highlights (so the brightest values compress gracefully into the paper's maximum density rather than clipping abruptly). That toe-and-shoulder shape is a global tone curve realized in chemistry — and it is exactly what a digital global operator's S-curve emulates: soft shoulder for highlight roll-off, lifted toe for shadow compression, monotone in between. (Note that the H&D curve is plotted against log exposure, the photographer's natural axis — stops, L7 — and the reason the multiplicative world of light is best handled in log, L2.)
Tying the two together is the Zone System (Ansel Adams; Adams, The Negative, Adams, The Print). It is a disciplined method for mapping the scene's luminances onto the print's tones: meter the important values in the scene, decide which print tone — which "zone," from black through middle gray to paper white — each should land on, then control exposure and development of the negative, and the dodging and burning of the print, to put them there. The Zone System is tone mapping as deliberate craft — it sets the global transfer (where the metered values land on the curve) and the local adjustments (which regions get dodged or burned), and it did all of this decades before any computation existed.
So the bridge to land is exact. Digital global tone mapping = the characteristic curve — toe, straight-line, shoulder, now an arbitrary function instead of fixed chemistry. Digital local tone mapping = dodge and burn plus the Zone System — region-by-region brightening and darkening, now automatic and per-pixel instead of by hand under the enlarger (Figure 8.1.3). The computation did not invent tone control; it mechanized a century-old craft and made it instant and repeatable. This is the same continuity Adams drew when he called the negative the score and the print the performance — the negative records what was there, the print is an interpretation, and tone mapping is where the interpretation happens (cross-ref Human factors and the art of photography; the Introduction's score-and-performance analogy).