8.0 SINGLE-IMAGE COMPUTATIONAL PHOTOGRAPHY⧉
A single photograph is a measurement, and like every measurement it threw something away. The lens blurred the scene; the sensor sampled it on a finite grid and added noise; a foreground hair, a saturated highlight, a power line you wish you hadn't framed all collapsed into the same flat array of pixels. This part asks how much you can get back — and, just as honestly, how much you can only get by making it up. Its tasks look unrelated at first glance: turn a small image into a larger sharp one, undo camera shake, strip the fog out of a landscape, paint over a removed object, lift a person cleanly off their background, split a pixel into "light" and "surface," or repaint a snapshot in someone else's style. They are one part because they are all the same shape of problem. A known physical process — blur, downsample, scatter, mix, mask — ran forward and produced the photo you have. You want to run it backward. And running it backward is ill-posed: the forward process destroyed information, so infinitely many scenes are consistent with the one image you measured. The data alone cannot choose among them.
What chooses is a prior — a model of what real images look like. This is the spine of the whole part, and it is worth stating bluntly: the prior is not optional. When a measurement genuinely destroys information — frequencies above the sensor's sampling, the contents of a hole, the foreground color hidden behind a wisp of hair — no amount of cleverness recovers it from the data alone, and naïvely inverting the process also amplifies noise, dividing by the vanishing frequencies the blur killed. A prior is what selects an answer. It is a load-bearing part of the algorithm, not a tuning knob you sprinkle on at the end. Every chapter here is the same recipe — an ill-posed inverse problem rescued by a prior — written out with a different physics and a different prior in the slot. The formal version is one line you will see again and again: recover $\hat{x}$ by trading fidelity to the measurement against plausibility under the prior, $\hat{x} = \arg\min_x \tfrac{1}{2}\lVert Ax - y\rVert^2 + \lambda\,\Phi(x)$, where $A$ is the forward model and $\Phi$ is the prior.
The honest distinction that runs through the part is reconstruction versus hallucination. Both make a better-looking picture; only one adds truth. A reconstruction prior fuses detail that was genuinely measured somewhere — sub-pixel-shifted frames in a burst, frequencies the optics merely attenuated rather than killed, structure copied from elsewhere in the same image. It is recovered: verifiable, repeatable, faithful. A hallucination prior invents detail that is plausible for natural images but was never in the measurement — pores on a face, the digits on a distant sign, a whole region behind a deleted object. It can look stunning and be wrong. Keeping that line in view is the difference between a tool and a forgery: a super-resolved licence plate and an inpainted scene are not evidence, however sharp they look.
The roadmap walks from recovering measured detail toward inventing plausible detail, then to combining and re-styling images. Super-resolution and image priors sets up the whole part: it writes the super-resolution forward model $y = (k * x)\downarrow_s + n$, shows why inverting it needs a prior, separates genuine multi-frame reconstruction (burst super-resolution, hand tremor turned into extra samples) from learned single-frame hallucination, and lands the part's most reusable abstraction — any denoiser is a prior you can plug into any solver (Plug-and-Play and regularization by denoising (RED)), of which diffusion is the continuous limit. Blind deblurring is the hard sequel: invert a blur when the image is noisy (why the naïve inverse explodes, and the Wiener filter as the regularised fix) and when the blur kernel itself is unknown (a sparse-gradient prior breaks the chicken-and-egg tie). Dehazing is the same shape in a different costume — atmospheric scattering, $I = Jt + A(1-t)$, made solvable by one clever statistical prior (the dark channel). Together these three are the "recover what the measurement attenuated" stretch.
The next chapters push the dial toward invention. Inpainting, texture synthesis is the purest case of all: a hole has zero data, so 100% of the fill comes from the prior, and the chapter is a ladder of priors from weakest to strongest — smoothness, self-similarity (copy patches from elsewhere), a photo database ("the internet is your prior"), and a learned generative model. Patch match is the engine that makes the copy-a-patch methods fast: a randomised nearest-neighbour search, with its globally-optimal graph-cut cousin (Shift-Map). Compositing, segmentation and matting combines images rather than fixing one — cut the subject out and blend it in. The cut is a graph cut and the soft matte $C = \alpha F + (1-\alpha)B$ is another brutally ill-posed inverse rescued by a constraint or prior. Illumination related effects in a single image factors a pixel into the light × surface product the world actually produced — intrinsic images $I = R\cdot S$, reflection and shadow and highlight removal — each ill-posed because the world is multiplicative and one pixel can't say which factor is which.
Two chapters close the part by changing the goal from faithful to expressive. Style transfer imposes the look of a reference onto a target while keeping its content, running from classical patch-and-statistics methods to neural style (style as the Gram-matrix correlations of deep features) and learned image-to-image translation. Non-photorealistic rendering is its rendering-side cousin — abstracting and re-drawing a photograph as paint, ink, or line. And a short Recap: tone mapping gathers the tone-mapping thread that has run since the basics into one taxonomy and ties it back to the darkroom — a backward-looking consolidation of a thread that ran through the whole book, not new algorithms.
Through all of it, watch for the same recurring characters. Affinity — the color or intensity difference between two pixels, read as how much they belong together — is the matting Laplacian, the graph-cut smoothness term, and the colorization weight, one object wearing many hats. The denoiser-as-prior of the first chapter is what a diffusion model becomes at its sampleable limit, the leap from a prior you can only score with to one you can draw images from. And a learned operator is, throughout, just the data-fit-plus-prior skeleton with the hand-designed prior swapped for a data-driven $\Phi$. Hold onto the one sentence and the chapters stop looking like a grab-bag of tricks: get more out of one image than it seems to contain, by deciding — honestly — what to measure and what to assume.
Contents of this part
- 8.1 Recap: tone mapping
- 8.2 Super-resolution and image priors
- 8.3 Blind deblurring
- Deblurring in the presence of noise: why naive inversion fails
- The Wiener filter — the regularized, noise-aware inverse
- Blind deblurring — estimating the kernel and the image
- A more realistic blur model: spatially-varying (camera-shake) blur
- Decolorization (Color2Gray) as gradient-matching optimization
- Colorization as an optimization (the inverse-problem framing)
- 8.4 Dehazing
- 8.5 Style transfer
- 8.6 Inpainting, texture synthesis
- Inpainting as filling unmeasured pixels — the spectrum of priors
- PDE / diffusion-based inpainting
- Texture synthesis (Efros–Leung; Efros–Freeman quilting)
- Exemplar inpainting: clone, healing brush, and object removal (Criminisi)
- Data-driven scene completion (Hays & Efros)
- Deep inpainting (context encoders → partial/gated conv → diffusion)
- Highlight / specular recovery
- Epitomes — a compact patch model
- 8.7 Patch match
- 8.8 Compositing, segmentation and matting
- Segmentation: cutting the object out
- The matting equation and the alpha channel
- Trimaps: turning the impossible inverse into a tractable one
- Natural-image matting: Bayesian and closed-form
- Easy matting I — blue/green/chroma keying (constrain the background)
- Easy matting II — IR, depth and multi-flash matting (measure the separation)
- Rotoscoping and video matting (coherence over time)
- Learned segmentation and matting (the prior, learned)
- Harmonization: making the composite belong
- Where the blends live: Poisson, pyramid, photomontage, and fusion (cross-refs)
- 8.9 Illumination related effects in a single image
- 8.10 Non-photorealistic rendering
- What NPR is for, and the one idea
- Stroke-based / painterly rendering (and the brush p-set)
- Edge-preserving abstraction: bilateral + Difference-of-Gaussians
- Example-based stylization and the bridge to neural style
- Region-based stylization: stained glass, low-poly, mosaics
- Artistic screening and halftoning