4.0 COMPUTATIONAL TOOLS⧉

Everything before this part built an image up: the physics of light, the optics and sensing that turn a scene into a grid of numbers, and the basic pipeline that cleans and shapes those numbers into a finished photograph. This part is a change of register. It does not add a new stage to the imaging pipeline. Instead it assembles the general-purpose computational machinery the rest of the book reaches for again and again — the tools we use to recover an image we did not directly measure, or generate one that never existed. Almost every later technique — deblurring, super-resolution, compositing, high-dynamic-range merging, video, relighting — is, at bottom, one of these tools pointed at a particular imaging problem. So we gather them here, once, before the application parts begin, and learn each well enough that the applications read as variations on a theme.

The unifying picture is the inverse problem. A camera applies some operation to the world — a blur, a downsampling, a color transform, the loss of a region behind an occluder — and hands us the result. We want to run that arrow backwards: given the measurement, recover the scene. Written as a forward model, the measurement is $y = Ax$, where $x$ is the image we want, $A$ is the operation the imaging process applied, and $y$ is what we got. Recovery is then "undo $A$." The catch, which animates this whole part, is that undoing $A$ is almost never clean: the measurement may be noisy, and worse, $A$ often throws information away (a blur erases fine detail; a downsample discards pixels), so infinitely many images $x$ are consistent with the same $y$. The recurring cure is a prior — a sense of what a real image looks like — used to choose among all the consistent answers. Every chapter here is the same sentence said louder: data-fit plus a prior. Stay faithful to what was measured, and among all faithful answers prefer the one that looks like a real image. What changes from chapter to chapter is only where the prior comes from.

The four chapters trace exactly that progression, from a hand-built prior to one you can sample from. Linear Inverse Problems and Regression sets up the frame and the cheapest version of the prior. It takes the cleanest example — non-blind deblurring, where the blur $A$ is known — and shows that recovery is regression: find the image $x$ whose blurred version best matches the photo, $\hat x = \arg\min_x \tfrac12\|Ax - y\|^2$. The decisive trick is that you never build the matrix $A$; it is far too large (millions by millions for a normal photo). Iterative solvers — gradient descent, conjugate gradient, multigrid, and the fast Fourier transform (FFT) — only ever need to apply the operation and its transpose, and that same solver machinery resurfaces in gradient-domain editing, Poisson cloning, colorization, and matting. Machine learning keeps that recipe and replaces the hand-designed half. The classical prior — a smoothness penalty, an inverse filter, a clever heuristic — becomes a function $f_\theta$ learned from data: collect example pairs of degraded input and clean target, and fit the operator to them. This chapter is the model-agnostic principle, and its real lesson is that the data is as much the method as the model — the synthetic degradations, realistic noise models, and benchmark datasets that make a learned operator transfer to real photographs. Deep learning is that idea made concrete: a survey of what deep networks actually learn for single-image tasks — denoising, demosaicking, super-resolution, colorization, monocular depth, retouching — plus the generative image-to-image translators and the learned perceptual metrics (such as the learned perceptual image patch similarity, LPIPS) that judge results where pixel-error metrics fail. Generative AI and diffusion pushes the prior to its limit. The earlier priors were ones you could only evaluate — score an image, or denoise it. A generative model learns the distribution of natural images $p(x)$ well enough that you can sample fresh images from it. That single shift, from scoring to drawing, unlocks unconditional generation, text- and image-conditioned synthesis, and posterior sampling for inverse problems — and the dominant engine, diffusion, turns out to be exactly the denoiser of the earlier chapters run in a loop.

It is worth naming the throughline, because the part's big lessons compound rather than compete. A linear operator is easiest to invert in the basis where it is diagonal — for convolution, that basis is Fourier (L5). A learned operator swaps a hand-designed prior for one fit to data (L8). When the measurement destroys information, the prior is not optional — it is the only thing that picks an answer (L10). And a generative model is a prior you can sample (L11). Carry those four and you carry most of the conceptual weight of the rest of the book.

A closing word on what these chapters are not. They are tools, not applications, so they lean on the linear-algebra, optimization, probability, and machine-learning background gathered in the Refreshers appendix rather than re-teaching it — in particular, we do not teach neural networks from scratch here. And they keep their examples deliberately generic: deblurring stands in for every linear inverse problem, a U-Net for every learned operator, diffusion for every generative prior. The point is the machinery. Once you have it, most of what follows is a matter of choosing the forward model $A$ and the prior — and turning the crank.

Contents of this part

▸ full collapsible outline of this part

4.0 COMPUTATIONAL TOOLS🔗⧉

Contents of this part

4.0 COMPUTATIONAL TOOLS⧉