💬Comments welcome. To leave a note, select any text and click the note / highlight button that pops up — or open the panel with the tab at the top-right (‹). Notes are visible only inside our private review group.
Computational Photography, an AI-powered Slopendium — 18 Image Forensics and Authentication
expand to📖 Full book outlinejump to1 parts · 2 chapters · 9 sections · 13 figures embedded · 2 placeholders · double-click a figure to enlarge
Part 18 IMAGE FORENSICS AND AUTHENTICATION
The rest of this book is mostly about *making* images — better, brighter, sharper, or out of nothing. This part is about *believing* them. Cheap editing tools, and now generative models that synthesize a convincing photograph from a sentence, have severed the old reflexive link between "photograph" and "something that happened." Two responses run in parallel. **Forensics** works on an image you are handed cold — no cooperation, possibly an adversary on the other side — and looks for the statistical and physical fingerprints that authentic capture leaves and editing or generation disturbs. **Authentication** works the other way around: it asks creators and cameras to *sign* their work and every edit to it, so trust comes from a verifiable record rather than a hunt for tells. Neither is sufficient alone — forensics gives evidence, not proof, and provenance is opt-in — so the honest position is that they are complementary.
18.1 Image Forensics
fig-focus-stacking
fig-focus-stacking · optics-chapter illustrative figure (07-07 Figure 4): a stepped-focus stack → sharpness selection → all-in-focus composite. Synthetic per-slice defocus on one photo (`sourced/corn-cobs.jpg`, © Frédo Durand) — license-safe. The full real-data treatment lives in part-08 `fig-focalstack-*`
fig-telephoto-vs-retrofocus
fig-telephoto-vs-retrofocus · two two-group schematics — telephoto (+ then −, principal plane H′ pushed in front → physical length < f) vs retrofocus/inverted-telephoto (− then +, long back-focal distance to clear the SLR mirror); marks f vs physical length, H′, F′
fig-correspondence-then-transport
fig-correspondence-then-transport · the L17 spine — one scene displaced (two views / two faces / two frames / one long frame), each resolved by estimating a coordinate map (homography, morph field, flow, track, motion vector, camera path) then transporting pixels by one shared inverse-warp engine; the finding is hard, the moving is plumbing 🟨
fig-thick-lens-wave-sim
fig-thick-lens-wave-sim · live 2-D FDTD wave sim of a **thick biconvex lens** focusing: a point source's diverging wave is reshaped by the slower glass into a converging one that meets at an image point; focus tracks 1/f=(n−1)2/R and 1/v=1/f−1/u (Fundamentals → Lens image formation, Fermat/equal-path)
fig-jpeg-artifacts
fig-jpeg-artifacts · JPEG blocking & ringing at low quality (original vs q=8, zoomed crop) 🟨
fig-illum-times-reflectance
fig-illum-times-reflectance · light = illumination × reflectance (per-wavelength)
fig-portrait-lighting-sim
fig-portrait-lighting-sim · live portrait-lighting simulator (web edition): three soft area lights (key/fill/kicker — az/el/extent/intensity/colour, area-light-supersampled soft shadows) on a 3D face with a physiological melanin/hemoglobin skin model; a from-behind setup view with Meshy studio umbrellas + a camera rig; lighting presets; static fallback is a screenshot. *3D umbrella generated with Meshy AI.*
The premise of all blind forensics: a real photograph is the output of a specific **physical pipeline** — a particular sensor, a color-filter mosaic, a demosaicking algorithm, a lens, a JPEG encoder, a single illumination of a single 3-D scene — and that pipeline stamps the pixels with **consistent low-level regularities**. Splice two photos together, paint something out, scale a pasted region, or synthesize an image from a network, and those regularities are broken locally or never reproduced. Forensics is the art of modeling the expected regularity and flagging where it fails.
18.2 Authentication and Provenance (C2PA)
fig-diffraction-aperture-size
fig-diffraction-aperture-size · two apertures: a smaller one diffracts more (spread θ ≈ λ/D) 🟨
⬜ figure not yet created
their complementarity [fig-provenance-vs-forensics
fig-telephoto-vs-retrofocus
fig-telephoto-vs-retrofocus · two two-group schematics — telephoto (+ then −, principal plane H′ pushed in front → physical length < f) vs retrofocus/inverted-telephoto (− then +, long back-focal distance to clear the SLR mirror); marks f vs physical length, H′, F′
fig-crop-focal-length
fig-crop-focal-length · cropping = changing focal length: full frame vs a crop box ≡ a longer-focal-length capture, upsampled 🟨
fig-rotation-challenge
fig-rotation-challenge · Andrew Adams' rotation challenge: rotate a full turn in N steps of 360/N° (each step bilinear-resamples the PREVIOUS result), for N∈{10,60,360}; track a validity mask through identical rotations so corners that ever leave the frame go black. Result returns to 0° but is resampled N times → accumulated blur/mush + shrinking valid region toward the inscribed disc. More, smaller rotations compound MORE damage (rotate k× by 360/N, not once by k·360/N)
⬜ figure not yet created
**Durable Content Credentials** — manifest + invisible **watermark** + **fingerprint**, so a screenshot that strips the metadata can still be **recovered** by matching the watermark/fingerprint to a provenance store [fig-durable-credentials fig-editing-lr-vs-ps
fig-pinhole-imaging
fig-pinhole-imaging · imaging-scenario series (2/3): add a pinhole to the bare sensor — one ray per scene point → a dim **inverted** image (same tree+sensor+colours as fig-bare-sensor-averaging)
fig-cos4-vignetting
fig-cos4-vignetting · natural vignetting: relative illumination ∝ cos⁴θ falling toward the corner, with an image-corner darkening illustration (companion to `fig-cos4-falloff`, ties θ to image radius + a picture)
Forensics is a losing race in the limit: as generators improve, the pixel-level tells vanish. Authentication changes the game by **not relying on the pixels to confess**. Instead it asks the *honest* parts of the pipeline — the camera, the editing app, the AI tool, the publisher — to **cryptographically sign what they did**, and binds that signed record to the image. A reader then **verifies a signature against a trust list** rather than hunting for artifacts. The catch, which we keep front and center, is that this is **opt-in**: it can prove an image *is* what it claims, but a missing or stripped credential proves nothing — so it complements forensics rather than replacing it.