💬Comments welcome. To leave a note, select any text and click the note / highlight button that pops up — or open the panel with the tab at the top-right (‹). Notes are visible only inside our private review group.
Computational Photography, an AI-powered Slopendium — 06 Warping and morphing
expand to📖 Full book outlinejump to1 parts · 7 chapters · 17 sections · 16 figures embedded · 9 placeholders · double-click a figure to enlarge
Part 6 WARPING AND MORPHING
💡 **Big lesson (L17 · transport half):** almost every geometric technique is **(1) establish a correspondence / coordinate map**, then **(2) transport pixels along it** by warping and resampling. This part owns step (2) — one shared engine serving rectification, panoramas, morphing, stabilization, and frame interpolation — and the ways to *build* a map from sparse control. The hard *estimation* half (step 1) is [[Matching pixels across space and time]].
6.1 Warping and resampling
fig-warp-domain-vs-range
fig-warp-domain-vs-range · one image, two orthogonal arrows — the domain arrow bends the grid (a pixel slides, carrying its colour; a warp), the range arrow recolours in place (a point/tone op); warping moves where, not what 🟨
⬜ figure not yet created
`fig-forward-warp-holes` (forward/input-driven warp scatters input pixels to fractional output locations → gaps (uncovered output) and pile-ups (many inputs to one output)) fig-forward-warp-holes
fig-inverse-warp-lookup
fig-inverse-warp-lookup · inverse (output-driven) warping, the useful idiom — loop over output pixels, push each back through $f^{-1}$, sample there; every output covered once, the only difficulty a clean resampling 🟨
fig-recon-filters-1d
fig-recon-filters-1d · nearest (boxy staircase), linear (kinked through every sample), and cubic (smooth with mild overshoot) reconstruction of the same 1-D sample row — the sharpness/smoothness trade 🟨
⬜ figure not yet created
`fig-bilinear-weights` (the four surrounding pixels and the fractional-coordinate area weights of bilinear interpolation) fig-bilinear-weights
fig-minify-aliasing-moire
fig-minify-aliasing-moire · the L16 picture — a fine grating decimated naively (folds into false moiré) vs area-averaged first then decimated (clean); same size, opposite outcomes, decided by prefiltering 🟨
⬜ figure not yet created
`fig-kernel-scaling-minify` (the reconstruction kernel widened by the downscale factor so it averages over the right footprint — "gather farther when shrinking") fig-kernel-scaling-minify
fig-dof-ladder
fig-dof-ladder · the degrees-of-freedom ladder — a unit square under translation (2) → similarity (4) → affine (6) → projective (8) → free-form ($\infty$), each rung breaking a preserved property 🟨
fig-mesh-triangulation-warp
fig-mesh-triangulation-warp · piecewise-affine mesh warp — control matches triangulated, each triangle carrying its own affine map; exact and fast but only $C^0$, creasing along shared edges 🟨
⬜ figure not yet created
`fig-tps-rbf-warp` (a few control points displaced → a smooth thin-plate-spline field filling the gaps, minimal bending) fig-tps-rbf-warp
fig-beier-neely-segments
fig-beier-neely-segments · Beier–Neely feature-line warp — before/after directed segments, one segment's $(u,v)$ frame, and the distance-weighted blend of several segments' proposed displacements ($w_i=(\ell_i^p/(a+d_i))^b$) 🟨
fig-arap-vs-affine
fig-arap-vs-affine · drag one handle — an unconstrained affine/least-squares warp shears and stretches, vs an as-rigid-as-possible / MLS warp that keeps each neighbourhood near a rotation so the shape bends without stretching 🟨
💡 **Big lesson (L16 · Prefilter before you downsample):** a warp that *shrinks* the image (minifies) is throwing samples away — and if you just decimate (keep every $N$th source pixel, or sample a narrow reconstruction kernel) the detail too fine for the new grid **folds down into bold false moiré**, not into nothing. The fix is to **area-average first**: widen the resampling kernel in proportion to the downscale factor $s$ (so it gathers over the whole footprint each output pixel now covers) and renormalize — that low-pass blur is the prefilter. Enlarging needs *no* prefilter (clamp the kernel width to 1); shrinking *always* does. (→ see Big lesson **L16**, first placed in BASIC → Resampling; the frequency-domain *why* is **L5** — aliasing folds frequencies above the new Nyquist. The same lesson governs every geometric resample in this part: rectification corners that shrink, stabilization crops, optical-flow rendering.)
⚠️ **Candidate new part-spine lesson (flag only — not registered).** Almost every algorithm in this part follows one shape: **establish a correspondence / coordinate map between two images (or an image and a target frame), then transport pixels along it.** Warping makes the map explicit and applies it; perspective rectification *solves* the map (a homography) from constraints; panorama stitching *estimates* the map from features; morphing *interpolates between* two maps; optical flow *estimates a dense map* from brightness constancy; stabilization *smooths a map over time*; frame interpolation *renders along* a map. A lesson like **"find the coordinate map, then move the pixels"** would unify the part the way L10 unifies the inverse-problem part. Worth registering as a part-spine lesson for MOTION/WARPING — *noted here, deliberately not registered* (registry-first convention: register in [[Big Lessons]] before placing).
equations
warp as a domain map $\text{out}(x,y)=\text{in}\big(f^{-1}(x,y)\big)$ (move the domain, keep the range)
**forward** warp $\text{out}(f(x,y))\leftarrow \text{in}(x,y)$ (scatters → holes)
1-D **linear** reconstruction $\text{im}(x)= (1-\alpha)\,\text{im}[\lfloor x\rfloor]+\alpha\,\text{im}[\lceil x\rceil]$, $\alpha=x-\lfloor x\rfloor$
**bilinear** = separable 1-D linear in $x$ then $y$ (four-neighbour area weights)
general **convolution resampling** $\text{out}(x)=\sum_{x'}k\big(f^{-1}(x)-x'\big)\,\text{in}(x')$ with analytic kernel $k$ centred at the fractional source location
**minification kernel scaling** $k_s(t)=\tfrac1s\,k(t/s)$ for downscale factor $s$ (widen + renormalize → area-average = prefilter, **L16**)
**affine** $\begin{psmallmatrix}x'\\y'\end{psmallmatrix}=A\begin{psmallmatrix}x\\y\end{psmallmatrix}+\mathbf t$ (6 DOF)
**projective / homography** $\begin{psmallmatrix}x'\\y'\\w'\end{psmallmatrix}=H\begin{psmallmatrix}x\\y\\1\end{psmallmatrix}$, then $(x'/w',y'/w')$ (8 DOF
solve from 4 correspondences — [[Manual panorama stitching from multiple views]])
**TPS / RBF** field $f(\mathbf p)=A\mathbf p+\mathbf t+\sum_i w_i\,\phi(\lVert\mathbf p-\mathbf c_i\rVert)$ with $\phi(r)=r^2\log r$ (thin-plate radial basis)
**Beier–Neely** single-segment map via the segment frame $u=\dfrac{(\mathbf X-\mathbf P)\cdot(\mathbf Q-\mathbf P)}{\lVert\mathbf Q-\mathbf P\rVert^2},\ v=\dfrac{(\mathbf X-\mathbf P)\cdot\perp(\mathbf Q-\mathbf P)}{\lVert\mathbf Q-\mathbf P\rVert}$, then $\mathbf X'=\mathbf P'+u(\mathbf Q'-\mathbf P')+\dfrac{v\,\perp(\mathbf Q'-\mathbf P')}{\lVert\mathbf Q'-\mathbf P'\rVert}$, multi-segment weight $w_i=\Big(\dfrac{\ell_i^{p}}{a+d_i}\Big)^{b}$.
6.2 Resampling for complex spatial transforms
6.3 Morphing
fig-morph-crossfade-ghost
fig-morph-crossfade-ghost · why a cross-dissolve ghosts — two misaligned faces averaged at $t=0.5$ superimpose features into a translucent four-eyed, two-mouthed ghost; the blend handled colour but ignored position
⬜ figure not yet created
the failure that motivates everything)
fig-morph-shape-vs-color
fig-morph-shape-vs-color · the two orthogonal knobs of a morph — range (colour) as the cross-dissolve and domain (shape) as feature-position interpolation; a true morph is their product, the square's corners labelling the four extremes 🟨
fig-morph-recipe
fig-morph-recipe · the three-step morph at $t=0.5$ — interpolate the features to the midpoint shape, warp both images to that same shape, then cross-dissolve; a sharp single face because alignment preceded the blend 🟨
⬜ figure not yet created
`fig-beier-neely-oneseg` (a single before/after line pair: the $(u,v)$ coordinate frame along/perpendicular to the segment, and how a point $X$ maps to $X'$) fig-beier-neely-oneseg
⬜ figure not yet created
`fig-beier-neely-multiseg` (several line pairs, each proposing an $X'_i$, combined by a distance/length **weighted average** — the field warp) fig-beier-neely-multiseg
fig-mesh-vs-field-morph
fig-mesh-vs-field-morph · two ways to drive the same morph — mesh (per-triangle affine, strictly local, fiddly, can crease) vs field/Beier–Neely (a few control lines, globally-supported distance-weighted field, sparse but fold-prone) 🟨
fig-view-morph-prewarp
fig-view-morph-prewarp · why two views need view morphing — a naive 2-D morph bends straight edges and shrinks a rotating object, vs view morphing (prewarp to aligned epipolar lines → interpolate → postwarp) keeping lines straight and shape intact 🟨
💡 **Big lesson (recurrence — *align before you blend*):** averaging two images only makes sense once their features sit at the *same place*. A plain cross-dissolve blends colors at fixed coordinates, so any misalignment **double-exposes into a ghost**; the cure is to **warp into correspondence first, then average** — the same move as panorama de-ghosting and multi-frame super-resolution's sub-pixel alignment. Morphing is this lesson taken to its limit: not just align-then-average, but interpolate the *alignment itself* over $t$. (Companion to the *capture-then-decide / align-then-merge* family; the dense automatic aligner is [[Optical flow]].)
6.4 Morphable models
6.5 Shape-preserving warping
6.6 Video in-betweening
6.7 Perspective distortion and its correction
fig-keystoning-cause
fig-keystoning-cause · keystoning is the tilt, not the lens — a camera tilted up makes world-parallel verticals converge (façade → trapezoid), while the fronto-parallel shot keeps them parallel; projection preserves lines, not parallelism 🟨
fig-rectify-before-after
fig-rectify-before-after · rectifying by homography — a keystoned input with four corner handles dragged onto a known rectangle (vanishing point marked) re-rendered fronto-parallel by $\text{out}(\mathbf x)=\text{in}(H^{-1}\mathbf x)$ 🟨
⬜ figure not yet created
`fig-vanishing-point-autosolve` (detected parallel-line bundles meeting at vanishing points → the homography that sends the vertical VP to infinity straightens the verticals) fig-vanishing-point-autosolve
fig-tilt-shift-scheimpflug
fig-tilt-shift-scheimpflug · fixing perspective in the lens — shift moves the sensor within an oversized image circle while staying parallel to the façade (no converging verticals), tilt sets the focus plane via the Scheimpflug condition 🟨
⬜ figure not yet created
`fig-rectify-resample-stretch` (the rectified output's non-uniform resolution: the far/top corners are enlarged and softened while the near edge is compressed — why one homography only rectifies a *plane*). fig-rectify-resample-stretch
💡 **Big lesson (L16 recurrence · prefilter the shrinking corners):** rectification does not resample uniformly — to make the *far* (top) of the façade as wide as the *near* edge, the homography **enlarges** the top corners (upsample → just interpolate) and **compresses** other regions (downsample → must prefilter, or alias). So a faithful rectifier applies the **prefilter-before-downsample** discipline *per region*, with the kernel footprint set by the **local Jacobian** of $H$ (how much each area shrinks). (→ see Big lesson **L16**, [[Warping and resampling]] / BASIC Resampling.)
equations
the rectifying map is a **homography** $\mathbf x' \simeq H\mathbf x$ (homogeneous
divide by $w'$) — the projective warp of [[Warping and resampling]]
solve $H$ from **4 correspondences** (building corners → target rectangle), each pair giving 2 linear equations, $H$ up to scale → 8 DOF (least squares / SVD, per [[Manual panorama stitching from multiple views]])
applied by **inverse warp + resample** $\text{out}(\mathbf x)=\text{in}\big(H^{-1}\mathbf x\big)$
the **auto** constraint: choose $H$ so the bundle of (imaged) parallel lines maps to a **vanishing point at infinity** (verticals become vertical / parallel).