💬Comments welcome. To leave a note, select any text and click the note / highlight button that pops up — or open the panel with the tab at the top-right (‹). Notes are visible only inside our private review group.
💡 In a hurry? Jump to this chapter’s 1 big lesson ↓

6.2 Antialiasing for complex transforms

Take the warp engine of the previous chapter and point it at a checkerboard floor receding to the horizon. Near the camera, one output pixel covers about one tile, and the round prefilter of Warping and resampling handles it perfectly. But follow the floor into the distance and something breaks: toward the horizon a single output pixel spans many tiles in depth while still covering only one tile across. The patch of source that lands in that pixel is not a little square — it is a long, thin sliver, dozens of times longer than it is wide. Average that sliver with a round blur and you are stuck with an impossible choice. Size the blur to the sliver's width and the length goes under-filtered, so the depth direction aliases into crawling moiré. Size it to the sliver's length and the width goes over-filtered, so the whole far field smears to grey. One radius cannot serve two wildly different extents. This chapter is about giving the prefilter an orientation so it can do both jobs at once.

💡 Big lesson (L16, anisotropic)

The prefilter-on-minify discipline — L16, area-average before you throw samples away — is direction-dependent the moment the warp stops being a similarity. A general warp compresses the source more along one axis than another, so the correct prefilter is not a round blur but an oriented ellipse: average a lot along the compressed direction, barely at all across the sharp one. Isotropic filtering must then either alias the compressed axis (round filter sized to the short axis) or blur the sharp one (sized to the long axis); anisotropic filtering — EWA and its approximations — does both jobs at once by matching the kernel to the footprint ellipse. It is the same lesson as 6.1's round prefilter, one dimension richer: shape the low-pass to the shape of the footprint, not just its size. (Big lesson L16, first placed in BASIC — Resampling and reused in Warping and resampling; the frequency-domain why — aliasing folds everything above the local Nyquist — is L5, see Linearity, Fourier, aliasing and deblurring.)

6.2.1 When the footprint stops being a square — the warp Jacobian

Recall the engine. Inverse warping reads the source at $f^{-1}(x,y)$: for each output pixel, pull it back through the inverse map and look up the colour there. But an output pixel is not a point — it is a little area, and what we honestly owe it is the average of the source over the whole region that maps into it. Under a uniform scale that region is a square, the isotropic L16 kernel covers it exactly, and we are done. Under a general warp it is not a square, and the whole chapter follows from what shape it actually is.

To find that shape, linearize the inverse warp about the pixel. Its first-order behaviour there is captured by the Jacobian $J=\partial f^{-1}/\partial(x,y)$, the $2\times2$ matrix of partial derivatives of the source coordinates with respect to the screen coordinates. A linear map sends a circle to an ellipse, so the unit screen circle — one output pixel — maps to an ellipse in source space: the footprint. Its geometry is read straight off $J$ through the singular value decomposition. The singular values $\sigma_1\ge\sigma_2$ are the ellipse's major and minor radii; the singular vectors $\hat e_1,\hat e_2$ are their directions. Equivalently, the footprint covariance is $\Sigma=JJ^{\top}$, whose eigenvalues are $\sigma_1^2,\sigma_2^2$. The single number that decides whether any of this matters is the anisotropy ratio $\rho=\sigma_1/\sigma_2$: when $\rho=1$ the warp is a similarity (uniform scale plus rotation), the footprint is a circle, and we are back to the isotropic prefilter of Warping and resampling. When $\rho>1$ the footprint is genuinely stretched, and a round filter can no longer match it (Figure 6.2.1).

fig-footprint-ellipse-jacobian
Figure 6.2.1. One output pixel, back-projected through a perspective warp. Left: the unit screen circle (one output pixel) is pushed through the local inverse-warp Jacobian $J=\partial f^{-1}/\partial(x,y)$ and lands in source space as an ellipse — the footprint. Its major and minor radii are the singular values $\sigma_1\ge\sigma_2$ of $J$, along the singular vectors $\hat e_1,\hat e_2$; a square (round) footprint survives only under a similarity ($\sigma_1=\sigma_2$). Right: a regular grid under a real perspective warp — a small source cell back-projects to a nearly isotropic circle near the camera but to a long stretched ellipse toward the horizon. The footprint shape is the local Jacobian.

The canonical offender is exactly the floor we opened with: a checkerboard plane rendered in perspective, or rectified to perspective. Near the camera one pixel covers one tile and $\rho\approx1$; toward the horizon one pixel covers many tiles in depth but only one across, and $\rho$ climbs into the tens or hundreds. That is where every isotropic filter visibly fails, and the failure is not a matter of taste but of geometry — a round kernel has one radius, and the footprint has two very different ones. The fix is to make the kernel an ellipse that matches the footprint: the L16 prefilter, now carrying an orientation. Figure 6.2.2 shows the three outcomes side by side on a genuine perspective checkerboard.

fig-isotropic-vs-anisotropic-floor
Figure 6.2.2. A receding checkerboard floor, filtered three ways (a real perspective render, each pixel filtered in numpy). Point sample (one tap per pixel): sharp up close but the compressed depth axis aliases into crawling moiré toward the horizon. Isotropic prefilter (a round average sized to the major axis): the moiré is gone, but the sharp across-axis is over-blurred and the far field washes to flat grey far too early. Anisotropic / EWA (an elliptical average matched to the footprint): tiles stay crisp into the distance with no moiré — the kernel averages hard along depth and gently across it. One radius cannot win; an oriented ellipse can.

6.2.2 EWA — the elliptical weighted average

The gold-standard answer is to weight each source texel inside the footprint by an elliptical Gaussian matched to the ellipse, then normalize. This is the elliptical weighted average (EWA), due to [@greene-heckbert-1986] and developed in full in [@heckbert-1989]. It is precisely the reconstruct-and-prefilter convolution of Warping and resampling with the kernel finally allowed to be non-round: the round Gaussian of the isotropic prefilter becomes an anisotropic Gaussian whose covariance is the footprint covariance $\Sigma=JJ^{\top}$.

What makes EWA practical, and not merely correct, is that an ellipse is the level set of a quadratic. Write the Mahalanobis form

$$ Q(\mathbf u)=\mathbf u^{\top}\Sigma^{-1}\mathbf u = A\,u^2 + B\,uv + C\,v^2, $$

where $\mathbf u=(u,v)$ is the offset of a texel from the footprint centre. Then a texel lies inside the footprint exactly when $Q\le F$ (the cutoff $F$ fixes how many standard deviations of the Gaussian we keep), and its weight is the elliptical Gaussian $w=\exp(-\tfrac12 Q)$. So the whole filter is a tidy loop: walk the axis-aligned bounding box of the ellipse, accumulate $w\cdot\text{texel}$ wherever $Q\le F$, and divide by $\sum w$. The conic both tests membership and supplies the weight (Figure 6.2.3).

fig-ewa-gaussian-ellipse
Figure 6.2.3. EWA over the source texels. The footprint ellipse carries an elliptical-Gaussian weight $w=\exp(-\tfrac12 Q)$ — texels near the centre count fully, texels toward the rim fade out — laid over the texel grid. The dashed conic bounding box is the axis-aligned rectangle the loop scans; inside it, a texel contributes iff $Q\le F$. Along a scanline the quadratic $Q$ updates by the incremental march $Q(u{+}1,v)=Q(u,v)+(2Au+Bv+A)$, whose own increment is the constant $2A$ — so each texel costs about two additions and a lookup for $\exp$, no multiplies in the inner loop.

That incremental update is the elegant part and the reason EWA was already practical in 1986. Because $Q$ is a quadratic in the integer texel coordinates, its first difference along a scanline, $Q(u{+}1,v)-Q(u,v)=2Au+Bv+A$, is merely linear, and that quantity's own increment is the constant $2A$. So you carry two running sums — $Q$ itself and its per-step increment — and advance both by addition. The per-texel cost collapses to about two adds and one table lookup for the exponential; there are no multiplies in the inner loop at all.

The cost that does bite is the ellipse area, which is proportional to $\sigma_1\sigma_2$ and explodes at grazing angles where $\sigma_1$ runs away. Two guards keep it bounded. First, run EWA on a MIP pyramid: pick the pyramid level at which the minor axis is about one texel, and at that level the whole ellipse spans only a handful of texels — the prefiltered pyramid has already done the across-axis averaging for you, so EWA only has to integrate along the major axis. Second, apply a max-anisotropy clamp, $\sigma_1\leftarrow\min(\sigma_1,\rho_{\max}\sigma_2)$, with a one-texel floor on the minor axis: beyond the cap you accept a little extra blur rather than an unbounded sample count. These two together turn EWA from "exact but occasionally ruinous" into "exact and bounded."

6.2.3 Feline and anisotropic MIP probing — EWA on a GPU budget

Integrating the entire ellipse is more than real-time rendering can afford per pixel, so hardware uses a clever approximation. The move, due to [@mccormack-etal-1999]Feline, for Fast Eliptical Lines — is to stop integrating the area and integrate a line instead: approximate the footprint ellipse by a row of circular probes strung along its major axis. Each probe is an ordinary trilinear MIP sample sized to the minor axis $\sigma_2$ — so it averages correctly across the ellipse, reusing the prefiltered pyramid — and the $N$ probes, spaced along the major axis $\hat e_1$ and summed with Gaussian weights, cover the ellipse's length. The ellipse's area integral becomes a short sum of cheap point queries (Figure 6.2.4).

fig-feline-probe-line
Figure 6.2.4. Feline approximates the footprint ellipse by a line of trilinear probes. Instead of integrating the whole ellipse, lay $N$ circular probes along the major axis $\hat e_1$; each is a standard trilinear MIP sample sized to the minor axis $\sigma_2$ (so it averages correctly across the ellipse and reuses the prefiltered pyramid), and the probes are Gaussian-weighted along the length. This is exactly what "$16\times$ anisotropic" means on a GPU — up to 16 trilinear taps along the footprint's long axis. Trilinear alone is the degenerate $N{=}1$ case: one isotropic probe forced to pick a single LOD for both axes, which over-blurs the sharp one.

This is the literal meaning of the "16× anisotropic" slider in a graphics driver: hardware anisotropic filtering is this scheme, with $N$ — up to sixteen — trilinear taps marched along the footprint's long axis. Plain trilinear filtering (Williams' MIP-map, [@williams-1983]) is the $N{=}1$ degenerate case: a single isotropic probe that must choose one level of detail for both axes at once, and therefore over-blurs whichever axis is sharper. Feline restores the lost axis for a handful of taps rather than the full ellipse-area cost of EWA. The trade-off is the obvious one: more probes mean a closer match to true EWA and more memory bandwidth, capped by the same $\rho_{\max}$ as before. Because each probe is a minor-axis-sized trilinear tap into the existing pyramid, every probe is $O(1)$ — which is the entire appeal.

6.2.4 Other footprint integrators — summed-area tables and ripmaps

EWA and Feline buy arbitrarily oriented anisotropy. Two older structures buy the cheaper, axis-aligned kind, and they are worth knowing both for where they still live and for the precise way they fall short.

A summed-area table (SAT), due to [@crow-1984], precomputes the integral image $S$ — the running sum of all pixels above and to the left of each location. With $S$ in hand, the average over any axis-aligned box is just four corner lookups,

$$ \frac{S(x_1,y_1)-S(x_0,y_1)-S(x_1,y_0)+S(x_0,y_0)}{\text{area}}, $$

an $O(1)$ operation for any box size with no pyramid at all. The catches are two. It handles axis-aligned rectangles only — there is no way to tilt the box, so it can never match a rotated footprint ellipse, only an upright one. And the partial sums grow enormous (they accumulate the whole image), so a SAT is badly precision-hungry and wants wide integer or float storage. A ripmap is the complementary idea: an anisotropic MIP-map that is separately pre-shrunk in $x$ and in $y$, forming a 2-D grid of levels indexed by $(\text{LOD}_x,\text{LOD}_y)$, so an axis-aligned anisotropic footprint reads one matching level directly. It too is blind to rotation, and it costs the storage of an ordinary MIP-map (Figure 6.2.5).

fig-sat-ripmap
Figure 6.2.5. Two axis-aligned footprint integrators. Left — summed-area table: the box average over any axis-aligned rectangle is four corner lookups of the integral image $S$, $(S_{11}-S_{01}-S_{10}+S_{00})/\text{area}$ — $O(1)$ at any box size, but only for upright boxes (no rotation), and precision-hungry because the partial sums grow huge. Right — ripmap: a 2-D grid of levels separately pre-shrunk in $x$ and in $y$, indexed by $(\text{LOD}_x,\text{LOD}_y)$; an axis-aligned anisotropic footprint reads one level directly, at 4× the storage of a MIP-map. Both capture axis-aligned anisotropy only — neither can tilt to a rotated ellipse.

Where each sits is then clear. SATs and ripmaps give axis-aligned anisotropy cheaply; EWA and Feline give arbitrarily oriented anisotropy — which is the case that actually arises under perspective, where the receding-floor ellipse points along whatever direction the depth gradient happens to take. So in practice GPUs ship Feline-style probing for texturing, while summed-area tables survive less in texture filtering than in integral-image tricks elsewhere — box filters, Viola–Jones face detection, local-contrast normalization — wherever a fast axis-aligned box average is the whole job.

6.2.5 Footprint estimation, clamping, and when to reach for it

Everything above assumes we know the footprint, which means knowing the Jacobian $J$. Happily it usually comes for free. In a renderer, screen-space derivatives of the texture coordinates — $\partial u/\partial x$ and the rest, exposed on the GPU as the ddx/ddy instructions — are the two columns of $J$ directly. In an image warp it is the local linearization of $f^{-1}$ that you are already evaluating to do the inverse lookup. Either way, two derivative vectors give the $2\times2$ matrix $J$, from which $\Sigma=JJ^{\top}$ and the ellipse follow. And the chapters join up exactly at the similarity case: for pure minification with no warp, $J$ is a scaled identity, $\Sigma$ is a circle of the downscale radius, and EWA degenerates back into the isotropic L16 prefilter of Warping and resampling — the two treatments are one continuous story, with this chapter the strictly more general end.

A few clamps keep the whole thing stable. Cap the anisotropy at $\rho_{\max}$ to bound the work (the probe count, or the ellipse area). Floor the minor radius at one texel so you never try to sample below the base level of the pyramid. And be wary at silhouettes and discontinuities: a linearization is only valid where the warp is smooth, so across a depth edge the estimated Jacobian lies, and a slightly-blurred, clamped answer beats a confidently sharp wrong one.

Finally, the honest decision aid — because this machinery is not always worth its cost. A gentle affine, a panorama reprojection, a mild lens-distortion correction: these are close enough to similarities that the isotropic L16 prefilter (a MIP-map plus trilinear) is entirely sufficient, and reaching for EWA buys nothing. Save the anisotropic machinery for strong perspective, swirls, and grazing-angle minification — anywhere the footprint ellipse gets genuinely long and one axis would otherwise alias. That is precisely the rectifier's non-uniform Jacobian in Perspective distortion and its correction, and the receding-ground case we opened with: the warps where the floor crawls unless you average along the ellipse.

One last connection points forward. The same elliptical Gaussian runs in reverse for point-based rendering: rather than gathering source texels into an output pixel, [@zwicker-etal-2002] forward-splats each point into the image as an elliptical-Gaussian footprint and accumulates. It is the antialiasing dual of EWA texture filtering — gather versus scatter of the very same kernel — and a direct ancestor of today's Gaussian-splatting view synthesis, where a scene is a cloud of anisotropic Gaussians splatted into each new view. (Cross-reference the novel-view-synthesis material.) The ellipse you matched to a pixel's footprint here is the same ellipse a splat paints there.


Big lessons of this chapter

The recurring principles from this chapter, gathered for review.

💡 Big lesson (L16, anisotropic)

The prefilter-on-minify discipline — L16, area-average before you throw samples away — is direction-dependent the moment the warp stops being a similarity. A general warp compresses the source more along one axis than another, so the correct prefilter is not a round blur but an oriented ellipse: average a lot along the compressed direction, barely at all across the sharp one. Isotropic filtering must then either alias the compressed axis (round filter sized to the short axis) or blur the sharp one (sized to the long axis); anisotropic filtering — EWA and its approximations — does both jobs at once by matching the kernel to the footprint ellipse. It is the same lesson as 6.1's round prefilter, one dimension richer: shape the low-pass to the shape of the footprint, not just its size. (Big lesson L16, first placed in BASIC — Resampling and reused in Warping and resampling; the frequency-domain why — aliasing folds everything above the local Nyquist — is L5, see Linearity, Fourier, aliasing and deblurring.)