💬Comments welcome. To leave a note, select any text and click the note / highlight button that pops up — or open the panel with the tab at the top-right (‹). Notes are visible only inside our private review group.
jump to

2.4 Image measurements as integrals

A camera makes a number out of light. Point it at a scene, open the shutter, and where there was a continuous, infinitely detailed field of light there is now a grid of integers. The question of this chapter is exactly what each of those integers is — what physical quantity a pixel value reports, and what governs how trustworthy it is.

The answer is short enough to state up front: a pixel value is an integral. The light reaching the camera is not one ray but a whole bundle, spread over a range of directions (the aperture), a patch of the sensor (the pixel's area), a band of colors (wavelength), and a stretch of time (the exposure). The sensor adds all of it up. Everything else in this chapter elaborates that one sentence — first by naming the full field of light a camera draws from (the plenoptic function), then by writing the measurement itself as that integral and connecting its four axes to the camera's controls, and finally by describing the physical device that performs it (the sensor). The inevitable jitter in the result — noise — traces straight back to the fact that the thing being integrated is a stream of discrete, randomly arriving photons; it is the subject of the next chapter.

Sidebar — photography as objective measurement

It is worth pausing on the word measurement, because the digital sensor sits at the end of a long historical arc. Photography began as a way to make pictures, and for most of its life a photograph was a perceptual, interpretive object: a film stock had a "look," exposure and development were choices, and the print was a performance of the negative, not a readout of it (the score and the performance). But across that whole history the medium bent steadily toward objective measurement — each advance made the photograph a more faithful, more quantitative record of the light that was actually there, frozen at one instant. The digital sensor is the far end of that arc. Because each photosite literally counts photons, a raw image is, to a good approximation, a calibrated radiometric measurement — a physical map of scene radiance, in real units, of a single snapshot of time — rather than merely a pleasing rendering of one. That shift is what makes the rest of this book possible: you can only merge exposures into high dynamic range, recover depth, or undo a blur if the pixel numbers genuinely mean something physical. (The intro's generation → measurement → generation arc tells the same story from the other side — photography swung hard toward measurement, and only now, with generative AI, swings part of the way back.)

2.4.1 The plenoptic function

Before we can say what a pixel measures, it helps to name everything there is to measure. Stand in a room and ask: what is all the light here, before any camera, eye, or sensor selects from it? At every point in space, light is passing through in every direction, at every wavelength, at every instant. The function that captures all of it at once is the plenoptic function (from plenus, full, and optic — "full optic") introduced by Adelson and Bergen.

Parameterize a single ray of light by its 3D position $(x, y, z)$, its 2D direction $(\theta, \phi)$, its wavelength $\lambda$, and the time $t$, and the plenoptic function returns the radiance carried by that ray:

$$ P(x, y, z, \theta, \phi, \lambda, t). $$

That is seven numbers in, one radiance out — a seven-dimensional description of every ray of light filling all of space, in every direction and color, at every moment. It is everything there is to see, from everywhere, at once (Figure 2.4.1).

Seven is the classical count, and it captures everything an ordinary camera responds to — but a ray of light carries one more property the standard plenoptic function omits: its polarization, the orientation in which the light's electromagnetic field oscillates. Add it as an eighth axis and $P$ gains a polarization argument. We are largely blind to it, so it is easy to forget; yet it is physically there on every ray, and it is information — the sky is partially polarized in a pattern that follows the sun, reflections off glass and water are strongly polarized (which is how a polarizing filter cuts glare), and the surface orientation of an object leaves a polarization signature. Many animals read it directly (bees navigate by the sky's polarization pattern; mantis shrimp even sense circular polarization — see Human (and animal) vision and color), and polarization cameras sample it on purpose for glare removal, reflection separation, and shape-from-polarization. It is one more dimension of the light that a measurement can choose to integrate over (throwing it away, as most sensors do) or resolve (as a polarization camera does) — the same capture-or-discard choice the rest of this chapter draws for direction, wavelength, and time.

fig-plenoptic-ray
Figure 2.4.1. The plenoptic function. A single ray of light is specified by its 3D position $(x, y, z)$, its 2D direction $(\theta, \phi)$, its wavelength $\lambda$, and the time $t$ — seven numbers. The plenoptic function $P(x, y, z, \theta, \phi, \lambda, t)$ returns the radiance of every such ray filling space. The second panel shows a camera measuring only a 2D projection of it: each pixel sums all the rays reaching it through the aperture, discarding their direction.

The reason to introduce such an extravagant object is that it makes one idea precise: any imaging operation is the measurement of some portion of the plenoptic function. The eye samples a 2D slice; a video camera samples a 2D slice that also varies over time; a depth sensor recovers part of the geometry hidden in it; a light-field camera grabs a richer 4D chunk. Each imaging device is just a different way of sampling $P$.

In particular, a conventional camera records only a 2D projection. Each pixel collects all the rays that reach it through the aperture and sums them, discarding their direction — it keeps that light arrived, not from which direction within the aperture it came. That discarded directional information is exactly what light-field and multi-aperture cameras (Advanced part) work to preserve, so they can refocus or compute depth after the shot. For an ordinary camera it is thrown away, and the throwing-away is an integral. The next section makes that literal.

2.4.2 The pixel integral

The plenoptic function now pays off concretely. We said a conventional camera measures only a 2D projection of $P$; we can now say exactly what each pixel of that projection holds. A pixel does not sample a single ray — it integrates a slice of $P$ over four of its plenoptic dimensions at once, each weighted by how the pixel responds there (Figure 2.4.2):

  1. the aperture — all the ray directions $(\theta, \phi)$ the lens admits toward this pixel, a solid angle set by the f-number;
  2. the pixel's area $(x, y)$ on the sensor — a finite patch, not a mathematical point;
  3. wavelength $\lambda$ — weighted by the sensor's and color filter's spectral response; and
  4. the exposure time $t$ — the interval the shutter is open.

Schematically, the value recorded at a pixel is

$$ \text{pixel value} \;=\; \iiiint \underbrace{P(x, y, \theta, \phi, \lambda, t)}_{\text{plenoptic radiance}}\; \underbrace{r(\lambda)\, k(x, y, \theta, \phi)}_{\text{pixel's response}}\; d(\text{aperture})\; d(\text{area})\; d\lambda\; dt, $$

the scene radiance integrated over aperture × pixel area × wavelength × time, weighted by the pixel's spectral response $r(\lambda)$ and the geometric weight $k$ that the aperture and footprint impose. Stated in one line: measurement is integrating the plenoptic function against the pixel's sampling kernel — and the support of that kernel along each axis is exactly a photographic knob. Because the underlying capture counts photons, this integral is linear in scene radiance — a fact we will need both for high-dynamic-range merging and for the noise analysis later in the chapter.

fig-pixel-measurement-integral
Figure 2.4.2. A pixel value as a four-dimensional integral — a measured slice of the plenoptic function. The thin lens (drawn as an ellipse) admits a cone of rays; the value recorded at one finite-area pixel is the scene radiance integrated over (1) the aperture — which rays/directions the lens admits, (2) the pixel's area on the sensor, (3) wavelength (the spectral response), and (4) the exposure time. Aperture → depth of field; pixel area → sampling/antialiasing; time → motion blur; wavelength → color.

The four domains are the four knobs. What makes this more than bookkeeping is that each integration domain is governed by a control the photographer actually sets — the integral is the camera's exposure settings, written as mathematics:

The one familiar exposure control missing from this list is ISO, and the omission is instructive: ISO is not an integration limit. It scales the result of the integral after the light is collected (analog or digital gain), rather than changing how much light is gathered — which is exactly why pushing ISO amplifies noise along with signal, as the noise section will show.

The deeper reason to write capture this way is that each of the four integration domains earns its own topic later, and much of photography is the craft of choosing how to shape these four integrals — every later sensing topic is just what happens when you change the support of one axis:

The payoff of the kernel framing is that the rest of the sensing story reduces to a single question — which kernel, how wide, and what does it lose: antialiasing widens the spatial kernel, depth of field is the aperture kernel, motion blur is the time kernel, and demosaicking inverts the wavelength multiplexing. Each is also a source of noise: a narrower kernel along any axis — a smaller footprint, a shorter exposure, a stopped-down aperture — integrates fewer photons, and fewer photons mean a noisier estimate, which is exactly where the next chapter picks up.

Off-axis ($\cos^4$) falloff. One prediction falls straight out of this irradiance integral. Rays reaching the corners of the frame arrive obliquely, so the same scene radiance delivers less irradiance there: the aperture-and-area geometry of the pixel integral works out to a $\cos^4\theta$ dimming toward the edges, called natural vignetting (Figure 2.4.3). It is distinct from optical vignetting — lens elements physically blocking edge rays, covered in Optics — in that natural vignetting is a clean consequence of the pixel-integral geometry alone.

fig-cos4-falloff
Figure 2.4.3. Natural vignetting: the $\cos^4\theta$ falloff. Relative pixel irradiance is plotted against the off-axis angle $\theta$, beside a preview of a frame darkening toward its corners. Even with uniform scene radiance, corner pixels receive less light, falling as $\cos^4\theta$ — the combined effect of oblique incidence, the foreshortened aperture, and the longer path to the corner. This is the geometry of the sensor irradiance integral, separate from optical vignetting.

This completes the picture-forming side of measurement: a camera selects a slice of all the light in the world — the plenoptic function — and integrates it over aperture, area, wavelength, and time. The device that performs that integral physically — a grid of photosites counting photons, read out as numbers, and the shutter that times them — is the subject of the next chapter; the unavoidable jitter in the photon count, the noise it imposes, follows after that.