symbol

meaning (this chapter)

note

$S(x,y,\lambda)$, $S(\lambda)$

the scene spectrum (spectral radiance) at a pixel — power (or reflectance) as a function of wavelength

from Color technology; the continuous object RGB samples. Note: $S$ here is the spectrum, not the shading $S$ of the sibling chapter Intrinsic images with time lapse

$\lambda$

wavelength; the third axis of the data cube

from Notations

$R_c(\lambda)$

a color channel's spectral sensitivity (three broad, overlapping curves for RGB)

from Color technology

$R_b(\lambda)$

a hyperspectral band's spectral sensitivity — narrow, near-delta at $\lambda_b$

new (this chapter); many of them

$I_c = \int S R_c\,d\lambda$

a channel value as the projection of the spectrum onto $R_c$; lossy, many-to-one (metamers)

from Color technology; the equation RGB and hyperspectral share

$I(x,y,\lambda)$, $I_b(x,y)$

the hyperspectral data cube — an image stack indexed by wavelength; $I_b$ its $b$-th band

new (this chapter); the chapter's central object

$\mathrm{NDVI}$

$(I_{\mathrm{NIR}}-I_{\mathrm{red}})/(I_{\mathrm{NIR}}+I_{\mathrm{red}})$ — a per-pixel band ratio vegetation index

new; an L1 normalized/multiplicative index

LCTF / AOTF

liquid-crystal / acousto-optic tunable filter — no-moving-parts spectral scanners

new (capture hardware)

💡 In a hurry? Jump to this chapter’s 1 big lesson ↓

10.11 Hyperspectral imaging, color wheels⧉

Stand in front of a fresh leaf and a plastic leaf dyed to match it. To your eye, and to your camera, they can be the same green — identical R, G, B. Yet the light coming off them, measured wavelength by wavelength across the spectrum, is not remotely the same: the real leaf has a sharp cliff in reflectance just past the red, where chlorophyll stops absorbing and the near-infrared (NIR) floods back, and the plastic has nothing of the kind. Your camera cannot see that cliff. It collapsed the whole spectrum into three numbers at the instant of capture, and the cliff fell into the gap between them. Hyperspectral imaging is the refusal to make that collapse — to record, at every pixel, not a color but a spectrum: tens or hundreds of narrow wavelength bands instead of three broad ones.

This is the part's one move, transplanted to a new axis. We have stacked frames along exposure to beat dynamic range and along focus to beat depth of field; here we stack along wavelength, deferring "which three colors?" — or really, which spectral question — to long after capture (the L14 framing, made precise in the box below). Record the full spectral set, and you can ask, days or years later, what is this thing actually made of, a question RGB threw away before you ever thought to ask it.

💡 Big lesson (L14 — wavelength axis)

Capture the full set, decide later. RGB discards the spectrum at the moment of capture: three broad, overlapping bands, irreversibly mixed into three numbers. Hyperspectral imaging instead records the whole spectrum per pixel and decides spectral questions afterward — the same defer-the-decision move as HDR on the exposure axis, focal stacks on the focus axis, and light fields on the aperture axis, now run along the wavelength axis. The cost is the usual one: more data, more light, more capture time. The payoff is that you can ask what is this made of? — match a material, spot disease, expose a forgery — long after the shutter has closed. (Registered as L14 in Big Lessons; first appears in this part introduction; this is its wavelength-axis recurrence, and the light-field/plenoptic camera in Advanced computational photography is the same idea for the focus axis.)

10.11.1 Why three numbers aren't enough — RGB as a 3-sample projection⧉

Recall, from Human (and animal) vision and color and Color technology, exactly what a camera channel does. Light arrives at a pixel carrying a continuous spectrum $S(\lambda)$ — power as a function of wavelength. A color channel does not record that function; it records a single integral of it against a fixed spectral sensitivity $R_c(\lambda)$,

$$ I_c \;=\; \int S(\lambda)\,R_c(\lambda)\,d\lambda . $$

Read back: each channel multiplies the incoming spectrum by its own sensitivity curve and sums the result to one number. RGB does this with three broad, overlapping sensitivities — roughly the long-, medium-, and short-wavelength responses our cones inspired — so a whole continuous function is summarised by three dot products. That projection is brutally lossy and many-to-one: it is entirely possible for two physically different spectra to produce identical RGB. Those are metamers, and they are not an edge case — they are the everyday currency of color reproduction (the leaf and its plastic twin, two paints, two inks).

What gets lost is not noise; it is information that can be exactly the information you need. Two pigments on a canvas, two species of plant, fresh produce and the same fruit a day from spoiling, the forger's ink and the original it imitates — these routinely look identical to RGB and differ plainly in their spectra. The data that would tell them apart is precisely what the three-number projection discarded (Figure 10.11.1).

The generalisation is then almost trivial. Keep the very same equation, but swap the three broad sensitivities for many narrow band responses $R_b(\lambda)$, each a near-delta spike at a wavelength $\lambda_b$:

$$ I_b(x,y) \;=\; \int S(x,y,\lambda)\,R_b(\lambda)\,d\lambda \;\approx\; S(x,y,\lambda_b) . $$

Now each pixel yields not three numbers but a sampled spectrum $S(x,y,\lambda_b)$ — and stacking those samples across all bands gives the hyperspectral data cube $I(x,y,\lambda)$: an image stack whose third axis is wavelength, not focus or exposure (Figure 10.11.2). Seen this way, RGB is just this cube with three broad bands, and a Bayer color-filter array is a three-band snapshot mosaic — the most impoverished hyperspectral camera there is (cross-ref Color technology). The cube is the general object; color is a thin slice of it.

A word on vocabulary, because the two terms are used loosely. Multispectral imaging means a handful of bands — Landsat's seven, or an ordinary camera with a near-infrared channel bolted on. Hyperspectral means many contiguous, narrow bands — tens to hundreds — packed densely enough that each pixel carries a quasi-continuous spectrum. Same idea, different sampling density along $\lambda$; the line between them is one of degree, not of kind.

fig-hyperspectral-rgb-vs-spectrum — **Figure 10.11.1.** Two patches, one color, different spectra. **Left:** two materials — say a real leaf and a color-matched plastic, or two greens of paint — photographed under the same light, rendering to **identical RGB**: side by side the camera cannot tell them apart. **Right:** their full reflectance spectra $S(\lambda)$, plotted over the visible and into the near-infrared, are visibly different — different peaks, and (for the leaf) a sharp red-edge rise into the NIR that the plastic lacks. The three broad RGB sensitivities $R_c(\lambda)$ are overlaid faintly to show how both spectra integrate to the same three numbers: the patches are **metamers**. Hyperspectral sampling, dense along $\lambda$, separates what RGB merged.

fig-hyperspectral-cube — **Figure 10.11.2.** The hyperspectral data cube $I(x,y,\lambda)$. A stack of co-registered images, one per narrow wavelength band, drawn as a literal cube whose **third axis is wavelength** (contrast with the focus axis of a focal stack or the exposure axis of an HDR bracket). One pixel is pulled out two ways: as its full **sampled spectrum** — a dense curve of intensity versus $\lambda$ across the band stack — and, beside it, as the mere **three RGB samples** a normal camera would have kept. The cube is the general object; RGB is the same cube with only three broad bands.

10.11.2 Building the spectral stack — filter wheels, tunable filters, pushbroom, snapshot⧉

Having decided to fill a cube $I(x,y,\lambda)$, the engineering question is how. A sensor is a two-dimensional array; the cube is three-dimensional; something has to give. Every hyperspectral camera is a different answer to "which dimension do I trade away to get the third?" — and, exactly as with the focal and exposure stacks, the unifying view is L14: each method fills the cube by a different schedule, trading time against spatial resolution against spectral resolution (Figure 10.11.3).

Spectral scanning — the filter wheel and the tunable filter. The most direct strategy: put a narrow bandpass filter in front of the sensor, take a full-frame image in that one band, then change the filter and shoot again, sweeping through the bands one at a time. A mechanical color wheel rotates discrete glass filters into the optical path — the literal "color wheel" of the chapter's title, and the oldest trick in the book (Prokudin-Gorskii's three-filter color photographs of 1900s Russia are the three-band ancestor). Better, an electronically tunable filter — a liquid-crystal tunable filter (LCTF) or an acousto-optic tunable filter (AOTF) — sweeps its passband with no moving parts, fast and programmable. The trade is clean: you keep full spatial resolution and can take as many bands as you have patience for, but the capture is sequential in time, so the scene and camera must hold perfectly still across the whole sweep. This is the natural choice for the lab bench, the microscope, and a painting on an easel — anything that will sit still.

Spatial scanning — pushbroom (line-scan). Here you give up imaging the whole frame at once. A slit admits a single line of the scene, and a dispersing element — a prism or grating — spreads that line's light across the sensor by wavelength. The result is that one sensor axis records position along the line and the other records wavelength: you capture one spatial line in all its bands simultaneously. To fill the second spatial dimension you sweep the line across the scene, usually by moving the platform. This is the right architecture precisely when the scene is already moving past the sensor — a satellite or aircraft with the ground scrolling beneath it (hence the agricultural name, "pushbroom"), or a conveyor belt of produce or crushed mineral streaming past an inspection head. The trade: every band of a line arrives at once, but you need relative motion and careful line-to-line registration to assemble a clean cube.

Snapshot spectral imaging. The third answer captures the entire cube in one exposure, by spending spatial pixels on spectral bands. A spectral filter mosaic — a Bayer-style array, but with many band filters tiled instead of just R, G, B — gives each tiny neighbourhood of pixels a full set of bands; computational and coded designs do the same trick more cleverly (surveyed by Hagen & Kudenov 2013). The trade is the obvious one: a single shot, so motion and even video are fine, paid for by a drop in spatial resolution, since the sensor's pixel budget is now split between where and what color.

The most ambitious snapshot designs are unapologetically computational, and they pull the chapter back onto the part's other spine — the inverse problem. In coded snapshot compressive spectral imaging (CASSI) (Wagadarikar et al. 2008), a coded aperture and a dispersive element — a prism or grating — sit in the optical path and together multiplex the whole $(x,y,\lambda)$ cube onto a single 2-D sensor frame: the mask blocks a patterned subset of rays and the disperser shears each band sideways by its wavelength, so every sensor pixel ends up summing a coded mixture of wavelengths from a small neighbourhood. No band is read cleanly; the measurement is one scrambled projection of the entire cube. Recovering the cube is then exactly the compressive-sensing reconstruction we met for coded imaging — find the $I(x,y,\lambda)$ whose coded, sheared projection matches the measured frame, an under-determined fit pinned down by a prior: that a natural spectrum is sparse in some basis, or, increasingly, a learned spectral prior that fills in the rest. The bargain is the one this whole part keeps making, now spent the other way round: instead of trading time (the filter wheel's sequential sweep) or spatial pixels (the mosaic) for the cube, CASSI trades a reconstruction — solving an inverse problem in software — for a genuine single-shot capture, fast enough for motion and video where the scanners cannot follow. It is the same coded-aperture, compressive-sensing move as single-pixel and coded-aperture imaging; the machinery is developed under compressive sensing / coded imaging in Advanced computational photography.

Step back and the three strategies fall on a clean tradeoff triangle. You want all three of {full spatial resolution, full spectral resolution, single-shot / fast}; you can cheaply have any two. Spectral scanning sacrifices time (the scene must freeze); pushbroom demands motion (you must sweep); snapshot sacrifices spatial resolution (the mosaic spends pixels on wavelength). There is no free corner — which is simply the L14 cost (data, light, time) made geometric.

fig-hyperspectral-capture — **Figure 10.11.3.** Three ways to fill the cube, three panels. **(a) Spectral scan:** a filter wheel or tunable filter in front of the sensor admits one narrow band at a time; a stack of full-frame images accumulates along $\lambda$ as the passband steps — full spatial resolution, but sequential, so the scene must hold still. **(b) Pushbroom line-scan:** a slit selects one line of the scene and a prism/grating disperses it so the sensor reads position-along-the-line on one axis and wavelength on the other; sweeping the line (platform motion) fills the frame — all bands per line at once, but needs relative motion. **(c) Snapshot mosaic:** a Bayer-like array of many band filters captures the whole cube in one exposure — single shot, but spatial resolution is traded for spectral bands. Below, the **tradeoff triangle**: full spatial, full spectral, single-shot — pick two.

10.11.3 What it's for — material ID, agriculture, art and beyond⧉

A spectrum per pixel changes the kind of question you can answer. RGB answers "what color is it?"; a spectrum answers "what is it made of?" — and that shift drives every application.

Material identification and unmixing. A pixel's spectrum is a fingerprint. Lay it against a library of known spectral signatures and you can classify the material — this mineral, that plastic, this species of crop — pixel by pixel, building a material map of the scene. And because a single ground pixel from orbit may straddle several materials, you can go further and unmix it, decomposing the measured spectrum into a weighted sum of pure-material signatures to recover how much of each is present. This is the founding use of imaging spectrometers in remote sensing — the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS), run by NASA's Jet Propulsion Laboratory (JPL), and its descendants map geology and mineralogy from the air this way.

Agriculture and vegetation. Healthy plants do something striking just past the visible: they absorb red light (chlorophyll) but reflect near-infrared strongly, producing the sharp "red edge" we met in the leaf example. The workhorse index that exploits it is the normalised difference vegetation index (NDVI),

$$ \mathrm{NDVI} \;=\; \frac{I_{\mathrm{NIR}} - I_{\mathrm{red}}}{I_{\mathrm{NIR}} + I_{\mathrm{red}}} . $$

Read it back: the contrast between the near-infrared and red bands, normalised by their sum. High NDVI means vigorous, well-watered foliage; a drop flags water stress, disease, or senescence — often before any change is visible to the eye. Note the shape of the formula. It is a per-pixel band ratio, and that is no accident: it is the L1/L2 multiplicative-and-log lesson in miniature. A ratio of bands cancels the overall illumination level — bright sun or shade scales NIR and red together and divides out — leaving an intrinsic property of the surface. Normalising by the sum just bounds the index to $[-1,1]$. This is the same instinct as working in log to turn a multiplicative scene into an additive one (cross-ref BASIC point operations): when the thing you care about is a proportion, you build a ratio, not a difference.

Art conservation and forensics. Train a multispectral or hyperspectral camera on a painting and the surface turns translucent to questions the eye cannot pose. Infrared bands penetrate paint layers to reveal underdrawings and pentimenti — the artist's changes of mind; subtle spectral differences betray later retouching and overpaint; matching pigment spectra to a reference library identifies the pigments and dates them; and a forgery can be unmasked when its modern inks or pigments have spectra the period's materials never had. The discipline documents and authenticates by seeing colors that, to us, are not different colors at all.

And beyond. Food inspection grades ripeness and catches contamination on the line; medical and biological imaging reads tissue oxygenation from haemoglobin's spectrum and runs fluorescence microscopy through LCTF/AOTF filters; recycling and sorting plants pick plastics apart by their infrared signatures faster than any human could.

In every one of these the discipline is the same — record the full spectral set, then ask the question: which material? healthy or stressed? original or forgery? — long after the light has been captured. The wavelength axis thus takes its place alongside exposure (HDR), focus (focal stacks), and aperture (light fields) as one more dimension along which you can refuse to decide at the shutter, and defer the decision into software.

Big lessons of this chapter

The recurring principles from this chapter, gathered for review.

💡 Big lesson (L14 — wavelength axis)

10.11 Hyperspectral imaging, color wheels🔗⧉

10.11.1 Why three numbers aren't enough — RGB as a 3-sample projection🔗⧉

10.11.2 Building the spectral stack — filter wheels, tunable filters, pushbroom, snapshot🔗⧉

10.11.3 What it's for — material ID, agriculture, art and beyond🔗⧉

Big lessons of this chapter

10.11 Hyperspectral imaging, color wheels⧉

10.11.1 Why three numbers aren't enough — RGB as a 3-sample projection⧉

10.11.2 Building the spectral stack — filter wheels, tunable filters, pushbroom, snapshot⧉

10.11.3 What it's for — material ID, agriculture, art and beyond⧉