💡 In a hurry? Jump to this chapter’s 1 big lesson ↓

17.1 Human factors and the art of photography⧉

Every other chapter in this part has been about getting the picture right — focused, exposed, the colors faithful, the geometry honest. None of that makes a picture good. A perfectly exposed, perfectly sharp snapshot of a cluttered kitchen counter is still a bad photograph, and a grainy, slightly soft frame of the right moment can be a great one. So this chapter changes the question. It is not how does the camera work but how does a person make an image worth looking at — and, just as importantly, it is a vocabulary to talk about pictures. You cannot reliably make good photographs, but you can learn to see why one works and another does not, and the fastest way to improve is to take a lot of pictures and ruthlessly critique your own.

There is no recipe for creativity, and this chapter will not pretend to give one. What it can give is the set of moves that working photographers reach for, distilled from a 313-slide lecture into a handful of rules of thumb. Almost all of them are tactics for one underlying idea, the throughline of the whole chapter: layering and separation — making your subject stand out from everything around it. Hold that idea; we are about to make it the chapter's one big lesson.

💡 Big lesson — L13 · Separate the subject from its surroundings, and commit

The recurring move of making a good picture is separation: lift the subject out of its background so the eye knows, instantly and without effort, what to look at. You have four levers to do it — depth of field (throw the background out of focus), framing and viewpoint (find a clean background, build layers front to back), light (a rim light, a pool of brightness, contrast against shadow), and color (a warm subject on a cool ground, or a saturated subject on a desaturated one). Pick a lever — and then commit. Whatever you do — off-center the subject, blur the background, tilt the horizon, push the saturation — do it boldly or not at all. The timid half-measure, the subject that is almost off-center, the background that is sort of blurred, is the amateur's tell. Most of a photograph's craft is deciding what to leave out, and then leaving it out without flinching. Every rule in this chapter is one tactic in service of this single idea.

17.1.1 Make better photos⧉

Start with the practical core: the rules that turn a snapshot into a photograph. They fall into composition (arranging the frame), light (the raw material), portraits (the special case of people), and a light touch in post. Each is a way to do separation.

Composition⧉

Control the background. This is the single highest-leverage fix in all of photography, because a busy background is what kills more snapshots than anything else — a telephone pole growing out of someone's head, a bright distraction in the corner, a tangle of clutter competing with the subject for attention. You have a whole toolbox for taming it: move your feet to put a cleaner wall or sky behind the subject; open the aperture for a shallow depth of field (DoF) that melts the background to a smooth blur; reach for a telephoto lens, which compresses the background and narrows the field so less of it is in the frame; crop or get close so the clutter never makes it into the picture; clean it up afterward (clone-stamp or Poisson editing); desaturate or darken the surroundings; or, when color is the problem, go black and white and let the distractions fall away into tone. The clean background of Figure 17.1.1 is the same picture as its cluttered twin — the photographer simply took two steps left.

[figure fig-clean-background not built]

Figure 17.1.1. Control the background. Left: a portrait shot against a cluttered scene — a sign, a passerby, a bright window — all fighting the subject for attention. Right: the same subject, after the photographer moved a few feet to put a plain wall behind them and opened the aperture to soften what remained. Nothing about the subject changed; the picture went from busy to clean by managing the background alone.

Simplify — get close. A photograph is stronger when it says one thing. The slogan is blunt: if you can't make it good, make it tight. When a scene is not coming together, filling the frame with the one element that matters is almost always an improvement — it forces the simplification the composition needed anyway.

Off-center the subject. The rule of thirds — place the subject on one of the lines a third of the way in, rather than dead center — gives a frame energy and room. Leave space ahead of a moving subject or in front of a gaze, so the eye has somewhere to travel. The rule is a default, not a law: deliberate dead-center framing can be powerful for symmetry — but if you center, center on purpose, and do it exactly. (This is L13 in miniature: off-center boldly, or center exactly; the near-miss reads as a mistake.)

[figure fig-rule-of-thirds not built]

Figure 17.1.2. Placement and commitment. Left: the subject sits slightly off dead-center — neither centered nor clearly to one side — and the frame feels accidental. Middle: the subject placed firmly on a rule-of-thirds line, with room ahead of its gaze, and the frame gains energy. Right: the subject placed dead-center for deliberate symmetry. The two committed framings both work; the timid near-miss does not.

Give the subject room to breathe with negative space — empty, quiet area around the subject that isolates it and lets it stand alone. Pick the viewpoint: "move your feet," because the angle beats the zoom almost every time. Get low or down to eye level for people and especially animals (a dog photographed from standing-adult height looks small and dismissed; the same dog at its own eye level looks like a character), and occasionally go high for a fresh look. Mind the geometry: keep verticals straight, avoid near-parallel lines and accidental alignments (you can correct converging verticals later with a homography, but it is better to level the camera), build the composition on diagonals, and try an unusual angle when the obvious one is dull. Match the frame orientation to the subject — a standing person or a waterfall wants a vertical (portrait) frame; a landscape or a reclining figure wants a horizontal one. Before you press the shutter, sweep the edges of the frame with your eye to catch a distraction creeping in or a hand half-cut by the border. And for landscapes, remember a foreground element — a rock, a branch, a path — adds depth and a way into the picture.

Light⧉

It's the light that counts. The same subject in good light and bad light is two different photographs, and the difference is larger than almost any choice of lens or composition. Soft, directional light flatters — it wraps around form, models depth gently, and forgives texture; harsh, overhead midday sun rarely does, carving hard black shadows under eyes and noses. The easiest good light is open shade or an overcast sky: a giant soft, even source with no harsh shadows at all. For warmth and drama, shoot at golden hour — the hour after sunrise or before sunset, when the sun is low, warm, and directional — or at blue hour, the cool twilight just after the sun is gone. When the light is contrasty and you must keep it, fill the shadows: a fill flash, a bounce flash fired off a wall or ceiling (roughly 45°) so it arrives soft and from above, a diffuser over the source, or a reflector kicking light back into the shadow side. The one thing never to do is fire a bare flash straight at the subject from the camera — flat, harsh, red-eyed, and unmistakably amateur.

[figure fig-soft-vs-harsh-light not built]

Figure 17.1.3. The light is the picture. Left: a face under harsh midday sun — hard-edged shadows in the eye sockets and under the nose, blown highlights on the forehead. Right: the same face in open shade or golden-hour light — soft, directional, the shadows gentle and the skin evenly lit. Same subject, same camera; only the light changed.

The deeper idea under all of these rules is that they are about managing dynamic range. A scene's contrast — the ratio between its brightest and darkest parts — is often wider than the sensor (or the print) can hold, and harsh light makes it worse. Soft light, fill, bounce, and reflectors all compress that range so it fits. The same goal drives studio three-point lighting (a key light to shape, a fill to lift the shadows to a chosen ratio, a rim to separate the subject from the background — separation again, this time with light), and it is exactly what high dynamic range (HDR) capture and tone mapping do computationally in post. The photographer with reflectors and the algorithm with bracketed exposures are solving the same problem.

Portraits⧉

A portrait is a composition and a lighting problem with one extra, non-negotiable rule: it's all about the eyes. Focus on the near eye (with shallow depth of field, that is where sharpness must land), and look for a catchlight — the little reflected highlight that makes an eye look alive rather than dead. Shoot near eye level, not down at the subject. A short telephoto (≈ 85–135 mm) flatters faces: it lets you stand back far enough that the nose-to-ear perspective looks natural rather than bulbous, and it compresses the background into a soft wash. Light the face softly and warm the white balance (WB) a touch — cool skin looks ill. The whole craft of lighting and makeup comes down to shaping a face with light and shadow: a broad soft key from slightly above and to one side does most of the work. And over-shoot — expressions are fleeting and cheap to capture; you are fishing for the one frame where everything aligns.

Then, lightly, in post⧉

Most of the work is done at capture, but a little processing finishes the image (and these are exactly the operations the rest of the book automates — see the Color technology and image-signal-processing chapters, and the HDR chapter). Shoot RAW so you keep the full sensor data and the latitude to adjust later. Then: set the white balance, fix the exposure, manage contrast with a tone curve, dodge and burn to lift or darken local regions (the manual ancestor of every local-tone operator), crop for the composition you wish you had nailed in camera, and boost saturation or vibrance a touch for life. For a scene whose contrast outran the sensor, merge an HDR bracket and tone-map it back to a displayable range.

The governing principle of post is one word: restraint. Whatever you do, do it less. Over-sharpened, over-saturated, over-HDR'd images announce themselves instantly, and pulling every slider back from where it "looks good" to where it looks finished is the mark of an edited image that does not look edited. This is L13's twin: commit at capture, restrain in post.

17.1.2 Typical shooting scenarios⧉

Different genres reward different settings. Here is a quick field guide — for each genre, the lens, the exposure priorities, and the one thing that matters most.

Portrait — a short telephoto (85–135 mm); a wide aperture for shallow depth of field and subject isolation; focus on the near eye; soft, directional light; a slightly warm white balance. The one thing: the eyes.
Landscape — a wide lens; a small aperture (f/8–f/16) for deep depth of field front to back; low ISO on a tripod; focus near the hyperfocal distance; a foreground element for depth; shoot at golden or blue hour; bracket for HDR when the sky outruns the land. The one thing: depth and light.
Wildlife — a long telephoto (300–600 mm); a fast shutter to freeze motion; continuous autofocus (AF) with burst shooting; raise the ISO as needed; get down to the animal's eye level; and patience above all. The one thing: get close and stay sharp.
Studio — controlled strobes in a key/fill/rim setup; low ISO; aperture set for the depth of field you want; mind the flash sync speed; shape the light with a softbox or reflector; often shooting tethered. The one thing: you own the light, so shape it.
Sports and action — a very fast shutter (1/1000 s and up); continuous-AF tracking at a high frame rate; a wide aperture and high ISO to afford that shutter; and pan with the subject to blur the background while keeping the subject sharp. The one thing: freeze the subject, suggest the speed.
Architecture and real estate — a wide or tilt-shift lens to keep verticals straight (or correct the converging verticals later with a homography); a small aperture; level the camera; and bracket interiors to hold both the bright window and the dim room. The one thing: straight lines and the window-vs-room dynamic range.
Macro — high magnification, its own beast, and its own section next.

17.1.3 Macro photography⧉

Macro photography means shooting small subjects at high magnification — an insect, a flower's stamen, a snowflake — at life-size (1:1), where the subject projects onto the sensor at its true size, or greater. It is technically demanding for one dominant reason.

The central problem is depth of field. At 1:1 the in-focus band is on the order of millimetres even with the aperture stopped well down — focus on a dragonfly's eye and its wings are already a blur. Everything in macro technique is a response to this. You can stop down (f/8–f/16 and beyond) to deepen the band — but only so far, because at very small apertures diffraction starts softening the whole image, and at high magnification that ceiling is real and close, so brute-force stopping down runs out fast. The computational escape is focus stacking: shoot a stack of frames at stepped focus distances and combine the sharp slice from each (this is the focus-stacking story of the multiple-exposure chapter, and an instance of the capture-everything-decide-later idea — see L14). And there is a free lunch hiding in sensor size: small sensors get macro depth of field almost for free, because depth of field scales with sensor size (the image-formation chapter derives this), which is why a phone or a compact can do casual macro that a full-frame camera struggles with.

Working distance versus magnification is the other thing to understand, and it surprises people. A longer macro focal length buys you more working distance — physical room between the front of the lens and the bug — at the same magnification. That extra room matters enormously, because at macro distances the lens itself is so close it shadows the subject and the subject is so close it spooks. Crucially, going longer does not change the depth-of-field band: at the same magnification you get the same DoF, whatever the focal length (again, the image-formation chapter). So a 100 mm and a 180 mm macro lens framing the same bug at 1:1 give the same razor-thin focus band; the longer one just lets you stand back.

The gear follows from this: a true macro lens, or cheaper routes — extension tubes (spacers that let any lens focus closer), a reversed lens, or close-up diopters (filters that act like reading glasses) — plus a focusing rail to step focus precisely for a stack. The light is its own challenge: the subject is so close that the lens shadows it, so macro shooters use a ring flash or twin flash around the lens, with diffusers, and a tripod for anything that will hold still. Finally, a delightful corner: macro 3D via hypostereo — a stereo pair shot with the two viewpoints spaced only millimetres apart (a tiny baseline), which gives natural-looking depth on a subject just centimetres from the lens, where a normal eye-spaced baseline would exaggerate it into a cardboard cutout (cross-reference the stereo chapter).

17.1.4 Special effect photography⧉

Some pictures are made by abusing the camera in productive ways — usually by playing games with time and viewpoint.

Light painting is the long exposure's party trick: open the shutter for seconds in a dark room and move a light — a flashlight, a sparkler, your phone — through the frame (or move the camera). The sensor accumulates every position the light passed through, so the trail draws itself into the image as a glowing line, while the dark surroundings record almost nothing (this is exactly the integration over exposure time from the exposure chapter).

[figure fig-light-painting not built]

Figure 17.1.4. Light painting. A multi-second exposure in a dark space while a handheld light is swept through the air; the sensor accumulates the light's whole path into a single luminous trail, while the unlit surroundings stay dark. The "drawing" exists nowhere at any instant — only in the integral over the exposure.

Multi-Fredo / clone shots lock the camera on a tripod and shoot several frames of the same person standing in different spots, then composite them into one frame — one subject appearing many times at once, shaking hands with himself. It works precisely because the camera is locked off: the background is identical across frames, so the composite is just a matter of choosing, per region, which frame to show.

[figure fig-multi-fredo not built]

Figure 17.1.5. The clone shot. A tripod-locked camera captures several frames of one person posed in different positions within an unchanging scene; compositing the frames places all the copies into a single image. The locked viewpoint makes the background pixels identical frame to frame, so the merge reduces to a per-region choice of which frame to reveal.

Box rigs and multi-camera setups take the viewpoint game to hardware: bullet-time and array rigs surround the subject with many cameras fired together (one instant from many viewpoints), and beam-splitter rigs put two cameras behind one optical axis. The slogan is that you are capturing one moment from many viewpoints — or one viewpoint across many instants.

[figure fig-box-rigs not built]

Figure 17.1.6. Multi-camera capture rigs. Left: a bullet-time array — many cameras arranged around the subject and triggered simultaneously, so a single frozen instant can be viewed from a sweep of angles. Right: a beam-splitter rig sharing one optical axis between two cameras. These rigs trade hardware for the ability to sample viewpoint (or time) densely at the moment of capture.

And a grab-bag of others, each a deliberate misuse: tilt-shift / miniature faking (a narrow plane of focus that fools the eye into reading a real scene as a tabletop model), intentional camera movement during a long exposure for painterly smears, prism and kaleidoscope tricks held in front of the lens, and infrared and ultraviolet photography that records light the eye cannot see.

17.1.5 Fun artsy stuff⧉

Scattered through the book are artists and gadgets that stretch what "a photograph" even is — worth meeting because they reframe the technical ideas as creative ones. David Hockney's photo-collage "joiners" assemble a scene from dozens of overlapping snapshots, each with its own viewpoint and instant — a hand-made, deliberately wrong panorama that captures how the eye actually roams a scene rather than how a single exposure flattens it. Jason Salavon averages large sets of images (every Playboy centerfold, every frame of a movie) into ghostly statistical composites — the mean image as art, and a preview of the data-driven turn the book takes later. There is the face-detector turned loose on clouds, finding faces where there are none (pareidolia made computational); Bill-style action visualizations that lay a whole motion out across one frame; non-linear perspective and push-broom cameras that violate the single-center-of-projection rule on purpose; multi-perspective panoramas that stitch many viewpoints into one impossible-but-readable image; and a family of conceptual cameras — a text-output camera that describes the scene in words instead of pixels, a blind camera that shoots without a viewfinder, an anti-cliché camera that refuses to take the postcard shot everyone else takes. Each is a thumbnail of an idea the book develops more seriously elsewhere; they are here to keep the engineering playful.

17.1.6 Perception of art⧉

Why do good pictures look the way they do? Because the visual system that makes them is the same one that reads them, and art that works tends to be art that matches how we see (the framing is Aaron Hertzmann's "why does art look the way it does," and Margaret Livingstone's Vision and Art; it picks up directly from the perception chapter).

Luminance versus color. The luminance channel — brightness, regardless of hue — is what carries form, depth, and motion for us; color is comparatively a passenger. So art built on equiluminant color (regions that differ in hue but match in brightness) looks oddly unstable and shimmery, because the brightness-based machinery that locates edges and depth has nothing to grip — exactly the restless quality of much Impressionist and Op-art work. This ties straight back to the chroma contrast-sensitivity function of the spatial-vision chapter: we resolve color far more coarsely than brightness.
Central versus peripheral vision. The eye is sharp only at the fovea and goes blurry toward the periphery. That single fact explains the Mona Lisa's elusive smile — the smile lives in low-spatial-frequency information that your peripheral vision picks up when you look at her eyes, but dissolves when you look straight at her mouth — and it explains why a loose, suggestive sketch reads cleanly from across the room: at a distance, or in the periphery, the missing detail was never going to be resolved anyway, so the brain fills it in.
Shortcuts that match perception. Artists discovered, by trial and centuries of error, tricks that exploit the visual system: edges and line drawings work because we encode the world as edges in the first place; exaggerated contrast at boundaries works because lateral inhibition already exaggerates it for us (the Mach-band effect); chiaroscuro and cast shadows read as depth because shadow is a depth cue the brain trusts; atmospheric perspective (distant things hazier and bluer) works because real scattering does exactly that, so we read haze as distance.
The camera versus the eye. A photograph flattens what the eye explores over time. The eye foveates around a scene, refocusing and re-exposing point by point, building a rich mental image with no single viewpoint, focus, or exposure; a camera takes one exposure, at one focus, from one point, all at once. That gap is a large part of why "the photo doesn't look like I remember it" — and, not coincidentally, much of computational photography (HDR, focus stacks, panoramas) is an attempt to close it.

Forward tie-ins: this is the seed of non-photorealistic rendering and stylization (Advanced) and of aesthetics and saliency models in the machine-learning chapters.

Big lessons of this chapter

The recurring principles from this chapter, gathered for review.

💡 Big lesson — L13 · Separate the subject from its surroundings, and commit

symbol / term	meaning (this chapter)	note / clash
$f/N$	aperture as an $f$-number (e.g. f/8–f/16) — small $N$ = wide aperture, shallow depth of field	from image formation; reused here as a craft control
1:1	macro magnification ratio: subject projected at life size on the sensor	new notation; greater than 1:1 = larger-than-life
DoF	depth of field — the range of distances acceptably in focus	introduced/expanded in image formation; spelled out on first use here
HDR	high dynamic range (capture + tone mapping)	spelled out here; full treatment in the HDR chapter
AF	autofocus (single vs continuous AF)	spelled out here
WB	white balance	introduced in Draft — Color technology; spelled out here

17.1 Human factors and the art of photography🔗⧉

17.1.1 Make better photos🔗⧉

Composition🔗⧉

Light🔗⧉

Portraits🔗⧉

Then, lightly, in post🔗⧉

17.1.2 Typical shooting scenarios🔗⧉

17.1.3 Macro photography🔗⧉

17.1.4 Special effect photography🔗⧉

17.1.5 Fun artsy stuff🔗⧉

17.1.6 Perception of art🔗⧉

Big lessons of this chapter

17.1 Human factors and the art of photography⧉

17.1.1 Make better photos⧉

Composition⧉

Light⧉

Portraits⧉

Then, lightly, in post⧉

17.1.2 Typical shooting scenarios⧉

17.1.3 Macro photography⧉

17.1.4 Special effect photography⧉

17.1.5 Fun artsy stuff⧉

17.1.6 Perception of art⧉