💬Comments welcome. To leave a note, select any text and click the note / highlight button that pops up — or open the panel with the tab at the top-right (‹). Notes are visible only inside our private review group.
jump to

22.16 How this book was created

This book was written with an AI collaborator — Anthropic's Claude — as a deliberate experiment in AI-assisted authoring. That is a slightly self-referential thing to admit in a book whose final parts are about machine learning and generative models, and the irony is intentional: it seemed dishonest to write about computation reshaping a craft while pretending this craft had been untouched. So this appendix describes, plainly, how the outline, the figures, the prose, and even the book's own tooling came to be. The short version is that an AI did a great deal of the drafting, building, and checking, and a human did all of the deciding. The longer version is below, because the method turned out to be as interesting as the result, and because the failure modes everyone rightly worries about — hallucinated facts, a soup of inconsistent notation, prose that drifts into nonsense over hundreds of pages — are exactly the things the method was built to prevent.

22.16.1 Two documents, not one

The central design decision was to keep two living documents apart. The first is the outline — an annotated specification of what the book says: its structure, the order of ideas, and, attached to each section, the figures it needs, the equations and symbols it uses, its prerequisites, and the papers and slides it draws on. The second is the prose — the actual text a reader sees, which is about how it reads. A compilation step turns the outline (plus a set of supporting files) into draft prose, the way a compiler turns source into a program. The point of the separation is leverage: structure and content live in one place where they can be reasoned about and rearranged cheaply, while voice and phrasing live in another.

Crucially, the arrow runs both ways. When the human author edits the prose, the change is triaged back into the right home: a change to content (a new point, a reordering, a corrected claim) updates the outline; a change to voice (word choice, rhythm, "spell this out") updates a style guide; and a purely local polish is simply locked in place so it is never silently overwritten. This split is what makes the system's central promise hold — that re-compiling the book reproduces essentially the edits the author made, rather than regenerating something new and forgetting the corrections. The outline is therefore kept as a strict superset of the prose's content, so nothing the writing made emerge is lost on the next pass.

22.16.2 Compiling a section

Large language models are at their worst in long, meandering conversations, where early mistakes calcify and context "rots." The antidote here is that files are the durable memory and the chat is disposable. Each section is compiled in a fresh, clean context from an explicit capsule of exactly what it needs — never from accumulated conversation. That capsule holds the section's outline bullets, a one-line statement of the chapter's thesis and the neighbouring sections for continuity, the in-scope notation and the relevant style rules, and — this is the important part — the source material.

Two kinds of source feed every section, and they play different roles. The first is the body of teaching this book grew out of: the lecture slides and cleaned transcripts from the course, transcribed into machine-readable digests. These supply pedagogy and voice — the specific framing, the worked examples, the analogies, and the order in which a human teacher found it natural to reveal things. The second is the technical literature: the cited papers and textbooks, kept on disk and consulted for the exact definitions, results, and attributions. These supply precision. The division of labour is deliberate — slides and transcripts decide how to explain, papers decide what is true — and grounding every claim in a named source is the first and best defence against an AI confidently inventing things. The rule throughout is to write from the sources, not from the model's own memory.

22.16.3 Figures as code

None of the book's diagrams or example images is drawn by hand. Each figure is a small program that builds it, so that every figure regenerates from source, stays consistent with the others, and updates automatically when its data or style changes — the same reproducibility ethic the book preaches for image-processing code. Two little libraries do the work: one holds the actual image operations a figure demonstrates (exposure, contrast, tone mapping, convolution, pyramids, demosaicking, and so on), written once so a figure and the book's prose can never disagree about what an operation does; the other holds a single shared visual style — one palette, one set of fonts, consistent labels — applied everywhere. A figure is specified in the outline alongside the text it supports, built to a legibility standard (no text smaller than the body type, nothing clipped or overlapping), and checked against the prose so that the writing actually leverages each figure rather than merely gesturing at it. Photographs used as examples are the author's own, or carefully credited Creative-Commons images.

22.16.4 Generative imagery: cover art and 3D

The line the previous section draws — figures are code, never guessed — is exactly what lets the book use generative models freely for the decorative imagery, where invention is the point rather than evidence. The cover art was made with a text-to-image model (Reve), and the book's 3D assets — the fantastical cameras and optical artifacts that populate the interactive demos and the animated cover — were generated from text prompts with Meshy, a text-to-3D tool, then placed and lit in a small three.js scene. A pleasing recursion: a book about computational imaging, illustrated by computational imaging's newest tools. The rule stays firm — generative models are never trusted to draw a ray diagram or a Bayer grid (a hallucinated arrow would be a lie); they make ornament, not evidence, and every caption that uses a generated model says so.

fig-cover-museum-3d
Figure 22.16.1. The cover as a living exhibit. An animated 3D "Museum of Computational Photography": a procedural, themed exhibit hall whose pedestals hold Meshy-generated fantastical cameras and optical artifacts — a prism, a plenoptic lens array, a pinhole camera — each slowly turning in its glass case, with a bigger camera on the central pedestal and a glowing title overhead. It comes in several styles (steampunk, cyberpunk, retro-futurist, medieval, da Vinci); reshuffle for a new arrangement. In the printed edition this stands in as a still. The cameras and artifacts are 3D models generated with Meshy AI.

22.16.5 Verification and review

A chapter is not allowed to start drafting until it passes a readiness gate: its outline must be fleshed out, its prerequisites declared and satisfied upstream, its figures built (or stub-placeheld), its equations and notation locked, its references resolving to real documents, and its source coverage checked. Only then does prose get generated. After a draft exists, it is run past a panel of automated reviewers, each looking through a different lens: a technical reviewer hunting for inaccuracies and missing assumptions; a pedagogical reviewer asking where a student would get lost; a prerequisite reviewer flagging background that is assumed but never introduced; a style reviewer enforcing that the whole thing reads as if written by a single author; and further checks for figure–text fit, citation correctness, and faithful coverage of the underlying slides and papers. Their findings collect into a running issues list and steer the next revision. Mechanical checks — consistent notation, language-isolated code, sane value-range and encoding assumptions — run separately and automatically. The verifiers do not replace human review; they make the human review land on the things that actually matter.

One optional step in that loop is a deliberate second opinion from a different model. Every reviewer just described is, in the end, the same family of model that wrote the prose, so it inherits the author-model's blind spots. To decorrelate the review, the pipeline can hand a slice of the book — the outline or the drafts, at chapter, part, or whole-book granularity — to an independent model from another vendor (in practice an OpenAI model) for a cross-check. Its feedback returns in a fixed two-part shape — a high-level summary, then an itemized issues list — with each issue tagged for triage: small, unambiguous corrections the editing AI applies directly, and judgment calls escalated to the human editor. The tool is deliberately read-only and sandboxed (it runs no version control and writes only its own report), so it can run alongside the main build without interfering. The value is not more review for its own sake; it is decorrelated review — a fresh model tends to catch exactly what the resident one is systematically blind to.

22.16.6 Keeping a long book coherent

The hardest problem in generating a book piece by piece is that the pieces must add up to one object. Three measures hold it together. First, a set of single-source registries: one master notation table owns every symbol and its meaning, one glossary owns every key term's definition, and a registry of the book's recurring "big lessons" (that much of imaging is multiplicative; that color is non-orthogonal and non-negative; that the right basis diagonalizes a problem; that edge-preserving filtering is really about affinity; that quantization is rarely the real culprit) records where each principle first appears and everywhere it recurs. Because symbols, terms, and themes are defined in exactly one place and referenced everywhere else, they cannot drift. Second, a dependency graph: each chapter declares what it provides and what it requires, and the build checks that nothing is used before it is introduced — and propagates a change in a shared definition to every chapter that leans on it. Third, version control: every change to both the prose and the tooling is committed, so the generated and the edited versions sit side by side, edits can be extracted as differences, and the whole history is recoverable. A verbatim log of every instruction and a dated process changelog accompany the repository, so the method itself — which evolved continuously as we learned — can be reconstructed and explained, which is much of what this appendix is built from.

22.16.7 The toolchain

The prose is written once in a neutral markup and rendered to several outputs — print/PDF and a web edition — from a single source, so it is proofread once rather than per-format. The same single-source trick produces the book's two flavours, one with Python code and one with C++: the shared explanation is written once and language-specific snippets are swapped in per build, so a reader ever sees only one language, never a confusing both. Derived artifacts — the slide digests, the figures, and eventually the compiled chapters, the index, and the glossary — follow an incremental, build-style discipline: each records the inputs it was built from and is rebuilt only when those inputs actually change, so a small edit does not trigger a wholesale regeneration. The book even carries a version number that ticks up on each recompile.

22.16.8 What the machine did, and what it did not

It is worth being precise about the division of labour, because "written by AI" is both an overstatement and an undersell. The AI was extremely good at the things that are laborious but well-specified: drafting fluent prose from a tight, source-backed outline; writing and debugging the figure-generation code; performing the mechanical bookkeeping — notation tables, cross-references, consistency sweeps — that no human enjoys; and building the very tooling that ran the process. It gave one author the throughput of a small team. It was not trusted with judgment. A human chose the book's structure and argument, decided what mattered and what to cut, supplied the taste, caught the subtle errors that survive plausible-sounding prose, and owns every claim that remains. The safeguards described above — source-grounding against fabrication, fresh per-section context against drift, single-source registries against incoherence, and a reviewer panel feeding a human editor — exist precisely because an unsupervised model would fail in each of those ways. The honest summary is that the AI is a powerful collaborator and a tireless build engine, and the responsibility, like the byline, stays human.

22.16.9 Who wrote what: a per-part estimate

In the spirit of openness, here is an honest attempt to quantify the division of labour — part by part, for the outline (the structure and content) and for the figures. A caveat first, because the numbers are softer than they look. "Authorship" of the outline does not cleave cleanly: the human author set the structure, chose every topic, and supplied the course material the book grows from, yet a great deal of the connective outline text — the sidebars, the cross-references, the worked framings — was drafted by the AI from those seeds and then edited. So a low "human %" on the outline does not mean the human was absent; it means the human directed and the machine expanded. Conversely, a figure counted as "AI-originated" was still requested in general terms, reviewed, and kept or killed by the human. The percentages below are estimates reconstructed from the verbatim prompt log, rounded to the nearest 5%, and meant to convey proportions, not precision.

Two columns, two questions:

PartOutline — human %Figures — human-specified %
Intro4515
Fundamentals4040
Basic image processing & ISP4035
Computational tools (ML/DL/diffusion)205
Edges matter3520
Warping and morphing3015
Matching pixels across space & time3015
Single-image computational photography2010
Optics, lenses & aberration correction4030
Multiple-exposure imaging (HDR, panoramas)4020
Many images & photo collections2510
Video3010
Advanced computational photography205
3D and depth2510
Revealing the invisible3510
Adjacent fields & applications205
Human factors & the art of photography3010
Image forensics & authentication205
Systems305
Performance engineering & Halide4510
Conclusions & discussion255
Appendices (incl. problem sets)4550
Whole book (content-weighted)≈ 30–35≈ 20

Two patterns are worth reading off the table. The human share of the outline is highest exactly where the book leans on the author's own teaching and research — Fundamentals, Basic, Optics, Multiple-exposure, Performance/Halide, and the problem-set Appendices — all rooted in the MIT course's lectures, slides, and assignments; it is lowest in the survey-style later parts, where the AI expanded thin topic lists into structured outlines that the author then pruned. The human share of the figures is concentrated in the parts that have actually been drafted and iterated (Fundamentals and Basic), where many figures were requested by name — and high again in the Appendices, whose figures are the author's own course-handout illustrations. Across the whole book, roughly a third of the outline's substance traces to explicit human direction (with the rest AI-expanded under that direction), and about a fifth of the figures were specified in detail by the author (with the rest proposed by the AI and then accepted, revised, or rejected). Every page, figure, and number, whoever first drafted it, was reviewed and is owned by the human author.