11.9 Personalized priors⧉
A prior is whatever a model assumes about the answer before it sees the degraded or missing input. For an ill-posed problem — and restoring or generating a face is deeply ill-posed — the prior is not a tie-breaker but the deciding vote: the input narrows the space of answers, and the prior picks among everything that survives. A generic image prior, trained on everyone, knows what a face looks like, and it will fill an eight-pixel-wide blur or a missing region with the most statistically plausible face it can. A personalized prior knows what your face looks like, and it fills the same gap with you. When the task is to recover or depict a specific, recognizable subject, those are not small differences in quality — they are different people, and only one of them is correct.
The wager is the same one the rest of this part keeps making, just aimed at a single person instead of the whole world. Inpainting Using Millions of Photographs bet that, at the scale of millions of photographs, some image already contains a plausible completion of almost any scene; Pix 2 GPS bet that a giant geotagged corpus is a model of how the planet looks. Personalized priors make the inverse bet: the collection is tiny — a handful to a few dozen of someone's own photographs — but it is exactly the right collection, and for the narrow job of being about that subject it beats a model trained on everyone. Scale is traded for specificity, and for restoring or synthesizing a particular person, specificity is what matters.
11.9.1 Personalized restoration⧉
Super-resolution, denoising, and deblurring are all instances of the same ill-posed inverse problem from Super-resolution and image priors: a forward model threw away information — blur, downsampling, noise — and infinitely many sharp images are consistent with the degraded one you hold. The prior is what chooses. A generic face prior (a GAN or diffusion model trained on thousands of faces) chooses the most plausible face in general: it will happily synthesize a crisp, symmetric, attractive face that has nothing to do with the person actually in the photo. It looks sharp and it looks like a face, which is precisely why the failure is so easy to miss — the hallucinated detail is plausible, just wrong, the same "plausible but wrong" trap that haunts the DNG-rendering reference.
A personalized restoration prior closes that gap by feeding the model the subject's own clean photographs — high-resolution, well-exposed captures of the same person — as exemplars or as an explicit identity constraint. Now the model is not asked to invent eyes from the population average; it can supply the exact eye shape, the mole on the left cheek, the asymmetric hairline, the way this particular face creases when it smiles — identity-specific detail that no generic prior could know, because it is not in any population statistic. The collection acts as a per-subject memory that the restoration is allowed to copy from (Figure Figure 11.9.1).
Why specificity wins is worth stating precisely, because it is the whole thesis in miniature. Both priors are "correct" in the weak sense that both produce a sharp, face-shaped output consistent with the blurry input. But correctness for a known person is not plausibility — it is identity fidelity. The generic prior maximizes the likelihood of the output under "faces in general"; the personalized prior maximizes it under "this specific face." For a stranger, those nearly coincide. For someone recognizable — and recognizability is the only reason you cared to restore the photo — they diverge sharply, and the personalized answer is the one a friend would accept as them.
11.9.2 Personalizing generative models⧉
The same move powers a more recent and more visible application: teaching a large pretrained generative model a new subject from a few example images, so it can then place that subject in scenes and styles it never photographed. This is the generative cousin of restoration — instead of recovering lost pixels of a known person, you synthesize novel pixels of them — and it leans on the same insight that a tiny personal set, folded into the model, beats anything the generic model knew on its own. Two methods define the space, and they sit at opposite ends of a cost/fidelity trade-off.
DreamBooth (Ruiz et al., 2023) fine-tunes the model's weights. Take a pretrained text-to-image diffusion model (Deep learning), bind the subject to a rare, unique token ("a photo of a [V] person"), and continue training on just a handful — three to five — images of the subject. The weights shift just enough to associate that token with this specific identity, so afterward a prompt like "a [V] person as an astronaut" renders that person in a spacesuit. To keep the fine-tuning from collapsing the whole class onto one face — forgetting what "person" means in general — DreamBooth adds a prior-preservation loss that interleaves generic examples of the class, a telling acknowledgment that personalization is a deliberate, controlled overfitting.
Textual inversion (Gal et al., 2022) is the lighter-weight cousin: it freezes the model entirely and instead learns a single new embedding — a new pseudo-word in the model's vocabulary — that, when fed to the unchanged network, summons the subject. Nothing about the model's weights changes; all the new knowledge lives in one learned vector that you can save, share, and drop into prompts. It is cheaper, more portable, and less prone to damaging the base model, at some cost in fidelity for hard subjects. Both methods do the same essential thing — encode a person or object into a generative prior from a tiny personal set — and differ only in where they put the new knowledge: DreamBooth in the weights, textual inversion in a frozen embedding.
11.9.3 The bargain and its ethics⧉
The recurring empirical finding is blunt: for restoring or synthesizing a specific subject, the personalized prior wins decisively on identity fidelity — and it pays for that win twice. It pays in generality, because a model bent toward one face is, to that degree, worse at faces in general (the reason DreamBooth needs its prior-preservation loss). And it pays in overfitting and memorization: a prior built from a dozen images can simply regurgitate them, reproducing a training photo's background, pose, or lighting wholesale instead of generalizing — the personal-scale version of the same memorization risk that haunts large generative models. The specificity that makes the prior good is inseparable from the specificity that makes it brittle.
The deeper cost is ethical, and it is not a side note — it is the structural consequence of the method working. A model that faithfully reproduces a particular person is, by construction, a tool for impersonation: the very identity fidelity that lets you restore Grandmother's blurred portrait also lets someone synthesize a convincing photograph of a person doing something they never did. Deepfakes are not an abuse of personalized priors; they are personalized priors used as designed, on a subject who did not consent. So the central questions become consent, ownership, and likeness — whose photos may be used to build whose prior, who may invoke it, and what recourse a person has against a faithful synthetic double. This is the personal-scale echo of the always-on privacy problem in Life logging cameras: there, the harm was a camera that recorded everyone without asking; here, it is a model that can depict anyone without asking.
For an ill-posed problem, the prior decides the answer — so for restoring or depicting a specific person, the right prior is their own photographs. A generic model trained on everyone fills the gap with the most plausible anyone; a personalized model trained on a tiny personal collection fills it with the actual you. This is the part's data-as-prior creed (Inpainting Using Millions of Photographs, Pix 2 GPS) turned inward: the collection shrinks from millions of strangers' photos to a dozen of your own, and that shrinkage is the whole point — it is small, but it is exactly the right collection. The same specificity that buys identity fidelity (over the generic average) also buys overfitting, memorization, and a consent problem, because a faithful model of one person is, unavoidably, a tool for impersonating them.
Big lessons of this chapter
The recurring principles from this chapter, gathered for review.
For an ill-posed problem, the prior decides the answer — so for restoring or depicting a specific person, the right prior is their own photographs. A generic model trained on everyone fills the gap with the most plausible anyone; a personalized model trained on a tiny personal collection fills it with the actual you. This is the part's data-as-prior creed (Inpainting Using Millions of Photographs, Pix 2 GPS) turned inward: the collection shrinks from millions of strangers' photos to a dozen of your own, and that shrinkage is the whole point — it is small, but it is exactly the right collection. The same specificity that buys identity fidelity (over the generic average) also buys overfitting, memorization, and a consent problem, because a faithful model of one person is, unavoidably, a tool for impersonating them.