The book reads best front to back: it was sequenced so that every idea lands on ground prepared by an earlier chapter. But it was also built for readers with deadlines, and the dependencies between parts are explicit enough that several shorter paths work well. This page maps the dependencies first, then offers four tested reading paths.
How the Parts Depend on Each Other
Part I is the foundation for everything: its array conventions, convolution, interpolation, and frequency ideas are assumed by all later parts. Part II splits into two strands: the feature-and-matching strand (Chapters 9 to 11, 15, 16) enriches Part III but is not strictly required by it, while the geometry strand (Chapters 12 to 14) is a hard prerequisite for the 3D material later. Part III is the gateway to Part IV: generative models are deep networks, and Part IV assumes you can read PyTorch, train a model, and reason about CNNs and transformers. Table F5.1 summarizes the load-bearing edges.
| If you are heading for | Make sure you have read |
|---|---|
| Part III (deep learning) | Chapter 0, plus Chapters 2, 3, and 5 from Part I |
| Chapter 27 (depth, 3D, NeRF, splatting) | Chapters 12 to 14 from Part II |
| Part IV (generative models) | Chapters 18 to 22 and 25 from Part III, plus Chapters 3, 4, and 7 from Part I |
| Chapter 36 (video, 3D, world generation) | Chapters 26 and 27, plus Chapter 33 |
| The capstone project | At least one full pass through Parts I and III |
Table F5.1 The dependency edges that matter when skipping ahead.
Three kinds of material sit outside the dependency graph and can be visited at any time: the Tools of the Trade chapters (8, 17, 29, 38) are standalone references; Appendix A backstops the mathematics on demand; and Appendix B's dataset catalog is useful from the first project onward.
Path 1: The Engineer
You want to ship a working vision feature soon and deepen later. Read Chapter 0 carefully, then Chapters 2, 3, and 5 from Part I (histograms, filtering, geometric transforms: the preprocessing you will actually write). From Part II take Chapters 9 and 10, plus Chapter 12 if your problem involves real cameras. Then commit to the Part III core: Chapters 18 through 21, then 23 (detection) or 24 (segmentation) depending on your task, and Chapter 28 when it is time to deploy. Keep Chapters 8 and 29 open as references throughout. Return for the rest of Part II and for Part IV when the shipped feature buys you time.
Path 2: The Researcher
You want depth, current methods, and the trail into the literature. Skim Part I for notation (Chapters 3, 4, and 7 reward close reading; their ideas return as learned methods), read the geometry strand of Part II (Chapters 12 to 14) in full, then read Parts III and IV completely and in order. Pay particular attention to the Research Frontier callouts and the chapter bibliographies, which are curated as entry points into the 2024 to 2026 literature. Appendix B gives you the benchmark landscape for positioning experiments.
Path 3: The Generative Practitioner
You came for diffusion models and want the shortest honest route there. Read Chapter 0, then Chapters 3, 4, and 7 from Part I: convolution, frequency, and denoising are the conceptual backbone of diffusion. Skip Part II for now. From Part III read Chapters 18, 19, 21, 22, and 25 (networks, CNNs, training, transformers, and CLIP-style foundation models; Chapter 24 is worth adding for promptable segmentation in editing workflows). Then read Part IV in full, in order: the chapters from VAEs through GANs to diffusion are written as one continuous argument, and Chapters 34 and 35 assume their predecessors. Do not skip Chapter 37; evaluation, provenance, and licensing are part of the craft.
Path 4: Self-Study
You have time and want mastery. Read front to back at a steady pace, one to two chapters per week, which completes the book in six to nine months. Do the exercises: the conceptual ones before moving on, the coding ones for any chapter touching your interests, the analysis ones whenever a claim surprises you. Start sketching your capstone while reading Part III so that Part IV can feed it. Appendix C offers ready-made week-by-week schedules if you prefer external structure, and Appendix D expands the pathways on this page with finer-grained guidance per audience.
Whichever Path You Take
Run the code; reading about images without looking at any is a strange diet. Use the cross-reference links generously; they exist so that a forgotten prerequisite is one click away rather than one guilt trip away. And when a chapter ends, glance at its What's Next section even if you are about to jump elsewhere: it is the narrative thread that keeps the four parts feeling like one book.