Part I: Image Processing
Chapter 8: Tools of the Trade: The Image Processing Stack

Tools of the Trade: The Image Processing Stack

"Eight chapters of beautiful theory, and what production taught me is this: half of vision engineering is knowing which library to import, and the other half is remembering what it just did to your channel order."

A Battle-Hardened Imaging Pipeline

Chapter Overview

Every chapter in Part I taught a body of theory and then reached for a library to make it run: Chapter 3 built convolution by hand and then called cv2.filter2D; Chapter 5 derived homographies and then trusted cv2.warpPerspective; Chapter 7 ended with denoisers that scikit-image ships as one-liners. What we never paused to do is compare the tools themselves: why four different libraries exist for what looks like one job, which one to reach for first, how to make any of them fast, and where the test data and quality numbers that validate your pipeline come from. This chapter is that pause, the pit stop at the end of Part I.

It is built as a reference, not a narrative. Section 8.1 maps the library landscape: OpenCV, scikit-image, Pillow, and SciPy's ndimage, their design philosophies, their incompatible conventions (BGR versus RGB, integers versus floats), and a Rosetta stone of equivalent calls. Section 8.2 is about speed: why Python loops lose by factors of hundreds, how NumPy vectorization, OpenCV's optimized backends, Numba, and GPU arrays each buy back performance, and how to measure honestly before optimizing at all.

Section 8.3 gathers the experimental infrastructure: the standard test images every paper uses, the datasets that benchmark denoising, super-resolution, and deblurring, and the tooling that computes PSNR, SSIM, LPIPS, and their no-reference cousins, including the sharp edges in their default parameters. Section 8.4 closes with a curated, annotated reading list: the books, courses, documentation trails, and classic papers behind Part I, plus a lightweight workflow for staying current without drowning in arXiv.

Read Section 8.1 now if you read nothing else; it will save you the single most common class of bug in imaging code. Keep the rest bookmarked and return when a pipeline is too slow, an experiment needs data, or a metric needs computing. This is also the first of the book's four "Tools of the Trade" chapters: Chapter 17 does the same service for classical vision, Chapter 29 for deep learning, and Chapter 38 for generative models. Each part of the book ends by consolidating its workshop.

Big Picture

Mastery of image processing is mastery of a stack, not a single library: NumPy arrays underneath, four complementary libraries on top, performance tools beside them, and test data plus metrics to keep everyone honest. The theory of Chapters 1 through 7 does not change; the tooling decides whether applying it takes an afternoon or a sprint.

Learning Objectives

Prerequisites

This chapter consolidates all of Part I, so any of Chapter 1 through Chapter 7 enriches it, but only two are essential. Chapter 0: Foundations: The Python Imaging Stack installed the libraries this chapter compares, and Chapter 1: Digital Image Fundamentals established images as NumPy arrays with dtypes and channel conventions, the vocabulary in which every comparison here is phrased. The performance discussion in Section 8.2 leans on the convolution costs analyzed in Chapter 3: Spatial Filtering & Convolution, and the metrics tooling in Section 8.3 builds on the PSNR and SSIM definitions from Chapter 1 and the restoration tasks of Chapter 7: Image Restoration & Enhancement.

Chapter Roadmap

Fun Fact

The four libraries this chapter compares were born in four different decades and four different worlds: Pillow descends from PIL, started in 1995 to put images on the early web; OpenCV began at Intel in 1999 to sell more CPUs by making vision fast; SciPy's ndimage grew out of scientific Python in the early 2000s; and scikit-image emerged from the SciPy community in 2009 to give NumPy users a Pythonic alternative. None of them set out to be part of a "stack". Your import block is an accident of history that happens to work.

What's Next?

With the Part I workshop organized, the book changes gears. Chapter 9: Edges, Lines & Curves opens Part II: Classical Computer Vision, where the question stops being "how do I improve this image?" and becomes "what is in it?". The gradients of Chapter 3 sharpen into the Canny edge detector, and the Hough transform begins pulling geometric structure out of pixels. Everything in this chapter's toolbox comes along for the ride.

Bibliography & Further Reading

Foundational Papers

Harris, C.R. et al. "Array programming with NumPy." Nature 585 (2020). Nature (open access)

The paper describing the array model underneath every library in this chapter: dtypes, strides, broadcasting, and the vectorization story Section 8.2 builds on.

van der Walt, S. et al. "scikit-image: image processing in Python." PeerJ 2:e453 (2014). PeerJ (open access)

The design paper for scikit-image: why it standardizes on floats in [0, 1], how it relates to NumPy and SciPy, and the API philosophy Section 8.1 contrasts with OpenCV's.

Virtanen, P. et al. "SciPy 1.0: fundamental algorithms for scientific computing in Python." Nature Methods 17 (2020). Nature Methods (open access)

Documents the SciPy project, including the ndimage subpackage whose n-dimensional filters quietly power scikit-image internals.

Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. "Image Quality Assessment: From Error Visibility to Structural Similarity." IEEE TIP (2004). IEEE Xplore

The SSIM paper. Section 8.3 shows that scikit-image's defaults do not match this paper's protocol, and which flags restore it.

Zhang, R., Isola, P., Efros, A., Shechtman, E., and Wang, O. "The Unreasonable Effectiveness of Deep Features as a Perceptual Metric." CVPR (2018). arXiv:1801.03924

Introduces LPIPS, the learned perceptual metric of Section 8.3 that often disagrees with PSNR exactly when humans do.

Books

Gonzalez, R. and Woods, R. "Digital Image Processing." Pearson, 4th ed. (2018). Companion website

The standard reference textbook for everything in Part I; Section 8.4 maps its chapters onto this book's.

Szeliski, R. "Computer Vision: Algorithms and Applications." Springer, 2nd ed. (2022). Free online edition

Free, encyclopedic, and current; its Chapter 3 condenses most of this book's Part I into eighty dense pages with full references.

Tools & Libraries

OpenCV 4.x documentation. docs.opencv.org

The authoritative API reference for the most heavily used library in this book, including the optimization and Transparent API material of Section 8.2.

Pillow documentation. pillow.readthedocs.io

Reference for the friendly fork of PIL: image I/O across dozens of formats, EXIF handling, and the Image object model Section 8.1 contrasts with raw arrays.

CuPy documentation. docs.cupy.dev

The NumPy-compatible GPU array library of Section 8.2, including cupyx.scipy.ndimage, a GPU mirror of the SciPy filters used throughout Part I.

Riba, E. et al. "Kornia: an Open Source Differentiable Computer Vision Library for PyTorch." WACV (2020). arXiv:1910.02190

Classical image processing reimplemented as differentiable PyTorch operators; the bridge between this chapter's tools and the deep learning stack of Part III.

Chen, C. and Mo, J. "IQA-PyTorch: Toolbox for Image Quality Assessment." GitHub repository

One pip install pyiqa away from forty-plus quality metrics, classical and learned; the workhorse of Section 8.3's metric tooling.

Datasets & Benchmarks

USC-SIPI Image Database. sipi.usc.edu/database

The historical home of the classic test images, maintained since 1977; Section 8.3 explains which of its residents are still citable.

Kodak Lossless True Color Image Suite. r0k.us/graphics/kodak

Twenty-four film scans that became the default benchmark for denoising and compression papers; small, free, and everywhere.

Agustsson, E. and Timofte, R. "NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study (DIV2K)." CVPRW (2017). Dataset page

The 2K-resolution dataset behind nearly every modern super-resolution benchmark, used by the restoration evaluations referenced in Section 8.3.