Section 6.6: Contours, Moments & Shape Descriptors

"Translate me, rotate me, shrink me to a thumbnail: my seven numbers will not budge. Hold me up to a mirror, though, and exactly one of them changes its mind."
A Defiantly Invariant Hu Moment

Big Picture

A blob's boundary carries nearly everything worth knowing about its shape, and this section compresses that boundary, step by deliberate step, into a handful of numbers that do not change when the object slides, turns, or shrinks, so that recognizing a shape becomes comparing two short vectors. The pipeline runs: trace the boundary into a polygon with a hierarchy of holes, measure the polygon (perimeter, area, vertices, convexity), then distill it through the moment invariance chain into descriptors. At the end we assemble the chapter's promised payoff: a complete, auditable shape classifier in a page of code.

Section 6.5 collapsed shapes onto their centerlines; this final section walks their edges. We have met boundaries twice already: the morphological gradient of Section 6.3 produced a rim of boundary pixels, and Section 6.4's regionprops quietly consumed boundaries to report perimeters and orientations it did not explain. Both left the boundary as an unordered set. The upgrade in this section is order: a contour is the boundary traced in sequence, a closed polygon you can walk, simplify, integrate over, and compare. Order is what turns a rim into geometry, and geometry into recognition, completing the journey this chapter began when a threshold from Chapter 2 first split the world into foreground and background.

1. From Pixel Sets to Boundaries: Border Following Intermediate

The standard tracing algorithm is Suzuki and Abe's border following (1985), the engine inside cv2.findContours. It raster-scans the binary image for a foreground pixel whose left neighbor is background (a border start), then walks counterclockwise around the component, keeping foreground on one side, until it returns to the start. Crucially, it does this for every border in the image, outer boundaries and hole boundaries alike, and records who surrounds whom. The connectivity conventions of Section 6.1 are baked in exactly as Rosenfeld prescribed: 8-connected foreground borders, 4-connected holes, which is why a diagonal chain of pixels traces as one contour rather than many.

The "who surrounds whom" record is the contour hierarchy, and choosing how much of it to keep is the first API decision. cv2.RETR_EXTERNAL keeps only outermost boundaries (fastest, blind to holes); cv2.RETR_LIST returns every contour with no structure; cv2.RETR_CCOMP arranges them into exactly two levels, outer boundaries and their holes, which is the workhorse mode for parts on a tray; and cv2.RETR_TREE keeps the full nesting tree, which a washer lying inside a larger ring genuinely needs. The companion flag cv2.CHAIN_APPROX_SIMPLE compresses straight runs of boundary pixels to their endpoints (a rectangle becomes 4 points instead of hundreds) while cv2.CHAIN_APPROX_NONE keeps every boundary pixel. Figure 6.6.1 shows a three-deep scene and the tree that RETR_TREE builds from it, including the four-integer record [next, previous, first_child, parent] that OpenCV returns per contour.

Figure 6.6.1 Border following finds every boundary and records the nesting. Left: a plate (C0) with a hole (C1) containing an island (C2), beside a separate disc (C3). Right: the RETR_TREE hierarchy as OpenCV encodes it, one [next, previous, first_child, parent] quadruple per contour; C0 and C3 are siblings at depth 0, C1 is C0's child, C2 is C1's child. Nesting depth parity tells you instantly whether a contour bounds an object or a hole.

The code below traces the scene of Figure 6.6.1 and walks the hierarchy, computing each contour's depth by following parent pointers. Depth parity then classifies every contour without any geometry at all, an idea Section 6.4's Euler number expressed as a single global count and the hierarchy now itemizes:

# Trace every border with RETR_TREE and recover each contour's nesting depth
# by climbing parent pointers. Depth parity alone (even = object, odd = hole)
# classifies every border without touching pixel geometry.
import cv2

bw = cv2.imread("plate_scene.png", cv2.IMREAD_GRAYSCALE)
contours, hierarchy = cv2.findContours(bw, cv2.RETR_TREE,
                                       cv2.CHAIN_APPROX_SIMPLE)
print(f"{len(contours)} contours found")
for i, (nxt, prev, child, parent) in enumerate(hierarchy[0]):
    depth, p = 0, parent
    while p != -1:                      # climb parent links to the root
        depth += 1
        p = hierarchy[0][p][3]
    kind = "object border" if depth % 2 == 0 else "hole border"
    print(f"C{i}: depth {depth} ({kind:13s}) "
          f"{len(contours[i]):4d} pts  child={child:2d} parent={parent:2d}")
# Output:
# 4 contours found
# C0: depth 0 (object border )  482 pts  child= 1 parent=-1
# C1: depth 1 (hole border   )  168 pts  child= 2 parent= 0
# C2: depth 2 (object border )   62 pts  child=-1 parent= 1
# C3: depth 0 (object border )  139 pts  child=-1 parent=-1

Code Fragment 1: Walking the RETR_TREE hierarchy of Figure 6.6.1's scene: each contour's nesting depth is recovered by climbing parent pointers, and depth parity alone separates object borders (even) from hole borders (odd), no pixel geometry required.

2. Contour Geometry: Perimeter, Area & Polygon Simplification Beginner

Once a boundary is an ordered polygon, classical geometry applies verbatim. cv2.contourArea evaluates the shoelace formula (Green's theorem specialized to polygons):

$$ A \;=\; \frac{1}{2} \,\Bigl|\, \sum_{i=0}^{n-1} \bigl( x_i\, y_{i+1} - x_{i+1}\, y_i \bigr) \Bigr|, $$

which is why its answer differs slightly from the pixel count Section 6.4 reported as area: one integrates a polygon threaded through pixel centers, the other counts squares. The discrepancy is worth a digit of caution on small blobs and nothing more. Perimeter via cv2.arcLength deserves more caution: summing pixel steps along a staircase systematically overestimates the smooth boundary length (a digital diagonal line of length $L$ measures up to about 8 percent long), so any spec written in millimeters of perimeter should be calibrated against parts of known size rather than trusted absolutely. Derived ratios inherit the bias; we return to this in Exercise 6.6.3.

The third primitive is simplification. The Douglas-Peucker algorithm inside cv2.approxPolyDP recursively keeps only vertices that deviate from a straight chord by more than a tolerance $\varepsilon$, typically set as a fraction of the perimeter. It converts a noisy 400-point boundary into the 4-point quadrilateral that the document scanner of Chapter 5 hunted for, or the 6-point hexagon that will identify a nut in Section 5 below. Together with the convex hull (cv2.convexHull) these produce the working measurements of contour analysis:

# The contour-geometry starter kit on the largest object: shoelace area,
# staircase perimeter, Douglas-Peucker simplification, and convex hull, then
# two first shape descriptors (circularity and solidity) built from them.
import cv2
import numpy as np

contours, _ = cv2.findContours(bw, cv2.RETR_EXTERNAL,
                               cv2.CHAIN_APPROX_SIMPLE)
c = max(contours, key=cv2.contourArea)            # largest object

area  = cv2.contourArea(c)                        # shoelace / Green's theorem
perim = cv2.arcLength(c, closed=True)
approx = cv2.approxPolyDP(c, 0.02 * perim, closed=True)
hull   = cv2.convexHull(c)

print(f"area {area:.0f} px^2, perimeter {perim:.1f} px")
print(f"vertices after Douglas-Peucker: {len(approx)}")
print(f"circularity {4*np.pi*area/perim**2:.3f}, "
      f"solidity {area/cv2.contourArea(hull):.3f}")
# Representative output for a slightly dog-eared rectangular label:
# area 50412 px^2, perimeter 916.4 px
# vertices after Douglas-Peucker: 4
# circularity 0.754, solidity 0.988

Code Fragment 2: The contour-geometry starter kit on one object: shoelace area, staircase perimeter, Douglas-Peucker simplification at 2 percent of perimeter (recovering the label's 4 corners despite boundary noise), and two first descriptors, circularity and solidity, computed from the same ingredients.

3. Moments & the Invariance Chain Intermediate

Geometry measures a shape as placed; recognition needs numbers that survive placement. The instrument is the family of image moments, borrowed directly from mechanics: treat the foreground as a sheet of unit mass and compute

$$ m_{pq} \;=\; \sum_{(x,\,y) \in A} x^p\, y^q . $$

The zeroth moment $m_{00}$ is the area; the first moments give the centroid $\bar x = m_{10}/m_{00}$, $\bar y = m_{01}/m_{00}$, the formula Section 6.4 promised we would derive. Second moments describe how mass spreads around the centroid, and from them comes the orientation that regionprops reports: the major axis of the best-fit ellipse sits at

$$ \theta \;=\; \tfrac{1}{2}\, \operatorname{atan2}\!\bigl( 2\mu_{11},\; \mu_{20} - \mu_{02} \bigr), $$

where $\operatorname{atan2}(b, a)$ is the two-argument arctangent (the quadrant-aware angle of the vector $(a, b)$, available as numpy.arctan2) and the $\mu_{pq}$ are the central moments, the same sums computed about the centroid, $\mu_{pq} = \sum (x - \bar x)^p (y - \bar y)^q$. Centering is the first rung of a ladder that Figure 6.6.2 lays out in full, and each rung discards exactly one nuisance.

Rung one, translation. Central moments do not change when the object translates, since the centroid moves with it. Rung two, scale. Dividing by the right power of area, $\eta_{pq} = \mu_{pq} / m_{00}^{\,1 + (p+q)/2}$, yields normalized central moments that also ignore size; the exponent is exactly the one that cancels scale, since enlarging a shape by a factor $s$ multiplies $\mu_{pq}$ by $s^{\,2 + p + q}$ and the area $m_{00}$ by $s^2$.

Rung three, rotation. The final rung is Hu's 1962 construction: seven polynomial combinations of the $\eta_{pq}$ up to third order that are additionally unchanged by rotation. The first two are

$$ \phi_1 = \eta_{20} + \eta_{02}, \qquad \phi_2 = (\eta_{20} - \eta_{02})^2 + 4\eta_{11}^2, $$

and five more follow in the same spirit. The seventh is special: it is a skew invariant whose magnitude survives reflection but whose sign flips when the shape is mirrored, because mirroring negates every central moment with an odd power of one coordinate, and $\phi_7$ alone is built from an odd combination. One bit of asymmetry survives all that quotienting, and Section 5's practical example will spend exactly that bit.

Figure 6.6.2 The moment invariance chain. Raw moments depend on position, scale, and rotation; subtracting the centroid forgets translation, normalizing by a power of area forgets scale, and Hu's seven polynomial combinations forget rotation. Each stage discards a nuisance variable on purpose, and quantities you still need (the centroid as a pick-point, $\theta$ as a grasp angle) must be read off before the stage that destroys them.

Hu invariants have a wide dynamic range ($\phi_1$ might be $10^{-3}$ while $\phi_7$ is $10^{-21}$), so in practice everyone compares them on a signed log scale, $h_i = -\operatorname{sign}(\phi_i)\,\log_{10}\lvert\phi_i\rvert$. The demonstration below puts the whole chain to the test: a wrench mask is rotated 37 degrees and shrunk to 60 percent, then mirrored, and the log-scale invariants are printed for all three. Watch the last column:

# Test the invariance chain experimentally: rotate-and-scale a wrench mask and
# also mirror it, then print the seven log-scale Hu values for all three. Six
# values hold steady; only the sign of phi_7 flips under the reflection.
import cv2
import numpy as np

def log_hu(mask):
    """Seven Hu invariants on a signed log10 scale."""
    hu = cv2.HuMoments(cv2.moments(mask, binaryImage=True)).flatten()
    return -np.sign(hu) * np.log10(np.abs(hu) + 1e-30)

mask = cv2.imread("wrench_mask.png", cv2.IMREAD_GRAYSCALE)
h, w = mask.shape
M = cv2.getRotationMatrix2D((w/2, h/2), angle=37, scale=0.6)
moved    = cv2.warpAffine(mask, M, (w, h))    # rotated + scaled copy
mirrored = cv2.flip(mask, 1)                  # left-right reflection

for name, m in [("original", mask), ("rot 37, scale 0.6", moved),
                ("mirrored", mirrored)]:
    print(f"{name:18s}", np.round(log_hu(m), 2))
# Representative output (phi_1 .. phi_7, log scale):
# original           [ 2.95  6.61  9.69 10.31 20.43 13.66 -20.65]
# rot 37, scale 0.6  [ 2.95  6.60  9.70 10.33 20.47 13.66 -20.70]
# mirrored           [ 2.95  6.61  9.69 10.31 20.43 13.66  20.65]

Code Fragment 3: The invariance chain verified experimentally: rotating and rescaling the wrench perturbs the seven log-scale Hu values only in the second decimal (discretization noise from resampling), while mirroring leaves six of them untouched and flips precisely the sign of $\phi_7$, the one skew invariant that remembers handedness.

Key Insight: Invariance Is a Budget You Spend Deliberately

Every rung of the chain in Figure 6.6.2 destroys information, and that is its job: position, scale, and orientation are nuisances when classifying, so you quotient them out. But they are the answer when gripping or aligning, so a real pipeline reads the centroid and $\theta$ off the early rungs, then climbs the rest of the ladder for recognition. And sometimes an invariance is a bug: a classifier built on $\phi_1$ through $\phi_6$ (or on matchShapes, which compares magnitudes) is constitutionally incapable of telling a part from its mirror image. Choose which transformations your numbers should forget; do not let the library choose for you.

Fun Fact

Hu published his seven invariants in 1962, and for decades they were taught, implemented, and shipped as a canonical set. It took until 2000 for Jan Flusser to prove the set is neither independent ($\phi_3$ is expressible from the others) nor complete. Thirty-eight years of production use before anyone audited the algebra; the fix (Flusser's complete invariant systems) exists, but the seven originals remain the default in every library, a monument to "good enough, shipped first."

4. A Descriptor Vector for Every Blob Beginner

Moments are one family of shape numbers; the geometry of Section 2 supplies the rest. In practice a small, interpretable descriptor vector covers an enormous range of industrial decisions. Circularity, $4\pi A / P^2$, is 1 for a perfect disk and falls as the boundary wrinkles or elongates. Aspect ratio from cv2.minAreaRect captures elongation without caring about orientation. Extent (area over bounding-box area) separates crosses from blobs; solidity (area over convex-hull area) drops the moment a bite, crack, or concavity appears; the Douglas-Peucker vertex count reads off polygonal class directly (4 means quadrilateral, 6 means hexagon); and the hierarchy's hole count from Section 1 is the topological feature no amount of boundary noise can fake. Each is one or two lines on top of code we have already written, and each has a failure mode you can state in a sentence, which is exactly what makes the vector auditable.

When classes are too subtle for thresholds on individual descriptors, the next step up is comparing whole invariant vectors against labeled templates, which is nearest-neighbor classification in descriptor space and the conceptual ancestor of everything in Chapter 16. OpenCV ships the comparison ready-made:

Library Shortcut: cv2.matchShapes Is a Shape Metric in One Line

Hand-rolling template comparison means computing moments, building Hu vectors, applying the signed log map, guarding against zero moments, and choosing a distance: about 30 lines of fussy code. cv2.matchShapes(c1, c2, cv2.CONTOURS_MATCH_I1, 0) is one line, a 30-to-1 reduction, and handles all of it internally: it forms $m_i = \operatorname{sign}(\phi_i) \log_{10} \lvert \phi_i \rvert$ for both shapes and returns $\sum_i \lvert 1/m_i^A - 1/m_i^B \rvert$ (methods I2 and I3 are alternative formulas on the same ingredients). Near-zero is a match. It accepts contours or whole grayscale images, and because it compares magnitudes only, it is mirror-blind by design; when handedness matters, check the sign of $\phi_7$ yourself as in Section 3.

Common Misconception: Hu Moments Are Not a Robust Shape Recognizer

The clean invariance demo of Section 3 (six numbers barely twitching under rotation and rescaling) tempts learners to treat Hu moments and cv2.matchShapes as a dependable general-purpose shape recognizer. In practice they are fragile and weakly discriminative. They are global integrals over every foreground pixel, so a single occlusion, a touching neighbor, a broken-off corner, or even the staircase discretization of Section 2 shifts the higher-order values noticeably (note how $\phi_5$ through $\phi_7$ already wander in the demo's second decimal from resampling alone). The dynamic range is enormous, so a distance on the raw $\phi_i$ is dominated by $\phi_1$ unless you take the signed-log transform first, and even then very different shapes can land at nearly the same Hu vector because seven scalars cannot capture fine boundary detail. Use them where they earn their keep: coarse, clean, fully-visible silhouettes against a controlled background, as a fast pre-filter or the auditable handedness check of the bracket example. When parts can be occluded, cluttered, or subtly different, reach instead for the richer descriptors of Chapter 10 and the learned features of Chapter 25.

5. The Complete Shape Classifier Intermediate

Now the payoff the chapter overview promised: a complete shape classifier in a page of code, every stage of which this chapter built. Threshold with Otsu (Chapter 2), clean with an opening (Section 6.3), trace with hierarchy (Section 1), describe (Sections 2 to 4), and decide with rules a quality auditor can read aloud. The parts tray contains screws, washers, and hex nuts; topology and three descriptors are enough to tell them apart:

# The chapter's capstone: threshold, open, trace with a two-level hierarchy,
# describe each outer contour with five numbers, and decide with four readable
# rules. Topology (the presence of a hole) does the heaviest classification lifting.
import cv2
import numpy as np

def describe(c):
    """Descriptor vector for one outer contour."""
    area  = cv2.contourArea(c)
    perim = cv2.arcLength(c, True)
    circularity = 4 * np.pi * area / perim**2 if perim > 0 else 0.0
    solidity = area / cv2.contourArea(cv2.convexHull(c))
    (w, h) = cv2.minAreaRect(c)[1]                  # oriented box sides
    aspect = max(w, h) / max(min(w, h), 1e-6)
    verts  = len(cv2.approxPolyDP(c, 0.02 * perim, True))
    return dict(area=area, circularity=circularity,
                solidity=solidity, aspect=aspect, vertices=verts)

def classify(d, has_hole):
    if has_hole and d["circularity"] > 0.80:        # round ring
        return "washer"
    if has_hole and 5 <= d["vertices"] <= 7:        # hexagonal ring
        return "nut"
    if not has_hole and d["aspect"] > 2.5:          # long and solid
        return "screw"
    return "reject"                                 # fail loud, not plausible

img = cv2.imread("tray.png", cv2.IMREAD_GRAYSCALE)
_, bw = cv2.threshold(img, 0, 255,
                      cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
bw = cv2.morphologyEx(bw, cv2.MORPH_OPEN, se)       # de-speckle first

contours, hier = cv2.findContours(bw, cv2.RETR_CCOMP,
                                  cv2.CHAIN_APPROX_SIMPLE)
counts = {}
for i, c in enumerate(contours):
    if hier[0][i][3] != -1:          # skip hole borders themselves
        continue
    if cv2.contourArea(c) < 200:     # dust gate
        continue
    has_hole = hier[0][i][2] != -1   # any child contour means a hole
    label = classify(describe(c), has_hole)
    counts[label] = counts.get(label, 0) + 1
print(counts)
# Representative output on a 24-piece kit tray:
# {'screw': 12, 'washer': 8, 'nut': 4}

Code Fragment 4: The chapter's capstone in one page: Otsu threshold, morphological opening, two-level contour retrieval, a five-number descriptor per outer contour, and four readable rules that sort a fastener tray into screws, washers, and nuts, with topology (the hole) doing the heaviest lifting and an explicit reject class for everything else.

Three habits in this code repay study. The reject class exists because a rule-based system should fail loudly on the unexpected, never shrug a foreign object into the nearest class. The hole feature comes from the hierarchy, not from any intensity heuristic, so grease stains cannot fake it. And every threshold (0.80, 2.5, 200) traces to a measurable property of golden samples, so recalibrating for a new part is a measurement session, not a retraining run. When classes finally overlap in every descriptor you can write down, that is the boundary of this method and the on-ramp to the learned features of Chapter 10 and the classifiers of Chapter 16. The chapter's five verbs (threshold, clean, label, measure, decide) are now all in your hands; the Hands-On Lab at the end of this section asks you to assemble them into one complete, runnable inspection station that you build and run end to end.

Practical Example: The Two Brackets That Were the Same Shape

Who: A vision engineer at a furniture-hardware supplier whose kitting line packs door-hinge sets: screws, washers, and a left-hand and a right-hand mounting bracket per kit.

Situation: A camera over a backlit tray verified each kit before sealing, using a descriptor classifier much like this section's. Screws and washers sorted perfectly. The two brackets did not: kits kept shipping with two left brackets, and the resulting installer complaints were the warranty department's top line item.

Problem: The brackets are exact mirror images. Area, perimeter, circularity, solidity, aspect ratio, vertex count, $\phi_1$ through $\phi_6$, and cv2.matchShapes are all reflection-blind by construction, so every number the system computed was identical for both parts. The team's first instinct, training a small CNN, stalled on explainability requirements and a 5 ms cycle budget.

Decision: Use the one reflection-sensitive number in the standard toolkit: the sign of $\phi_7$. They verified on 2,000 golden samples that $\lvert\phi_7\rvert$ for this bracket sits far from zero (the shape is strongly chiral, meaning it differs clearly from its mirror image), then added a single rule: positive sign means right-hand, negative means left-hand, magnitude below a guard threshold means reject for manual inspection.

Result: Left/right separation went to 100 percent on a 50,000-part audit at no measurable cycle-time cost, the guard threshold caught a handful of bent (and therefore nearly symmetric) brackets as a free bonus, and the auditors signed off on a decision rule that fits in one sentence.

Lesson: Invariances are design choices. The team did not need a richer model; they needed one number that deliberately kept the asymmetry every other number had been engineered to forget. The illustration below dramatizes exactly this trap.

Two L-shaped brackets that are exact mirror images stand by a small mirror while a confused measuring robot holds up two identical report cards, with only a tiny plus-and-minus badge differing between the parts, dramatizing how every reflection-invariant shape descriptor reads identical for mirror-image parts except the sign of the seventh Hu moment. — Mirror-image parts defeat every reflection-blind descriptor at once; the only rescue is the one number engineered to remember handedness, the sign of the seventh Hu moment.

Research Frontier: Contours and Topology Inside the Learning Loop

This section's objects are thriving inside modern pipelines. SAM 2 (Ravi et al., 2024, arXiv:2408.00714) made promptable segmentation a commodity, and the masks it emits are post-processed in production with exactly this section's toolkit: findContours, hole-aware hierarchies, polygon simplification for annotation formats, and descriptor gates to reject implausible masks; the measurement layer of Chapter 24's models is this chapter. Topology is becoming a training signal too: Betti-matching losses (Stucki et al., ICML 2023, with an efficient 3D extension in 2024) penalize a segmentation for predicting the wrong number of holes or components, differentiating the very quantities our hierarchy walk counted. And orientation-from-moments has a learned descendant: oriented-bounding-box heads in Ultralytics YOLO11 (2024) regress per-object angles end to end for aerial and industrial imagery, while the broader handcrafted-descriptor program of this section is carried forward by the learned representations of Chapter 25, which trade auditability for breadth.

Exercise 6.6.1: One Bit of Handedness Conceptual

Show algebraically why mirroring an image about the $y$ axis negates every central moment $\mu_{pq}$ with odd $p$ and leaves the rest unchanged, and use this to argue that $\phi_1$ through $\phi_6$ are reflection-invariant while $\phi_7$ flips sign. Then explain why cv2.matchShapes cannot distinguish a part from its mirror image, and describe two industrial situations from this chapter's examples where that blindness is harmless and one where it ships the wrong part.

Exercise 6.6.2: An Alphabet Retrieval Engine Coding

Render the 26 uppercase letters as binary masks with cv2.putText at a large font size. Build a template library of log-scale Hu vectors, then query it with rotated (random angles) and rescaled (0.5x to 2x) versions of each letter using nearest neighbor on the Hu vectors, and separately with cv2.matchShapes. Report the confusion matrix for each method. Which letter pairs collapse under rotation invariance (consider N and Z, M and W under mirroring), and which descriptor from Section 4 would you add to break each tie?

Exercise 6.6.3: The Circularity Ceiling Analysis

Rasterize perfect disks of radius 2 to 200 pixels and plot measured circularity $4\pi A / P^2$ using cv2.contourArea and cv2.arcLength against radius. Explain the curve's small-radius behavior and its large-radius asymptote (it does not reach 1.0; quantify the gap and attribute it to the staircase perimeter bias of Section 2). Repeat with skimage.measure.perimeter_crofton, which estimates perimeter by the Crofton intersection formula, and recommend, with numbers, a minimum part diameter in pixels below which a circularity-based accept/reject gate should not be trusted.

Hands-On Lab: A Backlit Fastener-Inspection Station

Duration: ~75 minutes Intermediate

Objective

Build a complete, runnable binary-vision inspection station that synthesizes its own backlit parts tray, then runs the chapter's full five-verb pipeline (threshold, clean, label, measure, decide) to count and classify screws, washers, and hex nuts, flag defects, and render an annotated overlay plus a pass/fail report, with no external image files required.

What You'll Practice

Otsu thresholding into a binary mask, then morphological opening and closing to remove speckle and fill pinholes (Sections 6.2 and 6.3).
Labeling connected components and reading per-object region properties (Section 6.4).
Tracing contours with a two-level hierarchy to detect holes by topology, not intensity (Section 1 of this section).
Building a small, interpretable descriptor vector (circularity, solidity, aspect ratio, vertex count, hole count) and a rule-based classifier with an explicit reject class (Sections 4 and 5).
Using the sign of the seventh Hu moment as an auditable handedness check (Section 3).

Setup

One environment, three libraries, zero downloads. The tray is drawn synthetically with OpenCV so the lab is fully self-contained and reproducible.

pip install opencv-python numpy

Steps

Step 1: Synthesize a backlit parts tray

A backlit station gives near-binary images: bright background, dark parts. Draw a few screws (filled rounded rectangles), washers (annuli), and hex nuts (filled hexagons with a central hole) on a bright canvas, at random positions and angles, then add mild sensor noise so the cleanup steps have something to do.

import cv2
import numpy as np

rng = np.random.default_rng(0)

def make_tray(h=480, w=640):
    img = np.full((h, w), 235, np.uint8)            # bright backlit canvas
    truth = []                                       # ground-truth labels

    def place(draw_fn, label, n):
        for _ in range(n):
            cx, cy = rng.integers(70, w - 70), rng.integers(70, h - 70)
            ang = rng.uniform(0, 180)
            draw_fn(img, cx, cy, ang)
            truth.append(label)

    # TODO: implement three drawing helpers that paint DARK parts (value ~40):
    #   - a screw: a long thin rounded rectangle (use cv2.fillPoly on a rotated box)
    #   - a washer: a filled dark disk with a bright disk punched in its center
    #   - a nut:    a filled dark regular hexagon with a bright disk punched out
    # Hint: build points, rotate with cv2.getRotationMatrix2D, then cv2.fillPoly.

    place(draw_screw, "screw", 4)
    place(draw_washer, "washer", 3)
    place(draw_nut, "nut", 2)

    noise = rng.normal(0, 6, img.shape)
    img = np.clip(img.astype(float) + noise, 0, 255).astype(np.uint8)
    return img, truth

tray, truth = make_tray()
cv2.imwrite("tray.png", tray)
print("parts placed:", truth)

Hint

For the hexagon, generate six vertices with angle = np.deg2rad(60*k), scale by a radius, offset by the centre, rotate the whole point set, and pass them to cv2.fillPoly. To punch a hole, draw a bright filled circle (value 235) on top of the dark part after drawing it.

Step 2: Threshold and clean the mask

Convert the grayscale tray to a binary foreground mask with Otsu, then de-speckle with an opening and seal pinholes with a closing. This is the "threshold" and "clean" verbs from the chapter's five-step pipeline.

# TODO: Otsu-threshold `tray` so DARK parts become white foreground.
#   Hint: cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU with thresh value 0.
# TODO: build a 5x5 elliptical structuring element with cv2.getStructuringElement.
# TODO: apply MORPH_OPEN then MORPH_CLOSE with cv2.morphologyEx to get `clean`.

Hint

Opening (erode then dilate) deletes structures smaller than the element, removing isolated noise pixels; closing (dilate then erode) fills small gaps. Order matters: open first to kill speckle, then close to repair the parts. An ellipse element avoids the blocky artifacts a square leaves on round parts.

Step 3: Sanity-check with connected components

Before contour analysis, label the cleaned mask and print one row per blob. This is the "label" and a first pass of the "measure" verb, and it tells you immediately whether cleanup left the right number of parts.

# TODO: call cv2.connectedComponentsWithStats(clean, connectivity=8).
# TODO: loop over labels 1..n-1, printing area (stats[i, cv2.CC_STAT_AREA])
#       and centroid; skip components with area < 200 as dust.
# Expected: roughly 9 surviving components, one per placed part.

Hint

connectedComponentsWithStats returns (count, labels, stats, centroids). Label 0 is always the background. If you see far more than nine survivors, your opening element is too small; if parts merged, two were drawn touching and a watershed split (Section 6.5) would be needed.

Step 4: Describe each part with a shape-descriptor vector

Trace outer contours with a two-level hierarchy so a hole is detected by topology, then compute the five-number descriptor from Section 4 for every part. This is the heart of the "measure" verb.

def describe(c):
    area  = cv2.contourArea(c)
    perim = cv2.arcLength(c, True)
    # TODO: circularity = 4*pi*area / perim**2  (guard perim > 0)
    # TODO: solidity    = area / contourArea(convexHull(c))
    # TODO: aspect      = longer side / shorter side of cv2.minAreaRect(c)
    # TODO: vertices    = len(cv2.approxPolyDP(c, 0.02*perim, True))
    # return a dict(area=, circularity=, solidity=, aspect=, vertices=)
    ...

contours, hier = cv2.findContours(clean, cv2.RETR_CCOMP,
                                  cv2.CHAIN_APPROX_SIMPLE)
# TODO: for each OUTER contour (hier[0][i][3] == -1) with area >= 200,
#       record has_hole = hier[0][i][2] != -1 and its descriptor.

Hint

In RETR_CCOMP mode the hierarchy quadruple is [next, previous, first_child, parent]. An outer contour has parent == -1; it owns a hole when first_child != -1. Skip the hole borders themselves so you do not double-count.

Step 5: Classify with readable rules and a reject class

Turn descriptors into decisions with the auditable rule set from Section 5. Topology (the hole) does the heavy lifting; an explicit reject class fails loudly on anything implausible. This is the "decide" verb.

def classify(d, has_hole):
    # TODO: washer  -> has_hole and circularity > 0.80
    # TODO: nut     -> has_hole and 5 <= vertices <= 7
    # TODO: screw   -> (not has_hole) and aspect > 2.5
    # TODO: anything else -> "reject"
    ...
# TODO: tally counts per label and compare against `truth` from Step 1.

Hint

Every threshold (0.80, 2.5, 200) should come from measuring your synthetic golden samples, not from guessing. If a nut is misread as a washer, its hexagon was simplified to too few vertices; widen the vertex band or lower the Douglas-Peucker tolerance.

Step 6: Render an annotated overlay and a pass/fail report

Draw each contour on a color copy of the tray, colored by predicted class, with the centroid and label printed. Then emit a one-line verdict: PASS when the counts match the expected kit (4 screws, 3 washers, 2 nuts, 0 rejects), FAIL otherwise. This is the deliverable a line operator would actually see.

# TODO: overlay = cv2.cvtColor(tray, cv2.COLOR_GRAY2BGR)
# TODO: for each classified part, draw its contour (color by class) and
#       cv2.putText the label at the centroid.
# TODO: cv2.imwrite("inspection.png", overlay)
# TODO: expected = {"screw": 4, "washer": 3, "nut": 2}
#       print("PASS" if counts == expected and rejects == 0 else "FAIL")

Hint

Use a dict mapping label to BGR color, for example screw blue, washer green, nut orange, reject red, so a glance at inspection.png tells you what the station decided. The centroid comes from cv2.moments(c) as (m10/m00, m01/m00).

Expected Output

The console prints nine surviving components in Step 3, a descriptor table in Step 4, and a final tally that matches the placed parts, for example {'screw': 4, 'washer': 3, 'nut': 2} followed by PASS. The saved inspection.png shows every part outlined and labeled in its class color over the original tray, with no part left unclassified and no background blob mislabeled. Re-running with a different random seed in Step 1 reshuffles positions and angles but the verdict logic still classifies every part correctly, demonstrating the rotation and translation robustness the chapter built.

Stretch Goals

Add a left-hand and a right-hand bracket to the tray (one the mirror of the other) and tell them apart using only the sign of the seventh Hu moment, as in the bracket practical example of Section 5; confirm every other descriptor is identical for the pair.
Inject a defect: draw one washer with a notch bitten out of its rim and one nut with a hairline crack. Add a solidity gate (reject solidity below 0.95) so the inspection station catches both as defects, and verify the verdict flips to FAIL.
Place two screws so they touch, watch the connected-component count drop, then separate them with a distance-transform watershed (Section 6.5) before the contour stage so the count recovers.

Complete Solution

# Complete, runnable fastener-inspection station: synthesize a tray, then run
# the five-verb pipeline (threshold, clean, label, measure, decide) end to end.
import cv2
import numpy as np

rng = np.random.default_rng(0)
DARK, BRIGHT = 40, 235

def _rotate(pts, cx, cy, ang):
    M = cv2.getRotationMatrix2D((cx, cy), ang, 1.0)
    pts = np.hstack([pts, np.ones((len(pts), 1))])
    return (M @ pts.T).T.astype(np.int32)

def draw_screw(img, cx, cy, ang):
    L, Wd = 60, 14
    box = np.array([[cx-L, cy-Wd], [cx+L, cy-Wd],
                    [cx+L, cy+Wd], [cx-L, cy+Wd]], np.float32)
    cv2.fillPoly(img, [_rotate(box, cx, cy, ang)], DARK)

def draw_washer(img, cx, cy, ang):
    cv2.circle(img, (cx, cy), 34, DARK, -1)
    cv2.circle(img, (cx, cy), 15, BRIGHT, -1)        # punch the hole

def draw_nut(img, cx, cy, ang):
    r = 36
    hexp = np.array([[cx + r*np.cos(np.deg2rad(60*k)),
                      cy + r*np.sin(np.deg2rad(60*k))] for k in range(6)],
                    np.float32)
    cv2.fillPoly(img, [_rotate(hexp, cx, cy, ang)], DARK)
    cv2.circle(img, (cx, cy), 14, BRIGHT, -1)        # threaded hole

def make_tray(h=480, w=640):
    img = np.full((h, w), BRIGHT, np.uint8)
    truth = []
    def place(fn, label, n):
        for _ in range(n):
            cx, cy = int(rng.integers(70, w-70)), int(rng.integers(70, h-70))
            fn(img, cx, cy, rng.uniform(0, 180))
            truth.append(label)
    place(draw_screw, "screw", 4)
    place(draw_washer, "washer", 3)
    place(draw_nut, "nut", 2)
    img = np.clip(img.astype(float) + rng.normal(0, 6, img.shape),
                  0, 255).astype(np.uint8)
    return img, truth

# --- threshold + clean ---------------------------------------------------
tray, truth = make_tray()
_, bw = cv2.threshold(tray, 0, 255,
                      cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5))
clean = cv2.morphologyEx(bw, cv2.MORPH_OPEN, se)
clean = cv2.morphologyEx(clean, cv2.MORPH_CLOSE, se)

# --- label sanity check --------------------------------------------------
n, labels, stats, cents = cv2.connectedComponentsWithStats(clean, connectivity=8)
survivors = [i for i in range(1, n) if stats[i, cv2.CC_STAT_AREA] >= 200]
print(f"{len(survivors)} components survive cleanup")

# --- measure -------------------------------------------------------------
def describe(c):
    area  = cv2.contourArea(c)
    perim = cv2.arcLength(c, True)
    circ  = 4*np.pi*area/perim**2 if perim > 0 else 0.0
    sol   = area / max(cv2.contourArea(cv2.convexHull(c)), 1e-6)
    w, h  = cv2.minAreaRect(c)[1]
    aspect = max(w, h) / max(min(w, h), 1e-6)
    verts  = len(cv2.approxPolyDP(c, 0.02*perim, True))
    return dict(area=area, circularity=circ, solidity=sol,
                aspect=aspect, vertices=verts)

# --- decide --------------------------------------------------------------
def classify(d, has_hole):
    if has_hole and d["circularity"] > 0.80:
        return "washer"
    if has_hole and 5 <= d["vertices"] <= 7:
        return "nut"
    if not has_hole and d["aspect"] > 2.5:
        return "screw"
    return "reject"

contours, hier = cv2.findContours(clean, cv2.RETR_CCOMP,
                                  cv2.CHAIN_APPROX_SIMPLE)
colors = {"screw": (200, 80, 0), "washer": (0, 170, 0),
          "nut": (0, 140, 255), "reject": (0, 0, 230)}
overlay = cv2.cvtColor(tray, cv2.COLOR_GRAY2BGR)
counts = {}
for i, c in enumerate(contours):
    if hier[0][i][3] != -1 or cv2.contourArea(c) < 200:
        continue                                     # skip holes and dust
    has_hole = hier[0][i][2] != -1
    label = classify(describe(c), has_hole)
    counts[label] = counts.get(label, 0) + 1
    m = cv2.moments(c)
    px, py = int(m["m10"]/m["m00"]), int(m["m01"]/m["m00"])
    cv2.drawContours(overlay, [c], -1, colors[label], 2)
    cv2.putText(overlay, label, (px-25, py), cv2.FONT_HERSHEY_SIMPLEX,
                0.5, colors[label], 1, cv2.LINE_AA)

cv2.imwrite("inspection.png", overlay)
print("counts:", counts)
expected = {"screw": 4, "washer": 3, "nut": 2}
verdict = "PASS" if counts == expected and counts.get("reject", 0) == 0 else "FAIL"
print("verdict:", verdict)
# Representative output:
# 9 components survive cleanup
# counts: {'screw': 4, 'washer': 3, 'nut': 2}
# verdict: PASS