Part I: Image Processing
Chapter 6: Morphology, Binary Images & Shape

Connected Components & Region Properties

"For years I was just foreground. Then one Tuesday a two-pass algorithm gave me a name, an area, a centroid, and a bounding box. I am component 17 now. I contain 4,212 pixels and my eccentricity is none of your business."

A Freshly Labeled Connected Component
Big Picture

Connected-component labeling is the bridge between image processing and object reasoning: it converts a cleaned binary mask into a list of discrete things, and region properties convert each thing into a row of numbers, at which point vision problems become data problems. Everything before this section manipulated pixels; everything after it (counting, gating, classifying, the recognition pipelines of Chapter 16) manipulates objects. This section builds the bridge twice: once by hand to understand it, once with the library calls used in production.

The previous section delivered a clean mask: one connected blob per physical object, no speckle, no pinholes. The definitions are also already in place: Section 6.1 defined a connected component as a maximal set of mutually reachable pixels under a chosen adjacency. What remains is the algorithmic step of actually assigning every foreground pixel an integer identity (this pixel belongs to object 3), and then the measurement step of summarizing each identity into properties a decision rule can consume. Both steps are humble, and the second is arguably the most economically productive operation in this book: most deployed vision systems are, at bottom, regionprops plus an if-statement.

1. Labeling: Giving Pixels an Identity Beginner

A labeling of a binary image is an integer image of the same size in which background pixels hold 0 and every foreground pixel holds the index of its component: all pixels of the first object hold 1, the second 2, and so on. Figure 6.4.1 shows the transformation on a small grid, together with the measurements that fall out of it almost for free.

Binary mask label Label image 123 label area centroid 1 4 (2.5, 2.5) 2 3 (5.7, 1.7) 3 4 (3.5, 5.25) objects are now rows of numbers
Figure 6.4.1 Labeling turns a binary mask (left) into a label image (center) where each connected component carries a distinct integer, here rendered as colors. From the label image, per-object measurements (right) such as area and centroid follow by simple accumulation, and the image has become a table.

The simplest correct algorithm is flood fill: scan for an unlabeled foreground pixel, assign it the next label, and propagate that label outward through neighbors (breadth-first or depth-first) until the component is exhausted; repeat. It visits each pixel a constant number of times, so it is linear in image size, and it makes the connectivity definition of Section 6.1 executable in a screenful of code:

import numpy as np
from collections import deque

def label_floodfill(mask, connectivity=8):
    """Connected-component labeling by BFS flood fill."""
    offs = [(-1,0),(1,0),(0,-1),(0,1)]
    if connectivity == 8:
        offs += [(-1,-1),(-1,1),(1,-1),(1,1)]
    h, w = mask.shape
    labels = np.zeros((h, w), np.int32)
    current = 0
    for y in range(h):
        for x in range(w):
            if mask[y, x] and not labels[y, x]:
                current += 1                      # found a new object
                queue = deque([(y, x)])
                labels[y, x] = current
                while queue:                      # spread the label
                    cy, cx = queue.popleft()
                    for dy, dx in offs:
                        ny, nx = cy + dy, cx + dx
                        if (0 <= ny < h and 0 <= nx < w
                                and mask[ny, nx] and not labels[ny, nx]):
                            labels[ny, nx] = current
                            queue.append((ny, nx))
    return labels, current

yy, xx = np.mgrid[0:9, 0:9]
ring = ((np.abs(xx - 4) + np.abs(yy - 4)) == 3)
print(label_floodfill(ring, 8)[1], label_floodfill(ring, 4)[1])
# Output:
# 1 12        (the Section 6.1 diamond: one object 8-connected, twelve 4-connected)
A complete flood-fill labeler in 25 lines, parameterized by connectivity and validated on Section 6.1's paradox diamond; readable and correct, but Python-loop slow, which is exactly the gap the library implementations close.

Production labelers use a different classic, the two-pass algorithm: a single raster scan assigns provisional labels by looking only at already-visited neighbors (up and left), records an equivalence whenever two different provisional labels meet (a U-shaped object's two arms, discovered to be one object at the bottom of the U), and resolves all equivalences with a union-find structure before a second scan rewrites final labels. Two sequential, cache-friendly sweeps and a near-constant-time union-find make it extremely fast; the open YACCLAB benchmark tracks a whole zoo of refinements (block-based scanning, SIMD, GPU variants) that push hundreds of megapixels per second. You will never need to write one, but knowing the mechanism explains the otherwise mysterious phrase "label equivalence" in library documentation, and explains why labels come out in raster order.

2. The Production Call: connectedComponentsWithStats Beginner

OpenCV bundles labeling with the most common measurements in a single call, which is the workhorse of counting pipelines:

import cv2
import numpy as np

mask = cv2.imread("cleaned_parts_mask.png", cv2.IMREAD_GRAYSCALE)

n, labels, stats, centroids = cv2.connectedComponentsWithStats(
    mask, connectivity=8)

print(f"{n - 1} objects")               # label 0 is the background
for i in range(1, n):
    x, y, w, h, area = stats[i]         # bbox corner, bbox size, pixel count
    cx, cy = centroids[i]
    print(f"  #{i}: area={area:5d}  bbox=({x},{y},{w}x{h})  "
          f"centroid=({cx:.1f},{cy:.1f})")
# Representative output:
# 4 objects
#   #1: area= 8120  bbox=(64,40,112x99)   centroid=(119.6,89.2)
#   #2: area= 7903  bbox=(241,38,109x101) centroid=(295.0,88.8)
#   #3: area=  311  bbox=(402,71,25x31)   centroid=(414.1,86.0)
#   #4: area= 8045  bbox=(78,210,110x100) centroid=(132.8,259.7)
One call, four return values: cv2.connectedComponentsWithStats yields the count, the label image, a per-label stats matrix (bounding box and area, indexed by the cv2.CC_STAT_* constants), and sub-pixel centroids; object #3's suspiciously small area marks it as a fragment, not a part.

The stats matrix supports the single most useful post-cleanup operation: area filtering. Component #3 above is an order of magnitude smaller than its siblings, almost certainly debris that survived the opening of Section 6.3. Rebuilding the mask while dropping every component outside an area band is three lines (keep = np.isin(labels, good_ids) after selecting good_ids from the stats column), and unlike a larger opening it removes the debris with zero distortion of the survivors. Area filtering and morphological cleanup are complements, not rivals: morphology fixes pixel-scale damage within objects, area filtering enforces object-scale sanity between them.

3. regionprops: The Measurement Toolkit Intermediate

When a pipeline needs more than areas and boxes, scikit-image's regionprops is the standard instrument. It computes dozens of per-region properties lazily, and its sibling regionprops_table emits them as columnar data ready for pandas, which is the moment a vision problem officially becomes a data-science problem. The properties worth knowing by heart are summarized in Table 6.4.1.

Table 6.4.1 The core regionprops measurements and what they are for.
PropertyWhat it measuresTypical decision it powers
areaPixel count of the regionSize gating; part present/absent
centroidMean pixel coordinatePick-point for a robot; position check
bboxAxis-aligned bounding boxCropping; coarse size limits
perimeterBoundary length estimateCompactness and roughness scores
eccentricityElongation of the best-fit ellipse, 0 (circle) to 1 (line)Round versus elongated part classes
orientationAngle of the ellipse's major axisGrasp angle; alignment check
solidityarea / convex hull areaDetecting bites, cracks, concavities
extentarea / bounding-box areaDistinguishing crosses from blobs
euler_numberComponents minus holes (here 1 minus holes)Counting holes: washers versus discs

Several of these will be derived properly in Section 6.6 (centroid and orientation from moments, perimeter from contours); here they are consumers' goods. The code below runs the full measurement pipeline on a parts image and lands the results in a dataframe:

import pandas as pd
from skimage import measure

lbl = measure.label(mask > 0, connectivity=2)        # 8-connectivity in 2D
df = pd.DataFrame(measure.regionprops_table(
    lbl,
    properties=["label", "area", "centroid", "eccentricity",
                "solidity", "euler_number"]))
print(df.round(3).to_string(index=False))
# Representative output:
#  label  area  centroid-0  centroid-1  eccentricity  solidity  euler_number
#      1  8120      89.21      119.62         0.214     0.987             1
#      2  7903      88.83      295.04         0.198     0.991             0
#      3   311      86.04      414.13         0.892     0.704             1
#      4  8045     259.72     132.81          0.226     0.989             1
From mask to dataframe in four statements with skimage.measure.regionprops_table: region 2's Euler number of 0 betrays a hole (one component minus one hole), region 3's high eccentricity and low solidity mark it as an elongated sliver, and downstream logic is now ordinary pandas.
Key Insight: The Image Disappears Here

After labeling and measurement, the pipeline never touches pixels again: decisions run on a table of per-object features. This is the architectural pattern of all recognition, classical and deep. The pipelines of Chapter 16 replace these nine geometric features with hand-crafted descriptor vectors, and the detectors of Chapter 23 replace them with learned ones, but the shape of the computation (image to objects to feature rows to decision) is the one established on this page.

Library Shortcut: regionprops_table Replaces Your Measurement Loop

Computing Table 6.4.1's nine properties by hand means looping over labels, masking each region, and writing each formula (the ellipse fits alone, via second-order moments, are a page of careful code); call it 60 to 80 lines. regionprops_table(lbl, properties=[...]) is one line, evaluates lazily so you pay only for requested properties, handles edge-touching regions and empty inputs, and returns NumPy columns that drop straight into pandas. It also generalizes to 3D volumes unchanged, which matters the day your masks come from a CT scanner rather than a camera. Use spacing= to get physical units instead of pixels.

4. From Measurements to Decisions Beginner

The last step of a counting or gating pipeline is almost embarrassingly simple, and that simplicity is the selling point. A bolt-kit verifier might require exactly 12 regions with area in [7000, 9000] and eccentricity below 0.4 (the washers), plus 4 regions with eccentricity above 0.85 (the bolts), and reject otherwise. Such rules execute in microseconds, can be reviewed line by line in an audit, and fail loudly rather than plausibly. Their weakness is equally plain: they generalize exactly as far as the engineer's foresight. The moment classes overlap in every measured feature, rules give out, and that is the cliff edge where Section 6.6's richer shape descriptors, then Chapter 16's learned classifiers, take over.

Practical Example: The Returns Station That Out-Counted the Detector

Who: The automation lead at an e-commerce hardware retailer, building a returns-verification station that must confirm the contents of opened fastener kits (mixed screws, nuts, washers, up to 200 pieces) poured onto a tray.

Situation: The first prototype used a fine-tuned object detector from Chapter 23's family. On dense piles it missed heavily occluded pieces and double-counted at tile boundaries, plateauing at 96 percent count accuracy, far below the 99.9 percent the business case needed, and each new kit SKU required retraining.

Problem: Counting many small, identical, high-contrast objects is precisely the regime where detection networks struggle (tiny instances, mutual occlusion) and classical machinery excels, but the team had not considered the classical route.

Decision: Rebuild the station around physics plus this chapter: a translucent backlit tray (making segmentation trivial per Chapter 2), a vibration pulse to spread the pile, then threshold, open, label, and measure. Pieces were classified by area and eccentricity bands per SKU, configured from 20 golden samples in minutes rather than retraining. Clumps flagged by implausible area were routed to the distance-transform separator of Section 6.5.

Result: 99.95 percent count accuracy at 1.8 seconds per tray on a fanless industrial PC, new SKUs onboarded by a technician without a data scientist, and the deep detector was retired from this station (it kept its job on the loading dock, where lighting could not be controlled).

Lesson: When you control the lighting, contrast is free, and once contrast is free, counting is connected components plus a dataframe. Engineer the scene before you engineer the model.

Research Frontier: Counting and Labeling in 2024-2026

Both halves of this section are active research surfaces. On the labeling side, GPU implementations matter at video rate and volume scale: RAPIDS cuCIM ports label and regionprops to CUDA for microscopy volumes, and the YACCLAB benchmark continues to absorb new block-and-union GPU labelers measured in gigapixels per second. On the counting side, the open-world counters now compete with the backlit tray: CountGD (Amini-Naieni et al., NeurIPS 2024, arXiv:2407.04619) counts arbitrary objects specified by text or visual exemplars in uncontrolled images, and SAM 2 (2024, arXiv:2408.00714) plus per-mask regionprops has become a standard zero-shot measurement pipeline in biology labs: a foundation model proposes the regions, and the 1960s measurement algebra of this section still produces the numbers that get published.

Exercise 6.4.1: Equivalences in the Wild Conceptual

Trace the two-pass algorithm by hand on a U-shaped object scanned top to bottom, writing the provisional labels each pixel receives and the moment the equivalence between the two arms is discovered. Then construct an object shape that generates not one but three pairwise equivalences that must all resolve to a single label (hint: think of a comb), and explain why a naive "relabel on discovery" strategy without union-find can degrade to quadratic time on such shapes.

Exercise 6.4.2: The Area-Filter Bake-Off Coding

Create a mask of 50 random disks (radii 8 to 15 px) plus 500 specks (1 to 2 px). Clean it two ways: (a) morphological opening with an SE just large enough to kill the specks, and (b) labeling plus area filtering at a 50 px threshold. Compare the two results to the speck-free ground truth with a pixel-level intersection-over-union score, measure both runtimes, and explain the IoU gap you observe near disk boundaries in terms of Section 6.3's corner-rounding behavior.

Exercise 6.4.3: Euler's Typography Audit Analysis

Render the 26 uppercase letters in a bold sans-serif font, label each letter image, and compute its Euler number with regionprops. Verify the relation $E = C - H$ (components minus holes) letter by letter: B should give $1 - 2 = -1$, A and D give 0, and so on. Report which letters' results depend on the foreground connectivity chosen, and connect the failures you find at thin junctions back to the Jordan pairing of Section 6.1.