"My job is to look at two million pixels and report back with a handful of lines. Compression is easy; the hard part is deciding which 1,999,500 pixels were beside the point."
A Ruthlessly Selective Edge Detector
This chapter is where the book stops processing images and starts understanding them: it turns arrays of pixel intensities into geometric structure, the edges, lines, and curves that describe what a scene contains rather than what its pixels measure. The derivative filters of Chapter 3 gave us gradient maps; here the Canny detector distills them into clean one-pixel-wide contours, the Hough transform groups those contours into lines and circles by letting pixels vote, and robust fitting sharpens the votes into precise parametric curves that survive outliers. The chapter ends by assembling every piece into a working lane-marking detector, the same architecture that shipped in early driver-assistance systems and still runs as a fallback in some of them.
Chapter Overview
Part I treated an image as a signal to be improved: denoised in Chapter 7, sharpened and filtered in Chapter 3, warped in Chapter 5. Whatever we did, the output was always another image. This chapter changes the type signature. The output of an edge detector is not an image in any useful sense; it is a claim about the world: "here, the scene changes." The output of a line detector is more abstract still: four numbers that summarize ten thousand pixels. This progression, from dense arrays toward compact symbolic descriptions, is the central project of classical computer vision, and edges are its first step because nearly every meaningful boundary in a scene (an object's silhouette, a road marking, the corner of a building) announces itself as a rapid change in intensity.
The chapter follows a deliberate arc from evidence to structure. Section 9.1 revisits the image gradient with new eyes: not as a filter output but as a measuring instrument, and confronts the gap between a gradient map (a number at every pixel) and an edge map (a decision at every pixel). Section 9.2 closes that gap with the Canny detector, a four-stage pipeline from 1986 that remains the most-used edge detector on the planet; we build every stage from scratch, then compress the whole thing into one OpenCV call. Section 9.3 asks what to do with edge pixels once we have them, and answers with one of the field's most elegant ideas: the Hough transform, which detects lines and circles by letting every edge pixel vote for every shape that could pass through it. Section 9.4 trades the Hough transform's robustness for precision, fitting parametric curves by least squares, then rescuing least squares from its fatal sensitivity to outliers with M-estimators and RANSAC, an algorithm so durable that it will reappear in Chapter 10, Chapter 13, and Chapter 14.
Section 9.5 is the payoff: a complete, working lane-marking detection pipeline that thresholds gradients and colors, warps the road into a bird's-eye view using the homography machinery of Chapter 5, gathers lane pixels with a sliding-window search, fits robust polynomials, and reports lane curvature in meters. It is a genuinely deployed architecture, and it is also a teaching instrument: every stage exercises a section of this chapter, and every failure mode (shadows, worn paint, rain) illustrates exactly why the deep learning methods of Part III were invented.
One more reason to take this chapter seriously, even in 2026: edges refuse to retire. The first convolutional layer of virtually every trained CNN rediscovers oriented edge detectors on its own, as we will see in Chapter 19. And at the far end of the book, Canny edge maps return as one of the most popular conditioning signals for controllable image generation: ControlNet and its successors steer diffusion models with precisely the one-pixel-wide contours this chapter teaches you to compute, a story told in Chapter 35. The gradient-to-geometry pipeline you learn here is not a historical exhibit; it is a load-bearing wall of the modern stack.
Four verbs carry the whole chapter, each owning one section's idea. Measure the gradient as evidence (Section 9.1); decide which evidence is an edge, the leap that Canny makes with context rather than a single threshold (Section 9.2); vote to group anonymous edge pixels into shapes, the Hough transform's election (Section 9.3); and fit to turn a coarse winner into a precise curve, robustly (Section 9.4). The lane detector of Section 9.5 runs all four in sequence on a real frame. Hold the four verbs and you hold the chapter; the recurring refrain underneath them, "detect coarsely by voting, then measure precisely by fitting", is the design pattern that returns in every geometric pipeline from Chapter 10 onward.
Prerequisites
This chapter leans directly on the derivative filters (Sobel, Scharr, Laplacian) of Chapter 3: Spatial Filtering & Convolution, especially Section 3.4; if gradient magnitude and orientation feel hazy, reread that section first. Thresholding and histogram reasoning from Chapter 2: Point Operations, Histograms & Thresholding appear in every section. The lane-detection capstone in Section 9.5 uses the perspective transform from Chapter 5: Geometric Transformations & Image Warping and benefits from the binary-image cleanup operations of Chapter 6: Morphology, Binary Images & Shape. As always, fluency with NumPy arrays from Chapter 0 is assumed, and the noise models of Chapter 7 explain why every detector in this chapter begins by smoothing.
Chapter Roadmap
- 9.1 What Is an Edge? Gradients Revisited Edges as rapid intensity change: edge profiles and their physical causes, gradient magnitude and orientation as measurements, why differentiation amplifies noise, and why a gradient map is not yet an edge map.
- 9.2 The Canny Edge Detector, Step by Step The 1986 detector that refuses to die: optimality criteria, non-maximum suppression, double thresholding with hysteresis, a from-scratch implementation, and the one-line OpenCV equivalent with its tuning knobs.
- 9.3 The Hough Transform: Lines & Circles Detection as voting: the polar line parameterization, accumulator arrays, peak finding, the probabilistic variant that returns segments, and the gradient trick that makes circle detection tractable.
- 9.4 Fitting Curves: Least Squares & Robust Alternatives From detection to measurement: total least squares for lines, why one outlier ruins everything, M-estimators and iteratively reweighted fitting, RANSAC and its iteration budget, and direct ellipse fitting.
- 9.5 Worked Example: Lane-Marking Detection The whole chapter assembled: gradient and color thresholding, bird's-eye perspective warp, sliding-window lane pixel search, robust polynomial fits, curvature in meters, and a sober comparison with 2024-2026 learned lane detectors.
What's Next?
Edges describe where the image changes, but they are anonymous: one stretch of contour looks much like another, which is why this chapter could group them only with strong geometric priors (lines, circles, polynomials). Chapter 10: Keypoints, Descriptors & Matching takes the complementary path: instead of long anonymous curves, it finds compact distinctive points (corners and blobs) and equips each with a fingerprint descriptor so it can be recognized again in a different photograph. That single capability, finding the same point in two images, unlocks panorama stitching, camera calibration, stereo depth, and ultimately the 3D reconstruction pipelines of Chapters 12 through 14. The RANSAC estimator you meet in Section 9.4 travels there with you; it is the standard tool for separating correct matches from false ones. Before moving on, put the whole chapter to work in the Hands-On Lab below, where you assemble measure, decide, vote, and fit into a runnable lane-departure warning tool.
Hands-On Lab: Build a Lane-Departure Warning Tool
Objective
Assemble the four verbs of this chapter (measure, decide, vote, fit) into a single runnable program: a lane-departure warning tool that reads a forward-facing driving clip frame by frame, finds the two lane markings with the gradient-to-geometry pipeline of Sections 9.1 through 9.4, fits a robust polynomial to each, computes how far the vehicle has drifted from lane center, and writes an annotated output video that flashes a warning when the drift crosses a threshold. The finished tool runs on a synthetic clip the script generates itself, so it always produces a result even without a dataset.
What You'll Practice
- Turning a gradient magnitude map into a binary edge decision with thresholding and Canny (Sections 9.1 and 9.2).
- Grouping edge pixels into line candidates with the probabilistic Hough transform
cv2.HoughLinesP(Section 9.3). - Separating the left and right markings by line slope and fitting each with a robust polynomial that survives stray votes (Section 9.4).
- Converting a pair of fitted lanes into a single departure metric (offset from lane center) and acting on it.
- Chaining the whole detector across video frames into a deployable tool with an annotated output, the architecture Section 9.5 traces to real driver-assistance systems.
Setup
Two libraries and no dataset required; the script synthesizes its own driving clip if you do not supply one. Install with:
pip install opencv-python numpy
To run on real footage instead, drop a short forward-facing clip named drive.mp4 beside the script; the loader falls back to a generated curved-lane animation when no file is found.
Steps
Step 1: Open a clip, or synthesize one
Build a frame source that yields BGR frames from drive.mp4 when present, and otherwise paints a moving pair of curved lane lines on a gray road so the lab always has something to detect.
import cv2
import numpy as np
W, H = 640, 360
def synth_frame(t):
"""One synthetic dashcam frame at phase t in [0, 1)."""
img = np.full((H, W, 3), 60, np.uint8) # gray road
drift = int(40 * np.sin(2 * np.pi * t)) # car drifts side to side
for y in range(H // 2, H):
z = (y - H // 2) / (H // 2) # 0 near horizon, 1 at bumper
half = int(40 + 180 * z) # lane widens toward camera
cx = W // 2 + drift + int(60 * z * z) # gentle right curve
cv2.circle(img, (cx - half, y), 2, (255, 255, 255), -1)
cv2.circle(img, (cx + half, y), 2, (255, 255, 255), -1)
return img
def frames(path="drive.mp4", n=120):
cap = cv2.VideoCapture(path)
if cap.isOpened():
while True:
ok, f = cap.read()
if not ok:
break
yield cv2.resize(f, (W, H))
cap.release()
return
# TODO: no file found. Yield n synthetic frames by calling synth_frame
# with t stepping evenly from 0 to 1 across the n frames.
raise NotImplementedError
Hint
For the fallback, loop for i in range(n): yield synth_frame(i / n). Stepping t from 0 to 1 makes the synthetic car drift left then right, which is exactly what your warning logic needs to react to.
Step 2: Reduce a frame to lane-edge evidence
Restrict attention to the road with a triangular region of interest, then run Canny (Section 9.2) on the masked grayscale image. The mask removes sky and clutter so the Hough vote later sees only road markings.
def edge_map(bgr):
gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0) # tame noise before differentiating
edges = cv2.Canny(blur, 60, 180) # low and high hysteresis thresholds
roi = np.zeros_like(edges)
poly = np.array([[(0, H), (W, H), (W // 2, H // 2)]]) # bottom triangle
# TODO: fill poly white on roi with cv2.fillPoly, then return
# the bitwise AND of edges and roi so only road edges survive.
...
Hint
cv2.fillPoly(roi, poly, 255) paints the triangle, then return cv2.bitwise_and(edges, roi). If you lose the markings, widen the triangle or lower the high Canny threshold from 180.
Step 3: Vote for line segments with the Hough transform
Apply the probabilistic Hough transform of Section 9.3 to the edge map. It returns concrete segment endpoints rather than infinite lines, which is what you want for fitting.
def hough_segments(edges):
# TODO: call cv2.HoughLinesP on edges with rho=2, theta=np.pi/180,
# threshold=40, minLineLength=20, maxLineGap=100. Return the array
# of segments, or an empty list if it returns None.
...
Hint
lines = cv2.HoughLinesP(edges, 2, np.pi/180, 40, minLineLength=20, maxLineGap=100); then return lines if lines is not None else []. A higher threshold demands more votes per line and returns fewer, cleaner segments.
Step 4: Split segments into left and right by slope
A left marking rises toward the image center with negative slope (in image coordinates where y grows downward), a right marking with positive slope. Sort the Hough segments into two buckets and discard near-horizontal noise.
def split_by_slope(segments):
left, right = [], []
for seg in segments:
x1, y1, x2, y2 = seg[0]
if x2 == x1:
continue # skip vertical, undefined slope
slope = (y2 - y1) / (x2 - x1)
if abs(slope) < 0.3:
continue # skip near-horizontal clutter
# TODO: append the point pairs (x1, y1) and (x2, y2) to `left`
# if slope < 0, otherwise to `right`.
...
return np.array(left), np.array(right)
Hint
Inside the loop, build pts = [[x1, y1], [x2, y2]] and do (left if slope < 0 else right).extend(pts). Collecting endpoints, not slopes, gives the next step raw coordinates to fit.
Step 5: Fit each lane with a polynomial
Fit a line (degree-1 polynomial) to each bucket with np.polyfit, treating y as the independent variable so a near-vertical lane is still a function. This is the least-squares fit of Section 9.4; the slope filter in Step 4 already removed the worst outliers.
def fit_lane(points):
if len(points) < 2:
return None
xs, ys = points[:, 0], points[:, 1]
# TODO: fit x as a function of y, x = a*y + b, with np.polyfit(ys, xs, 1).
# Return a callable that maps a y value to its fitted x.
...
Hint
a, b = np.polyfit(ys, xs, 1) then return lambda y: a * y + b. Fitting x as a function of y (not the reverse) avoids the infinite-slope problem for vertical lanes, the same reasoning Section 9.5 uses with its quadratic fit.
Step 6: Turn two lanes into a departure metric
Evaluate both fitted lanes at the bumper line (the bottom row), take the midpoint as the lane center, and compare it with the image center. A large signed gap means the vehicle has drifted out of the middle of its lane.
def departure_offset(left_fn, right_fn):
if left_fn is None or right_fn is None:
return None
y = H - 1 # bumper line
lane_center = (left_fn(y) + right_fn(y)) / 2
# TODO: return the signed offset of the image center (W/2) from the
# lane center, in pixels. Positive means the car sits right of center.
...
Hint
return (W / 2) - lane_center. A positive value means the camera (vehicle) is to the right of the lane midpoint and should warn the driver to steer left.
Step 7: Annotate every frame and write the output video
Draw the two fitted lanes, print the offset, and flash a warning banner when the absolute offset exceeds a threshold. Write the annotated frames to lane_warning.mp4, the artifact you keep.
def annotate(frame, left_fn, right_fn, offset, threshold=35):
out = frame.copy()
for fn, color in [(left_fn, (0, 255, 0)), (right_fn, (0, 255, 0))]:
if fn is None:
continue
ys = np.arange(H // 2, H, 5)
pts = np.array([[int(fn(y)), int(y)] for y in ys])
cv2.polylines(out, [pts], False, color, 3)
if offset is not None:
cv2.putText(out, f"offset {offset:+.0f}px", (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 255), 2)
# TODO: if abs(offset) > threshold, draw a red "LANE DEPARTURE" banner
# near the top of the frame with cv2.putText.
...
return out
writer = cv2.VideoWriter("lane_warning.mp4",
cv2.VideoWriter_fourcc(*"mp4v"), 20, (W, H))
for frame in frames():
edges = edge_map(frame)
left, right = split_by_slope(hough_segments(edges))
lf, rf = fit_lane(left), fit_lane(right)
off = departure_offset(lf, rf)
writer.write(annotate(frame, lf, rf, off))
writer.release()
print("wrote lane_warning.mp4")
Hint
For the banner: if abs(offset) > threshold: cv2.putText(out, "LANE DEPARTURE", (W//2 - 140, 70), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 3). On the synthetic clip the banner should appear at the extremes of the side-to-side drift.
Expected Output
One video file, lane_warning.mp4, roughly six seconds long at 20 frames per second. Each frame shows the two detected lane markings traced in green with a live offset readout in the corner; on the synthetic clip the car drifts left and right, so a red LANE DEPARTURE banner appears at the extremes of the swing and disappears as it returns to center. The console prints a single line, wrote lane_warning.mp4. If you supply a real drive.mp4, the green traces should hug the painted lane lines on straight road and the offset should stay near zero while you hold the lane.
Stretch Goals
- Replace the degree-1 fit in Step 5 with a degree-2 polynomial (
np.polyfit(ys, xs, 2)) and add the curvature-radius formula of Section 9.4 to the on-screen readout, so the tool reports turn sharpness as well as departure. - Make the fit robust: before
np.polyfit, run a small RANSAC loop (Section 9.4) that samples two points, scores inliers within a pixel band, and keeps the best consensus set. Compare the trace stability against the plain least-squares version on a clip with dashed markings. - Library shortcut, the Right Tool principle in action: swap your hand-built thresholding and Hough stack for a single call to the bird's-eye and sliding-window approach of Section 9.5, then contrast both with a learned detector such as CLRerNet (see the chapter bibliography). State which pipeline you would ship and why.
Complete Solution
import cv2
import numpy as np
W, H = 640, 360
def synth_frame(t):
img = np.full((H, W, 3), 60, np.uint8)
drift = int(40 * np.sin(2 * np.pi * t))
for y in range(H // 2, H):
z = (y - H // 2) / (H // 2)
half = int(40 + 180 * z)
cx = W // 2 + drift + int(60 * z * z)
cv2.circle(img, (cx - half, y), 2, (255, 255, 255), -1)
cv2.circle(img, (cx + half, y), 2, (255, 255, 255), -1)
return img
def frames(path="drive.mp4", n=120):
cap = cv2.VideoCapture(path)
if cap.isOpened():
while True:
ok, f = cap.read()
if not ok:
break
yield cv2.resize(f, (W, H))
cap.release()
return
for i in range(n):
yield synth_frame(i / n)
def edge_map(bgr):
gray = cv2.cvtColor(bgr, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
edges = cv2.Canny(blur, 60, 180)
roi = np.zeros_like(edges)
poly = np.array([[(0, H), (W, H), (W // 2, H // 2)]])
cv2.fillPoly(roi, poly, 255)
return cv2.bitwise_and(edges, roi)
def hough_segments(edges):
lines = cv2.HoughLinesP(edges, 2, np.pi / 180, 40,
minLineLength=20, maxLineGap=100)
return lines if lines is not None else []
def split_by_slope(segments):
left, right = [], []
for seg in segments:
x1, y1, x2, y2 = seg[0]
if x2 == x1:
continue
slope = (y2 - y1) / (x2 - x1)
if abs(slope) < 0.3:
continue
pts = [[x1, y1], [x2, y2]]
(left if slope < 0 else right).extend(pts)
return np.array(left), np.array(right)
def fit_lane(points):
if len(points) < 2:
return None
xs, ys = points[:, 0], points[:, 1]
a, b = np.polyfit(ys, xs, 1)
return lambda y: a * y + b
def departure_offset(left_fn, right_fn):
if left_fn is None or right_fn is None:
return None
y = H - 1
lane_center = (left_fn(y) + right_fn(y)) / 2
return (W / 2) - lane_center
def annotate(frame, left_fn, right_fn, offset, threshold=35):
out = frame.copy()
for fn in (left_fn, right_fn):
if fn is None:
continue
ys = np.arange(H // 2, H, 5)
pts = np.array([[int(fn(y)), int(y)] for y in ys])
cv2.polylines(out, [pts], False, (0, 255, 0), 3)
if offset is not None:
cv2.putText(out, f"offset {offset:+.0f}px", (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 255), 2)
if abs(offset) > threshold:
cv2.putText(out, "LANE DEPARTURE", (W // 2 - 140, 70),
cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 3)
return out
if __name__ == "__main__":
writer = cv2.VideoWriter("lane_warning.mp4",
cv2.VideoWriter_fourcc(*"mp4v"), 20, (W, H))
for frame in frames():
edges = edge_map(frame)
left, right = split_by_slope(hough_segments(edges))
lf, rf = fit_lane(left), fit_lane(right)
off = departure_offset(lf, rf)
writer.write(annotate(frame, lf, rf, off))
writer.release()
print("wrote lane_warning.mp4")
Bibliography & Further Reading
Foundational Papers
HoughLinesP: random sampling plus on-the-fly segment extraction, the practical variant recommended throughout Sections 9.3 and 9.5.fitEllipse, used in Section 9.4; a model of how a clever constraint turns a hard problem into an eigenvalue problem.Recent Research (2023-2026)
Books
Tools & Libraries
cv2.Canny, including the aperture and L2-gradient flags tuned in Section 9.2.HoughLines and HoughLinesP as used in Sections 9.3 and 9.5, with the accumulator semantics spelled out.