Section 2.1: Brightness, Contrast & Gamma Correction

"They told me to brighten up. So I added 50 to everything and clipped my feelings at 255."
A Chronically Underexposed Security Camera

Big Picture

A point operation transforms each pixel using only that pixel's own value, which means every such operation, no matter how fancy its formula, is just a curve from input intensity to output intensity. Brightness, contrast, gamma, the "curves" tool in Photoshop, the tone mapping inside your phone's camera app: all of them are functions $s = T(r)$ applied independently at every pixel, and all of them can be compiled into a 256-entry lookup table that runs at memory speed. Master the curve and you have mastered the entire family.

In Chapter 1 we followed light from photons to a quantized grid of numbers, and in Chapter 0 we learned to hold that grid as a NumPy array without mangling its dtype. Until now, though, we have only loaded, inspected, and converted images. This section is where we change one for the first time. We start with the gentlest possible change: transformations that look at a single pixel value and replace it with another, with no knowledge of any neighboring pixel. The restriction sounds severe, yet it covers nearly every exposure and tone adjustment performed in practice. The illustration below captures the deal in one image: each pixel is judged entirely on its own value, wearing blinkers to its neighbors.

A grid of cartoon pixel squares where a single central pixel wears blinkers and is repainted a new shade by a small paint roller while cheerfully ignoring the neighbor pixels waving beside it, illustrating that a point operation transforms each pixel using only that pixel's own value with no knowledge of any neighbor. — Every point operation judges a pixel entirely on its own merits: profound objectivity and a permanent blind spot in one tidy package.

1. Point Operations: One Pixel In, One Pixel Out Basic

Formally, a point operation (also called an intensity transformation) maps an input image $f$ to an output image $g$ via a function $T$ that depends only on the pixel's own value:

$$g(x, y) = T\big(f(x, y)\big)$$

The coordinates $(x, y)$ appear on both sides but $T$ itself ignores them: the same function is applied at every location. Contrast this with the neighborhood operations of Chapter 3, where the output at $(x,y)$ depends on a whole window of surrounding pixels. The independence of point operations has three practical consequences. First, they are trivially parallel: every pixel can be processed at once, which is why they vectorize perfectly in NumPy and run essentially for free on a GPU. Second, they can never sharpen, blur, or denoise, because those effects require comparing neighbors. Third, for 8-bit images, any point operation whatsoever can be precomputed as a table of 256 output values, a fact we will exploit shortly.

Because $T$ maps the interval $[0, 255]$ to itself, we can draw it: input intensity on the horizontal axis, output intensity on the vertical. Figure 2.1.1 plots the transfer curves of the four operations this section covers. The identity is the diagonal; brightness shifts the diagonal up; contrast rotates it steeper around a pivot; gamma bends it into a power-law curve. Reading transfer curves fluently is a genuinely useful skill: it is exactly how photographers read the "curves" panel, and how you will debug a mysterious tone shift in a data pipeline.

Figure 2.1.1 Transfer curves of common point operations. Every point operation is a curve from input intensity $r$ to output intensity $s$. Brightness translates the identity line, contrast steepens it around a pivot, and gamma bends it into a power law. The flat segments at 0 and 255 are clipping: information in those ranges is permanently destroyed.

2. Brightness and Contrast: The Linear Family Basic

The workhorse point operation is affine:

$$g(x, y) = \alpha \cdot f(x, y) + \beta$$

where $\alpha$ (the gain) controls contrast and $\beta$ (the bias) controls brightness. With $\alpha > 1$ the intensity range stretches and the image looks punchier; with $\alpha < 1$ it compresses and the image looks flat. Positive $\beta$ lifts everything toward white. The subtlety is not the formula but the arithmetic around it: the result $\alpha r + \beta$ routinely lands outside $[0, 255]$, and as we saw when discussing dtypes in Chapter 0, uint8 arithmetic wraps around rather than saturating. The from-scratch implementation below does it correctly: compute in float, clip, then convert back.

# Affine point operation g = alpha*f + beta, done the safe way:
# promote to float32 first so the intermediate cannot overflow,
# clip to the displayable range, then convert back to uint8.
import numpy as np
import cv2

img = cv2.imread("scene.jpg")          # uint8 BGR, shape (H, W, 3)

def adjust_brightness_contrast(img, alpha=1.0, beta=0):
    """Apply g = alpha * f + beta with correct saturation."""
    out = img.astype(np.float32) * alpha + beta   # work in float32: no overflow
    out = np.clip(out, 0, 255)                    # saturate instead of wrapping
    return out.astype(np.uint8)                   # back to displayable uint8

punchy = adjust_brightness_contrast(img, alpha=1.3, beta=-20)
washed = adjust_brightness_contrast(img, alpha=0.6, beta=80)

print(img[0, 0], "->", punchy[0, 0])
# Example output for a pixel [200 180 90]:
# [200 180 90] -> [240 214 97]   (the 200 channel hit 240 = 1.3*200 - 20)

Code Fragment 1: A safe from-scratch brightness and contrast adjustment: adjust_brightness_contrast uses a float32 intermediate, an explicit clip to [0, 255], then conversion back to uint8. Skipping the clip is the classic source of psychedelic wrap-around artifacts, which is exactly why the spot-check prints the 200 channel landing at 240 rather than wrapping.

The function above is six lines. OpenCV collapses it to one, and handles the saturation internally with optimized SIMD code. Whenever you see yourself writing the float-clip-convert dance for an affine adjustment, reach for the library instead.

Library Shortcut: cv2.convertScaleAbs

The six-line from-scratch version becomes a single call:

punchy = cv2.convertScaleAbs(img, alpha=1.3, beta=-20)

The entire affine adjustment, saturation included, as a single OpenCV call.

That is a 6-to-1 line reduction, and the library handles more than convenience: convertScaleAbs performs saturating arithmetic in vectorized SIMD instructions, processes all channels at once, and never materializes a float copy of the image, so it is both faster and more memory-frugal than the NumPy version. The one behavioral difference: it takes an absolute value before saturating, which only matters if your alpha or input can be negative.

Key Insight: Clipping Destroys Information Forever

Every pixel that hits 0 or 255 after a point operation has lost its identity: distinct input values were mapped to the same output, and no later processing can tell them apart. A point operation is invertible only where its curve is strictly monotonic and unclipped. This is why professional pipelines adjust exposure on raw or float data and convert to uint8 last, and why in training pipelines you should apply photometric augmentation before quantizing, a discipline that returns in Chapter 21.

3. Gamma Correction: The Power-Law Curve Intermediate

Linear adjustments move all intensities by the same recipe, but human vision does not work linearly: we are far more sensitive to differences in shadows than in highlights. As covered in Chapter 1, this is why almost every stored image is gamma-encoded: the camera applies a power law that allocates more of the 256 code values to darker tones, and the display applies the inverse. The same power law, applied deliberately, is the standard tool for tonal correction. On intensities normalized to $[0, 1]$:

$$s = r^{\gamma}$$

With $\gamma < 1$ the curve bows upward and shadows brighten dramatically while highlights barely move (the red curve in Figure 2.1.1). With $\gamma > 1$ the curve bows downward and the image darkens, with midtones affected most (the orange curve). Because the endpoints 0 and 1 map to themselves, gamma never clips: it redistributes the tonal range instead of truncating it, which is exactly why it is preferred over a brightness shift for fixing exposure.

# Gamma correction via a precomputed 256-entry lookup table:
# evaluate s = r**gamma once for every possible uint8 input,
# then let cv2.LUT remap millions of pixels with a single table read.
import numpy as np
import cv2

def gamma_correct(img, gamma):
    """Apply s = r**gamma on normalized intensities via a 256-entry LUT."""
    r = np.arange(256, dtype=np.float32) / 255.0   # all possible inputs, normalized
    table = np.clip(255.0 * (r ** gamma), 0, 255).astype(np.uint8)
    return cv2.LUT(img, table)                     # one table lookup per pixel

img = cv2.imread("underexposed_loading_dock.jpg")
brighter = gamma_correct(img, gamma=0.5)   # gamma < 1 lifts shadows
darker   = gamma_correct(img, gamma=2.0)   # gamma > 1 deepens them

# Spot-check the curve itself:
print(int(255 * (50/255) ** 0.5))    # 112  : deep shadow 50 jumps to 112
print(int(255 * (200/255) ** 0.5))   # 225  : highlight 200 moves only to 225

Code Fragment 2: Gamma correction implemented as a lookup table inside gamma_correct: the curve is evaluated once for the 256 possible input values, then applied to millions of pixels with a single cv2.LUT call. The printed spot-checks show the signature behavior: input 50 jumps to 112 while 200 only creeps to 225, so shadows move a lot and highlights barely move.

Note what the code does: rather than computing img ** gamma over the whole array (millions of float power operations), it evaluates the curve at the 256 possible uint8 values and lets cv2.LUT do a table lookup per pixel. The spot-check printout confirms the asymmetry that makes gamma so useful: an input of 50 jumps to 112, while 200 only creeps to 225. If you work in PyTorch data pipelines, the equivalent is torchvision.transforms.v2.functional.adjust_gamma, and randomized gamma is a standard photometric augmentation we will meet again in Chapter 21.

Warning: Know Whether Your Pixels Are Linear or Gamma-Encoded

A JPEG straight from a camera is already sRGB gamma-encoded; pixel 128 is not half the light of 255. If you average, blend, or resize such an image, you are doing math on nonlinear values, which is usually acceptable but technically wrong; if you simulate physics or relight a scene, you must linearize first (apply roughly $\gamma = 2.2$), operate, and re-encode. Mixing up the two states is one of the most common silent bugs in imaging code, and it matters again when computing the quality metrics introduced in Chapter 1.

Practical Example: The Warehouse Cameras That "Needed Replacing"

Who: A computer-vision engineer at a logistics company running barcode and label detection on 240 fixed dock cameras.

Situation: Cameras at the loading docks faced bright doorways, so the auto-exposure metered for the daylight outside and the package labels in the foreground sat in the bottom eighth of the intensity range.

Problem: The label detector's recall at the dock cameras was 31 points below the warehouse average. The operations team proposed replacing all dock cameras with HDR units, a six-figure purchase.

Decision: Before approving hardware, the engineer added a single preprocessing step to the dock-camera streams: gamma correction with $\gamma = 0.45$ applied through a precomputed LUT, costing under a millisecond per frame on CPU.

Result: Label contrast in the shadow region roughly tripled, recall recovered to within 4 points of the warehouse average, and the camera replacement was shelved.

Lesson: When a model fails on dark inputs, try a 256-byte lookup table before a hardware budget. Point operations are the cheapest knob in the entire vision stack, so turn them first.

4. Lookup Tables: Point Operations at Production Speed Intermediate

The gamma example quietly introduced the most important implementation idea of this section: for 8-bit data there are only 256 possible input values, so any point operation, however expensive its formula, costs exactly 256 function evaluations plus one table lookup per pixel. That precomputed table of outputs is a lookup table (LUT), and we will use the acronym from here on. Better still, lookup tables compose. If you want to apply contrast, then gamma, then a tone tweak, you do not run three passes over the image; you compose the three curves into one table and run a single pass. This is precisely how camera ISPs, video pipelines, and color-grading systems work internally, and it is why colorists exchange ".cube" LUT files rather than scripts. The illustration below makes the trick concrete: compute the curve once into a row of cubbies, then hand every pixel its finished color at memory speed.

A friendly robot beside a cabinet of pre-filled color cubbies hands a finished color chip to each grey pixel as it files past, with a tone curve painted on the cabinet, illustrating a 256-entry lookup table that evaluates a point operation once per possible input then remaps every pixel with a single fast table read. — Curve, table, compose: compute the tone curve once into 256 cubbies and every pixel after that is just a free lookup.

# Compose a whole tone pipeline (contrast, brightness, gamma) into ONE
# 256-entry table built before the video loop, so each frame costs a
# single cv2.LUT pass no matter how many stages were folded in.
import numpy as np
import cv2

def build_tone_lut(alpha=1.0, beta=0.0, gamma=1.0):
    """Compose contrast/brightness and gamma into ONE 256-entry table."""
    r = np.arange(256, dtype=np.float32)
    r = np.clip(alpha * r + beta, 0, 255) / 255.0   # stage 1: affine
    s = np.clip(255.0 * (r ** gamma), 0, 255)       # stage 2: gamma
    return s.astype(np.uint8)

lut = build_tone_lut(alpha=1.2, beta=-10, gamma=0.8)

cap = cv2.VideoCapture("dock_camera.mp4")
while True:
    ok, frame = cap.read()
    if not ok:
        break
    corrected = cv2.LUT(frame, lut)    # entire tone pipeline: one lookup pass
    # ... feed `corrected` to the detector ...

Code Fragment 3: Composing an entire tone pipeline (contrast, brightness, gamma) into a single lookup table with build_tone_lut, built once outside the video loop. Inside the cap.read() loop the whole adjustment is one cv2.LUT call per frame, regardless of how many stages were composed, which is why the per-frame cost is independent of the number of curves.

The pattern in this code, build the table once and reuse it for every frame, is worth internalizing. It converts per-pixel math into per-codebook math, and it scales to color: a 3D LUT tabulates a function of (R, G, B) jointly on a coarse lattice and interpolates between entries, which is how film looks and display calibration are implemented. Keep this in mind when you reach learned enhancement later: several state-of-the-art methods literally predict LUTs.

The Point-Operation Recipe in Three Words: Curve, Table, Compose

If you remember nothing else from this section, remember those three words. Curve: every point operation, however fancy its formula, is a single transfer function $s = T(r)$ from input intensity to output intensity. Table: for 8-bit data that curve is exactly 256 numbers, so you precompute it once as a LUT and the per-pixel cost drops to one lookup. Compose: stacking operations means composing curves into one table, so any chain of tone adjustments still costs a single pass. Curve, table, compose is the entire family, and it is why a colorist ships a .cube file instead of a script.

Fun Fact: Gamma Is a CRT Fossil That Turned Out to Be Genius

The number 2.2 originally described the physics of cathode-ray tubes: the beam current of a CRT responded to voltage as roughly $V^{2.2}$, so cameras pre-distorted the signal with the inverse curve. When CRTs died, gamma should have died with them, except engineers noticed the pre-distortion was almost exactly the perceptual encoding you would design on purpose: it spends scarce bits where human vision is most sensitive. The fossil became the standard (sRGB), and every JPEG you have ever taken carries it.

You Could Build This: A One-File Photo-Filter LUT Pack

You now have everything needed to build a tiny command-line photo-filter tool. Use build_tone_lut to bake a handful of named looks (a warm "golden hour" curve, a flat "matte" curve, a punchy high-contrast curve) into 256-entry tables, then apply the chosen one to any image with a single cv2.LUT pass. Because tables compose, stacking two looks is still one pass, so a user can chain "matte then punchy" for free. This is exactly how Instagram-style filters and colorist .cube packs work under the hood, which makes it a satisfying first portfolio piece. Complexity: QUICK, about 20 to 30 minutes. Stretch it by loading and applying a real .cube 3D LUT file so your tool reads the same presets professional grading software exports.

5. Choosing Parameters Automatically: Percentile Stretching Intermediate

Everything so far required a human to pick $\alpha$, $\beta$, or $\gamma$. A first step toward automation is contrast stretching: map the darkest interesting intensity to 0 and the brightest to 255, linearly. Using the absolute minimum and maximum is fragile (one dead pixel ruins the stretch), so robust implementations use percentiles, typically the 2nd and 98th:

$$g = 255 \cdot \frac{\mathrm{clip}(f,\, p_2,\, p_{98}) - p_2}{p_{98} - p_2}$$

# Pick the stretch endpoints automatically from the data: map the
# low/high percentiles (not the raw min/max) to 0 and 255, so a single
# dead pixel cannot dictate the contrast of the whole frame.
import numpy as np

def autocontrast(gray, low_pct=2, high_pct=98):
    """Percentile-based contrast stretch (robust to outlier pixels)."""
    lo, hi = np.percentile(gray, [low_pct, high_pct])
    if hi <= lo:                       # flat image: nothing to stretch
        return gray.copy()
    out = (gray.astype(np.float32) - lo) * (255.0 / (hi - lo))
    return np.clip(out, 0, 255).astype(np.uint8)

# A hazy image occupying only [90, 170] gets remapped to use [0, 255]:
hazy = np.random.randint(90, 171, size=(480, 640), dtype=np.uint8)
stretched = autocontrast(hazy)
print(hazy.min(), hazy.max(), "->", stretched.min(), stretched.max())
# 90 170 -> 0 255

Code Fragment 4: Automatic contrast stretching in autocontrast, driven by the 2nd and 98th percentiles of the intensity distribution rather than the raw extremes. The synthetic hazy image occupies less than a third of the available range before stretching and the full [0, 255] range after, as the printed 0 255 confirms.

Notice what autocontrast needed in order to act: the percentiles of the intensity distribution. We computed them directly here, but the principled tool for describing that distribution is the histogram, and once you have a histogram you can do far more than stretch endpoints: you can diagnose exposure problems, compare images, and derive optimal contrast curves and thresholds. That is the subject of the next two sections, beginning with Section 2.2.

Research Frontier: Point Operations, Learned

The lowly tone curve is an active research object. Zero-DCE (Guo et al., CVPR 2020) trains a tiny network that outputs per-pixel curve parameters for low-light enhancement with no paired training data: a learned, spatially varying version of this section's gamma correction. Retinexformer (Cai et al., ICCV 2023) couples a Retinex decomposition with a one-stage transformer and remains a strong open baseline. NILUT (Conde et al., AAAI 2024) represents professional 3D LUTs as compact implicit neural networks, so a single model can carry multiple color-grading styles onto mobile hardware. LightenDiffusion (Jiang et al., ECCV 2024) pushes the same problem into the diffusion-model framework of Part IV, and the HVI color space (Yan et al., CVPR 2025) shows that even choosing the space in which the curves act is still publishable territory. The pattern across all of them: the operations of this section are not obsolete, they are now the output of neural networks.

Exercise 2.1.1: Curve Reading Conceptual

Sketch (or describe) the transfer curve $s = T(r)$ for each of the following, and state whether the operation is invertible: (a) a negative ($s = 255 - r$); (b) brightness $+100$ on uint8 with saturation; (c) gamma 0.5 followed by gamma 2.0; (d) posterization to 4 levels. For each non-invertible case, identify exactly which input values become indistinguishable.

Exercise 2.1.2: One-Pass Tone Pipeline Coding

Extend build_tone_lut with an optional S-curve stage $s = \tfrac{1}{2}\big(1 + \tanh(k\,(r - \tfrac{1}{2}))\big)$ on normalized intensities, controlled by a strength parameter $k$. Verify on a test image that applying your composed LUT in one cv2.LUT call produces results identical (to within 1 intensity level) to applying the three stages as separate passes. Time both versions on a 4K frame with time.perf_counter.

Exercise 2.1.3: Clipping Audit Analysis

Take any well-exposed photograph and apply brightness shifts $\beta \in \{20, 40, 80, 160\}$ with saturation. For each, measure the fraction of pixels clipped at 255 and compute the PSNR (from Chapter 1) between a round trip (brighten by $\beta$ then darken by $-\beta$) and the original. Plot clipped fraction versus round-trip PSNR and explain the shape of the relationship in one paragraph.