PNG: The Lossless Graphics Standard
Complete technical deep dive into PNG file format, compression mechanisms, color handling, transparency, interlacing, and hands-on FFmpeg manipulation techniques.
Overview and History
Portable Network Graphics (PNG) is a lossless raster image format officially standardized by the W3C in 1996. Created as a free alternative to GIF and LZW compression patents, PNG combines advanced compression with rich metadata support.
Key Features
- Lossless compression preserves all pixel data
- Bit depths: 1, 2, 4, 8, 16, 24, 32, 48, 64-bit
- Full per-pixel alpha transparency (0-255 opacity)
- Multiple color models (grayscale, RGB, indexed)
- Adam7 progressive interlacing
- Embedded gamma and color profile data
Typical Use Cases
- Web graphics and UI elements
- Screenshots without artifacts
- Logos, icons, and vector-based art
- Scientific and medical imaging
- Technical diagrams and infographics
- Archival and long-term storage
PNG File Structure and Chunks
A PNG file consists of an 8-byte signature followed by a series of chunks. Each chunk contains metadata or image data with built-in error detection via CRC (Cyclic Redundancy Check).
File Signature
Every PNG file begins with an invariant 8-byte signature:
Decimal: 137 80 78 71 13 10 26 10 Hex: 89 50 4E 47 0D 0A 1A 0A ASCII: ‰ P N G CR LF SUB LFThis signature serves multiple purposes: it identifies the file as PNG, catches common transmission errors (the control characters detect non-8-bit-clean channels), and is human-readable in hex editors.
Chunk Structure
All data following the signature is organized into chunks. Each chunk has a fixed 12-byte overhead plus variable data:
┌─────────────────────────┐ │ Length (4 bytes) │ Big-endian unsigned 32-bit integer ├─────────────────────────┤ │ Chunk Type (4 bytes) │ ASCII characters (e.g., "IHDR", "IDAT") ├─────────────────────────┤ │ Chunk Data (variable) │ 0 to (2³¹ − 1) bytes ├─────────────────────────┤ │ CRC (4 bytes) │ CRC32 of type + data └─────────────────────────┘Critical Chunks
PNG defines certain chunks as "critical"—if an unrecognized critical chunk appears, the file is invalid. Critical chunk names have uppercase first letters.
IHDR (Image Header)
Critical, must be first. Contains 13 bytes:
- Width (4 bytes) – pixels
- Height (4 bytes) – pixels
- Bit Depth (1 byte) – 1, 2, 4, 8, or 16
- Color Type (1 byte) – 0, 2, 3, 4, or 6
- Compression (1 byte) – always 0 (deflate)
- Filter (1 byte) – always 0 (adaptive)
- Interlace (1 byte) – 0 (none) or 1 (Adam7)
IDAT (Image Data)
Critical, contains compressed pixel data. May span multiple chunks. Data format:
- Zlib header (2 bytes)
- Deflated scanlines
- Adler-32 checksum (4 bytes)
- Each scanline: filter type (1 byte) + pixels
IEND (Image Trailer)
Critical, must be last. Zero-length chunk marking end of file. CRC value: 0xAE426082.
PLTE (Palette)
Ancillary, required for color type 3. 1–256 color entries, 3 bytes each (RGB):
Length must be multiple of 3. Palette defines RGB for indexed-color images.
Ancillary Chunks
Optional metadata chunks (lowercase first letter = ancillary). Unknown ancillary chunks can be safely ignored.
tRNS (Transparency)
For color types 0, 2, 3: transparency information. Grayscale/RGB specify transparent color; indexed-color lists alpha values per palette entry.
gAMA (Gamma)
4-byte gamma value × 100,000. Ensures brightness consistency. Typical: 45455 (≈ 2.2 for sRGB), 100000 (linear).
cHRM (Chromacities)
CIE 1931 color space coordinates for RGB primaries and white point. 32 bytes total.
sRGB (sRGB Profile)
1-byte rendering intent (0–3). If present, overrides gAMA/cHRM. Ensures web-standard color.
iCCP (ICC Profile)
Embedded ICC color profile for precise device-independent color. May be large; often overrides other color chunks.
tIME (Timestamp)
7-byte modification time (year, month, day, hour, minute, second).
pHYs (Physical Pixel)
Pixels per unit (X, Y, unit). Unit: 0 = unknown ratio, 1 = meters (DPI conversion).
tEXt (Text Data)
Uncompressed text key-value pairs (author, description, copyright, etc.).
zTXt (Compressed Text)
Deflate-compressed text. Useful for large metadata or comments.
Color Types and Bit Depths
PNG supports 5 color types, each with valid bit depth combinations. The color type byte in IHDR determines the image's color model.
| Color Type | Description | Bit Depths | Bytes/Pixel |
|---|---|---|---|
| 0 | Grayscale | 1, 2, 4, 8, 16 | 1/8 to 2 |
| 2 | Truecolor (RGB) | 8, 16 | 3 to 6 |
| 3 | Indexed-color (palette) | 1, 2, 4, 8 | 1/8 to 1 |
| 4 | Grayscale + Alpha | 8, 16 | 2 to 4 |
| 6 | Truecolor + Alpha (RGBA) | 8, 16 | 4 to 8 |
Bit Depth Details
Bit depth defines the number of bits per sample (grayscale or color channel):
- 1-bit – 2 colors (black and white). Ideal for line art, binary masks.
- 2-bit – 4 colors. Rarely used; old computer graphics.
- 4-bit – 16 colors (or 16 gray levels). Historical graphics.
- 8-bit – 256 colors (or 256 gray levels). Web-standard for indexed images; standard for grayscale.
- 16-bit – 65,536 levels per channel. Professional photography, scientific imaging; larger files.
PNG Compression Mechanism
PNG employs a two-stage compression pipeline: filtering (prediction) followed by DEFLATE (LZ77 + Huffman). This hybrid approach achieves excellent compression without information loss.
Stage 1: Filtering (Prediction)
Before compression, each 8-bit scanline (pixel row) is preprocessed using one of 5 filter algorithms. The filter type byte is stored at the start of the compressed scanline. Filters exploit spatial correlation—adjacent pixels are often similar—converting data into a form more compressible by Huffman coding.
Filter 0: None
No transformation; raw pixel values stored. Useful when pixels are already uncorrelated (e.g., random noise, high-frequency content).
Output[x] = Original[x]Filter 1: Sub
Subtracts the pixel to the left. Effective for gradual horizontal transitions and photographs.
Output[x] = Original[x] − Original[x−1]Filter 2: Up
Subtracts the pixel above. Exploits vertical correlation; works well for vertical gradients.
Output[x] = Original[x] − Original_above[x]Filter 3: Average
Subtracts the average of left and above neighbors. Balances horizontal and vertical correlation.
Output[x] = Original[x] − ⌊(Left + Up) / 2⌋Filter 4: Paeth
Paeth predictor: selects left, up, or upper-left based on distance to target. Most sophisticated; best compression on graphics.
Output[x] = Original[x] − PaethPredictor(Left, Up, UpLeft)
Predictor = p where p ∈ {Left, Up, UpLeft} minimizes distanceStage 2: DEFLATE Compression
After filtering, the entire preprocessed image data (all scanlines concatenated) is compressed with DEFLATE. DEFLATE combines:
DEFLATE Algorithm
- LZ77 (Lempel-Ziv) – A sliding window algorithm that replaces repeated sequences with back-references (offset, length pairs). Typical window: 32 KB. If a sequence appears multiple times, only store distance to previous occurrence + length.
- Huffman Coding – Assigns variable-length binary codes to symbols: frequent values get short codes, rare values get long codes. Reduces average bits per symbol.
- Zlib Wrapper – PNG IDAT chunks use zlib format: 2-byte header (method, flags) + compressed data + 4-byte Adler-32 checksum.
Encoder Optimization
Sophisticated PNG encoders (e.g., libpng, ImageMagick) use multiple strategies to minimize output size:
- Try all 5 filters on each scanline; select the one producing smallest compressed size.
- Optimize DEFLATE parameters (compression level, block size) based on image type (photo vs. graphics).
- For indexed-color images, analyze palette and remap colors to minimize entropy.
- Bit-depth reduction (e.g., 8-bit → 4-bit) if image contains ≤ 16 unique colors.
- Strip ancillary chunks not needed for display (metadata, gamma, color profiles) for smaller files.
Adam7 Interlacing
PNG supports the Adam7 interlacing algorithm, enabling progressive display. Instead of storing pixels left-to-right, top-to-bottom, Adam7 rearranges them into 7 passes. After the first pass, a low-resolution preview is visible; each pass refines the image.
Adam7 Passes
Each pass targets specific pixel positions in an 8×8 repeating pattern:
| Pass | X Start | X Step | Y Start | Y Step | Resolution |
|---|---|---|---|---|---|
| 1 | 0 | 8 | 0 | 8 | 1/8 × 1/8 |
| 2 | 4 | 8 | 0 | 8 | 2/8 × 1/8 |
| 3 | 0 | 4 | 4 | 8 | 2/8 × 2/8 |
| 4 | 2 | 4 | 0 | 4 | 4/8 × 2/8 |
| 5 | 0 | 2 | 2 | 4 | 4/8 × 4/8 |
| 6 | 1 | 2 | 0 | 2 | 8/8 × 4/8 |
| 7 | 0 | 1 | 1 | 2 | 8/8 × 8/8 |
Progressive Display Benefits
- Pass 1 alone provides a 1/8 × 1/8 resolution preview (12.5% of pixels) – typically 30–50 KB into the file.
- Pass 1–3 yields 2/8 × 2/8 resolution (25% complete preview) after ≈ 15% of file bytes.
- Pass 1–4 reaches 4/8 × 2/8 resolution (50% complete). By this point, user sees general image structure.
- All 7 passes complete the full resolution. Network delays allow incremental refinement instead of blank screen.
Trade-offs
Adam7 interlacing increases file size by 5–10% due to extra scanline predictor data. Use interlacing for web images (progressive perception); disable for fast local viewing or archival.
Transparency and Alpha Channels
PNG supports two complementary transparency mechanisms: embedded alpha channels (for color types 4 and 6) and the tRNS chunk (for types 0, 2, 3).
Alpha Channel (Types 4 & 6)
Color types 4 and 6 include a full alpha channel: an 8-bit (or 16-bit) value per pixel, 0 = fully transparent, 255 = fully opaque. Intermediate values enable anti-aliasing edges.
- Type 4 (Grayscale+Alpha) – Each pixel: grayscale (1–16 bits) + alpha (same bit depth). Useful for masks and overlays.
- Type 6 (RGBA) – Each pixel: red + green + blue + alpha (8 or 16 bits each). Standard for web graphics with transparency.
tRNS Chunk (Types 0, 2, 3)
For images without built-in alpha, tRNS specifies a single "transparent" color:
- Type 0 (Grayscale) – tRNS contains a 2-byte grayscale value; pixels matching this value are transparent.
- Type 2 (RGB) – tRNS contains three 2-byte RGB values; pixels matching (r, g, b) are transparent.
- Type 3 (Indexed) – tRNS lists an alpha value (0–255) for each palette entry; enables gradual transparency on paletted images.
Alpha Composition
When rendering PNG with alpha, viewers composite the image over a background using the standard formula:
Output = (Foreground × Alpha) + (Background × (1 − Alpha))PNG does not specify background color; each viewer chooses (typically white or the page background). To ensure portable appearance, images destined for web are often pre-composited against white or a specific background color, sacrificing the alpha channel but guaranteeing consistent appearance.
Gamma Correction and Color Spaces
PNG stores color-critical metadata to ensure images display consistently across different devices. Gamma correction, color space definitions, and embedded profiles bridge the gap between image creation (linear RGB) and display (gamma-corrected sRGB).
Gamma Basics
Gamma (γ) is a non-linear mapping between pixel values and display brightness:
Display_Intensity = Input_Value ^ (1 / γ)A gamma of 2.2 is standard for sRGB (typical web and Windows default). macOS historically used 1.8. Cameras embed gamma to match their target display, not for linear light capture.
gAMA Chunk
The gAMA chunk stores a single 4-byte unsigned integer representing gamma × 100,000:
- 45455 – sRGB standard (γ ≈ 2.2)
- 45750 – Apple standard (γ ≈ 2.187, ≈ 1.8 inverse)
- 100000 – Linear (no gamma)
- 55556 – γ ≈ 1.8
sRGB Chunk
If present, sRGB chunk (1 byte) overrides gAMA and cHRM, indicating the image is in standard sRGB color space. Rendering intent:
- 0 – Perceptual
- 1 – Relative colorimetric
- 2 – Saturation
- 3 – Absolute colorimetric
cHRM Chunk (Chromacities)
Defines the RGB color space by specifying CIE 1931 chromaticity coordinates for the three primaries and white point. Each value is 4 bytes (× 100,000):
sRGB Primaries: Red: (0.64, 0.33) → (64000, 33000) Green: (0.30, 0.60) → (30000, 60000) Blue: (0.15, 0.06) → (15000, 6000) White D65: (0.3127, 0.329) → (31270, 32900)ICC Profile (iCCP Chunk)
For maximum color accuracy, PNG can embed a full ICC color profile—a binary blob describing device-specific color transformations. Overrides gAMA, cHRM, sRGB. Typical size: 1–4 KB. Used in professional photography and print workflows.
PNG Manipulation and Experiments via FFmpeg
FFmpeg is a versatile multimedia toolkit that can read, decode, filter, encode, and manipulate PNG files. Below are comprehensive techniques for exploring and experimenting with PNGs.
Basic Information and Inspection
Inspect PNG metadata and properties
ffprobe -v error -show_streams -show_format image.pngOutputs detailed metadata: dimensions, pixel format (color type), bit depth, color space, frame rate (for APNG), etc. in a structured format.
JSON output for scripting
ffprobe -v error -print_format json -show_format -show_streams image.png | jqParses metadata as JSON; useful for programmatic workflows.
View detailed stream information
ffmpeg -i image.png 2>&1 | grep -E "Stream|Duration|pixel|kb/s"FFmpeg prints color model, resolution, and estimated bitrate on stderr.
Format Conversion and Encoding
Convert PNG to JPEG
ffmpeg -i image.png -q:v 85 output.jpgLossy conversion. -q:v ranges 2–31 (2 = highest quality, ~95%; 31 = lowest, ~5%).
Convert PNG to WebP
ffmpeg -i image.png -c:v libwebp -q:v 80 output.webpWebP is more efficient than JPEG/PNG for web. Use -lossless 1 for lossless WebP.
Convert PNG to AVIF (next-gen)
ffmpeg -i image.png -c:v libaom-av1 -q:v 32 output.avifAVIF offers excellent compression. Lower -q:v = better quality.
Convert to different color types
# Convert to grayscale
ffmpeg -i image.png -pix_fmt gray output_gray.png
# Convert to RGB (force truecolor)
ffmpeg -i image.png -pix_fmt rgb24 output_rgb.png
# Convert to RGBA (add alpha channel)
ffmpeg -i image.png -pix_fmt rgba output_rgba.png
# Convert to indexed (paletted)
ffmpeg -i image.png -pix_fmt pal8 -c:v png output_indexed.pngBit Depth Manipulation
Convert to 8-bit per channel
ffmpeg -i input_16bit.png -pix_fmt rgb24 output_8bit.pngReduces file size significantly (rgb48 → rgb24).
Expand to 16-bit per channel
ffmpeg -i input_8bit.png -pix_fmt rgb48be output_16bit.pngIncreases precision; useful for professional workflows. Use rgb48le for little-endian on Intel systems (usually the default).
Grayscale bit depth variants
# 16-bit grayscale
ffmpeg -i input.png -pix_fmt gray16be output_16bit_gray.png
# 8-bit grayscale
ffmpeg -i input.png -pix_fmt gray output_8bit_gray.pngResizing and Scaling
Scale to fixed dimensions
ffmpeg -i input.png -vf "scale=800:600" output.pngResizes to exactly 800×600. May distort if aspect ratio differs.
Scale with aspect ratio preservation
ffmpeg -i input.png -vf "scale=800:600:force_original_aspect_ratio=decrease" output.pngPreserves aspect; reduces dimensions to fit within 800×600 box.
Scale by percentage
ffmpeg -i input.png -vf "scale=iw*0.5:ih*0.5" output_50percent.pngiw = input width, ih = input height.
Cropping and Padding
Crop a rectangular region
ffmpeg -i input.png -vf "crop=400:300:100:50" output.pngCrop 400×300 pixels starting at (x=100, y=50).
Pad with color (add border)
ffmpeg -i input.png -vf "pad=1000:1000:(ow-iw)/2:(oh-ih)/2:black" output.pngPads to 1000×1000, centering original image on black background. Use white, 0x00FF00 (green), or 0xFFFFFF.
Color and Brightness Adjustments
Adjust brightness and contrast
ffmpeg -i input.png -vf "eq=brightness=0.1:contrast=1.2" output.pngbrightness range: −1 to 1 (−1 = pitch black, 0 = original, 1 = overexposed). contrast: 1 = original.
Saturation adjustment
ffmpeg -i input.png -vf "hue=s=1.5" output.pngs=1.5 boosts saturation by 50%; s=0 = grayscale.
Colorize (apply hue tint)
ffmpeg -i input.png -vf "hue=h=120" output.pngShifts all hues by 120°. Use h=180 for inverse colors, h=90 for 90° rotation.
Gamma correction
ffmpeg -i input.png -vf "eq=gamma=2.2" output.pngApply gamma curve. Values > 1 brighten; < 1 darken.
Filtering and Effects
Blur
ffmpeg -i input.png -vf "boxblur=5" output.pngBox blur with 5-pixel radius. For Gaussian blur: -vf "gblur=sigma=2".
Sharpen
ffmpeg -i input.png -vf "unsharp=5:5:1.5" output.pngUnsharp mask with 5×5 kernel and 1.5× strength.
Edge detection
ffmpeg -i input.png -vf "sobel" output.pngSobel edge detector. Also: laplacian, roberts, canny.
Denoise
ffmpeg -i input_noisy.png -vf "nlmeans=s=5:p=7:r=15" output_denoised.pngNon-local means denoise. Parameters: s (strength), p (patch size), r (research radius).
Transparency and Alpha Manipulation
Add alpha channel if missing
ffmpeg -i input.png -pix_fmt rgba output_rgba.pngConverts RGB → RGBA; opaque alpha (255) everywhere.
Make specific color transparent
ffmpeg -i input.png -vf "colorkey=0x00FF00:0.1:0.5" output.pngKey out green (0x00FF00). Similarity tolerance: 0.1; blend: 0.5.
Adjust alpha channel opacity
ffmpeg -i input_rgba.png -vf "split[fg][bg]; [bg]lut=a=a/2[bg]; [fg][bg]overlay" output.pngReduces alpha by 50% (more transparent). For simpler adjustment, use lut filter with custom look-up tables.
Composite PNG over background
ffmpeg -i background.png -i foreground_alpha.png -filter_complex "overlay=(W-w)/2:(H-h)/2" output.pngPlaces foreground (centered) over background, respecting alpha.
Batch Processing and Frame Extraction
Extract frames from video as PNG sequence
ffmpeg -i input.mp4 -vf "fps=fps=1" frames_%04d.pngExtracts 1 frame per second: frames_0001.png, frames_0002.png, etc.
Extract specific frame at timestamp
ffmpeg -ss 00:01:30 -i input.mp4 -vframes 1 frame_at_1m30s.pngGrabs frame at 1 minute 30 seconds.
Create video from PNG sequence
ffmpeg -framerate 24 -i frames_%04d.png -c:v libx264 -pix_fmt yuv420p output.mp4Combines PNGs into H.264 video at 24 fps.
Metadata and Embedding
View PNG metadata
ffmpeg -i image.png 2>&1 | grep -i metadataDisplays embedded text, timestamps, color profiles, etc.
Copy PNG with new metadata (via FFmpeg)
ffmpeg -i input.png -metadata title="My Image" -metadata comment="Example" -codec copy output.pngAdds tEXt chunks without re-encoding (lossless copy).
Strip all metadata
ffmpeg -i input.png -map_metadata -1 -codec copy output.pngRemoves all ancillary chunks (gamma, color profiles, text) for a smaller file.
Combining Multiple PNGs
Horizontal concat (side-by-side)
ffmpeg -i left.png -i right.png -filter_complex "hstack" output.pngPlaces two PNGs horizontally.
Vertical concat (stacked)
ffmpeg -i top.png -i bottom.png -filter_complex "vstack" output.pngStacks two PNGs vertically.
Grid layout (montage)
ffmpeg -i image1.png -i image2.png -i image3.png -i image4.png -filter_complex "
[0:v][1:v]hstack=inputs=2[top];
[2:v][3:v]hstack=inputs=2[bottom];
[top][bottom]vstack=inputs=2[v]
" -map "[v]" output.pngArranges four PNGs in a 2×2 grid.
Advanced: APNG (Animated PNG)
Create APNG from PNG sequence
ffmpeg -framerate 10 -i frame_%04d.png -plays 0 -c:v apng output.apngGenerates APNG looping infinitely (−plays 0). Note: apng codec requires FFmpeg built with libapng support.
Extract frames from APNG
ffmpeg -i animated.apng frame_%04d.pngExtracts each animation frame as individual PNG files.
Complex Filter Graph Example
Here's an advanced example combining multiple operations:
ffmpeg -i input.png -vf "
scale=1200:800:force_original_aspect_ratio=decrease,
pad=1200:800:(ow-iw)/2:(oh-ih)/2:white,
eq=brightness=0.05:contrast=1.1,
unsharp=5:5:1.0
" -c:v png output_processed.pngThis workflow: (1) scales respecting aspect ratio, (2) pads to 1200×800 with white borders, (3) adjusts brightness/contrast, (4) sharpens, (5) encodes as PNG.
PNG vs. Other Formats
| Property | PNG | JPEG | WebP | GIF |
|---|---|---|---|---|
| Compression | Lossless | Lossy | Lossy/Lossless | Lossless (indexed) |
| Bit Depth | 1–64 bits | 24 bits | 8–32 bits | 8 bits indexed |
| Transparency | Full alpha | None | Full alpha | 1-bit (indexed) |
| Animation | APNG | None | WebP anim | Native |
| Metadata | Rich (chunks) | EXIF/IPTC | Limited | Limited |
| Color Profiles | ICC, sRGB, gamma | EXIF/ICC | ICC, VP8 | None |
| Artifacts | None (lossless) | Banding, blockiness | Minimal (lossy) | Color banding |
| Typical Use | Web graphics, UI, screenshots | Photos (lossy) | Modern web (smaller files) | Animated memes/loops |
$ PNG Deep Dive Summary
- File Structure – Signature + chunks (IHDR, IDAT, IEND, plus ancillary). Each chunk: length, type, data, CRC.
- Compression – Two-stage: filtering (5 types: None, Sub, Up, Average, Paeth) then DEFLATE (LZ77 + Huffman).
- Color & Bit Depth – 5 types (0–6); supports 1–64 bits. Type 6 (RGBA) standard for web with transparency.
- Transparency – Full alpha channel (types 4, 6) or single transparent color via tRNS chunk.
- Interlacing – Adam7 enables progressive display (7 passes, 12.5% preview in first pass).
- Color Management – gAMA, cHRM, sRGB, ICC profiles ensure display consistency.
- FFmpeg – Powerful tool for conversion, scaling, filtering, metadata manipulation, frame extraction, APNG creation, and batch processing.