JPEG was standardized in 1992 by the Joint Photographic Experts Group — and 34 years later, it remains the single most common image format on the internet, in cameras, in documents, and in print. Every device, every operating system, every browser, every email client, every image editor, and every social platform handles JPEG. Nothing else has this coverage.

JPEG's longevity isn't because it's the best compressor (WebP and AVIF both beat it significantly). It's because the format achieved universal critical mass decades ago, and the switching costs for billions of devices and trillions of images are enormous. Understanding how JPEG actually works — the DCT transform, quantization, chroma subsampling, quality factor — helps you make better decisions about when to use it and how to optimize it.

This guide covers the complete compression pipeline, practical quality settings, progressive vs baseline encoding, metadata handling, and modern optimization with mozjpeg. Converting away from JPEG? Convert JPG to WebP for 25-35% smaller web files, or JPG to AVIF for 50%+ savings.

The JPEG Compression Pipeline

JPEG compression happens in five stages, each contributing to the final file size. Understanding these stages explains both JPEG's strengths and its characteristic artifacts.

Step 1: Color Space Conversion (RGB to YCbCr)

JPEG converts RGB pixels to YCbCr: luminance (Y = brightness), blue-difference chrominance (Cb), and red-difference chrominance (Cr). This separates brightness information (which humans are very sensitive to) from color information (which humans are less sensitive to).

The conversion is mathematically lossless — no information is lost in this step. But it enables the next step (chroma subsampling) to discard color data that human eyes won't miss. Without this separation, you couldn't selectively reduce color resolution.

Step 2: Chroma Subsampling

The Cb and Cr channels are downsampled — typically halved in both dimensions (4:2:0 subsampling). This discards 75% of the color resolution while keeping full brightness resolution. For photographs, the visual impact is minimal because human eyes are 2-4x less sensitive to color detail than brightness detail.

4:4:4 — No subsampling. Full color resolution. Used when color detail matters: red text, sharp color transitions, medical imaging.

4:2:2 — Half horizontal color resolution. A middle ground rarely used in still images.

4:2:0 — Half horizontal and vertical color resolution. The standard for photographs. This single step reduces data by ~33% with negligible visual impact on natural images.

Where chroma subsampling fails: thin red lines on white backgrounds, colored text, pixel art with precise color placement, and any content where color transitions must be pixel-sharp. For these, use PNG or set 4:4:4 subsampling.

Step 3: DCT Transform (8x8 Blocks)

The image is divided into 8x8 pixel blocks. Each block is transformed from spatial domain (pixel values) to frequency domain (patterns of varying detail) using the Discrete Cosine Transform.

The DCT produces 64 coefficients per block. The coefficient at position (0,0) is the DC coefficient — the average brightness of the block. The remaining 63 AC coefficients represent increasingly fine patterns of brightness variation. Low-frequency coefficients (position near 0,0) capture broad shapes and gradients. High-frequency coefficients (far from 0,0) capture fine detail, sharp edges, and noise.

The 8x8 block size is fixed in JPEG. This constraint causes the characteristic "blocky" artifacts at low quality — each 8x8 block is compressed independently, and the quantization creates visible boundaries between blocks. WebP's variable block sizes (4x4 to 16x16) and AVIF's (4x4 to 128x128) avoid this problem.

Step 4: Quantization (The Lossy Step)

This is where quality loss happens. Each of the 64 DCT coefficients is divided by a corresponding value from a quantization table, then rounded to the nearest integer. High-frequency coefficients (fine detail) are divided by larger values, rounding them to zero. Low-frequency coefficients (broad shapes) are divided by smaller values, preserving them more precisely.

The quantization table determines the quality/size tradeoff. A high-quality setting uses small divisors (preserving more coefficients), producing a larger file. A low-quality setting uses large divisors (zeroing out more coefficients), producing a smaller file with visible artifacts.

Once quantized, the data is losslessly compressed using zigzag scanning (converting the 8x8 matrix into a 1D sequence ordered by frequency), run-length encoding of zeros, and Huffman coding. This final stage is lossless — all the quality loss happened in quantization.

Quality Factor: What the Numbers Really Mean

JPEG quality is a 0-100 scale, but it's not standardized. Quality 85 in Photoshop, quality 85 in ImageMagick, and quality 85 in FFmpeg produce different quantization tables and different file sizes. The number is a guide, not a specification.

What happens at each quality range for a typical 12MP photograph:

QualityTypical File SizeVisual QualityUse Case
95-1005-10 MBIndistinguishable from originalArchival, printing
85-922-4 MBExcellent, artifacts invisible at normal viewingHigh-quality web, social media
75-841-2 MBGood, minor artifacts in gradients on close inspectionGeneral web delivery
60-74500KB-1 MBAcceptable, artifacts visible on zoomThumbnails, previews
30-59200-500 KBNoticeable artifacts, blockiness in flat areasTiny thumbnails only
0-2950-200 KBSevere artifacts, blocky, color bleedingPlaceholder images, LQIP

The sweet spot for web delivery: quality 82-87 with mozjpeg encoding. This produces files 30-40% smaller than default libjpeg at the same visual quality.

Progressive vs Baseline JPEG

Baseline JPEG loads top-to-bottom, one row of 8x8 blocks at a time. On a slow connection, you see the top of the image first, then progressively more as data loads. Fast to encode and decode.

Progressive JPEG stores multiple scans of the entire image at increasing quality. Scan 1 shows a blurry full-image preview. Scan 2 adds more detail. Scan 3 adds more. Final scan provides full quality. The entire image appears immediately (blurry) and sharpens over time.

Which to use:

  • Progressive files are typically 2-5% smaller than baseline at the same quality (more efficient Huffman coding across scans)
  • Progressive provides better perceived loading speed — users see something immediately
  • Progressive requires slightly more CPU for decoding (multiple passes)
  • Baseline is simpler and faster to encode

For web delivery, progressive is almost always better: smaller file, better user experience during loading. Most modern encoders (mozjpeg, Photoshop, Lightroom) default to progressive for files over ~10KB.

EXIF, IPTC, and XMP Metadata

JPEG files carry metadata in up to three systems, each with different purposes.

EXIF (Camera Data)

EXIF stores camera settings at capture time: camera model, lens, ISO, aperture, shutter speed, focal length, white balance, flash status, GPS coordinates, timestamp, and orientation. EXIF data is typically 5-50KB but can reach 500KB if the camera embeds a large thumbnail preview.

Privacy concern: GPS coordinates in EXIF data reveal exactly where a photo was taken. Before sharing photos publicly, strip EXIF GPS data (or all EXIF) to protect location privacy. Social media platforms (Instagram, Facebook, Twitter) strip EXIF automatically on upload. Email does not.

Orientation tag: Modern cameras store photos in the sensor's native orientation and include an EXIF orientation tag that tells software how to rotate the display. A photo taken in portrait mode is stored landscape with a rotation tag. If your JPEG appears rotated wrong, the EXIF orientation isn't being read — either the viewer doesn't support it or the metadata was stripped during conversion.

IPTC and XMP

IPTC (International Press Telecommunications Council) stores editorial metadata: caption, photographer name, copyright, keywords, location name (city/country rather than GPS). Used extensively in photojournalism and stock photography.

XMP (Extensible Metadata Platform) is Adobe's XML-based metadata format. It can store everything EXIF and IPTC store, plus custom metadata from Lightroom presets, editing history, face tags, and more. XMP data can be 1-100KB.

For web delivery, strip IPTC and XMP unless your workflow requires them (e.g., copyright metadata for stock photos). The space savings are typically 2-50KB per image.

mozjpeg: Modern JPEG Optimization

mozjpeg is Mozilla's optimized JPEG encoder — a drop-in replacement for libjpeg-turbo that produces files 5-15% smaller at the same quality level. It achieves this through improved quantization table selection, better trellis quantization, and optimized Huffman coding.

The results are significant at scale: a website serving 100,000 images per day at an average of 200KB saves 1-3GB of daily bandwidth by switching from libjpeg to mozjpeg at the same quality setting. The visual output is identical or slightly better.

mozjpeg in practice:

  • ImageMagick and Sharp (libvips) can use mozjpeg as their JPEG encoder
  • Squoosh (Google's web image optimizer) uses mozjpeg
  • Most CDN image optimization pipelines (Cloudflare, Cloudinary, imgix) use mozjpeg or similar optimized encoders
  • Encoding speed is 3-5x slower than libjpeg-turbo, which is fine for build-time optimization but noticeable in real-time pipelines

Understanding JPEG Artifacts

JPEG produces characteristic visual artifacts that become visible at lower quality settings. Knowing what causes them helps you choose the right quality level.

Block artifacts: Visible 8x8 pixel grid boundaries, especially in flat-color areas. Caused by independent quantization of each 8x8 block — adjacent blocks may quantize to slightly different values, creating visible edges. Most noticeable at quality 60 and below.

Ringing artifacts (Gibbs phenomenon): Halo-like patterns around sharp edges and text. Caused by the DCT's inability to perfectly represent sharp transitions with limited frequency coefficients. Looks like bright/dark echoes around high-contrast edges.

Color bleeding: Color information spreading across edges, especially at block boundaries. Caused by chroma subsampling — the lower-resolution color channels can't track sharp color transitions as precisely as the luminance channel.

Mosquito noise: Shimmering artifacts in flat areas near edges in motion (video) or near text/lines in still images. A combination of ringing and block artifacts interacting at high-contrast boundaries.

All these artifacts are most visible on screenshots, text, line art, and computer-generated graphics. They're least visible on photographic content with natural textures and gradual transitions — which is exactly what JPEG was designed for.

When JPEG Is Still the Right Choice

Despite being outperformed by WebP and AVIF on compression, JPEG remains the right choice in several scenarios:

  • Email: Every email client renders JPEG. WebP support in email is inconsistent. AVIF support is nonexistent. For email images, JPEG is the only safe choice besides PNG.
  • Print: Print workflows universally accept JPEG (and TIFF). Print shops won't handle WebP or AVIF. Use JPEG at quality 95+ for print, or TIFF for maximum quality.
  • Universal sharing: When you don't know what device or software the recipient uses, JPEG is the safest bet. USB drives, presentations, documents, old phones, legacy systems — JPEG works everywhere.
  • Existing JPEG files: Never convert JPEG to JPEG at a lower quality. Each lossy re-encoding loses more data. If you have a JPEG source, convert once to WebP/AVIF for web delivery and keep the original JPEG.
  • Real-time encoding: JPEG encodes 5-10x faster than WebP and 10-50x faster than AVIF. For live camera feeds, real-time image processing, and latency-sensitive applications, JPEG's encoding speed matters.

JPEG is 34 years old and still the most important image format in the world. Not because it's the best compressor — it isn't — but because it achieved universal adoption before the alternatives existed, and the internet's infrastructure is built around it. Understanding how JPEG compression works helps you use it effectively: the right quality level, progressive encoding, mozjpeg optimization, and knowing when to switch to a newer format.

For web-optimized delivery, convert JPEG to a modern format: JPG to WebP saves 25-35%, JPG to AVIF saves 50%+. Going the other direction? WebP to JPG and AVIF to JPG when you need universal compatibility. All free, most running in your browser.