These four algorithms all solve the same problem — make files smaller — but they make dramatically different tradeoffs between compression ratio, compression speed, and decompression speed. Choosing the wrong one means either wasting time (slow compression when speed matters) or wasting space (weak compression when ratio matters).

This isn't a theoretical comparison. Below are real benchmarks on text, source code, and binary data. The numbers come from compressing actual datasets on a modern multi-core CPU, not from synthetic benchmarks that don't reflect real workloads.

The short answer: use Zstandard for everything new. Use GZIP for compatibility with legacy systems. Use XZ when absolute minimum file size matters and you don't care about compression speed. Skip BZIP2 entirely — it's been outclassed by XZ since 2009.

GZIP: The Universal Baseline

GZIP (GNU zip, 1992) uses the Deflate algorithm — the same LZ77 + Huffman combination as ZIP. Jean-loup Gailly and Mark Adler created it as a free replacement for the Unix compress utility, which used the patent-encumbered LZW algorithm. GZIP won. It's now the most widely deployed compression algorithm in computing.

Where GZIP Is Used

GZIP is everywhere beyond just file archiving:

  • HTTP: Every web server and browser supports Content-Encoding: gzip. Most web traffic is GZIP-compressed on the wire. When you load a website, the HTML, CSS, and JavaScript are typically GZIP-compressed during transfer.
  • TAR.GZ: The default compressed tarball format for software distribution on Unix/Linux since the 1990s.
  • Databases: PostgreSQL, MySQL, and most databases use GZIP for backup compression by default.
  • Log files: logrotate on Linux defaults to GZIP for rotating and compressing old log files.
  • APIs: Most REST APIs return GZIP-compressed responses.

This universality is GZIP's greatest asset. Every programming language has a GZIP library. Every operating system includes GZIP tools. Every server and client handles GZIP. When in doubt, GZIP works.

Compression Levels (1-9)

GZIP supports levels 1 (fastest, least compression) through 9 (slowest, most compression). The default is 6. Practical differences on a 100MB text file:

LevelOutput SizeCompress SpeedDecompress Speed
1 (fast)~12.5 MB~180 MB/s~350 MB/s
6 (default)~9.2 MB~120 MB/s~350 MB/s
9 (best)~8.8 MB~40 MB/s~350 MB/s

Notice: decompression speed is the same regardless of compression level. The level only affects compression time and ratio. Level 9 produces files barely smaller than level 6 (5% smaller) but takes 3x longer. Level 6 is the sweet spot for almost all use cases. Level 1 makes sense for real-time compression (streaming, HTTP on-the-fly).

BZIP2: The Obsolete Middle Ground

BZIP2 (1996) uses the Burrows-Wheeler Transform (BWT), a fundamentally different approach than GZIP's LZ77. BWT rearranges the input data to cluster similar bytes together, then uses Move-to-Front transform and Huffman coding to compress the clustered result. This achieves 10-15% better compression than GZIP on most data types.

Why BZIP2 Is Obsolete

BZIP2 was the go-to upgrade from GZIP between 1996 and 2009. Then XZ arrived and made BZIP2 pointless:

MetricBZIP2XZWinner
Compression ratioBetter than GZIP by 10-15%Better than GZIP by 20-30%XZ
Compression speed~25 MB/s~15 MB/sBZIP2 (slightly)
Decompression speed~80 MB/s~200 MB/sXZ (2.5x faster)
Memory usage~10 MB~65 MB (default)BZIP2

XZ compresses significantly smaller AND decompresses 2.5x faster. BZIP2's only advantage is slightly faster compression and lower memory usage. For any use case where you chose BZIP2 over GZIP for the better ratio, XZ now gives an even better ratio with faster decompression.

The Linux kernel switched from .tar.bz2 to .tar.xz in 2013. Most Linux distributions have followed. BZIP2 is still supported everywhere, so existing .tar.bz2 files extract fine, but there's no reason to create new ones.

If you have .tar.bz2 files, convert to TAR.GZ for universality or convert to TAR.XZ for better compression.

XZ: Maximum Compression, Minimum Patience

XZ (2009) uses LZMA2, the same algorithm as 7z. It achieves the best compression ratios of any mainstream single-file compressor. The Linux kernel, most Linux distributions, and many open-source projects distribute their source as .tar.xz because the bandwidth savings at scale justify the slow compression.

When XZ Shines

XZ's strength is compress-once, decompress-many workloads. The compression is slow (5-10x slower than GZIP) but decompression is fast (comparable to GZIP). This makes it ideal for:

  • Software distribution: You compress the release tarball once. Millions of users decompress it. Total CPU time saved is enormous.
  • Package repositories: Debian, Fedora, and Arch Linux distribute packages as .tar.xz or .tar.zst. The bandwidth savings across millions of downloads justify the one-time compression cost.
  • Long-term archival: You compress once for storage. When you eventually need the data, decompression is fast.

XZ Compression Levels and Memory

XZ levels range from 0 to 9, with an -e (extreme) option. Memory usage scales dramatically with level:

LevelDictionary SizeCompress MemoryDecompress Memory
-11 MB~10 MB~2 MB
-6 (default)8 MB~94 MB~10 MB
-964 MB~674 MB~66 MB
-9e (extreme)64 MB~674 MB~66 MB

Level -6 (default) is a good balance. Level -9 uses 7x more memory for marginal gains (typically 2-5% smaller). On memory-constrained systems (embedded devices, CI runners with low RAM), use -1 or -3.

Zstandard (ZSTD): The Modern Answer

Zstandard was developed by Yann Collet at Facebook (now Meta) and released in 2016. It breaks the traditional compression/speed tradeoff that all previous algorithms imposed. At comparable compression ratios to GZIP, Zstandard compresses 3-5x faster. At comparable ratios to XZ, Zstandard compresses 10-20x faster. Decompression is 3-5x faster than everything else in this comparison.

22 Compression Levels: From Real-Time to Ultra

Zstandard offers levels 1 through 22 (with negative levels for even faster, lower-ratio compression). This range is far wider than GZIP (1-9) or XZ (0-9):

ZSTD LevelComparable ToCompress SpeedRatio (1GB text)
1GZIP -1 ratio, 3x faster~500 MB/s~42%
3 (default)GZIP -6 ratio, 3x faster~400 MB/s~38%
9Between GZIP -9 and BZIP2~100 MB/s~35%
19XZ -6 ratio, 2-3x faster~15 MB/s~30%
22XZ -9 ratio, slightly faster~5 MB/s~28%

Decompression speed is consistently 700-900 MB/s regardless of compression level. This is Zstandard's secret weapon — even at ultra-high compression levels that match XZ's ratio, decompression is 3-4x faster than XZ.

Dictionary Pre-Training: ZSTD's Unique Feature

Zstandard supports dictionary training — you can provide a set of sample files, and Zstandard builds a dictionary of common patterns. The dictionary is then used during compression and must be available during decompression. This dramatically improves compression for small files that share common patterns.

Example: compressing individual JSON API responses (1-5KB each). Without a dictionary, GZIP achieves maybe 60% compression. With a Zstandard dictionary trained on 1,000 sample responses, compression reaches 85-90% because the dictionary already contains the common keys, values, and structures.

Dictionary compression is used by: Cloudflare (compressing HTTP responses), databases (compressing small records), and game engines (compressing network packets). It's not typically used for file archiving, but it's a powerful feature for application-level compression.

Adoption Status in 2026

Zstandard adoption has been rapid but isn't universal yet:

  • Linux: Arch Linux uses .tar.zst for all packages. Fedora supports it. The zstd package is available on all major distributions.
  • HTTP: Content-Encoding: zstd is supported in Chrome 123+, Firefox 126+, and Safari 17.4+. Server-side support is growing.
  • Docker: Docker supports Zstandard-compressed image layers (since Docker 23).
  • 7-Zip: Version 21+ supports .tar.zst and Zstandard compression within .7z archives.
  • Gaps: Windows' built-in tools don't support .tar.zst. macOS Archive Utility doesn't support it. Python's stdlib tarfile doesn't handle .tar.zst natively (you need a wrapper). These gaps are closing but not closed.

Head-to-Head Benchmarks

All benchmarks run on an 8-core AMD Ryzen 7, 32GB RAM, NVMe storage. Single-threaded compression unless noted. Decompress speed is always single-threaded (multi-threaded decompression is uncommon).

Text Data (1GB English Wikipedia dump)

AlgorithmCompressed SizeRatioCompress TimeDecompress Time
GZIP -6341 MB3.0x8.2s (122 MB/s)2.8s (357 MB/s)
GZIP -9326 MB3.1x22s (45 MB/s)2.8s (357 MB/s)
BZIP2 -9254 MB3.9x38s (26 MB/s)12s (83 MB/s)
XZ -6210 MB4.8x58s (17 MB/s)4.9s (204 MB/s)
XZ -9195 MB5.1x110s (9 MB/s)5.0s (200 MB/s)
ZSTD -3325 MB3.1x2.4s (417 MB/s)1.3s (769 MB/s)
ZSTD -9275 MB3.6x9.5s (105 MB/s)1.4s (714 MB/s)
ZSTD -19218 MB4.6x62s (16 MB/s)1.5s (667 MB/s)

Key finding: ZSTD -3 matches GZIP -9 compression ratio while compressing 9x faster and decompressing 2.2x faster. ZSTD -19 nearly matches XZ -6 ratio while decompressing 3.3x faster.

Source Code (Linux kernel 6.x source tree, 1.3GB uncompressed)

AlgorithmCompressed SizeRatioCompress TimeDecompress Time
GZIP -6178 MB7.3x10.8s3.7s
BZIP2 -9122 MB10.7x50s15.8s
XZ -692 MB14.1x78s6.3s
ZSTD -3165 MB7.9x3.1s1.7s
ZSTD -1998 MB13.3x85s2.0s

Source code shows the biggest differences between algorithms. XZ -6 produces a file 48% smaller than GZIP -6. ZSTD -19 gets within 6% of XZ -6's ratio while decompressing 3x faster.

Binary Data (500MB compiled executables)

AlgorithmCompressed SizeRatioCompress TimeDecompress Time
GZIP -6198 MB2.5x4.1s1.4s
BZIP2 -9172 MB2.9x19s6.1s
XZ -6135 MB3.7x28s2.5s
ZSTD -3185 MB2.7x1.2s0.7s
ZSTD -19142 MB3.5x30s0.8s

Binary data (executables, shared libraries) compresses less than text but still shows significant differences. XZ and ZSTD -19 achieve 3.5-3.7x compression. Note that both 7z and XZ benefit from BCJ (branch/call/jump) filters on executables, which these raw algorithm benchmarks don't use.

Which Algorithm to Use: Decision Framework

  • New projects, no legacy constraints: Zstandard. Level 3 for general use, level 9-15 for distribution, level 19-22 for archival. It's faster at every ratio tier.
  • Maximum compatibility (any system, any decade): GZIP. Level 6 (default). Every system since the 1990s handles .gz files.
  • Absolute smallest file, don't care about compression speed: XZ level 6 or 9. Used for Linux kernel releases, package repositories, and any compress-once/decompress-many scenario.
  • Log file compression: Zstandard level 1-3. Logs are highly compressible and generated continuously — you need fast compression. ZSTD level 1 compresses at 500+ MB/s.
  • HTTP response compression: GZIP for universal browser support. Zstandard for modern browsers (Chrome 123+, Firefox 126+, Safari 17.4+). Brotli (not covered here — it's Google's web-focused algorithm) is the most common GZIP alternative for HTTP.
  • Database backups: Zstandard. PostgreSQL's pg_dump piped to zstd -T0 (multi-threaded) compresses faster than the dump generates data.
  • Replacing existing BZIP2 usage: Switch to XZ if ratio matters, GZIP if speed matters, or Zstandard for the best of both.

The compression landscape has a clear hierarchy in 2026: Zstandard is the best general-purpose choice, GZIP is the compatibility fallback, and XZ is the maximum-compression specialist. BZIP2 is a historical artifact — better than GZIP when it was the only alternative, but outclassed by XZ in ratio and by Zstandard in speed.

For archive conversions between these formats: TAR.GZ to TAR.XZ for smaller files, TAR.XZ to TAR.GZ for compatibility, TAR.BZ2 to TAR.GZ to modernize old archives, or TAR.BZ2 to TAR.XZ to get better compression with faster decompression.