Archives serve two purposes that people frequently conflate: bundling multiple files into one, and compressing the result to save space. Some formats do both (ZIP, 7Z, RAR). Others deliberately separate the concerns — TAR bundles files without compressing, then a second tool (GZIP, BZIP2, XZ, or Zstandard) compresses the tarball. This separation isn't a design flaw; it's a feature that gives you control over which compression algorithm to use.
The choice between archive formats comes down to three factors: compression ratio (how small does it get?), compatibility (can the recipient open it?), and speed (how long does compression/decompression take?). No format wins all three. ZIP sacrifices compression ratio for universal compatibility. 7Z achieves the best ratios but requires third-party software. TAR.GZ balances all three for Unix environments. Here's the full breakdown.
ZIP: The Universal Default
ZIP was created by Phil Katz in 1989 (PKZIP) and has been the default archive format on Windows since XP (2001) and macOS since the beginning. Every major operating system can create and extract ZIP files without installing anything. That alone makes it the right choice for 80% of use cases.
How ZIP Works
ZIP compresses each file independently using the Deflate algorithm (by default). This per-file compression means you can extract a single file from a ZIP archive without decompressing everything else — a real advantage for large archives where you only need one file.
The tradeoff is compression efficiency. Because files are compressed independently, ZIP can't exploit redundancy between files. If your archive contains 500 similar XML files, ZIP compresses each one separately and misses the massive cross-file redundancy that solid compression would catch.
Modern ZIP implementations support algorithms beyond Deflate: LZMA, Bzip2, and Zstandard (ZSTD). But these newer methods reduce compatibility — Windows' built-in ZIP handler only understands Deflate. If you use LZMA compression inside a ZIP, the recipient needs 7-Zip or WinRAR to open it, defeating ZIP's compatibility advantage.
ZIP32 vs ZIP64
Classic ZIP (ZIP32) has two hard limits: individual files can't exceed 4GB, and the archive can't contain more than 65,535 files. ZIP64, introduced in 2001, removes both limits. Most modern tools create ZIP64 automatically when needed, but very old software may choke on ZIP64 archives. In practice, this is only a concern if your recipient is running software from before 2005.
ZIP Encryption: AES-256 Only
ZIP supports two encryption methods. ZipCrypto is the legacy method — it's fast but cryptographically broken. Known-plaintext attacks can crack ZipCrypto in minutes if the attacker knows any file in the archive (and many archives contain predictable files like XML manifests). Never use ZipCrypto for anything sensitive.
AES-256 is the modern option and is genuinely secure. 7-Zip, WinRAR, and macOS (via Keka) all support AES-256 ZIP encryption. Windows' built-in ZIP handler can extract AES-256 encrypted ZIPs since Windows 11 23H2, but older Windows versions need third-party tools.
7Z: Maximum Compression
7Z is the native format of 7-Zip, created by Igor Pavlov in 1999. It consistently produces the smallest archives among mainstream formats, typically 30-70% smaller than ZIP for the same content. The format is open (LGPL licensed), the reference implementation is free, and it's the go-to choice when file size matters more than convenience.
LZMA2 and Solid Compression
7Z defaults to LZMA2 compression, which uses a dictionary-based algorithm with much larger dictionary sizes than Deflate (up to 1.5GB vs Deflate's 32KB). Larger dictionaries mean the compressor can find and exploit repetitive patterns across a much wider window of data, producing dramatically smaller output for files with long-range redundancy.
7Z also supports solid compression — treating all files in the archive as a single continuous data stream. Instead of compressing each file independently (like ZIP), 7Z concatenates file contents and compresses the whole stream. If your archive contains 1,000 similar log files, solid compression exploits the cross-file redundancy that ZIP misses entirely.
The downside of solid compression: you can't extract a single file efficiently. To get file #500, the decompressor must decompress files #1 through #499 first. For large archives where you frequently extract individual files, this is a real penalty. You can disable solid mode in 7-Zip's settings if random access matters.
The Compatibility Problem
No major operating system opens 7Z natively. Windows 11 added basic 7Z extraction in 2023 (23H2 update), but creating 7Z archives still requires third-party software. macOS needs Keka or The Unarchiver. Linux needs p7zip. This means sending a 7Z file to a non-technical recipient often results in "I can't open this" — and then you re-send as ZIP anyway.
Use 7Z for: personal backups, archiving for long-term storage, sharing with developers or tech-savvy colleagues, and any scenario where you control both ends of the pipeline. Use ZIP when the recipient might not have 7-Zip installed. Convert 7Z to ZIP here when you need to share with someone who can't open it.
TAR.GZ: The Unix Standard
TAR (Tape Archive, 1979) bundles files into a single stream — preserving filenames, directory structure, permissions, ownership, timestamps, and symlinks. It does not compress. GZIP (1992) then compresses the resulting .tar stream. The two-tool pipeline is the standard way to distribute software and create backups on Linux and macOS.
Why TAR Preserves What ZIP Doesn't
TAR preserves Unix file permissions (rwxr-xr-x), user/group ownership, symbolic links, hard links, device nodes, and extended attributes. ZIP's permission handling is incomplete and platform-dependent — extract a ZIP on Linux and file permissions may be wrong or missing. This is why source code and Linux system backups use TAR, not ZIP.
TAR also preserves the exact directory structure, including empty directories (which ZIP sometimes omits). For backing up a server configuration or distributing a project with its directory hierarchy intact, TAR is the reliable choice.
Sequential Access Only
TAR is a stream format designed for tape drives. There's no central directory — to list files or extract one file, you must read from the beginning. GZIP makes this worse: since the entire TAR stream is compressed as one unit, extracting any file requires decompressing everything before it. This is the opposite of ZIP's random access capability.
For archives you'll create once and extract fully (software distribution, backups), sequential access doesn't matter. For archives where users routinely extract individual files, ZIP is more practical.
Compression Variants: GZ vs BZ2 vs XZ vs ZST
TAR itself doesn't compress, so you choose your compression algorithm separately. Here's how they compare on a practical level:
| Suffix | Algorithm | Compression Ratio | Compress Speed | Decompress Speed | Adoption |
|---|---|---|---|---|---|
| .tar.gz (.tgz) | GZIP (Deflate) | Good | Fast | Very fast | Universal |
| .tar.bz2 (.tbz2) | Bzip2 (BWT) | Better (~10-15% over gz) | Slow | Slow | Declining |
| .tar.xz (.txz) | XZ (LZMA2) | Best (~20-30% over gz) | Very slow | Fast | Growing (Linux kernel uses this) |
| .tar.zst | Zstandard | Similar to xz | Very fast | Very fast | Growing fast (Meta/Facebook) |
TAR.GZ is the safe default — universally supported and fast. TAR.XZ gives the best compression and is used by major Linux distributions for package distribution (the Linux kernel source tarball is .tar.xz). TAR.BZ2 is largely obsoleted by XZ, which compresses better and decompresses faster. TAR.ZST (Zstandard, developed by Facebook/Meta) is the newest contender — it matches XZ's compression ratio while compressing and decompressing dramatically faster. Arch Linux and Fedora use Zstandard for packages.
For cross-archive conversions: TAR.GZ to ZIP, TAR.GZ to 7Z, ZIP to TAR.GZ.
RAR: Proprietary but Capable
RAR is a proprietary format created by Alexander Roshal (the R in RAR). WinRAR is the shareware utility that everyone has used on a 40-day "trial" for 20 years. The format offers good compression (between ZIP and 7Z), solid compression support, and one genuinely unique feature: recovery records.
Recovery records add redundant error-correction data to the archive. If the file gets partially corrupted (bad disk sector, incomplete download), WinRAR can reconstruct the damaged portion. No other mainstream format offers this. For archival storage on media that might degrade, RAR's recovery records are legitimately useful.
That said, RAR is proprietary. Only WinRAR can create RAR files. Extraction is more widely supported (7-Zip, The Unarchiver, and many Linux tools can extract RAR), but the format's future depends entirely on Roshal's continued development. For distribution, prefer ZIP or 7Z — open formats with multiple implementations. Convert RAR to ZIP or RAR to 7Z for better compatibility.
Real-World Compression Benchmarks
| Test Data | Raw Size | ZIP (Deflate) | 7Z (LZMA2) | TAR.GZ | TAR.XZ |
|---|---|---|---|---|---|
| Mixed documents (Word, PDF, images) | 1 GB | ~380 MB | ~280 MB | ~350 MB | ~260 MB |
| Source code (JavaScript project) | 1 GB | ~120 MB | ~60 MB | ~100 MB | ~55 MB |
| Log files (repetitive text) | 1 GB | ~80 MB | ~18 MB | ~65 MB | ~15 MB |
| Already-compressed media (MP4, JPG) | 1 GB | ~990 MB | ~985 MB | ~988 MB | ~984 MB |
Key observations from these numbers:
- Source code and log files show the biggest format differences. 7Z and TAR.XZ exploit long-range redundancy that ZIP/GZIP miss, achieving 2-4x better compression on highly repetitive data
- Mixed documents show moderate differences. JPG and PDF files within the archive are already compressed internally, limiting further gains. The text-based documents (Word DOCX is XML in a ZIP) drive most of the compression
- Already-compressed media barely compresses at all. MP4, JPG, MP3, and PNG are already entropy-coded. Archiving them saves effectively zero space — you're just bundling files. Don't waste CPU time on maximum compression for media files; use ZIP or TAR with no compression (tar -cf, not tar -czf)
Format Comparison at a Glance
| Feature | ZIP | 7Z | TAR.GZ | TAR.XZ | RAR |
|---|---|---|---|---|---|
| Compression ratio | Good | Excellent | Good | Excellent | Very good |
| Compression speed | Fast | Slow | Fast | Very slow | Moderate |
| Decompression speed | Very fast | Fast | Very fast | Fast | Fast |
| Windows native | Full | Extract only (11 23H2+) | No | No | No |
| macOS native | Full | No | Full | Full | No |
| Linux native | Via unzip (pre-installed) | Via p7zip | Full | Full | Via unrar |
| Random file access | Yes | No (solid mode) | No | No | No (solid mode) |
| Encryption | AES-256 / ZipCrypto | AES-256 | None (use GPG) | None (use GPG) | AES-256 |
| Unix permissions | Partial | No | Full | Full | No |
| Recovery records | No | No | No | No | Yes |
| License | Open | Open (LGPL) | Open (GNU) | Open (GNU) | Proprietary |
Decision Framework
- Sending files to anyone: ZIP. No explanation needed, no software installation required. Convert 7Z to ZIP or TAR.GZ to ZIP before sharing
- Archiving for long-term storage: 7Z with LZMA2. The 30-70% space savings over ZIP compound across large archives. Storage is cheap but not free
- Linux source code or backups: TAR.GZ or TAR.XZ. TAR preserves permissions and symlinks that ZIP doesn't. Use .tar.gz for speed, .tar.xz for size
- Maximum compression, don't care about speed: TAR.XZ or 7Z. Both use LZMA2 and achieve similar ratios. TAR.XZ is more common on Linux; 7Z is more common on Windows
- Fast compression AND good ratio: TAR.ZST (Zstandard). It compresses nearly as well as XZ but 5-10x faster. Adoption is growing rapidly
- Archiving media that might degrade: RAR with recovery records. This is RAR's one genuinely unique capability
- Compressing already-compressed files (JPG, MP4, MP3): Don't bother with heavy compression. Use ZIP or uncompressed TAR — you're just bundling, not compressing
For 80% of people, ZIP is the answer. It works everywhere, it's fast, and the compression is good enough. The remaining 20% — developers, sysadmins, and anyone serious about storage efficiency — should use 7Z for maximum compression or TAR.GZ/TAR.XZ for Unix-native workflows.
If you have an archive in the wrong format, convert it: 7Z to ZIP for sharing, ZIP to 7Z for archival, RAR to ZIP to escape the proprietary format, or ZIP to TAR.GZ for Linux environments. The content inside is identical — only the container and compression change.