7z is the format you use when file size actually matters. Created by Igor Pavlov as the native format for his 7-Zip archiver (first released in 1999), 7z consistently produces the smallest archives among mainstream formats. On text-heavy content like source code, log files, and documents, 7z archives are 30-70% smaller than equivalent ZIPs. On highly repetitive data, the difference can exceed 80%.

The format achieves this through two mechanisms: LZMA2, a compression algorithm with dictionary sizes up to 1.5GB (vs Deflate's 32KB in ZIP), and solid compression, which treats all files as one continuous stream to exploit cross-file redundancy. Both are configurable, and understanding when to use each is key to getting the most from the format.

7z is open-source (LGPL license), extensible, and free. The only real downside is that no operating system opens it natively — though Windows 11 added extraction support in late 2023. Need to share with someone who can't open 7z? Convert 7Z to ZIP here.

Igor Pavlov and the Design of 7z

Igor Pavlov released 7-Zip in 1999 as a free, open-source archiver for Windows. The 7z format was designed alongside the tool, but the two are intentionally separate: the format specification is open, and anyone can implement a 7z reader or writer. The LZMA SDK is released into the public domain, making it one of the most freely licensed compression implementations available.

Pavlov's design philosophy was straightforward: maximize compression ratio, support modern encryption, and keep the format extensible. Unlike ZIP (which has accumulated decades of backward-compatible extensions) or RAR (proprietary and closed), 7z was designed from scratch with a modular architecture. The container format supports pluggable compression methods, filter chains (like the BCJ filter for executables), and multiple encryption algorithms — all cleanly separated.

7-Zip itself is maintained actively. As of 2026, version 24.x added ARM64 filter support, ZSTD compression, and improved multi-threading. The tool handles ZIP, RAR, TAR, GZIP, BZIP2, XZ, WIM, ISO, and dozens of other formats — making it the Swiss Army knife of archive tools.

LZMA and LZMA2: The Compression Engines

LZMA (Lempel-Ziv-Markov chain Algorithm) is Pavlov's compression algorithm and the reason 7z files are so small. It combines three techniques: LZ77-style dictionary matching with very large dictionary sizes, a Markov chain model for predicting upcoming bytes, and range coding (a more efficient variant of arithmetic coding) for the final bit-level encoding.

Dictionary Size: Why It Matters So Much

LZMA's dictionary sizes range from 64KB to 1.5GB. ZIP's Deflate uses 32KB. This 47,000x difference is the single biggest reason 7z compresses better. A larger dictionary means the algorithm remembers more previously-seen data and can find duplicate patterns separated by millions of bytes.

Concrete example: a 200MB log file with repeating patterns every 50,000 lines. Deflate (32KB window) can only match duplicates within ~32,000 bytes. LZMA with a 64MB dictionary matches duplicates separated by 64 million bytes. The result: LZMA compresses the log file to 5MB. Deflate compresses it to 35MB. Same data, same lossless output, 7x smaller.

The default dictionary size in 7-Zip is 16MB (for the "Normal" compression level) or 64MB (for "Ultra"). You can set it up to 1536MB, but diminishing returns kick in quickly, and the compressor needs 12x the dictionary size in RAM. A 1.5GB dictionary requires ~18GB of RAM for compression.

LZMA2: Multi-Threading and Better Handling

LZMA2 is an improved wrapper around LZMA that adds two critical features: multi-threaded compression (the input is split into blocks that compress in parallel) and better handling of incompressible data (LZMA2 stores incompressible blocks uncompressed rather than wasting bytes on failed compression attempts).

LZMA2 is the default method in 7z. There's no reason to use LZMA1 unless you need compatibility with very old decompressors. LZMA2 produces identical or slightly better compression while being significantly faster on multi-core processors — a 6-core CPU compresses roughly 4-5x faster with LZMA2 than with single-threaded LZMA.

Solid Compression: Cross-File Redundancy

Solid compression is 7z's second major advantage over ZIP. Instead of compressing each file independently (per-file compression, like ZIP), solid mode concatenates all files into a single continuous data stream and compresses the entire stream as one unit.

Why this matters: if your archive contains 500 JavaScript files from a React project, they share enormous amounts of common text — import statements, React API calls, JSX syntax, variable naming patterns. Per-file compression (ZIP) compresses each file individually and can't exploit any of this cross-file similarity. Solid compression sees the entire 500-file stream and replaces repeated patterns across all files.

Real numbers: 500 similar .js files totaling 50MB. ZIP (Deflate, per-file): 8.2MB. 7z (LZMA2, solid): 2.1MB. That's 4x smaller because solid compression captures the massive cross-file redundancy in similar source files.

The Random Access Tradeoff

Solid compression has a cost: you lose random access to individual files. To extract file #500 from a solid archive, the decompressor must decompress the entire stream up to file #500. On a large archive, this means extracting a single small file might decompress gigabytes of data first.

7-Zip mitigates this with configurable solid block sizes. Instead of treating the entire archive as one block, you can set the solid block size (e.g., 64MB). Files within each 64MB block use solid compression, but blocks are independent. Extracting a file only requires decompressing its block, not the entire archive.

For archives you create once and extract fully (backups, distribution packages), solid compression is always worth it. For archives where users frequently extract individual files, reduce the solid block size or disable solid mode entirely.

AES-256 Encryption with Filename Protection

7z uses AES-256 in CBC mode for encryption, with a key derived from the password via a SHA-256-based key derivation function. The number of SHA-256 iterations is configurable (default: 2^19 = 524,288 rounds), making brute-force attacks costly.

The standout feature: 7z can encrypt filenames, not just file contents. In an encrypted ZIP, anyone can see the names, sizes, and timestamps of all files without the password — only the file data is encrypted. In a 7z archive with filename encryption enabled, the entire directory listing is encrypted. Without the password, you can't even see what's inside.

This matters more than people realize. Filenames like "2026_tax_return.pdf" or "acquisition_target_acme_corp.docx" leak sensitive information even when the content is encrypted. If you're encrypting archives for security (not just casual password protection), always enable filename encryption.

In 7-Zip, filename encryption is a checkbox: "Encrypt file names" in the archive creation dialog. On the command line: 7z a -p -mhe=on archive.7z files/.

Multi-Volume 7z Archives

7z supports splitting archives into volumes of a specified size. This is useful for email attachment limits (typically 25MB), cloud storage upload limits, or fitting archives onto FAT32-formatted drives (4GB file size limit).

Command line: 7z a -v25m archive.7z files/ creates 25MB volumes (archive.7z.001, archive.7z.002, etc.). The recipient needs all volumes to extract. 7-Zip reassembles them automatically when you open the first volume.

Unlike RAR's multi-volume support (which has been standard in the piracy scene for decades), 7z multi-volume archives are less common and less tested across tools. If maximum compatibility matters, consider splitting with the generic split command on Linux or using RAR's multi-volume format instead.

Preprocessor Filters: BCJ, Delta, and More

7z supports preprocessor filters that transform data before compression to improve the ratio. The most important is BCJ (Branch, Call, Jump), which normalizes machine code addresses in executables. In compiled binaries, CALL and JUMP instructions reference absolute addresses that differ between files even when the code is similar. BCJ converts these to relative addresses, making the code more compressible.

On a collection of Windows .exe and .dll files, BCJ + LZMA2 typically compresses 5-15% better than LZMA2 alone. 7-Zip applies the appropriate BCJ filter automatically when it detects executable content. Variants exist for x86, ARM, ARM64, PowerPC, IA-64, and SPARC.

The Delta filter is useful for uncompressed audio (WAV, raw PCM) and other data where adjacent bytes are similar. It stores the difference between consecutive bytes rather than the bytes themselves, which LZMA2 then compresses very efficiently.

7z's Real Downsides

Slow compression. LZMA2 at Ultra settings (64MB dictionary) is 5-10x slower than Deflate for compression. Decompression is fast — comparable to or faster than Deflate — but creating 7z archives takes meaningfully longer. On a 10GB dataset, ZIP creation might take 2 minutes; 7z at Ultra might take 15-20 minutes.

No native OS support for creation. Windows 11 23H2 can extract 7z files, but creating them requires 7-Zip. macOS requires Keka or The Unarchiver. Linux requires p7zip or the 7z package. Sending a 7z file to a non-technical recipient often results in confusion.

No Unix metadata preservation. 7z does not store Unix file permissions, ownership, or symbolic links. If you archive a Linux project with executable scripts and symlinks in 7z, they'll lose their permissions on extraction. For Unix-native archiving, use TAR (optionally compressed with LZMA2 via .tar.xz). Convert 7Z to TAR.GZ if you need Unix metadata.

No recovery records. Unlike RAR, 7z has no built-in error correction. A single corrupted byte can make the archive unextractable, especially in solid mode where corruption affects all subsequent files in the block. For archival on potentially degrading media, either add external PAR2 recovery files or use RAR with recovery records.

7z is the right choice when compression ratio matters and you control the tooling on both ends. For personal backups, archiving source code, storing datasets, or distributing files between developers, it produces dramatically smaller archives than ZIP with no loss of data integrity.

The main scenarios where 7z is the wrong choice: sharing with non-technical recipients (use ZIP), archiving Unix systems with permissions (use TAR.XZ), and storage on potentially degrading media (use RAR with recovery records). For everything else, 7z is the most efficient open-source archive format available. Convert your ZIPs to 7Z to see the size difference yourself.