You download a ZIP file. Inside is another ZIP file. Inside that is a TAR.GZ. Inside that are the actual files you need. This isn't unusual — it happens when someone zips a folder that already contains archives, when email workflows add layers of compression, or when software packages contain other packages (Java WARs containing JARs, Docker images containing tar layers).
Nested archives are annoying but manageable. The real danger is malicious nested archives — zip bombs designed to consume all available disk space or crash extraction tools. This guide covers both the practical how-to and the security awareness you need.
Why Nested Archives Happen
- Email workflows: Alice zips a folder containing project files (some of which are .tar.gz or .7z archives). Bob receives it, adds his own files, and zips everything again. Carol adds more files and zips again. Three levels of nesting, each added by a different person trying to share files via email.
- Download aggregation: A download site bundles multiple software packages (each already archived) into a single ZIP for convenience. The user downloads one file and gets a ZIP containing 10 other archives.
- Software packaging: A Java WAR file (ZIP) contains JAR files (also ZIP). A Docker image is a TAR containing layer TARs. An npm package (TGZ) might contain a vendored dependency that's also a TGZ. These are intentional nesting for packaging purposes.
- Backup chains: A backup tarball of a directory that contains other tarballs. If you back up /home/user/ and the user has downloaded .tar.gz files, your backup contains nested archives by default.
- Data hoarding: Downloaded content archives (from torrents, file sharing, web scraping) often arrive pre-archived. Archiving a collection of these creates nested archives automatically.
Recursive Extraction: Tools and Techniques
No mainstream archive tool (7-Zip, WinRAR, tar) recursively extracts nested archives by default. They extract the outer archive, leaving inner archives as regular files. You need an extra step — or a specialized tool — to handle inner archives.
Manual Multi-Pass Extraction
The straightforward approach: extract, find inner archives, extract those, repeat.
# Extract outer archive
7z x download.zip -o extracted/
# Find and extract all inner archives
find extracted/ -name '*.zip' -exec 7z x {} -o {}_extracted \;
find extracted/ -name '*.tar.gz' -exec tar -xzf {} -C extracted/ \;
find extracted/ -name '*.7z' -exec 7z x {} -o {}_extracted \;
find extracted/ -name '*.rar' -exec 7z x {} -o {}_extracted \;This works but is tedious for deep nesting. For a one-off extraction, it's fine. For automated pipelines, script it with a depth limit to prevent infinite loops.
Scripted Recursive Extraction
#!/bin/bash
# Recursive archive extraction with depth limit
MAX_DEPTH=5
extract_recursive() {
local dir="$1" depth="$2"
[ "$depth" -gt "$MAX_DEPTH" ] && echo "Max depth reached" && return
find "$dir" \( -name '*.zip' -o -name '*.7z' -o -name '*.rar' \
-o -name '*.tar.gz' -o -name '*.tar.bz2' -o -name '*.tar.xz' \
-o -name '*.tgz' \) -print0 | while IFS= read -r -d '' archive; do
local outdir="${archive}_contents"
mkdir -p "$outdir"
7z x "$archive" -o"$outdir" -y 2>/dev/null || \
tar -xf "$archive" -C "$outdir" 2>/dev/null
extract_recursive "$outdir" $((depth + 1))
done
}
extract_recursive "$1" 0The MAX_DEPTH limit is critical — without it, a maliciously crafted nested archive could trigger infinite recursion or exhaust disk space. Five levels of nesting handles any legitimate use case.
Zip Bombs: Decompression Attacks
A zip bomb is an archive designed to consume enormous resources when extracted. The most famous is 42.zip (2001): a 42KB ZIP file that decompresses to 4.5 petabytes. It achieves this through nested recursion — the ZIP contains 16 ZIP files, each containing 16 ZIPs, each containing 16 ZIPs, repeated 5 layers deep. The bottom layer contains a 4.3GB file filled with zeros (which compresses to almost nothing).
16^5 = 1,048,576 files times 4.3GB each = 4.5 petabytes. The 42KB compressed representation costs almost nothing to store or transfer, but attempting to extract it will fill any available disk and crash most systems.
Non-Recursive Zip Bombs (More Dangerous)
Modern zip bombs don't rely on nesting — they exploit ZIP's format to achieve extreme compression ratios in a single layer. David Fifield's "quine zip bomb" (2019) creates a 42KB ZIP that decompresses to 5.5GB without any nesting. The technique uses overlapping files in the ZIP central directory — multiple directory entries point to the same compressed data block, so one block of compressed zeros is "extracted" as thousands of separate files.
Non-recursive bombs bypass the depth-limit defenses that block classic nested bombs. They look like normal single-layer ZIP files to antivirus scanners and extraction tools, making them harder to detect.
The ratio reveals the danger: a legitimate ZIP rarely achieves better than 20:1 compression (even on highly compressible text). A zip bomb achieves 100,000:1 or higher. Checking the compression ratio before extraction is the best defense.
Detection and Prevention
Protect yourself from decompression bombs:
- Check the ratio. Before extracting, list the archive contents and compare compressed vs uncompressed sizes.
7z l suspicious.zipshows both. If the uncompressed size is thousands of times larger than the compressed size, it's suspicious. Normal compression ratios for mixed data: 2:1 to 10:1. For text: up to 20:1. Anything above 100:1 warrants caution. - Set extraction limits. Use tools that support output size limits. 7-Zip doesn't, but you can monitor disk space during extraction:
timeout 60 7z x suspicious.zip -o /tmp/extract/with a disk space check. - Extract in a constrained environment. Use a Docker container or VM with a disk limit. If the extraction fills the constrained disk, it doesn't affect your main system:
docker run --rm -v ./archives:/data -w /data --tmpfs /extract:size=1G alpine sh -c "apk add p7zip && 7z x suspicious.zip -o/extract" - Limit recursion depth. If recursively extracting nested archives, set a hard maximum depth (3-5 levels). Legitimate nesting rarely exceeds 3 levels.
- Monitor during extraction. Watch disk usage with
df -hduring extraction of untrusted archives. Kill the extraction if disk usage grows unexpectedly fast.
Disk Space Estimation for Nested Archives
Before extracting nested archives, estimate the space required:
# Check uncompressed size of outer archive
7z l archive.zip | tail -1
# Shows total uncompressed size
# For TAR.GZ
gunzip -l archive.tar.gz
# Shows compressed and uncompressed sizes
# Rough estimate: you need at least:
# (compressed size) + (uncompressed size) + (inner archive uncompressed sizes)
# Plus working space for the extraction processRule of thumb: have 3x the uncompressed size of the outer archive available. If the outer archive is 1GB compressed and 5GB uncompressed, have 15GB free. This accounts for: the compressed file (1GB), the extracted contents including inner archives (5GB), and the inner archives extracted (up to 5GB more).
For untrusted archives: check the compression ratio first. If 7z reports that a 1MB archive contains 50GB of data, proceed very carefully — or don't proceed at all.
Legitimate Nested Archives in Software
Not all nested archives are accidental or malicious. Some are by design:
- Java WAR/EAR files: A WAR (ZIP) contains JAR files (also ZIP), which contain .class files. This is intentional packaging — WARs bundle a web application with its dependencies.
- Docker images: A
docker savecreates a TAR containing layer TARs and JSON manifests. This is the designed export format. - Debian packages: A .deb is an ar archive containing control.tar.gz and data.tar.xz. Two levels of archiving by design.
- OCI/Docker image registries: Image layers are compressed tarballs within a manifest structure.
- App bundles: macOS .app, Android .apk (ZIP), and iOS .ipa (ZIP) all contain internal archives or compressed resources by design.
These legitimate nested archives don't need recursive extraction — the tools that consume them (Java servlet containers, Docker daemon, dpkg) handle the internal structure natively.
Safe Extraction Practices
- List before extracting. Always
7z lortar -tfbefore extracting to see what's inside and how large it is. - Extract to a dedicated directory. Never extract into your home directory or system directories. Create a temporary extraction directory:
mkdir /tmp/extract-$(date +%s) && cd $_. - Watch for path traversal. Malicious archives can contain entries with
../../../etc/passwdpaths that write files outside the extraction directory. Modern versions of tar and 7z protect against this, but always extract with-o(7z) or-C(tar) to confine extraction to a specific directory. - Set a timeout. Use
timeout 300 7z x archive.zipto kill extraction if it takes more than 5 minutes — a sign of a decompression bomb or corrupted archive. - Verify after extraction. Check that the extracted files are what you expected. Unexpected executables, scripts, or symlinks in an archive that should contain documents are red flags.
Nested archives are mostly an annoyance — an extra extraction step between you and the files you want. The security dimension (zip bombs) is the more important consideration: always check compression ratios on untrusted archives, limit recursion depth, and extract in constrained environments when dealing with unknown sources.
For straightforward archive handling: ZIP to 7Z, TAR.GZ to ZIP, RAR to ZIP — ChangeThisFile handles single-layer archive conversion. For nested archives, you'll need to extract the outer layer first, then convert or extract the inner archives.