Backup archives have different requirements than sharing archives. When you share a file, compatibility is priority one — the recipient needs to open it. When you create a backup, the priorities are: reliable restoration, fast creation, small storage footprint, and long-term format stability.
The wrong format choice for backups costs you either storage (weak compression), time (slow compression on daily backups), or data (format doesn't preserve the metadata you need). This guide covers the practical tradeoffs and gives concrete recommendations based on what you're backing up.
Backup Format Comparison
| Format | Unix Permissions | Compression | Speed | Cross-Platform | Best For |
|---|---|---|---|---|---|
| TAR.ZSTD | Full | Very good | Very fast | Linux/macOS | Daily server backups |
| TAR.XZ | Full | Excellent | Slow | Linux/macOS | Weekly full backups, archival |
| TAR.GZ | Full | Good | Fast | Universal | Compatibility-first backups |
| ZIP | Partial | Good | Fast | Universal | Cross-platform, Windows users |
| 7Z | None | Excellent | Slow | Needs 7-Zip | Windows-only environments |
The clear hierarchy: TAR.ZSTD for daily automated backups (fast, good ratio, preserves everything). TAR.XZ for weekly/monthly archival backups (best ratio, still preserves everything). ZIP only when the backup needs to be opened on Windows without extra software. Avoid 7Z for Unix backups — it loses file permissions and symlinks.
TAR.ZSTD: The Modern Backup Standard
Zstandard compression with TAR archiving is the ideal combination for automated daily backups:
- Fast compression (300-500 MB/s): A 50GB server backup compresses in 2-3 minutes, not 30-60 minutes with XZ. This matters for backup windows — if your backup script runs during a maintenance window or low-traffic period, faster compression means shorter disruption.
- Good ratio: ZSTD at default level (3) achieves ratios comparable to GZIP -9, meaning you're not sacrificing meaningful storage for the speed gain.
- Multi-threaded:
tar cf - directory/ | zstd -T0 > backup.tar.zstuses all available CPU cores. On an 8-core server, compression throughput exceeds 1 GB/s. - Full metadata: TAR preserves permissions, ownership, symlinks, timestamps, and extended attributes — everything needed for a faithful system restore.
The only limitation: ZSTD decompression requires the zstd tool, which isn't installed by default on all systems. If your restore scenario involves booting from a minimal rescue environment, verify that zstd is available. For maximum portability in disaster recovery, TAR.GZ is safer.
# Daily backup with Zstandard (multi-threaded)
tar cf - /var/www /etc /home | zstd -T0 -3 > backup-$(date +%Y%m%d).tar.zst
# Restore
zstd -d backup-20260319.tar.zst -o - | tar xf - -C /
TAR.XZ: Maximum Compression for Archival
When storage cost matters more than backup speed — weekly full backups stored for years, off-site archival to cloud storage, or environments where bandwidth is expensive — TAR.XZ gives the smallest files:
- 20-30% smaller than TAR.GZ: On a 100GB backup, that's 20-30GB less storage. Over years of retained backups, the savings compound.
- Slow to create: XZ compression at default settings is 5-10x slower than GZIP. A 100GB backup might take 60-90 minutes to compress with XZ vs 10-15 minutes with GZIP. For weekly backups where the window is generous, this is acceptable.
- Fast to decompress: Despite slow compression, XZ decompression is fast (~200 MB/s) — comparable to GZIP. The restore scenario is quick.
A common pattern: daily backups with ZSTD (fast), weekly full backups with XZ (small), and monthly archival backups with XZ that get moved to cold storage.
# Weekly full backup with XZ (using multi-threaded xz)
tar cf - /var /etc /home | xz -T0 -6 > weekly-backup-$(date +%Y%m%d).tar.xz
# Restore
tar -xJf weekly-backup-20260319.tar.xz -C /
Incremental vs Full Backups
Full backups archive everything. Incremental backups archive only files changed since the last backup. The format choice affects how well each strategy works:
- Full backups: Any format works. Compress the entire dataset. Simple to create, simple to restore. Downside: large and slow if the dataset is large.
- Incremental with TAR: GNU tar supports incremental backups natively with
--listed-incremental. It stores a snapshot file listing each file's state. On subsequent runs, it only archives files modified since the last snapshot. This is the most common incremental backup method on Linux. - Incremental with ZIP: Not natively supported. You'd need external tooling to track which files changed and add only those to a new ZIP. Possible but awkward.
- Incremental with 7Z: Not supported. 7z has no built-in incremental mechanism.
Incremental TAR is efficient but has a critical gotcha: to restore, you need the full backup plus every incremental backup in order. If any incremental is corrupted or missing, you can't restore past that point. Verify the chain regularly.
# Full backup (creates snapshot file)
tar --listed-incremental=/var/backups/snapshot.snar \
-czf /var/backups/full-$(date +%Y%m%d).tar.gz /home
# Incremental (only changed files since last snapshot)
tar --listed-incremental=/var/backups/snapshot.snar \
-czf /var/backups/incr-$(date +%Y%m%d).tar.gz /home
# Restore: apply full, then each incremental in order
tar --listed-incremental=/dev/null -xzf full-20260315.tar.gz -C /
tar --listed-incremental=/dev/null -xzf incr-20260316.tar.gz -C /
tar --listed-incremental=/dev/null -xzf incr-20260317.tar.gz -C /
Splitting Large Backup Archives
Large backups may need splitting for: cloud storage upload limits, email-based off-site transfer, or FAT32-formatted external drives (4GB file size limit). Two approaches:
- split command (post-compression): Create the archive first, then split it into chunks. Simple, works with any format.
tar czf - /data | split -b 4G - backup-part- - Archive-native splitting: 7z supports multi-volume archives natively:
7z a -v4g backup.7z /data. This is slightly more convenient because 7z handles reassembly automatically.
For restoration, the chunks must be concatenated before extraction: cat backup-part-* | tar xzf - (for split tarballs) or open the first volume in 7-Zip (for multi-volume 7z).
Cloud backup services (AWS S3, Backblaze B2) handle large files natively with multipart uploads, so splitting is mainly needed for physical media or services with per-file limits.
Backup Verification: Hash Checking
A backup you can't restore is not a backup. Verification has two components:
- Integrity checking: Verify the archive file hasn't been corrupted during storage or transfer. Generate a SHA-256 hash immediately after creation and verify it before restoration.
- Content verification: Verify that files inside the archive are correct and complete. TAR can list contents without extracting (
tar -tf archive.tar.gz). 7z can test archive integrity (7z t archive.7z).
# Generate hash after backup creation
sha256sum backup-20260319.tar.zst > backup-20260319.tar.zst.sha256
# Verify before restoration
sha256sum -c backup-20260319.tar.zst.sha256
# Test TAR archive integrity (list all files)
tar -tzf backup-20260319.tar.gz > /dev/null
# Test 7z archive integrity
7z t backup-20260319.7zSchedule automated verification. Don't wait until you need to restore to discover your backups are corrupted. A weekly cron job that tests random backup archives catches corruption early.
Storage Media Considerations
The storage medium affects format choice:
- Local NAS/SAN: No file size limits, fast I/O. Use TAR.ZSTD for speed or TAR.XZ for size. No splitting needed.
- Cloud storage (S3, B2, GCS): Bandwidth costs money. TAR.XZ's smaller size saves transfer costs. Multi-part upload handles large files, so no splitting needed. Consider encryption (
tar cf - /data | xz | gpg -c > backup.tar.xz.gpg) since data is off-site. - External USB drives (often FAT32/exFAT): FAT32 has a 4GB file size limit. Split archives if the backup exceeds 4GB, or format the drive as exFAT (no file size limit, compatible with Windows/macOS/Linux).
- Optical media (BD-R, M-DISC): Capacity is limited (25-128GB per disc). Use TAR.XZ for maximum compression. Consider RAR with recovery records — optical media degrades over time, and recovery records can repair moderate corruption.
- Tape (LTO): TAR's native format — it was literally designed for tape. Use
tar cf /dev/st0 /datafor direct-to-tape backup. Compression is handled by the tape drive's hardware compression, so don't compress in software.
Restoration Testing: The Most Important Step
Every backup strategy needs regular restoration testing. A backup format that's untested is a hypothesis, not a plan.
- Monthly test restores: Pick a random backup from the retention pool. Extract it to a temporary location. Verify that files are complete, permissions are correct, and the application/service works from the restored data.
- Document the restore procedure: Write down every command needed to restore from your backup archives. Include the tools required (
zstd,xz,7z,gpg). Store this documentation separately from the backups — if you lose access to the machine, the documentation should still be reachable. - Test on a different machine: Restoring on the same machine that created the backup hides environment-dependent issues. Test on a clean machine or VM to verify that all required tools are available and the restore procedure is self-contained.
The best backup format is the one you've successfully restored from. Run the test.
The backup format decision is simpler than the sharing format decision. If you're backing up a Unix/Linux system: TAR.ZSTD for daily speed, TAR.XZ for weekly archival, both preserve full metadata. If you're backing up for cross-platform access: ZIP. If you're archiving to potentially degrading media: RAR with recovery records.
Then test the restore. Every month. From a different machine if possible. The format only matters if the restoration works. Convert TAR.GZ to TAR.XZ to shrink existing backups, or TAR.GZ to ZIP to make them Windows-accessible.