What's the difference between /ebook and /screen in Ghostscript?

/ebook downsamples images to 150 DPI and uses JPEG compression. /screen goes to 72 DPI which is visible as blur on standard monitors. Use /ebook as your default — it's the sweet spot between size and quality.

Will compression break my PDF links and bookmarks?

Ghostscript /ebook preserves hyperlinks and bookmarks. PyMuPDF lossless preserves everything including form fields. Neither approach strips annotations or navigation.

My PDF got bigger after Ghostscript compression — why?

The input images were already compressed at or below 150 DPI. Ghostscript re-encoded them with overhead. Run pdfimages -list on the input to check. If images are already small/JPEG, use PyMuPDF lossless only.

Can I compress just one page of a large PDF?

Ghostscript doesn't support per-page compression directly. Split with pdftk or PyPDF2 (extract page N), compress the page PDF, then merge back. The ChangeThisFile API also accepts page= to select a single page.

What's the file size limit on the API?

25MB on the free tier. Most PDFs are well under this. If you have larger scan PDFs, split them first (pdftk input.pdf burst) then compress each part.

How to Compress a PDF Without Losing Quality

"Compress PDF" means different things depending on the file. A scan-heavy PDF shrinks when you downsample embedded images. A text-only PDF shrinks when you strip unused objects and apply flate compression. Treating both the same way is why most compression tools either do nothing or visibly degrade your document.

This guide covers three techniques ordered by aggressiveness: lossless object cleanup, image-aware compression, and the ChangeThisFile API for environments where you don't want to install Ghostscript or PyMuPDF.

TL;DR

Text-only PDF: PyMuPDF garbage collect + deflate. File shrinks 20-40%, images untouched.
Scan/image-heavy PDF: Ghostscript /ebook preset downsample to 150 DPI. Typical 60-80% size reduction, no visible loss on screen.
No local install: POST to https://changethisfile.com/v1/convert with target=pdf and compress=true.

Why PDFs get bloated

PDFs accumulate fat in three ways:

Embedded images at print DPI. A scanner creates 600 DPI TIFFs. Acrobat embeds them verbatim. A 20-page scan PDF can be 50MB when 5MB would look identical on screen.
Incremental save cruft. Every time you edit and resave a PDF, Acrobat appends a new revision rather than rewriting the file. A 10-edit document can have 10x the necessary bytes.
Embedded fonts (unsubsetted). A 5MB TrueType font embedded in full when only 30 glyphs are used.

Lossless compression attacks #2 and #3. Image resampling attacks #1. The right technique depends on what's making your file large — check with pdfinfo yourdoc.pdf.

Method 1: Lossless compression with PyMuPDF

PyMuPDF's save options perform garbage collection, font subsetting, and flate (zlib) compression without touching image quality at all.

pip install PyMuPDF

import fitz  # PyMuPDF
import os

def compress_pdf_lossless(in_path: str, out_path: str) -> dict:
    doc = fitz.open(in_path)
    doc.save(
        out_path,
        garbage=4,      # remove unused objects + cross-reference rebuild
        deflate=True,   # apply flate compression to streams
        deflate_images=True,  # also compress already-compressed image streams
        clean=True,     # remove redundant content
    )
    doc.close()
    before = os.path.getsize(in_path)
    after = os.path.getsize(out_path)
    return {
        "before_mb": round(before / 1e6, 2),
        "after_mb": round(after / 1e6, 2),
        "reduction_pct": round((1 - after / before) * 100, 1),
    }

result = compress_pdf_lossless("document.pdf", "document-compressed.pdf")
print(result)  # {'before_mb': 8.4, 'after_mb': 5.1, 'reduction_pct': 39.3}

Typical results on text PDFs: 20-40% smaller. On already-compressed image PDFs: 5-15%. This never degrades visual quality.

Method 2: Ghostscript image resampling (best for scans)

Ghostscript can downsample embedded images to screen DPI. The /ebook preset targets 150 DPI — indistinguishable from 300-600 DPI on any display under 200 PPI.

# Install
apt install ghostscript   # Linux
brew install ghostscript  # macOS

# /screen = 72 DPI (tiny, visibly degraded — avoid)
# /ebook  = 150 DPI (web/email quality — recommended)
# /printer = 300 DPI (print quality — conservative)
# /prepress = 300 DPI + color profiles kept

gs -sDEVICE=pdfwrite \
   -dCompatibilityLevel=1.4 \
   -dPDFSETTINGS=/ebook \
   -dNOPAUSE -dQUIET -dBATCH \
   -sOutputFile=compressed.pdf \
   input.pdf

import subprocess
import os

def ghostscript_compress(in_path: str, out_path: str, preset: str = "/ebook") -> dict:
    """preset: /screen /ebook /printer /prepress"""
    result = subprocess.run([
        "gs",
        "-sDEVICE=pdfwrite",
        "-dCompatibilityLevel=1.4",
        f"-dPDFSETTINGS={preset}",
        "-dNOPAUSE", "-dQUIET", "-dBATCH",
        f"-sOutputFile={out_path}",
        in_path,
    ], capture_output=True, text=True)
    if result.returncode != 0:
        raise RuntimeError(f"Ghostscript failed: {result.stderr}")
    before = os.path.getsize(in_path)
    after = os.path.getsize(out_path)
    return {
        "before_mb": round(before / 1e6, 2),
        "after_mb": round(after / 1e6, 2),
        "reduction_pct": round((1 - after / before) * 100, 1),
    }

# 20-page scan PDF: 18.2MB → 3.4MB (81% reduction)
print(ghostscript_compress("scan.pdf", "scan-compressed.pdf"))

Real benchmarks (20-page document scan):

/screen: 18.2MB → 1.1MB (94% — too aggressive, visible blur)
/ebook: 18.2MB → 3.4MB (81% — sweet spot)
/printer: 18.2MB → 7.8MB (57% — conservative)

Method 3: ChangeThisFile API

POST the PDF with target=pdf. The API runs Ghostscript /ebook server-side — no local installs.

curl -X POST https://changethisfile.com/v1/convert \
  -H "Authorization: Bearer ctf_sk_your_key_here" \
  -F "file=@document.pdf" \
  -F "target=pdf" \
  --output compressed.pdf

import requests

API_KEY = "ctf_sk_your_key_here"

def compress_pdf(in_path: str, out_path: str) -> dict:
    with open(in_path, "rb") as f:
        resp = requests.post(
            "https://changethisfile.com/v1/convert",
            headers={"Authorization": f"Bearer {API_KEY}"},
            files={"file": f},
            data={"target": "pdf"},
            timeout=120,
        )
    resp.raise_for_status()
    with open(out_path, "wb") as f:
        f.write(resp.content)
    import os
    before = os.path.getsize(in_path)
    after = os.path.getsize(out_path)
    return {"reduction_pct": round((1 - after/before)*100, 1)}

print(compress_pdf("big.pdf", "small.pdf"))

const fs = require('fs');
const FormData = require('form-data');
const fetch = require('node-fetch');

async function compressPdf(inPath, outPath) {
  const form = new FormData();
  form.append('file', fs.createReadStream(inPath));
  form.append('target', 'pdf');

  const res = await fetch('https://changethisfile.com/v1/convert', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer ctf_sk_your_key_here', ...form.getHeaders() },
    body: form,
  });
  if (!res.ok) throw new Error(await res.text());
  const buf = await res.buffer();
  fs.writeFileSync(outPath, buf);
}

Edge cases and gotchas

Password-protected PDFs. Ghostscript cannot compress password-protected PDFs by default. Decrypt first with qpdf --decrypt input.pdf decrypted.pdf.
Already-compressed images. If images are already JPEG-compressed at 150 DPI, Ghostscript /ebook may produce a slightly larger file (re-encoding overhead). Check input DPI with pdfimages -list input.pdf.
Vector/text PDFs don't benefit from Ghostscript image resampling. Use PyMuPDF lossless instead — Ghostscript can even inflate pure-vector PDFs by rebuilding the content stream inefficiently.
Form fields and annotations. Ghostscript flattens interactive forms. If the PDF has fillable fields you need to preserve, use PyMuPDF lossless or the API (which preserves structure).
PDF/A compliance. Ghostscript's /ebook may break PDF/A compliance markers. For archival PDFs, use /prepress or PyMuPDF.

Scaling tips for bulk compression

from pathlib import Path
import concurrent.futures
import subprocess, os

def compress_one(pdf_path: Path, out_dir: Path) -> str:
    out = out_dir / pdf_path.name
    subprocess.run([
        "gs", "-sDEVICE=pdfwrite", "-dPDFSETTINGS=/ebook",
        "-dNOPAUSE", "-dQUIET", "-dBATCH",
        f"-sOutputFile={out}", str(pdf_path),
    ], check=True)
    before = pdf_path.stat().st_size
    after = out.stat().st_size
    return f"{pdf_path.name}: {before//1024}KB → {after//1024}KB"

input_dir = Path("./pdfs")
out_dir = Path("./pdfs-compressed")
out_dir.mkdir(exist_ok=True)

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as pool:
    futs = [pool.submit(compress_one, p, out_dir) for p in input_dir.glob("*.pdf")]
    for f in concurrent.futures.as_completed(futs):
        print(f.result())

Ghostscript is CPU-bound, so max_workers=os.cpu_count() fully saturates a machine. For API batching, stay under 10 concurrent requests on the free tier (1,000 req/month limit).

Match the technique to the PDF type. For scanned documents, Ghostscript /ebook is the industry standard. For text/vector PDFs, PyMuPDF's lossless pass is always safe. Get a free API key (1,000 conversions/month) if you'd rather skip the install.