How do I keep custom fonts from breaking pagination?

Embed fonts in the source DOCX before conversion. In Word: File > Options > Save > Embed fonts in the file. The PDF renderer then has the exact fonts and pagination matches Word output.

Can I convert .doc (legacy Word format) too?

Yes. The API supports source=doc, target=pdf. Same code, different source format. LibreOffice headless and docx2pdf both handle .doc as well.

What's the file size limit?

Free tier: 25MB. Pro tier: 100MB. For larger files (long manuscripts, books), use the async /v1/jobs endpoint with webhook callback.

Does the API preserve hyperlinks?

Yes. Both internal and external hyperlinks are preserved as clickable PDF annotations.

What about tracked changes and comments?

Tracked changes and comments are not rendered in the PDF output by default — same behavior as Word's File > Save As PDF when 'Print Layout' is the selected view. Accept changes in the source DOCX before converting if you want them included.

Can I convert DOCX to PDF/A for archival?

Not via the standard /v1/convert endpoint yet — this is on the roadmap. For now, convert to PDF first, then post-process with Ghostscript locally to convert to PDF/A-1b or PDF/A-2b.

How to Convert DOCX to PDF in Python (3 Methods + API)

Converting DOCX to PDF preserving formatting is a notoriously hard problem because Microsoft Word's rendering is the de facto reference. Open formats like DOCX are well-specified, but the visual output depends on which fonts are installed, how the renderer handles mixed content, and a long tail of quirks. Free Python options exist, but they all have meaningful tradeoffs.

This guide shows three working approaches with cross-platform notes and the formatting fidelity each one delivers.

Method 1: ChangeThisFile API (works anywhere)

The API uses LibreOffice headless on its servers, which gives you the best free renderer without the install pain. Get a free API key — 1,000 conversions/month on the free tier.

import requests

API_KEY = "sk_test_your_key_here"

def docx_to_pdf(docx_path: str, output_path: str) -> None:
    with open(docx_path, "rb") as f:
        response = requests.post(
            "https://changethisfile.com/v1/convert",
            headers={"Authorization": f"Bearer {API_KEY}"},
            files={"file": f},
            data={"source": "docx", "target": "pdf"},
            timeout=120,
        )
    response.raise_for_status()
    with open(output_path, "wb") as out:
        out.write(response.content)

docx_to_pdf("contract.docx", "contract.pdf")

For batch conversion, parallelize with a thread pool — the API handles concurrent requests well within your rate limit:

from concurrent.futures import ThreadPoolExecutor
import os

files = [f for f in os.listdir("docs") if f.endswith(".docx")]

def convert(filename):
    docx_to_pdf(f"docs/{filename}", f"out/{filename.replace('.docx', '.pdf')}")

with ThreadPoolExecutor(max_workers=8) as pool:
    pool.map(convert, files)

Method 2: docx2pdf (uses Microsoft Word)

docx2pdf shells out to Microsoft Word via COM automation on Windows or AppleScript on macOS. Output fidelity is perfect because it is literally Word doing the rendering. The catch: Linux is not supported, and you need Word installed.

pip install docx2pdf

from docx2pdf import convert

convert("contract.docx", "contract.pdf")

# Or batch convert a directory:
convert("docs/")  # converts all .docx files in docs/

Use docx2pdf when you are on a Windows or macOS workstation, you have Word installed, and you need 1:1 formatting fidelity (legal documents, regulatory filings). Do not use it in production servers — Word automation is fragile and slow.

Method 3: LibreOffice headless (cross-platform, free)

LibreOffice can run in headless mode and convert DOCX to PDF from the command line. This works on Linux, macOS, and Windows.

apt-get install libreoffice  # or: brew install libreoffice

import subprocess
import os

def docx_to_pdf(docx_path: str, output_dir: str) -> str:
    result = subprocess.run([
        "libreoffice", "--headless",
        "--convert-to", "pdf",
        "--outdir", output_dir,
        docx_path,
    ], capture_output=True, timeout=120)

    if result.returncode != 0:
        raise RuntimeError(f"LibreOffice failed: {result.stderr.decode()}")

    base = os.path.splitext(os.path.basename(docx_path))[0]
    return os.path.join(output_dir, f"{base}.pdf")

pdf_path = docx_to_pdf("contract.docx", "out/")

Two important constraints with headless LibreOffice:

Single-instance bottleneck. LibreOffice headless can only run one conversion at a time per user session. Concurrent conversions queue. Spinning up multiple processes does not parallelize cleanly because they fight over the same user profile directory.
Slow startup. The first conversion in a process takes 5-10 seconds just to spin up LibreOffice. Subsequent conversions in the same long-running process are faster.

If you need throughput, use the API instead — it pre-warms LibreOffice instances behind a queue, so individual conversions return faster.

Formatting fidelity comparison

Approach	Fidelity	Quirks
ChangeThisFile API	High (LibreOffice on server with full font set)	Custom fonts may fall back; embed fonts in source DOCX for guarantee
docx2pdf (Word)	Perfect (1:1 with Word)	Windows/macOS only, requires Word license
LibreOffice local	High (varies with installed fonts)	Slow startup, single-instance bottleneck

The most common fidelity issue is fonts. If your DOCX uses Calibri (Microsoft's default) and the renderer doesn't have Calibri installed, it falls back to a similar font and pagination shifts. The fix is to embed fonts in the DOCX before conversion (Word: File > Options > Save > Embed fonts in the file), or set explicit font fallbacks in the document style.

Production tips

Set realistic timeouts. A 100-page DOCX with images can take 30-60 seconds to convert. Use 120s timeout minimum.
Validate input before sending. Empty DOCX files, password-protected DOCX, and corrupt OOXML all fail conversion. Test with python-docx.Document() locally before hitting the API.
Handle the 502 case. Conversions can fail on edge-case DOCX (heavily nested tables, embedded VBA macros, broken styles). Surface the error to users with a clear message.
Cache results. If users convert the same DOCX repeatedly, hash the file and cache the PDF. Saves money and latency.

For one-off conversions on your own machine, install LibreOffice and shell out — it's free and good enough. For SaaS uploads, the API saves you from packaging LibreOffice in your Docker image and dealing with its headless quirks. Get a free API key with 1,000 conversions/month.

How to Convert DOCX to PDF in Python

Method 1: ChangeThisFile API (works anywhere)

Method 2: docx2pdf (uses Microsoft Word)

Method 3: LibreOffice headless (cross-platform, free)

Formatting fidelity comparison

Production tips

Frequently Asked Questions

Ready to convert your files?